Number 228 - May 2002

Obscure Any URL
from PChelp - January 23, 2002
   The weird-looking addresses above take advantage of several things many people don't know about the structure of a valid URL. There's a little more to Internet addressing than commonly meets the eye; there are conventions which allow for some interesting variations in how an Internet address is expressed. These tricks are known to the spammers and scammers, and they're used freely in unsolicited mails. You'll also see them in ad-related URLs and occasionally on web pages where the writer hopes to avoid recognition of a linked address for whatever reason. Now, I'm making these tricks known to you. Read on, and you'll soon be very hard to fool. (Note: Depending on your browser type and its version, some of the oddly-formatted URLs on this page may not work. Also if you're on a LAN and using a proxy [gateway] for Internet access, many of them are unlikely to work. Also, fear not; this page does not exploit the "Dotless IP Address" vulnerability of some IE versions.)

HOW IT'S DONE
    First take note of the "@" symbol that appears amid all those numbers. In actual fact, everything between "http://" and "@" is completely irrelevant! Just about anything can go in there and it makes no difference whatsoever to the final result. This feature is actually used for authentication. If a login name and/or password is required to access a web page, it can be included here and login will be automatic. But if the page requires no authentication, the authentication text is in effect ignored by both browser and server. This presents interesting possibilities for confusing the unsuspecting user. How about this one: Http://www.playboy.com@3484559912/obscure.htm

    If you didn't know better, you might think this page were at playboy.com!

    By the way, the @ symbol can be represented by its hex code %40 to further confuse things; this works for the IE browser, but not for Netscape. All right, so what about that long number after the "@"? How does 3484559912 get you to www.pc-help.org?

    In actual fact, the two are equivalent to one another. This takes a little explaining so follow me carefully here.

    The first thing you need to know (most Net users know this), is that Internet names translate to numbers called IP addresses. An IP address is normally seen in "dotted decimal" format. www.pc-help.org translates to 207.178.42.40.

    Numeric IP addresses are generally unrecognizable to people. That's why we use names for network locations in the first place.

    Merely using an IP address, in its usual dotted-decimal format, in place of the name is commonly done and can be quite effective at leaving the human reader in the dark.

    But there are other ways to express that same number. The alternate formats are:

    * "dword" - meaning double word because it consists essentially of two binary "words" of 16 bits; but it is expressed in decimal (base 10);

    * "octal", meaning it's expressed in base 8; and

    * "hexadecimal" hexa=6 + deci=10 (base 16).

    The dword equivalent of 207.178.42.40 is 3484559912. Its octal and hexadecimal equivalents are also illustrated below.

    Okay, so what about the rest of the URL?

    Here's how all that gibberish on the right works:

    Individual characters of a URL's path and filename can be represented by their numbers in hexadecimal form. Each hex number is preceded by a "%" symbol to identify the following two numbers/letters as a hexadecimal representation of the character. The practical use for this is to make it possible to include spaces and unusual characters in a URL. But it works for all characters and can render perfectly readable text into a complete hash.

    In my example, I have interspersed hex representations with the real letters of the URL. It simply spells out "/obscure.htm" in the final analysis:

    / o %62 s %63 ur %65 %2e %68 t %6D / o b s c ur e . h t m

    The letters used in the hex numbers can be either upper or lower case. The "slashes" in the address
cannot be represented in hex; nor can the IP address be rendered this particular way. But everything else can be.

HEXADECIMAL CHARACTER CODES
    Hex character codes are simply the hexadecimal (base 16) numbers for the ASCII character set; that is, the number-to-letter representations which comprise virtually all computer text.

    For most people, the conversion is probably best done with a chart. The best ASCII-to-hex chart I have ever seen is on the website of Jim Price: http://www.jimprice.com/jim-asc.htm. Jim explains the ASCII character set wonderfully well, and provides a wealth of handy charts.

MORE ON DOTTED-DECIMAL IPS
    Here's another address for this page: http://463.434.298.552/obscure.htm

    Normally, the four IP numbers in a standard dotted-decimal address will all be between 0 and 255. In fact they must translate to an 8-bit binary number (ones and zeroes), which can represent a quantity no higher than 255.

    But the way this number is handled by some software often allows for a value higher than 255. The program uses only the 8 right-hand digits of the binary number, and will drop the rest if the number is too large.

    This means you can add multiples of 256 to any or all of the 4 segments of an IP address, and it will often still work. In my tests, it was limited to 3 digits per number; values over 999 didn't work.

CONVERTING AN IP ADDRESS TO DWORD FORMAT
    Here's a way to do this with very simple math

    Multiply the numbers of the IP address by the following fixed values (which are powers of 256), then add the results:

    10420224= 159 x 65536 (256^2) 10240 = 40 x 256 (256^1) 2 = 2 x 1 (256^0) ______ 3466536962

    Now, there is a further step that can make this address even more obscure. You can add to this dword number, any multiple of the quantity 4294967296 (2564) - and it will still work. This is because when the sum is converted to its basic digital form, the last 8 hexadecimal digits will remain the same. Everything to the left of those 8 hex digits is discarded by the IP software and therefore irrelevant.

OCTAL IP ADDRESSES
    As if all this weren't enough, an IP address can also be represented in octal form - base 8. The URL for this page with its IP address in octal form looks like this: http://0317.0262.052.050/obscure.htm

    Note the leading zeroes. They're necessary to convey to your browser the fact that this is an octal number. Any number of leading zeroes can be added to any or all of the numbers in the address.

    I'll spare you a detailed description of octal conversion. For those who can't figure it out, there's a nifty URLomatic at www.samspade.org that will do it for you.

    There is yet another obscure way to express an IP address. Using the method outlined above, calculate the hexadecimal number for 207.178.42.40. That number (CFB22A28) can be expressed as an IP address in this manner: 0xCF.0xB2.0x2A.0x28 The "0x" designates each number as a hex quantity. The dots can be omitted, and the entire hex number preceded by 0x: 0xCFB22A28 And, additional arbitrary hex digits can be added to the left of the "real" number: 0x9A3F0800CFB22A28 Some browsers (Netscape 3.x and 4.x for instance) won't work with hex IPs; but for IE users, this page's URL can be: http://0xCF.0xB2.0x2A.0x28/obscure.htm or: http://0xCFB22A28/obscure.htm or: http://0x9A3F0800CFB22A28/obscure.htm

IN SUM
    URLs can be obscured at least three ways:

    1) Meaningless or deceptive text can be added after the "http://" and before an "@" symbol.

    2) The domain name can be expressed as an IP address, in dotted-decimal, dword, octal or hexadecimal format; and all of these formats have variants.

    3) Characters in the URL can also be expressed as hexadecimal (base 16) numbers.
  Number 228 - May 2002