Gilles Detillieux wrote:
> I tried entering the following string as the search word in a search.html
> form that uses the GET method, in Netscape Communicator 4.04:
>
> ~!@#$%^&*()_+`-=[]{}\|;:'",<.>/?
>
> This is what appeared in the words= part of the query string:
>
>
>%7E%21@%23%24%25%5E%26*%28%29_%2B%60-%3D%5B%5D%7B%7D%5C%7C%3B%3A%27%22%2C%3C.%3E%2F%3F
>
> So it seems the only unencoded punctuation characters, from Netscape, are
> @*_-. while the current default for encodeURL is ?_@.=&/:
>
> Lynx seems to give the same results. Unless someone points me to a
> standard reference that contradicts this, I'm tempted to go with what
> Lynx & Netscape use for encoding.
RFC1738 <http://info.internet.isi.edu/in-notes/rfc/files/rfc1738.txt> says to
encode:
Unprintable US-ASCII:
0x00-0x1F, 0x7F (control characters)
0x80-0xFF (outside US-ASCII)
Unsafe (MUST encode in URL)
<>" (URL delimiters)
# (fragment delimiter)
% (URL encoder)
{}|\^~[]` (unsafe due to gateway and transport agent modifications)
Reserved for schemes (Encode when not used for reserved purpose)
;/?:@=&
RFC1738 summarizes: Thus, only alphanumerics, the special characters
"$-_.+!*'(),", and reserved characters used for their reserved purposes may be
used unencoded within a URL.
Thus it appears that Netscape and lynx disagree with this RFC by not encoding
@.
RFC2368 <http://info.internet.isi.edu/in-notes/rfc/files/rfc2368.txt> says to
encode ampersands as &. Presumably this is because encoding it as a hex
code would, after decoding, expose a naked ampersand to the HTML parser. This
RFC (the mailto: URL) also says to encode parentheses, but this applies only to
the RFC822 (address) portion of mailto: URLs.
--
Fred Condo + [EMAIL PROTECTED] + http://webclass.csuchico.edu/
[EMAIL PROTECTED] + fredcondo on Yahoo Pager
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.