Hi,

I believe that the standard for URL's calls for always encoding in utf-8 but that all non-ascii bytes (bytes with the high bit set) are to be further encoded using %xx hex notation. So the URL is always transmitted as an ascii string, but is easily converted into a utf-8 string simply by converting the %xx codes back into binary bytes. Thus firewalls and proxies need only deal with ascii.

You're right, except one thing: when the standard was created, there were no UTF-8 encoding, so it can't be the default. I think that the standard is not talking about how the non-ASCII characters are encoded (iso-8859-* or utf-8 or else). And I know and I'm sure in it, that browsers are sending back non-ASCII characters by the same encoding as the page of the form was coded - so no UTF-8 is the default, there is no default.


Bye,
  Andras

Reply via email to