Dan Jacobson <[EMAIL PROTECTED]> writes:
> looking at a lasttime index
>
> Hmmm, maybe unicode query strings that look bad in the indexes, unless
> one's browser is in unicode mode, should be converted to &#nnn; sequences,
>
>
>http://news.google.com/news?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=%E5%B7%AB%E6%BA%90%E8%B2%B4&sa=N&tab=wn
>
> [the normal big5 strings look fine to me when converted in the browser]:
>
>
>http://ofind.sina.com.tw/cgi-bin/sinanews.exe?query=%B1%E6%BB%B7%C3%E8&database=monthly&sort=date&paging=1&redirect=1
The problem is that there is no way to know that the URL has unicode
in it so that the correct conversion can be applied.
This is a general problem with URLs and why there are defined ways of
converting a URL to a limited set of characters that everybody can
represent. There is no tag with the URL that can tell you the format
or meaning of the contents like there is with web-pages that can tell
you that they use unicode.
--
Andrew.
----------------------------------------------------------------------
Andrew M. Bishop [EMAIL PROTECTED]
http://www.gedanken.demon.co.uk/
WWWOFFLE users page:
http://www.gedanken.demon.co.uk/wwwoffle/version-2.7/user.html