On Oct 1, 2007, at 8:52 AM, Hossein Sharifi wrote:
I don't see this problem in 4.0 or 4.5.
It looks like searching for 秋 on bookmooch first goes to
http://www.bookmooch.com/search?w=%E7%A7%8B&search.x=14&search.y=13
(which looks fine - that's the correct URL encoding of the UTF-8
representation of that character)
but that page immediately redirects to a cleaner search URL
( http://bookmooch.com/m/s/...) which contains the $map(...).
Maybe it's related to the code that builds the cleaner URL?
Thanks Hossein, for the insight.
You were right, my problem was caused by a bug in the ncgi url
encoding function.
ncgi builds a $map() array of character conversions, and puts $map
(character) around anything that isn't A-Za-Z0-9, then a [subst -
nocommand $string] around all that.
The higher-UTF characters cause problems with the ncgi function, and
they emerge from it with $map() wrapped around them. A simple regexp
to remove $map() from what ncgi can't encode, and now it works
perfectly.
I have to say that the aolserver handling of UTF is really well done.
At Lyris, we never did quite get the UTF handling in TclHttpd done
perfectly, there were still some fringe cases that caused garbling.
With aolserver, except for this ncgi problem, and figuring out that I
needed to switch to utf-8 as the default (from the default iso
8859-1), non-english character sets have worked perfectly.
-john
--
AOLserver - http://www.aolserver.com/
To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]>
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject:
field of your email blank.