On Oct 1, 2007, at 8:52 AM, Hossein Sharifi wrote:
I don't see this problem in 4.0 or 4.5.
It looks like searching for 秋 on bookmooch first goes to
http://www.bookmooch.com/search?w=%E7%A7%8B&search.x=14&search.y=13
(which looks fine - that's the correct URL encoding of the UTF-8 representation of that character) but that page immediately redirects to a cleaner search URL ( http://bookmooch.com/m/s/...) which contains the $map(...). Maybe it's related to the code that builds the cleaner URL?

Thanks Hossein, for the insight.

You were right, my problem was caused by a bug in the ncgi url encoding function.

ncgi builds a $map() array of character conversions, and puts $map (character) around anything that isn't A-Za-Z0-9, then a [subst - nocommand $string] around all that.

The higher-UTF characters cause problems with the ncgi function, and they emerge from it with $map() wrapped around them. A simple regexp to remove $map() from what ncgi can't encode, and now it works perfectly.

I have to say that the aolserver handling of UTF is really well done.

At Lyris, we never did quite get the UTF handling in TclHttpd done perfectly, there were still some fringe cases that caused garbling.

With aolserver, except for this ncgi problem, and figuring out that I needed to switch to utf-8 as the default (from the default iso 8859-1), non-english character sets have worked perfectly.

-john



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to