I think there might be a problem with _normalize_path, from HTTP::Cookies. I'll explain what happens with my Python port, because I have no idea how Perl and unicode interact: a unicode URI got passed to my equivalent of _normalize_path() (a unicode string is a separate type from an ordinary byte-string in Python). That function complained because there were non-ASCII characters in the unicode string, and it refused to guess which encoding to use.
The stated purpose of _normalize_path is to allow plain string-comparison of HTTP URI paths, but I don't understand a) how that's possible given that the URI character set isn't always known, and b) why it's necessary -- why not just compare without any normalization? The trouble is, RFC 2396 doesn't specify any URI character encoding, but allows %-escapes, which are defined in terms of octets. So, when you see a URI containing %-escaped chars, you have to know the original URI character encoding in order to work out what characters they represent. Unfortunately, I don't think that's always possible (is it?), so normalizing to "fully-escaped" form (as _normalize_path does) may involve assuming a different encoding than was used to partially escape the URI before HTTP::Cookies had anything to do with it. Escaping with inconsistent character encodings certainly seems bad. Am I correct? Why not just leave URIs un-normalized? If they must be normalized, how should unicode URIs (or non-ASCII ones, generally) get normalized? This is all very confusing, especially to an English speaker who never reads or writes anything but ASCII! John