Hi, I've got a terible headache... It happens all the time I try to touch the bugs related with encodings - any of them... I'm sure you already know ( but I just found out ) what "surrogate" characters are. I know that UTF is _not_ 16 bits, but I had no idea it is 21 bits ( as opposed to UCS - 31 bits ). I'll try to get something working this weekend. Craig - you may want to take a look, the code in "DefaultServlet" is creating a writter for each encoding ( that's terribly expensive ), and doesn't seem to deal with surrogates ( well, the second part is not a problem - I doubt someone would use hieroglyphs or musical signs in a URL ). Now, the biggest problem is as ussually M$. From strange reasons, MSIE's javascript encode() method is generating %XXXX sequences instead of %XX%XX ( as most would expect ). That means the whole decoding might have to be rewritten 3.3 ( Apache doesn't deal with that either ). Question: what should happen with the context path ? It is supposed to be returned in the orignal form ( not decoded ) - but that can't work as a certain path can be encoded in many ways. I'm also not sure what should happen if web.xml and in server.xml ( where path is defined ) - should we use %xx encoded URLs ? But what would that mean for characters that have multiple encodings ? The solution I have in mind right now is to keep doing all the mappings and process web.xml - and do all internal operations with decoded characters, while keeping the "original" form for the facade, so servlets get what they expect. Any ideas ? I'm not sure I can handle this. Costin