Re: Basic Authentication Failed with multibyte username
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, On 1/21/2010 6:35 PM, André Warnier wrote: Basically, I would tend to say that if the server knows who the clients are and vice-versa, you should be free to use any encoding you want, with the limitation that what is exchanged on the wire conforms to HTTP (because there may be proxies on the way which are not so tolerant). +1 What the client is sending is already (in a way) conformant to HTTP, because it is base64 encoded and so, on the surface, it does not contain non-ascii characters. +1 But the problem is that the standard Tomcat code which decodes the Basic Authorization header does not work in the way you want, for these illegal headers. And this code should preferably not be changed in a way which breaks the conformance with standard HTTP. Because if you do that, then your Tomcat becomes useless for anything else than your special client. +1 Another possibility would be to use something like SecurityFilter, which allows you to (more easily) write your own authenticator and realm implementations, and you could write a BasicAuthenticator that reads these specially-formatted credentials. I checked the sf source, and it looks like we might have a bug: private String decodeBasicAuthorizationString(String authorization) { if (authorization == null || !authorization.toLowerCase().startsWith(basic )) { return null; } else { authorization = authorization.substring(6).trim(); // Decode and parse the authorization credentials return new String(Base64.decodeBase64(authorization.getBytes())); } } That authorization.getBytes() is just asking for trouble, because it uses the platform default encoding to convert characters to bytes. It should be using US-ASCII, ISO-8859-1, or something like that. It also calls the String constructor with a byte array without specifying the encoding, therefore using the platform default. Finally, this method is private, which means it cannot be overridden by a subclass, which would be a nice feature. Maybe I'll fix all that. :) Or, you drop the container-managed security, and you use something like the SecurityFilter (http://securityfilter.sourceforge.net/), but read the homepage carefully first. Note that the warning about BASIC authentication is waaay outdated: sf definitely does support BASIC auth. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktZy68ACgkQ9CaO5/Lv0PAdMACfVnkkBJRIo8Gt1LcsegO/JhPD Tl0AoLcI5QP0XoCa8kgy5zFJnkKBvL6Y =CBKO -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 André, On 1/21/2010 6:35 PM, André Warnier wrote: Basically, I would tend to say that if the server knows who the clients are and vice-versa, you should be free to use any encoding you want, with the limitation that what is exchanged on the wire conforms to HTTP (because there may be proxies on the way which are not so tolerant). +1 What the client is sending is already (in a way) conformant to HTTP, because it is base64 encoded and so, on the surface, it does not contain non-ascii characters. +1 But the problem is that the standard Tomcat code which decodes the Basic Authorization header does not work in the way you want, for these illegal headers. And this code should preferably not be changed in a way which breaks the conformance with standard HTTP. Because if you do that, then your Tomcat becomes useless for anything else than your special client. +1 Another possibility would be to use something like SecurityFilter, which allows you to (more easily) write your own authenticator and realm implementations, and you could write a BasicAuthenticator that reads these specially-formatted credentials. I checked the sf source, and it looks like we might have a bug: private String decodeBasicAuthorizationString(String authorization) { if (authorization == null || !authorization.toLowerCase().startsWith(basic )) { return null; } else { authorization = authorization.substring(6).trim(); // Decode and parse the authorization credentials return new String(Base64.decodeBase64(authorization.getBytes())); } } That authorization.getBytes() is just asking for trouble, because it uses the platform default encoding to convert characters to bytes. It should be using US-ASCII, ISO-8859-1, or something like that. -1 I don't think you have a problem there, because what you are decoding into bytes there IS bytes (it is base64-encoded). It also calls the String constructor with a byte array without specifying the encoding, therefore using the platform default. +1 That is indeed where you have a problem. There you SHOULD always decode it as US-ASCII (or maybe iso-8859-1, I'm not quite sure what the spec says exactly). Let's say that the spec is clear and says that the header value is *TEXT, and that *TEXT is always US-ASCII (or ISO-8859-1) by default. Let's take it from the browser side first. If the userid:password is indeed composed only of us-ascii characters, then the browser base64-encodes this directly and it is trivial.(*) But let's say that userid:password is something else than us-ascii. Another part of the spec says that then, you have to encode it according to RFC2047. My contention is then that the browser should first RFC2047-encode userid:password, and then base64-encode the result. Back on the server side. The server base64-decodes the authorization token, into an ascii string. It can do that always, because either the string was ascii to start with, or else it was not, but then it has been RFC2047-encoded, yelding a result that is ascii. (like : =?iso-8859-2?B?base64-encoded stuff...?= ) Then the server must do another round of decoding via RFC2047. That consists of a double decoding again : base64-decode the string between the ?? into bytes, and then decode those bytes into Unicode, using the charset indicated at the beginning of the rfc2047-encoded sequence. The above, I believe, would be totally consistent with the current RFCs. But there is a major catch : I don't believe that there is a browser on the market today, which properly encodes the userid:password string via rfc2047 when it isn't ascii. And the OP's special client sends UTF-8, but also does not rfc2047-encode it. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Auth Gábor wrote: Hi, I've found a potential bug in the Basic Authentication module. I have users and some user's username is contains national characters (encoded in UTF-8). The HTTP header based authentication is fails when the username or the password contains multibyte characters. The root of the bug is the Base64 decoder, which decodes the Base64 stream to char array: converts each byte to individual char, this decode method corrupts the multibyte characters... Hi. Before declaring that this is a bug, I suggest that you read the other thread entitled mod_jk codepage in header values. The main point is : according to the HTTP RFCs, a HTTP header value is supposed to contain /only/ US-ASCII characters. Some byte values in UTF-8 encoding are /not/ valid US-ASCII characters, so strictly speaking and according to the RFC, HTTP headers which would contain them are invalid. It's a pain, but it's (probably) not a bug. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
On 21/01/2010 05:54, Auth Gábor wrote: Hi, I've found a potential bug in the Basic Authentication module. I have users and some user's username is contains national characters (encoded in UTF-8). The HTTP header based authentication is fails when the username or the password contains multibyte characters. That sounds like a bug to me. The root of the bug is the Base64 decoder, which decodes the Base64 stream to char array: converts each byte to individual char, this decode method corrupts the multibyte characters... And that sounds like the root cause. It works, because the byte[] to String conversion supports the multibyte conversion and uses the encoding of the JVM. What do you think about it? I haven't tested it or looked at the detail of the base 64 decoding but on the basis it works for you then... Great! Many thanks. Please create a Bugzilla entry and add your patch to it. Patches sent to the mailing list are too easy to forget. Before you do, I have have one improvement suggestion. Using the platform default encoding to convert bytes to String is something that itself has caused bugs in the past and I can see it doing so here too. I'd suggest adding a characterEncoding attribute to the BasicAuthenticator (like there is for FormAuthenticator). Don't forget to include documenting this new attribute in your patch. The tricky question is what should the default be. I see the options as ISO-8859-1 or UTF-8. I'd use UTF-8 since that will work for most input including all ISO-8859-1 input. Thanks again for the patch. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Hi, André Warnier wrote: I've found a potential bug in the Basic Authentication module. I have users and some user's username is contains national characters (encoded in UTF-8). The HTTP header based authentication is fails when the username or the password contains multibyte characters. The root of the bug is the Base64 decoder, which decodes the Base64 stream to char array: converts each byte to individual char, this decode method corrupts the multibyte characters... Before declaring that this is a bug, I suggest that you read the other thread entitled mod_jk codepage in header values. I've read that. The main point is : according to the HTTP RFCs, a HTTP header value is supposed to contain /only/ US-ASCII characters. Some byte values in UTF-8 encoding are /not/ valid US-ASCII characters, so strictly speaking and according to the RFC, HTTP headers which would contain them are invalid. It's a pain, but it's (probably) not a bug. Hmm... the Basic Authorization header like this: Authorization: BASIC w7pzZXJfMDA3MjpqZWxzem8xMkFB Where do you see non US-ASCII character in the header? :) Gábor Auth - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
On 21/01/2010 06:12, André Warnier wrote: Auth Gábor wrote: Hi, I've found a potential bug in the Basic Authentication module. I have users and some user's username is contains national characters (encoded in UTF-8). The HTTP header based authentication is fails when the username or the password contains multibyte characters. The root of the bug is the Base64 decoder, which decodes the Base64 stream to char array: converts each byte to individual char, this decode method corrupts the multibyte characters... Hi. Before declaring that this is a bug, I suggest that you read the other thread entitled mod_jk codepage in header values. The main point is : according to the HTTP RFCs, a HTTP header value is supposed to contain /only/ US-ASCII characters. Some byte values in UTF-8 encoding are /not/ valid US-ASCII characters, so strictly speaking and according to the RFC, HTTP headers which would contain them are invalid. It's a pain, but it's (probably) not a bug. In this case I think it is a bug. The authorisation header is base64 encoded so it is automatically compliant with RFC2616. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Mark Thomas wrote: On 21/01/2010 06:12, André Warnier wrote: Auth Gábor wrote: Hi, I've found a potential bug in the Basic Authentication module. I have users and some user's username is contains national characters (encoded in UTF-8). The HTTP header based authentication is fails when the username or the password contains multibyte characters. The root of the bug is the Base64 decoder, which decodes the Base64 stream to char array: converts each byte to individual char, this decode method corrupts the multibyte characters... Hi. Before declaring that this is a bug, I suggest that you read the other thread entitled mod_jk codepage in header values. The main point is : according to the HTTP RFCs, a HTTP header value is supposed to contain /only/ US-ASCII characters. Some byte values in UTF-8 encoding are /not/ valid US-ASCII characters, so strictly speaking and according to the RFC, HTTP headers which would contain them are invalid. It's a pain, but it's (probably) not a bug. In this case I think it is a bug. The authorisation header is base64 encoded so it is automatically compliant with RFC2616. Yes, it sounds like you're right; my mistake. (Also for Gabor, I admit my mistake.) I agree that the HTTP header itself is correct. But there is still somethig which puzzles me in the absolute. Suppose that the browser and the server know nothing particular about one another, and that the server gets such an Authentication header from the browser. The Base64 decoding is done, and yields a series of bytes. Now this series of bytes have to be interpreted, to be translated into a string in Java (which is Unicode). Which encoding should be chosen to decode the byte array ? If you use the default platform JVM encoding, you are making the assumption that the browser knew what this encoding is, aren't you ? On the other hand, the browser sent nothing to indicate in which encoding this string was, before it encoded it using Base64, or did it ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
On 21/01/2010 06:55, André Warnier wrote: Mark Thomas wrote: The authorisation header is base64 encoded so it is automatically compliant with RFC2616. Yes, it sounds like you're right; my mistake. (Also for Gabor, I admit my mistake.) I agree that the HTTP header itself is correct. But there is still somethig which puzzles me in the absolute. Suppose that the browser and the server know nothing particular about one another, and that the server gets such an Authentication header from the browser. The Base64 decoding is done, and yields a series of bytes. Now this series of bytes have to be interpreted, to be translated into a string in Java (which is Unicode). Which encoding should be chosen to decode the byte array ? If you use the default platform JVM encoding, you are making the assumption that the browser knew what this encoding is, aren't you ? On the other hand, the browser sent nothing to indicate in which encoding this string was, before it encoded it using Base64, or did it ? RFC2617 to the rescue... basic-credentials = base64-user-pass base64-user-pass = base64 [4] encoding of user-pass, except not limited to 76 char/line user-pass = userid : password userid= *TEXT excluding : password = *TEXT *TEXT is defined in RFC2616 TEXT = any OCTET except CTLs, but including LWS and finally OCTET = any 8-bit sequence of data CTL= any US-ASCII control character (octets 0 - 31) and DEL (127) So actually, Tomcat is correct in the current treatment of credentials. Therefore, not a bug. Also André's comments regarding ISO-8859-1 were right if considering the actual user name and password rather than the header. Supporting other encodings would be a useful enhancement but the default will have to be ISO-8859-1 to remain spec compliant. What the browsers will do for user names and passwords in other encodings is not defined so it will be a case of YMMV. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Mark Thomas wrote: On 21/01/2010 06:55, André Warnier wrote: Mark Thomas wrote: The authorisation header is base64 encoded so it is automatically compliant with RFC2616. Yes, it sounds like you're right; my mistake. (Also for Gabor, I admit my mistake.) I agree that the HTTP header itself is correct. But there is still somethig which puzzles me in the absolute. Suppose that the browser and the server know nothing particular about one another, and that the server gets such an Authentication header from the browser. The Base64 decoding is done, and yields a series of bytes. Now this series of bytes have to be interpreted, to be translated into a string in Java (which is Unicode). Which encoding should be chosen to decode the byte array ? If you use the default platform JVM encoding, you are making the assumption that the browser knew what this encoding is, aren't you ? On the other hand, the browser sent nothing to indicate in which encoding this string was, before it encoded it using Base64, or did it ? RFC2617 to the rescue... basic-credentials = base64-user-pass base64-user-pass = base64 [4] encoding of user-pass, except not limited to 76 char/line user-pass = userid : password userid= *TEXT excluding : password = *TEXT *TEXT is defined in RFC2616 TEXT = any OCTET except CTLs, but including LWS and finally OCTET = any 8-bit sequence of data CTL= any US-ASCII control character (octets 0 - 31) and DEL (127) So actually, Tomcat is correct in the current treatment of credentials. Therefore, not a bug. Also André's comments regarding ISO-8859-1 were right if considering the actual user name and password rather than the header. Supporting other encodings would be a useful enhancement but the default will have to be ISO-8859-1 to remain spec compliant. What the browsers will do for user names and passwords in other encodings is not defined so it will be a case of YMMV. Mark Let me be even more pernickety : According to the HTTP 1.1 RFC 2616, HTTP header fields MAY contain *TEXT portions representing character sets other than US-ASCII. But then, such header field values MUST be encoded according to the rules of RFC 2047. RFC 2047 in turn, in 2. Syntax of encoded-words , indicates that this should be done using the form : encoded-word = =? charset ? encoding ? encoded-text ?= for example : Header-name: =?iso-8859-1?B?some iso-8859-1 text, base-64 encoded?= or Header-name: =?utf-8?B?some unicode/utf-8 text, base-64 encoded?= (I am not quite sure here of the utf-8 part as the correct name for the charset.) (NDLR: That is something one does find regularly in email headers; but I have never seen it used in HTTP headers until now.) On the other hand, regarding authentication mechanisms, RFC 2616 refers to RFC 2617, which itself indicates the following format for an authorization header sent by the browser to the server : Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== When base64-decoded, the above string should look like userid:password. I did not find in RFC 2617 any specific mention of character set encoding, but it itself refers back to RFC 2616 as being the base rules. And the base rules in RFC 2616 seem to be that header values are US-ASCII unless otherwise indicated. In other words, my contention is as follows : - if the userid:password above contain only US-ASCII characters, then the above simple form of the header is fine. - if the userid:password string above contain characters other than US-ASCII however, then they should be further encoded, using the rules of RFC 2047. This would mean that you should have something like : Authorization: Basic =?utf-8?B?QWxhZGRpbjpvcGVuIHNlc2FtZQ==?= (or, maybe, the other way around : it is the QWxhZGRpbjpvcGVuIHNlc2FtZQ string which, when base64-decoded, should yield a new string of the form =?utf-8?B?QWxhZGRpbjpvcGVuIHNlc2FtZQ==?=, which should then be decoded once more to give the userid:password string). Now, I am not sure that if you pass such a HTTP header, encoded as above, from Apache to Tomcat, that the Tomcat getHeader() call will properly decode it, using the indicated charset. And I am not sure either that there exists any browser on the market that will encode a userid:password string that way. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Hi, Mark Thomas wrote: OCTET = any 8-bit sequence of data CTL= any US-ASCII control character (octets 0 - 31) and DEL (127) So actually, Tomcat is correct in the current treatment of credentials. Therefore, not a bug. Yes, but the UTF-8 encoded text is contains any 8-bit sequence of data except control characters, so IMHO the UTF-8 encoded text is TEXT. Also André's comments regarding ISO-8859-1 were right if considering the actual user name and password rather than the header. Yes, thats right. The default header encoding is ISO-8859-1. Supporting other encodings would be a useful enhancement but the default will have to be ISO-8859-1 to remain spec compliant. What the browsers will do for user names and passwords in other encodings is not defined so it will be a case of YMMV. I've found some information about this issue: http://stackoverflow.com/questions/702629/utf-8-characters-mangled-in-http- basic-auth-username So... this is the real chaos... :) By the way, my users are not use HTML browsers, they are using JAX-WS in their client program, and the JAX-WS sends authentication data in UTF-8 (like Opera), because the default encoding is UTF-8 in the client JVM (and the server too). Gábor Auth - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Gábor, On 1/21/2010 9:16 AM, Auth Gábor wrote: Mark Thomas wrote: OCTET = any 8-bit sequence of data CTL= any US-ASCII control character (octets 0 - 31) and DEL (127) So actually, Tomcat is correct in the current treatment of credentials. Therefore, not a bug. Yes, but the UTF-8 encoded text is contains any 8-bit sequence of data except control characters, so IMHO the UTF-8 encoded text is TEXT. Sure, UTF-8 encoded text is TEXT, but you may not get the String value you expect. André is correct in that non-Latin characters appear to be unsupported by the HTTP Authenticate header. Now, there /are/ things that can be done to accommodate you. See below. The patch you posted probably will only work when the platform encoding is set to UTF-8. Instead, an encoding setting would probably have to be provided to the BasicAuthenticator to allow the Base64-encoded header value to use the desired encoding. Actually, the code as it looks right now does have a bug: the platform default encoding is used to decode Base-64 decoded bytes in the Authenticate header. Instead, it should probably be ASCII or maybe ISO-8859-1. Also André's comments regarding ISO-8859-1 were right if considering the actual user name and password rather than the header. Yes, thats right. The default header encoding is ISO-8859-1. It's ASCII, though ISO-8859-1 is backward-compatible (as is UTF-8). I've found some information about this issue: http://stackoverflow.com/questions/702629/utf-8-characters-mangled-in-http- basic-auth-username Nice that someone looked at actual behavior of the browsers. It would be pretty trivial to add a settable charset to the BasicAuthenticator, and also to allow things like RFC 2047 charset-in-value decoding, though I don't think that's appropriate because the Bas64 value has already been decoded. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktYuooACgkQ9CaO5/Lv0PAQZQCgoWiesTSQ/aX+oeRmF8Qvv+u3 73oAniYbXKfEIGdnIVyEHpZNgJ82ZjsI =qPwi -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
Christopher Schultz wrote: ... Nice that someone looked at actual behavior of the browsers. There is an easy way to find out what really happens. Gábor, I presume that you have a workstation set for iso-8859-2 (or whichever non iso-8859-1 charset is appropriate for Magyar, I forgot), and a browser set up similarly. Could you get one of these add-ons like Fiddler2 or LiveHttpHeaders, and arrange to capture what is sent by the browser in its authorization header when you enter a user-id/password containing some characters of the range above \x9F ? That should be the base64 encoding of whatever the browser is sending. Then of course you'll have to find a way to show us the base64-encoded form, and the corresponding non-encoded form of ditto (but I think that composing and sending your post as UTF-8 should do the trick). We could probably do much the same with our own charset-challenged browsers, but we don't have the easiest keyboards for that. It is my deep suspicion that the browsers will just take the input as iso-latin-x (whatever the workstation/browser is set for), and base64-encode it, without bothering to indicate the real charset in any way. But we'll see. Kösönöm szepen, I think it is... - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Basic Authentication Failed with multibyte username
To get back to the underlying issue : Auth Gábor wrote: So... this is the real chaos... :) Yes. By the way, my users are not use HTML browsers, they are using JAX-WS in their client program, and the JAX-WS sends authentication data in UTF-8 (like Opera), because the default encoding is UTF-8 in the client JVM (and the server too). Basically, I would tend to say that if the server knows who the clients are and vice-versa, you should be free to use any encoding you want, with the limitation that what is exchanged on the wire conforms to HTTP (because there may be proxies on the way which are not so tolerant). What the client is sending is already (in a way) conformant to HTTP, because it is base64 encoded and so, on the surface, it does not contain non-ascii characters. And (I presume) you cannot change the code of the client, so it will continue to send these invalid headers with a UTF-8 value, base64-encoded. But the problem is that the standard Tomcat code which decodes the Basic Authorization header does not work in the way you want, for these illegal headers. And this code should preferably not be changed in a way which breaks the conformance with standard HTTP. Because if you do that, then your Tomcat becomes useless for anything else than your special client. An additional complication is that, if you want to use the embedded container-managed Tomcat authentication mechanisms, then you have to do something very early in the cycle, because that authentication takes place even before any servlet filter is invoked. Up to Tomcat 5.5, you would have to do this in a Valve then, which has the inconvenient that it is Tomcat-specific. (I think Tomcat 6 may give other options, maybe not Tomcat-specific.) Or, you drop the container-managed security, and you use something like the SecurityFilter (http://securityfilter.sourceforge.net/), but read the homepage carefully first. So, to be pragmatic, I would tend to go in the following direction : - create a Valve which - checks the User-Agent. If it does not match your special client, do nothing. If it matches, then - get the Authorization header. If there is none, do nothing - else, decode its value properly into a Unicode string - re-encode this string in a way that fits with standard HTTP. For example, replace each character by a string like {}, where is the hex value of the Unicode codepoint of the character. (That is always valid us-ascii, but check the maximum length). - re-encode the result using base64 - replace the Authorization header value with this new string - in your back-end authentication mechanism (I will suppose it is a database of userids/passwords), encode the userids/passwords the same way, and make this an alternate key The embedded Tomcat authentication will then decode the new base64 string, split it into userid:password, and use them to verify the credentials, which will match. If you do not like a Valve, then use a front-end server like Apache, and do the transformation of the header there, before the request is passed to Tomcat. Alternatively then, you could also do the user authentication at the Apache level, and just pass the user-id to Tomcat. (being an Apache/mod_perl guy myself, I find this last option much easier, but YMMV). And all that for a few Ö's and Á's and ß's Another option is to use a front-end Apache httpd server, which would modify the requests as follows : (I presume that you have a way to identify requests coming from this particular client)(User-Agent header e.g.). Create a filter at the Apache level, which detects your special client. If it detects it, then it adds an additional header to the request - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org