Christopher Schultz wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 1/21/2010 6:35 PM, André Warnier wrote:
Basically, I would tend to say that if the server knows who the clients
are and vice-versa, you should be free to use any encoding you want,
with the limitation that what is exchanged on the wire conforms to HTTP
(because there may be proxies on the way which are not so tolerant).

+1

What the client is sending is already (in a way) conformant to HTTP,
because it is base64 encoded and so, on the surface, it does not contain
non-ascii characters.

+1

But the problem is that the standard Tomcat code which decodes the Basic
Authorization header does not work in the way you want, for these
illegal headers.
And this code should preferably not be changed in a way which breaks the
conformance with standard HTTP.
Because if you do that, then your Tomcat becomes useless for anything
else than your special client.

+1

Another possibility would be to use something like SecurityFilter, which
allows you to (more easily) write your own authenticator and realm
implementations, and you could write a BasicAuthenticator that reads
these specially-formatted credentials.

I checked the sf source, and it looks like we might have a bug:

   private String decodeBasicAuthorizationString(String authorization) {
      if (authorization == null ||
!authorization.toLowerCase().startsWith("basic ")) {
         return null;
      } else {
         authorization = authorization.substring(6).trim();
         // Decode and parse the authorization credentials
         return new String(Base64.decodeBase64(authorization.getBytes()));
      }
   }

That "authorization.getBytes()" is just asking for trouble, because it
uses the platform default encoding to convert characters to bytes. It
should be using US-ASCII, ISO-8859-1, or something like that.

-1
I don't think you have a problem there, because what you are decoding into bytes there IS bytes (it is base64-encoded).


It also calls the String constructor with a byte array without
specifying the encoding, therefore using the platform default.

+1
That is indeed where you have a problem. There you SHOULD always decode it as US-ASCII (or maybe iso-8859-1, I'm not quite sure what the spec says exactly).


Let's say that the spec is clear and says that the header value is *TEXT, and that *TEXT is always US-ASCII (or ISO-8859-1) by default.

Let's take it from the browser side first.
If the "userid:password" is indeed composed only of us-ascii characters, then the browser base64-encodes this directly and it is trivial.(*)

But let's say that "userid:password" is something else than us-ascii.
Another part of the spec says that then, you have to encode it according to RFC2047. My contention is then that the browser should first RFC2047-encode "userid:password", and then base64-encode the result.

Back on the server side.
The server base64-decodes the authorization token, into an ascii string.
It can do that always, because either the string was ascii to start with, or else it was not, but then it has been RFC2047-encoded, yelding a result that is ascii.
(like : =?iso-8859-2?B?....base64-encoded stuff...?= )

Then the server must do another round of decoding via RFC2047.
That consists of a double decoding again : base64-decode the string between the ?? into bytes, and then decode those bytes into Unicode, using the charset indicated at the beginning of the rfc2047-encoded sequence.


The above, I believe, would be totally consistent with the current RFCs.

But there is a major catch : I don't believe that there is a browser on the market today, which "properly" encodes the "userid:password" string via rfc2047 when it isn't ascii.

And the OP's special client sends UTF-8, but also does not rfc2047-encode it.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to