Re: [codec] RE: Base64.java

Jeffrey Dever Tue, 04 Feb 2003 12:11:25 -0800

Right now the Base64 class is entirely static, with static method and static members. The constructor is private, it cannot be instantiated. I don't think we are looking for any OO here, its functional in nature. We are not looking for radical redesign, changes or improvements, just a nice logical home in commons for a class that is currently widely used, replicated and forked.

Using static flags to configure Base64 behaviour are *not* going to work. Particularly with the concept of commons: small re-useable components. Here is why:

Lets talk about the usage pattern for those flags, there really are only two ways to go:
1) One of the classes in the package set the flags in a static block.
2) The flags are set before each call to the Base64 encode/decode methods.

In case 1), if you are the only package loaded in the jvm, and you always use Base64 the same way, then you are fine. But if there other packages loaded in the jvm that also use Base64, but with different flags, then the class loading order determines which flags are used. You are broken.

In case 2), if everything is running in one thread, then you are ok. But if there are multiple threads, then you can have your flags set out from under you while doing an decode/encode. You are broken.

As we get more applications built out of commons projects, and more commons projects that depend on other commons projects (like HttpClient), these static issues become more and more likely. To use flags safely, you would have to make them instance members and requre instantiation of Base64 objects. This does more harm than good.

Static Flags Considered Harmful!

There does not seem to be much choice other than overloading the method signatures:
public static byte[] decode(byte[] data);
public static byte[] decode(byte[] data, boolean chunk);

Jandalf.

O'brien, Tim wrote:

Here's Martin's post on rpc-dev re: cr/lf: http://nagoya.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&m
sgNo=713

Here's some observations, I've copied individuals from both the xml-rpc and
the httpclient project. It all boils down to using Base64 encoding in the
context of two different RFCs, 2045 and 2616. I believe that we can come to
an agreement here by adding some option flags to the method signatures.

*** XML-RPC facts:

1. I believe that XML-RPC is using Base64 in the context of RFC 2045 which
requires Base64 content to be encoded in 76 character "chunks" separated by
a newline character. The traling newline character is added to "terminate"
the final chunk.

2. XML-RPC is also adhereing to the requirement to discard all whitespace
when decoding base64 data.

3. XML-RPC is not complying with the requirement to convert text to
canonical form - replacing "text line breaks" with "CRLF sequences".

*** HTTPClient facts:

1. HttpClient's usage of Base64 does not create chunks of 76 characters
separated by newlines - as this would interfere with HTTP headers.

2. HttpClient's Base64 doesn't discard whitespace because in the context of
usage, no whitespace is added to the encoded output - see #1

** Here is RFC 2045 Multipurpose Internet Mail Extensions:
http://www.ietf.org/rfc/rfc2045.txt

2045 requirement 1: RFC 2045 on converting text material to canonical form:
"Care must be taken to use the proper octets for line breaks if base64
encoding is applied directly to text material that has not been converted to
canonical form. In particular, text line breaks must be converted into CRLF
sequences prior to base64 encoding. The important thing to note is that
this may be done directly by the encoder rather than in a prior
canonicalization step in some implementations."

2045 requirement 2: In terms of RFC 2045, requirement for "chunking" and
ignoring white space when decoding: "The encoded output stream must be
represented in lines of no more than 76 characters each. All line breaks or
other characters not found in Table 1 must be ignored by decoding software.
In base64 data, characters other than those in Table 1, line breaks, and
other white space probably indicate a transmission error, about which a
warning message or even a message rejection might be appropriate under some
circumstances."

** Here is RFC 2616 HTTP 1.1 which talks about base64 of an MD5 digest in a
header: http://www.ietf.org/rfc/rfc2616.txt?number=2616

"Conversion of all line breaks to CRLF MUST NOT be done before computing or
checking the digest: the line break convention used in the text actually
transmitted MUST be left unaltered when computing the digest."

"Note: while the definition of Content-MD5 is exactly the same for HTTP as
in RFC 1864 for MIME entity-bodies, there are several ways in which the
application of Content-MD5 to HTTP entity-bodies differs from its
application to MIME entity-bodies. One is that HTTP, unlike MIME, does not
use Content-Transfer-Encoding, and does use Transfer-Encoding and
Content-Encoding. Another is that HTTP more frequently uses binary content
types than MIME, so it is worth noting that, in such cases, the byte order
used to compute the digest is the transmission byte order defined for the
type. Lastly, HTTP allows transmission of text types with any of several
line break conventions and not just the canonical form using CRLF."

--------
Tim O'Brien

-----Original Message-----
From: Jeffrey Dever [mailto:[EMAIL PROTECTED]] Sent: Tuesday, February 04, 2003 10:16 AM
To: O'brien, Tim
Cc: 'Martin Redington'; [EMAIL PROTECTED]
Subject: Re: Base64.java

Http is very cr/lf aware. We use Base64 for encoding/decoding values that are added to headers which are always appended with a cr/lf as a value is not to contain the line delimiter.

Where (which) rfc does it state the trailing cr/lf?

Jandalf.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [codec] RE: Base64.java

Reply via email to