The following link defines the Base64 encoding:
http://www.ietf.org/rfc/rfc2045.txt
You mention that you have Base64 encoded object, right? So your data is already encoded, so encoded
data should not have line feeds.
So from Borenstein paper , I quote: " The encoded output stream must be represented in lines of no more than 76 characters each. All line breaks or other characters not found in Table 1 must be ignored by decoding software. In base64 data, characters other than those in Table 1, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances."
So the decoder method in our Base64 should ignore such a white space. This is done differently
in the Java and C++ implementations.
The problem that I can see is in the C++ implementation with the following line
while ( inputIndex < inputLength )
{
if (!XMLChar1_0::isWhitespace(inputData[inputIndex]))
{
rawInputData[ rawInputLength++ ] = inputData[ inputIndex ];
inWhiteSpace = false;
}
else
{
if (inWhiteSpace) ...................................Here..................................
return 0; // more than 1 whitespaces encountered
else
inWhiteSpace = true;
}
inputIndex++;
So in other words in the Java implementation we ignore all whitespace and decode the array,
so:
"asjejjejere rer linefeed
jjjjjwerwerrrr linefeed
asadasdasdad"
Would be deooded:
In C++ implementaition, if we find more than one whitespace we return with 0. No decoding.
What PeiYong mentioned about the Schema production really maybe should be the job of the Base64 validator, to validate and enforce the production for 0 or 1.
Just curious why your data which is encoded have multiple line feeds How did you encode
the data? Did you use Xerces C++?
Cheers,
Jeffrey Rodriguez Silicon Valley
From: "PeiYong Zhang" <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] To: <[EMAIL PROTECTED]> Subject: Re: Base64 decoder extremely (overly?) strict Date: Wed, 3 Sep 2003 16:09:23 -0400
Scott,
Following is the production copied from the Schema Datatype errata (http://www.w3.org/2001/05/xmlschema-errata#Errata2), it allows optional ONE whitespace ONLY. se64Binary ::= S? B64quartet* Base64final? B64quartet ::= B64 S? B64 S? B64 S? B64 S? B64final ::= B64 S? B04 S? '=' S? '=' S? | B64 S? B64 S? B16 S? '=' S? B04 ::= [AQgw] B16 ::= [AEIMQUYcgkosw048] B64 ::= [A-Za-z0-9+/]
Rgds,
PeiYong
----- Original Message -----
From: Scott Cantor To: [EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 3:20 PM Subject: Base64 decoder extremely (overly?) strict
There was a change made to the Base64.cpp source around Xerces 2.2 that has
created some real problems handling base64 encoded objects, in particular
because of linefeeds. I'm not sure what's legal and what's not, but I
thought I'd mention it, because if it's *allowable* to permit extra linefeed
characters and/or extra lines in the encoded data, it's really a hassle that
it doesn't permit it.
The culprit is the whitespace checking:
if (inWhiteSpace) return 0; // more than 1 whitespaces encountered else inWhiteSpace = true;
A lot of Java code I'm dealing with, xml-security in particular, is creating
base64 that breaks on that line (returns 0 because of sequences of multiple
whitespace in the data due to extra linefeeds).
So, I could try reporting this as a bug there, but I'm not sure if it even
is.
-- Scott
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
_________________________________________________________________
Get 10MB of e-mail storage! Sign up for Hotmail Extra Storage. http://join.msn.com/?PAGE=features/es
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]