On 19 March 2013 18:32, Gary Gregory <garydgreg...@gmail.com> wrote:
> Yes, why not depend on commons-codec?

Codec is nearly 4 times the size of file upload.
Not an ideal partner.

> Gary
>
> On Mar 19, 2013, at 14:28, Simone Tripodi <simonetrip...@apache.org> wrote:
>
>> Hi there,
>>
>>>> is there anything you want to discuss before I go ahead with a new RC?
>>>> I am performing some ITs in non-public projects and I planned to cut a
>>>> new RC in 1-2 days...
>>>
>>> I think there are still some issues in Base64Decoder (this time not
>>> ones I accidentally introduced - sorry about that!).
>>
>> don't worry, it can happen - unit tests saved us :)
>>
>>>
>>> For example, it does not seem to handle whitespace properly -
>>> certainly not in the last 4 bytes.
>>> Nor does it detect invalid characters in input - AFAICT these will be
>>> treated as binary (because the decoding table is initialised to 0).
>>
>> I am honestly not worried about white spaces, since those decoders are
>> not part of APIs and are used by the MimeUtility which tokenizes
>> strings by whitespaces
>>
>>> With the new  version of the code, skipped characters can cause array
>>> bounds errors after the main loop, because finish is calculated
>>> without taking skipped characters into account, thus the loop can
>>> consume additional bytes. In the original version, this would not
>>> cause array bounds errors, but the same byte could be used more than
>>> once.
>>
>> that is worrying
>>
>>> I think it really needs to be rewritten, which I am happy to do.
>>> I intend to fetch the bytes, skipping as necessary otherwise storing
>>> them in an array. When there are 3 bytes available, output them and
>>> resume.
>>> At end of input, output any remaining bytes.
>>>
>>> It remains to be decided how to handle '=' before the end of the byte
>>> data - does that signal end of input, or is it an error if '=' appears
>>> other than at the end?
>>> Indeed, should it handle whitespace at all, and if so are there
>>> restrictions where that may appear?
>>>
>>> QPD has problems too: it assumes that characters following '=' are
>>> hex, but does not validate this.
>>> It also fails to handle lowercase hex, which is not very tolerant.
>>> Also the exception messages aren't specific enough - cannot
>>> distinguish truncated encoded sequence from invalid CRLF pair.
>>> Again, I can take this on.
>>
>> I won't appose to a rewrite, but at that point, since both decoders
>> are buggy: what about adding a dependency to commons-coded? It already
>> handles both Base64 and QuotedPrintable codecs.
>> We could shading the interested classes, if we want to avoid less
>> dependencies as possible, but a dependency to another commons
>> component sounds fair to me.
>>
>>> Given that MimeUtil seems to strip spaces on input, do the decoders
>>> need to handle embedded spaces?
>>
>> not in my opinion, as I wrote before.
>>
>> Thanks and best,
>> -Simo
>>
>> http://people.apache.org/~simonetripodi/
>> http://simonetripodi.livejournal.com/
>> http://twitter.com/simonetripodi
>> http://www.99soft.org/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to