Hi,

It appears to be possible to do something like

boolean de/encode(ByteBuffer src, ByteBuffer dst);

returns true if all remaining bytes in src are en/decoded, false, the dst
is not big enough for all output bytes, the src.position() will be advanced
to the position of next un-en/decoded byte, dst.position() will be updated
accordingly as well.

to avoid the en/decoder to hold an internal state.

-Sherman

On 10/12/2012 12:47 PM, Ariel Weisberg wrote:
Hi,

Thanks for doing this BTW.

I think that including ByteBuffer API even if it isn't as efficient as
raw byte arrays is better then not having it in the API at all. If that
means allocating a byte array for the output and then doing a put on the
ByteBuffer that is fine.

Down the line if someone has a particularly powerful itch to scratch WRT
to performance they can add more code to the library to make it more
efficient at handling them and then everyone will benefit or they can do
their own implementation.

Thanks,
Ariel


On Fri, Oct 12, 2012, at 02:56 PM, Xueming Shen wrote:
Hi,

The exactly reason I was trying to skip en/decode(ByteBuffer in,
ByteByuffer out)
for now. I'm struggling with/can't make up my mind on whether or not the
en/decoder
should  have internal state, like the charset en/decoder. It appears the
API is being
pushed going that direction though:-)

-Sherman

On 10/12/2012 11:39 AM, Michael Schierl wrote:
Hello,

(sorry if the threading is broken, but I was not subscribed to the list
and only found the discussion on Twitter and read it in the mailing list
archive)

Ariel Weisberg wrote on Thu Oct 11 11:30:56 PDT 2012:
I know that ByteBuffers are pain, but I did notice that you can't
specify a source/dest pair when using ByteBuffers and that ByteBuffers
without arrays have to be copied. I don't see a simple safe way to
normalize access to them the way you can if everything is a byte array.
Agreed. One of the advantages of using byte buffers is reducing
allocations, resulting in fewer garbage collections.

In addition, in this implementation the ByteBuffers have to contain the
full data.

What I like about most byte buffers APIs is that I can pass in a
ByteBuffer with incomplete data or maybe an output ByteBuffer that is
too small to hold the complete result, and it will just process as much
as it can, and leave the rest for the next round (which should work well
for Base64, too, as it always processes chunks of 3 or 4 bytes).

So, a useful ByteBuffer API in my opinion needs a method like

public boolean encode(ByteBuffer in, ByteBuffer out,
    boolean endOfInput);

public boolean decode(ByteBuffer in, ByteBuffer out,
    boolean endOfInput);

(similar to CharsetEncoder#encode) that can process partial input and
will return true if all processable input has been processed (i. e. in
has to be refilled) or false if some input could not have been processed
(i. e. out has to be flushed).

Users have to call it again and again until they call it with
endOfInput=true and get true back (Using an enum as result similar to
CoderResult#UNDERFLOW and CoderResult#OVERFLOW might be another option
if the boolean results are too cryptic).

Having a ByteBuffer Base64 API might be useful (although I'm not sure
yet if I ever need it), but as it is now, it is mostly useless for
serious ByteBuffer usage, as if I have to split and copy the data
manually anyway, I can as well use the array APIs.


Just my 0.02 EUR,


Michael

Reply via email to