Hi,

On 09/22/2014 01:25 PM, Richard Warburton wrote:
Hi all,

A long-standing issue with Strings in Java is the ease and performance of
creating a String from a ByteBuffer. People who are using nio to bring in
data off the network will be receiving that data in the form of bytebuffers
and converting it to some form of String. For example restful systems
receiving XML or Json messages.

The current workaround is to create a byte[] from the ByteBuffer - a
copying action for any direct bytebuffer - and then pass that to the
String.

An alternative is to use CharsetDecoder to program a "decoding operation" on input ByteBuffer(s), writing the result to CharBuffer(s). If the resulting CharBuffer is a single object (big enough), it can be converted to String via simple CharBuffer.toString(). Which is a copy-ing operation. In situations where the number of resulting characters can be anticipated in advance (like when we know in advance the number of bytes to be decoded and the charset used has fixed "number of bytes per char" or nearly fixed (like with UTF-8), a simple static utility method somewhere in java.lang.nio package could be used to optimize this operation:

    public static String decodeString(CharsetDecoder dec, ByteBuffer in)
        throws CharacterCodingException {

        CharBuffer cb = dec.decode(in);

        if (cb.length() == cb.hb.length) {
            // optimized no-copy String construction
return SharedSecrets.getJavaLangAccess().newStringUnsafe(cb.hb);
        } else {
            return cb.toString();
        }
    }


  I'd like to propose that we add an additional constructor to the
String class that takes a ByteBuffer as an argument, and directly create
the char[] value inside the String from the ByteBuffer.

Similarly if you have a String that you want to encode onto the wire then
you need to call String.getBytes(), then write your byte[] into a
ByteBuffer or send it over the network.

Again, an alternative is to use CharBuffer.wrap(CharSequence cs, int start, int end) to wrap a String with a CharBuffer facade and then use CharsetEncoder to encode it directly into a resulting ByteBuffer. No additional copy-ing needed.

Regards, Peter

This ends up allocating a byte[] to
do the copy and also trimming the byte[] back down again, usually
allocating another byte[]. To address this problem I've added a couple of
getBytes() overloads that take byte[] and ByteBuffer arguments and write
directly to those buffers.

I've put together a patch that implements this to demonstrate the overall
direction.

http://cr.openjdk.java.net/~rwarburton/string-patch-webrev-5/

I'm happy to take any feedback on direction or the patch itself or the
overall idea/approach. I think there are a number of similar API situations
in other places as well, for example StringBuffer/StringBuilder instances
which could have similar appends directly from ByteBuffer instances instead
of byte[] instances.

I'll also be at Javaone next week, so if you want to talk about this, just
let me know.

regards,

   Richard Warburton

   http://insightfullogic.com
   @RichardWarburto <http://twitter.com/richardwarburto>

PS: I appreciate that since I'm adding a method to the public API which
consequently requires standardisation but I think that this could get
incorporated into the Java 9 umbrella JSR.

Reply via email to