Re: RFR 8193832: Performance of InputStream.readAllBytes() could be improved

Peter Levart Wed, 20 Dec 2017 04:41:47 -0800

Hi Brian,

I found another improvement. If you are reading from files, there's nodifference. But if you read from say socket(s), there may be short reads(read() returning 0). Your current code bails out from inner loop whenthis happens and consequently does not fill the buf up to the top whenit could fill it if it retried the inner loop. This is a variant whereinner loop guarantees that either buf is filled to the top or stream isat EOF:


    public byte[] readAllBytes() throws IOException {
        List<byte[]> bufs = null;
        byte[] result = null;
        byte[] buf = new byte[DEFAULT_BUFFER_SIZE];
        int total = 0;
        int remaining; // # of bytes remaining to fill buf
        do {
            remaining = buf.length;
            int n;
            // read to fill the buf or until EOF. Loop ends when either:

// - remaining == 0 and there's possibly more to be readfrom stream; or

            // - remaining > 0 and the stream is at EOF
            while (remaining > 0 &&

(n = read(buf, buf.length - remaining, remaining))>= 0) {

                remaining -= n;
            }

            int nread = buf.length - remaining;
            if (nread > 0) {
                if (MAX_BUFFER_SIZE - total < nread) {

throw new OutOfMemoryError("Required array size toolarge");

                }
                total += nread;
                byte[] copy;
                if (remaining == 0) { // buf is filled to the top
                    copy = buf;
                    buf = new byte[DEFAULT_BUFFER_SIZE];
                } else {
                    copy = Arrays.copyOf(buf, nread);
                }
                if (result == null) {
                    result = copy;
                } else {
                    bufs = new ArrayList<>(8);
                    bufs.add(result);
                    bufs.add(copy);
                }
            }
        } while (remaining == 0); // there may be more bytes in stream

        if (bufs == null) {
            return result == null ? new byte[0] : result;
        }

        result = new byte[total];
        int offset = 0;
        for (byte[] b : bufs) {
            System.arraycopy(b, 0, result, offset, b.length);
            offset += b.length;
        }

        return result;
    }



Regards, Peter

On 12/20/2017 12:59 PM, Peter Levart wrote:

Hi Brian,

There's also a variation of copy-ing fragment possible, that replacescopying with allocation:


                byte[] copy;
                if (nread == DEFAULT_BUFFER_SIZE) {
                    copy = buf;
                    if (n >= 0) {
                        buf = new byte[DEFAULT_BUFFER_SIZE];
                    }
                } else {
                    copy = Arrays.copyOf(buf, nread);
                }

For big FileInputStream(s), the buf will be fully read (nread ==DEFAULT_BUFFER_SIZE) most of the times. So this might be animprovement if allocation (involving pre-zeroing) is faster thanArrays.copyOf() which avoids pre-zeroing, but involves copying.


Regards, Peter

On 12/20/2017 12:45 PM, Peter Levart wrote:

Hi Brian,

On 12/20/2017 12:22 AM, Brian Burkhalter wrote:

On Dec 19, 2017, at 2:36 PM, Brian Burkhalter<brian.burkhal...@oracle.com> wrote:

You can also simplify the “for(;;) + break" into a do while loop:

do {
  int nread = 0;
  ...
} while (n > 0);

Good suggestion but I think that this needs to be "while (n >= 0)."

Updated version here:

http://cr.openjdk.java.net/~bpb/8193832/webrev.02/

Thanks,

Brian

Looks good. There is one case that could be further optimized. Whenresult.length <= DEFAULT_BUFFER_SIZE, the allocation of ArrayListcould be avoided. Imagine a use case where lots of small files areread into byte[] arrays. For exmaple:


    public byte[] readAllBytes() throws IOException {
        List<byte[]> bufs = null;
        byte[] result = null;
        byte[] buf = new byte[DEFAULT_BUFFER_SIZE];
        int total = 0;
        int n;
        do {
            int nread = 0;

            // read to EOF which may read more or less than buffer size
            while ((n = read(buf, nread, buf.length - nread)) > 0) {
                nread += n;
            }

            if (nread > 0) {
                if (MAX_BUFFER_SIZE - total < nread) {

throw new OutOfMemoryError("Required array sizetoo large");

                }
                total += nread;
                byte[] copy = (n < 0 && nread == DEFAULT_BUFFER_SIZE) ?
                    buf : Arrays.copyOf(buf, nread);
                if (result == null) {
                    result = copy;
                } else {
                    bufs = new ArrayList<>(8);
                    bufs.add(result);
                    bufs.add(copy);
                }
            }

} while (n >= 0); // if the last call to read returned -1,then break


        if (bufs == null) {
            return result == null ? new byte[0] : result;
        }

        result = new byte[total];
        int offset = 0;
        for (byte[] b : bufs) {
            System.arraycopy(b, 0, result, offset, b.length);
            offset += b.length;
        }

        return result;
    }

There is a possibility that JIT already avoids allocating ArrayListutilizing EA if all involved ArrayList methods inline, so thispotential optimization should be tested 1st to see if it actuallyhelps improve the "small file" case.


Regards, Peter

Re: RFR 8193832: Performance of InputStream.readAllBytes() could be improved

Reply via email to