Hi Brian,

I found another improvement. If you are reading from files, there's no difference. But if you read from say socket(s), there may be short reads (read() returning 0). Your current code bails out from inner loop when this happens and consequently does not fill the buf up to the top when it could fill it if it retried the inner loop. This is a variant where inner loop guarantees that either buf is filled to the top or stream is at EOF:

    public byte[] readAllBytes() throws IOException {
        List<byte[]> bufs = null;
        byte[] result = null;
        byte[] buf = new byte[DEFAULT_BUFFER_SIZE];
        int total = 0;
        int remaining; // # of bytes remaining to fill buf
        do {
            remaining = buf.length;
            int n;
            // read to fill the buf or until EOF. Loop ends when either:
            // - remaining == 0 and there's possibly more to be read from stream; or
            // - remaining > 0 and the stream is at EOF
            while (remaining > 0 &&
                   (n = read(buf, buf.length - remaining, remaining)) >= 0) {
                remaining -= n;
            }

            int nread = buf.length - remaining;
            if (nread > 0) {
                if (MAX_BUFFER_SIZE - total < nread) {
                    throw new OutOfMemoryError("Required array size too large");
                }
                total += nread;
                byte[] copy;
                if (remaining == 0) { // buf is filled to the top
                    copy = buf;
                    buf = new byte[DEFAULT_BUFFER_SIZE];
                } else {
                    copy = Arrays.copyOf(buf, nread);
                }
                if (result == null) {
                    result = copy;
                } else {
                    bufs = new ArrayList<>(8);
                    bufs.add(result);
                    bufs.add(copy);
                }
            }
        } while (remaining == 0); // there may be more bytes in stream

        if (bufs == null) {
            return result == null ? new byte[0] : result;
        }

        result = new byte[total];
        int offset = 0;
        for (byte[] b : bufs) {
            System.arraycopy(b, 0, result, offset, b.length);
            offset += b.length;
        }

        return result;
    }



Regards, Peter

On 12/20/2017 12:59 PM, Peter Levart wrote:
Hi Brian,

There's also a variation of copy-ing fragment possible, that replaces copying with allocation:

                byte[] copy;
                if (nread == DEFAULT_BUFFER_SIZE) {
                    copy = buf;
                    if (n >= 0) {
                        buf = new byte[DEFAULT_BUFFER_SIZE];
                    }
                } else {
                    copy = Arrays.copyOf(buf, nread);
                }

For big FileInputStream(s), the buf will be fully read (nread == DEFAULT_BUFFER_SIZE) most of the times. So this might be an improvement if allocation (involving pre-zeroing) is faster than Arrays.copyOf() which avoids pre-zeroing, but involves copying.

Regards, Peter

On 12/20/2017 12:45 PM, Peter Levart wrote:
Hi Brian,

On 12/20/2017 12:22 AM, Brian Burkhalter wrote:
On Dec 19, 2017, at 2:36 PM, Brian Burkhalter <brian.burkhal...@oracle.com> wrote:

You can also simplify the “for(;;) + break" into a do while loop:

do {
  int nread = 0;
  ...
} while (n > 0);
Good suggestion but I think that this needs to be "while (n >= 0)."
Updated version here:

http://cr.openjdk.java.net/~bpb/8193832/webrev.02/

Thanks,

Brian

Looks good. There is one case that could be further optimized. When result.length <= DEFAULT_BUFFER_SIZE, the allocation of ArrayList could be avoided. Imagine a use case where lots of small files are read into byte[] arrays. For exmaple:

    public byte[] readAllBytes() throws IOException {
        List<byte[]> bufs = null;
        byte[] result = null;
        byte[] buf = new byte[DEFAULT_BUFFER_SIZE];
        int total = 0;
        int n;
        do {
            int nread = 0;

            // read to EOF which may read more or less than buffer size
            while ((n = read(buf, nread, buf.length - nread)) > 0) {
                nread += n;
            }

            if (nread > 0) {
                if (MAX_BUFFER_SIZE - total < nread) {
                    throw new OutOfMemoryError("Required array size too large");
                }
                total += nread;
                byte[] copy = (n < 0 && nread == DEFAULT_BUFFER_SIZE) ?
                    buf : Arrays.copyOf(buf, nread);
                if (result == null) {
                    result = copy;
                } else {
                    bufs = new ArrayList<>(8);
                    bufs.add(result);
                    bufs.add(copy);
                }
            }
        } while (n >= 0); // if the last call to read returned -1, then break

        if (bufs == null) {
            return result == null ? new byte[0] : result;
        }

        result = new byte[total];
        int offset = 0;
        for (byte[] b : bufs) {
            System.arraycopy(b, 0, result, offset, b.length);
            offset += b.length;
        }

        return result;
    }


There is a possibility that JIT already avoids allocating ArrayList utilizing EA if all involved ArrayList methods inline, so this potential optimization should be tested 1st to see if it actually helps improve the "small file" case.

Regards, Peter



Reply via email to