Hi Brian,
I found another improvement. If you are reading from files, there's no
difference. But if you read from say socket(s), there may be short reads
(read() returning 0). Your current code bails out from inner loop when
this happens and consequently does not fill the buf up to the top when
it could fill it if it retried the inner loop. This is a variant where
inner loop guarantees that either buf is filled to the top or stream is
at EOF:
public byte[] readAllBytes() throws IOException {
List<byte[]> bufs = null;
byte[] result = null;
byte[] buf = new byte[DEFAULT_BUFFER_SIZE];
int total = 0;
int remaining; // # of bytes remaining to fill buf
do {
remaining = buf.length;
int n;
// read to fill the buf or until EOF. Loop ends when either:
// - remaining == 0 and there's possibly more to be read
from stream; or
// - remaining > 0 and the stream is at EOF
while (remaining > 0 &&
(n = read(buf, buf.length - remaining, remaining))
>= 0) {
remaining -= n;
}
int nread = buf.length - remaining;
if (nread > 0) {
if (MAX_BUFFER_SIZE - total < nread) {
throw new OutOfMemoryError("Required array size too
large");
}
total += nread;
byte[] copy;
if (remaining == 0) { // buf is filled to the top
copy = buf;
buf = new byte[DEFAULT_BUFFER_SIZE];
} else {
copy = Arrays.copyOf(buf, nread);
}
if (result == null) {
result = copy;
} else {
bufs = new ArrayList<>(8);
bufs.add(result);
bufs.add(copy);
}
}
} while (remaining == 0); // there may be more bytes in stream
if (bufs == null) {
return result == null ? new byte[0] : result;
}
result = new byte[total];
int offset = 0;
for (byte[] b : bufs) {
System.arraycopy(b, 0, result, offset, b.length);
offset += b.length;
}
return result;
}
Regards, Peter
On 12/20/2017 12:59 PM, Peter Levart wrote:
Hi Brian,
There's also a variation of copy-ing fragment possible, that replaces
copying with allocation:
byte[] copy;
if (nread == DEFAULT_BUFFER_SIZE) {
copy = buf;
if (n >= 0) {
buf = new byte[DEFAULT_BUFFER_SIZE];
}
} else {
copy = Arrays.copyOf(buf, nread);
}
For big FileInputStream(s), the buf will be fully read (nread ==
DEFAULT_BUFFER_SIZE) most of the times. So this might be an
improvement if allocation (involving pre-zeroing) is faster than
Arrays.copyOf() which avoids pre-zeroing, but involves copying.
Regards, Peter
On 12/20/2017 12:45 PM, Peter Levart wrote:
Hi Brian,
On 12/20/2017 12:22 AM, Brian Burkhalter wrote:
On Dec 19, 2017, at 2:36 PM, Brian Burkhalter
<brian.burkhal...@oracle.com> wrote:
You can also simplify the “for(;;) + break" into a do while loop:
do {
int nread = 0;
...
} while (n > 0);
Good suggestion but I think that this needs to be "while (n >= 0)."
Updated version here:
http://cr.openjdk.java.net/~bpb/8193832/webrev.02/
Thanks,
Brian
Looks good. There is one case that could be further optimized. When
result.length <= DEFAULT_BUFFER_SIZE, the allocation of ArrayList
could be avoided. Imagine a use case where lots of small files are
read into byte[] arrays. For exmaple:
public byte[] readAllBytes() throws IOException {
List<byte[]> bufs = null;
byte[] result = null;
byte[] buf = new byte[DEFAULT_BUFFER_SIZE];
int total = 0;
int n;
do {
int nread = 0;
// read to EOF which may read more or less than buffer size
while ((n = read(buf, nread, buf.length - nread)) > 0) {
nread += n;
}
if (nread > 0) {
if (MAX_BUFFER_SIZE - total < nread) {
throw new OutOfMemoryError("Required array size
too large");
}
total += nread;
byte[] copy = (n < 0 && nread == DEFAULT_BUFFER_SIZE) ?
buf : Arrays.copyOf(buf, nread);
if (result == null) {
result = copy;
} else {
bufs = new ArrayList<>(8);
bufs.add(result);
bufs.add(copy);
}
}
} while (n >= 0); // if the last call to read returned -1,
then break
if (bufs == null) {
return result == null ? new byte[0] : result;
}
result = new byte[total];
int offset = 0;
for (byte[] b : bufs) {
System.arraycopy(b, 0, result, offset, b.length);
offset += b.length;
}
return result;
}
There is a possibility that JIT already avoids allocating ArrayList
utilizing EA if all involved ArrayList methods inline, so this
potential optimization should be tested 1st to see if it actually
helps improve the "small file" case.
Regards, Peter