On Sun, 12 Mar 2023 21:25:46 GMT, Eirik Bjorsnos <[email protected]> wrote:
>> Please review this PR which speeds up TestTooManyEntries and clarifies its
>> purpose:
>>
>> - The name 'TestTooManyEntries' does not clearly convey the purpose of the
>> test. What is tested is the validation that the total CEN size fits in a
>> Java byte array. Suggested rename: CenSizeTooLarge
>> - The test creates DEFLATED entries which incurs zlib costs and File Data /
>> Data Descriptors for no additional benefit. We can use STORED instead.
>> - By creating a single LocalDateTime and setting it with
>> `ZipEntry.setTimeLocal`, we can avoid repeated time zone calculations.
>> - The name of entries is generated by calling UUID.randomUUID, we could use
>> simple counter instead.
>> - The produced file is unnecessarily large. We know how large a CEN entry
>> is, let's take advantage of that to create a file with the minimal size.
>> - By adding a maximally large extra field to the CEN entries, we get away
>> with fewer CEN records and save memory
>> - The summary and comments of the test can be improved to help explain the
>> purpose of the test and how we reach the limit being tested.
>>
>> These speedups reduced the runtime from 4 min 17 sec to 4 seconds on my
>> Macbook Pro. The produced ZIP size was reduced from 5.7 GB to 2 GB. Memory
>> consumption is down from 8GB to something like 12MB.
>
> Eirik Bjorsnos has updated the pull request incrementally with two additional
> commits since the last revision:
>
> - MAX_EXTRA_FIELD_SIZE can be better expressed as 0xFFFF
> - Bring back '@requires sun.arch.data.model == 64' for now
The test now runs fast with much less memory, but still consumes 2GB of disk
space.
I brought back `@requires (sun.arch.data.model == "64")`, is this required for
files > 2GB?
We could bring down the consumed disk space to 131MB by using a sparse file.
Whether this is worth pursuing depends on whether the 2GB file is considered
problematic.
Here's the SparseFileOutputStream used to bring the size down to 131MB:
/**
* By writing mostly extra fields as sparse 'holes', we can save disk space
* used by this test from ~2GB to ~131MB
*/
private static class SparseOutputStream extends FilterOutputStream {
private final byte[] extra;
private final FileChannel channel;
public SparseOutputStream(byte[] extra, FileChannel channel) {
super(new BufferedOutputStream(Channels.newOutputStream(channel)));
this.extra = extra;
this.channel = channel;
}
@Override
public void write(byte[] b, int off, int len) throws IOException {
if (b == extra && off == 0 && len == extra.length) {
// Write extra field header
out.write(b, off, EXTRA_HEADER_LENGTH);
out.flush();
// The data is all zeros, we can advance the position instead
channel.position(channel.position() + len - EXTRA_HEADER_LENGTH);
} else {
out.write(b, off, len);
}
}
}
-------------
PR: https://git.openjdk.org/jdk/pull/12991