On Mon, 30 Jan 2023 10:32:38 GMT, Eirik Bjorsnos <d...@openjdk.org> wrote:

> After finding a hash match, getEntryPos needs to compare the lookup name up 
> to the encoded entry name in the CEN. This comparison is done by decoding the 
> entry name into a String. The names can then be compared using the String 
> API. This decoding step adds a significat cost to this method.
> 
> This PR suggest to update the string comparison such that in the common case 
> where  both the lookup name and the entry name are encoded in 
> ASCII-compatible UTF-8,  decoding can be avoided and the byte arrays can 
> instead be compared direcly. 
> 
> ZipCoder is updated with a new method to compare a string with an encoded 
> byte array range.  The default implementation decodes to string (like the 
> current code), while the UTF-8 implementation uses 
> JavaLangAccess.getBytesNoRepl to get the  bytes. Both methods thes uses 
> Arrays.mismatch for comparison with or without matching trailing slashes. 
> 
> Additionally, this PR suggest to make the following updates to getEntryPos:
> 
> - The try/catch for IAE is redundant and can be safely removed. (initCEN 
> already checks this and will throws IAE for invalid UTF-8). This seems to 
> give a 3-4% speedup on micros)
> - A new test InvalidBytesInEntryNameOrComment is a added to verify that 
> initCEN does in fact reject invalid UTF-8 in CEN file names and comments. (I 
> found no existing test coverage for this)
> - The recursion when looking for "name/" matches is replaced with iteration. 
> We keep track of any "name/" match and return it at the end of the search. (I 
> feel this is easier to follow and it also gives a ~30% speedup for addSlash 
> lookups with no regression on regular lookups)
> 
> (My though is that including these additional updates in this PR might reduce 
> reviewer overhead given that it touches the exact same code. I might be wrong 
> on this, please advise :)
> 
> I'm seeing a ~17% saving on the micro ZipFileGetEntry.getEntryHit (modified 
> to use xalan.jar):
> 
> Baseline:
> 
> Benchmark                             (size)  Mode  Cnt    Score   Error  
> Units
> ZipFileGetEntry.getEntryHit              512  avgt   15   74.941 ± 1.004  
> ns/op
> ZipFileGetEntry.getEntryHit             1024  avgt   15   84.943 ± 1.320  
> ns/op
> ZipFileGetEntry.getEntryHitUncached      512  avgt   15  120.371 ± 2.386  
> ns/op
> ZipFileGetEntry.getEntryHitUncached     1024  avgt   15  126.128 ± 1.075  
> ns/op
> ZipFileGetEntry.getEntryMiss             512  avgt   15   23.818 ± 0.838  
> ns/op
> ZipFileGetEntry.getEntryMiss            1024  avgt   15   29.762 ± 5.998  
> ns/op
> ZipFileGetEntry.getEntryMissUncached     512  avgt   15   59.405 ± 0.545  
> ns/op
> ZipFileGetEntry.getEntryMissUncached    1024  avgt   15   71.840 ± 2.455  
> ns/op
> ZipFileGetEntry.getEntrySlash            512  avgt   15  135.621 ± 4.341  
> ns/op
> ZipFileGetEntry.getEntrySlash           1024  avgt   15  134.190 ± 2.141  
> ns/op
> 
> 
> 
> PR:
> 
> 
> Benchmark                             (size)  Mode  Cnt    Score   Error  
> Units
> ZipFileGetEntry.getEntryHit              512  avgt   15   62.267 ± 1.329  
> ns/op
> ZipFileGetEntry.getEntryHit             1024  avgt   15   72.916 ± 2.428  
> ns/op
> ZipFileGetEntry.getEntryHitUncached      512  avgt   15  101.630 ± 1.154  
> ns/op
> ZipFileGetEntry.getEntryHitUncached     1024  avgt   15  113.161 ± 0.502  
> ns/op
> ZipFileGetEntry.getEntryMiss             512  avgt   15   23.003 ± 1.191  
> ns/op
> ZipFileGetEntry.getEntryMiss            1024  avgt   15   23.236 ± 1.114  
> ns/op
> ZipFileGetEntry.getEntryMissUncached     512  avgt   15   56.781 ± 1.505  
> ns/op
> ZipFileGetEntry.getEntryMissUncached    1024  avgt   15   67.767 ± 1.963  
> ns/op
> ZipFileGetEntry.getEntrySlash            512  avgt   15   73.745 ± 2.717  
> ns/op
> ZipFileGetEntry.getEntrySlash           1024  avgt   15   75.784 ± 1.051  
> ns/op
> 
> 
> To assess the impact on startup/warmup, I made a main method class which 
> measures the total time of calling ZipFile.getEntry for all entries in the 
> 109 JAR file dependenies of spring-petclinic. The shows a nice improvement 
> (time in micros):
> 
> 
> Percentile Baseline Patch
> 50 %          23155 21149
> 75 %          23598 21454
> 90 %          23989 21691
> 95 %          24238 21973
> 99 %          25270 22446
> STDEV           792   549
> Count           500   500

Filed JDK-8301873 for this, update PR title when you're ready.

src/java.base/share/classes/java/lang/System.java line 2668:

> 2666:             @Override
> 2667:             public int mismatchUTF8(String str, byte[] b, int 
> fromIndex, int toIndex) {
> 2668:                 byte[] encoded = str.isLatin1() ? str.value() : 
> str.getBytes(UTF_8.INSTANCE);

I think this is incorrect: latin-1 characters above codepoint 127 (non-ascii) 
would be represented by 2 bytes in UTF-8. What you want here is probably 
`str.isAscii() ? ...`. The ASCII check will have to look at the bytes, so will 
incur a minor penalty.

Good news is that you should already be able to do this with what's already 
exposed via `JLA.getBytesNoRepl(str, StandardCharsets.UTF_8)`, so no need for 
more shared secrets.

src/java.base/share/classes/java/lang/System.java line 2671:

> 2669:                 if (false) {
> 2670:                     // Arrays.mismatch without the range checks (~5% 
> faster micro getEntryHit)
> 2671:                     int aLength = encoded.length;

Part of the difference you're seeing is due to knowing that you'll be matching 
the entire length of the first array (`encoded, 0, encoded.length`).

As an experiment I added `Arrays.mismatch(byte[], byte[], int, int)` to 
mismatch the entire range of the first array argument vs the range of the 
second and can spot an improvement in affected micros:

Benchmark                                     (size)  Mode  Cnt   Score   Error 
 Units
ArraysMismatch.Char.differentSubrangeMatches      90  avgt   10  12.165 ± 0.074 
 ns/op # mismatch(a, aFrom, aTo, b, bFrom, bTo)
ArraysMismatch.Char.subrangeMatches               90  avgt   10  10.748 ± 0.006 
 ns/op # mismatch(a, b, bFrom, bTo)

This might be something we can solve in the JITs without having to add new 
methods to `java.util.Arrays` to deal as efficiently as possible with the case 
when we're matching against the entirety of one of the arrays.

-------------

PR: https://git.openjdk.org/jdk/pull/12290

Reply via email to