Re: RFR: 8253821: Improve ByteBuffer performance with GCM [v2]
On Wed, 7 Oct 2020 19:42:11 GMT, Valerie Peng wrote: >> src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 664: >> >>> 662: * engineUpdate() and engineDoFinal(). >>> 663: */ >>> 664: private int bufferCrypt(ByteBuffer input, ByteBuffer output, >> >> It looks like this method is copied from the CipherSpi. For maintenance, it >> would be nice to add an extra comment to >> state the copy and update. For example, "this method and implementation is >> copied from javax.crypto.CipherSpi, with an >> improvement for GCM mode." > > Agree w/ Xuelei, it'd be nice to mention CipherSpi.bufferCrypt() method. In > case a bug is found, it'd be checked and > fixed in both places. Yeah.. good point. - PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8153005: Upgrade the default PKCS12 encryption/MAC algorithms
On Wed, 7 Oct 2020 22:08:19 GMT, Hai-May Chao wrote: >> Default algorithms are bumped to be based on PBES2 with AES-256 and SHA-256. >> Please also review the CSR at >> https://bugs.openjdk.java.net/browse/JDK-8228481. > > Looks good. Only minor comments. CSR looks good. In "Sepcification" section: a typo in 'Thr iteration counts used by'. At the end, it describes the new system property will override the security properties and use the older and weaker algorithms, so suggest we could also add text about setting the iteration counts to the default legacy values. - PR: https://git.openjdk.java.net/jdk/pull/473
Re: RFR: 8253821: Improve ByteBuffer performance with GCM [v2]
On Thu, 8 Oct 2020 03:21:46 GMT, Anthony Scarpino wrote: >> 8253821: Improve ByteBuffer performance with GCM > > Anthony Scarpino has updated the pull request incrementally with one > additional commit since the last revision: > > Xuelei comments src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 658: > 656: BadPaddingException { > 657: return bufferCrypt(input, output, false); > 658: } Is the override of this method for using a different bufferCrypt impl? There is also engineUpdate(ByteBuffer, ByteBuffer) in CipherSpi, is there a reason for not overriding that here? src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 744: > 742: } else { > 743: return core.doFinal(input, output); > 744: } It seems this block is the only difference between this method and CipherSpi.bufferCrypt(). Have you considered moving this special handling to the overridden engineDoFinal(...) method and not duplicating the whole CipherSpi.bufferCrypt() method here? BTW, instead of using the generic update/doFinal name and then commenting them for GCM usage only, perhaps it's more enticing to name them as gcmUpdate/gcmDoFinal? src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 820: > 818: return cipher.decrypt(src, dst); > 819: } > 820: return cipher.encrypt(src, dst); How about return (decrypting? cipher.decrypt(src, dst) : cipher.encrypt(src, dst)); src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 816: > 814: } > 815: > 816: int update(ByteBuffer src, ByteBuffer dst) throws > ShortBufferException { Is this also one of the "GCM only" methods? If so, same comment as doFinal(ByteBuffer, ByteBuffer)? Maybe the name should be more specific to avoid misuse. src/java.base/share/classes/com/sun/crypto/provider/FeedbackCipher.java line 261: > 259: > 260: int encryptFinal(ByteBuffer src, ByteBuffer dst) > 261: throws IllegalBlockSizeException, AEADBadTagException, Tag is generated during encryption, this can't possibly throw AEADBadTagException, copy-n-paste error maybe? src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 1253: > 1251: if (decrypting) { > 1252: if (buffered > 0) { > 1253: cipher.decrypt(buffer, 0, buffered, new byte[0], 0); This looks a bit strange? The output buffer is 0-length which would lead to ShortBufferException when the buffered bytes is enough to produce some output. src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 1258: > 1256: } else { > 1257: if (buffered > 0) { > 1258: cipher.encrypt(buffer, 0, buffered, new byte[0], 0); Same comment as above? src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 939: > 937: // if the size of specified output buffer is less than > 938: // the length of the cipher text, then the current > 939: // content of cipher has to be preserved in order for This change is somewhat dangerous. When using the supplied output buffer directly, you may corrupt its content w/ padding bytes. This optimization should only be applied when no padding is involved. In addition, input and output can point to the same buffer with all sorts of index combinations, i.e. inOfs == outOfs, inOfs < outOfs, inOfs > outOfs. With "outWithPadding" approach, no need to check. However, for "internalOutput", data corruption may happen. This kind of problem can be very hard to diagnose. Have to be very very careful here as this code may impact all ciphers... src/java.base/share/classes/com/sun/crypto/provider/GCTR.java line 240: > 238: } > 239: } > 240: } See the above red icon? It warns about missing newline at end of this file. src/java.base/share/classes/com/sun/crypto/provider/GaloisCounterMode.java line 528: > 526: } > 527: > 528: ArrayUtil.blockSizeCheck(src.remaining(), blockSize); Hmm, I am not sure if this check still applies in ByteBuffer case. You are passing the ByteBuffer objs directly from AESCipher->CipherCore->GaloisCounterMode. This is different from the byte[] case where CipherCore would chop up the data into blocks and pass the blocks to the underlying FeedbackCipher impl. Perhaps no existing regression tests covers ByteBuffer inputs w/ non-blocksize data? Otherwise, this should be caught? BTW, why not just use 'len' again? Seems unnecessary to keep calling src.remaining() in various places in this method. - PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8253821: Improve ByteBuffer performance with GCM [v2]
On Wed, 7 Oct 2020 16:29:32 GMT, Xue-Lei Andrew Fan wrote: >> Anthony Scarpino has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Xuelei comments > > src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 947: > >> 945: // create temporary output buffer if the estimated size is >> larger >> 946: // than the user-provided buffer. >> 947: if (output.length - outputOffset < estOutSize) { > > "outputCapacity" could be used to replace "output.length - outputOffset", and > join the clause with the if-clause above. interesting.. That makes this and the if condition below it able to join again. > src/java.base/share/classes/com/sun/crypto/provider/CounterMode.java line 28: > >> 26: package com.sun.crypto.provider; >> 27: >> 28: import java.nio.ByteBuffer; > > There is no more update except this line. The new import ByteBuffer may not > be used. Thanks. That is another new thing to get used to with git and multiple commits. When cleaning up code it's easy to not see changes like this. - PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8253821: Improve ByteBuffer performance with GCM [v2]
On Wed, 7 Oct 2020 16:17:38 GMT, Xue-Lei Andrew Fan wrote: >> Anthony Scarpino has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Xuelei comments > > src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 664: > >> 662: * engineUpdate() and engineDoFinal(). >> 663: */ >> 664: private int bufferCrypt(ByteBuffer input, ByteBuffer output, > > It looks like this method is copied from the CipherSpi. For maintenance, it > would be nice to add an extra comment to > state the copy and update. For example, "this method and implementation is > copied from javax.crypto.CipherSpi, with an > improvement for GCM mode." Agree w/ Xuelei, it'd be nice to mention CipherSpi.bufferCrypt() method. In case a bug is found, it'd be checked and fixed in both places. - PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8253821: Improve ByteBuffer performance with GCM [v2]
On Wed, 7 Oct 2020 22:38:21 GMT, Valerie Peng wrote: >> Anthony Scarpino has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Xuelei comments > > src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 939: > >> 937: // if the size of specified output buffer is less than >> 938: // the length of the cipher text, then the current >> 939: // content of cipher has to be preserved in order for > > This change is somewhat dangerous. When using the supplied output buffer > directly, you may corrupt its content w/ > padding bytes. This optimization should only be applied when no padding is > involved. In addition, input and output can > point to the same buffer with all sorts of index combinations, i.e. inOfs == > outOfs, inOfs < outOfs, inOfs > outOfs. > With "outWithPadding" approach, no need to check. However, for > "internalOutput", data corruption may happen. This kind > of problem can be very hard to diagnose. Have to be very very careful here as > this code may impact all ciphers... - I do not understand where the corruption comes from. The user provides a buffer that output is suppose to be placed into, what could it be corrupting? The existing tests (SameBuffer, in particular) works fine with this and the ByteBuffer calls. I spent a lot of time trying to get those same buffer tests to pass. - It was my expectation that checkOutputCapacity() is making sure all the inOfs ==<> outOfs are going to work. Does that not catch all cases? - outWithPadding" is excessive because it doubles the allocation for every operation followed by a copy to the original buffer, even if the original buffer was adequate. I'm ok with doing the extra alloc & copy in certain situations, but not everytime. Can you be more specific what things may fail that we don't already check for in the regression tests? > src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 658: > >> 656: BadPaddingException { >> 657: return bufferCrypt(input, output, false); >> 658: } > > Is the override of this method for using a different bufferCrypt impl? There > is also engineUpdate(ByteBuffer, > ByteBuffer) in CipherSpi, is there a reason for not overriding that here? Yes. thanks.. The IDE covered that up by going to the one in CipherSpi and I thought it was calling the AES one. TLS doesn't use update() so the perf numbers won't change. I'll have to run the tests again. > src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 744: > >> 742: } else { >> 743: return core.doFinal(input, output); >> 744: } > > It seems this block is the only difference between this method and > CipherSpi.bufferCrypt(). Have you considered moving > this special handling to the overridden engineDoFinal(...) method and not > duplicating the whole CipherSpi.bufferCrypt() > method here? BTW, instead of using the generic update/doFinal name and then > commenting them for GCM usage only, perhaps > it's more enticing to name them as gcmUpdate/gcmDoFinal? I didn't see a way to override this because CipherSpi is a public class, any methods I added would become a new API. Also bufferCrypt is private, so I had to copy it. CipherSpi does not know which mode is being used, but AESCipher does. Maybe I'm missing something, I'd rather it not be a copy, but I couldn't see a better way. If you have a specific idea, please give me details. - PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8242882: opening jar file with large manifest might throw NegativeArraySizeException [v2]
On Wed, 7 Oct 2020 21:40:43 GMT, Brent Christian wrote: >> I decided to slightly change the way this large manifest file was being >> created. I borrowed the idea from >> `Zip64SizeTest`[1] to create the file and set its length to a large value. I >> hope that is OK. If not, let me know, I >> will change this part. [1] >> https://github.com/openjdk/jdk/blob/master/test/jdk/java/util/zip/ZipFile/Zip64SizeTest.java#L121 > > I did some automated test runs, and the duration of this test is sufficiently > improved, IMO. > > While not representative of a real MANIFEST.MF file, I think it works well > enough for this specific test. Thank you for the review Brent. - PR: https://git.openjdk.java.net/jdk/pull/323
Re: RFR: 8253821: Improve ByteBuffer performance with GCM [v2]
> 8253821: Improve ByteBuffer performance with GCM Anthony Scarpino has updated the pull request incrementally with one additional commit since the last revision: Xuelei comments - Changes: - all: https://git.openjdk.java.net/jdk/pull/411/files - new: https://git.openjdk.java.net/jdk/pull/411/files/3a3a0296..b29c1ed5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=411&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=411&range=00-01 Stats: 17 lines in 6 files changed: 6 ins; 7 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/411.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/411/head:pull/411 PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8242882: opening jar file with large manifest might throw NegativeArraySizeException [v3]
On Wed, 7 Oct 2020 21:40:57 GMT, Brent Christian wrote: >> Jaikiran Pai has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Second round of review comments addressed > > Marked as reviewed by bchristi (Reviewer). Hello Lance, does the latest state of this PR look fine to you? If so, shall I trigger a integrate? - PR: https://git.openjdk.java.net/jdk/pull/323
Re: RFR: 8242882: opening jar file with large manifest might throw NegativeArraySizeException [v2]
On Thu, 1 Oct 2020 14:39:50 GMT, Jaikiran Pai wrote: >> test/jdk/java/util/jar/JarFile/LargeManifestOOMTest.java line 78: >> >>> 76: bw.write("OOM-Test: "); >>> 77: for (long i = 0; i < 2147483648L; i++) { >>> 78: bw.write("a"); >> >> As you probably noticed, this test takes a little while to run. One way to >> speed it up a little would be to write more >> characters at a time. While we're at it, we may as well make the Manifest >> well-formed by breaking it into 72-byte >> lines. See "Line length" under: >> https://docs.oracle.com/en/java/javase/15/docs/specs/jar/jar.html#notes-on-manifest-and-signature-files >> Just write >> enough lines to exceed Integer.MAX_VALUE bytes. > > I decided to slightly change the way this large manifest file was being > created. I borrowed the idea from > `Zip64SizeTest`[1] to create the file and set its length to a large value. I > hope that is OK. If not, let me know, I > will change this part. [1] > https://github.com/openjdk/jdk/blob/master/test/jdk/java/util/zip/ZipFile/Zip64SizeTest.java#L121 I did some automated test runs, and the duration of this test is sufficiently improved, IMO. While not representative of a real MANIFEST.MF file, I think it works well enough for this specific test. - PR: https://git.openjdk.java.net/jdk/pull/323
Re: RFR: 8242882: opening jar file with large manifest might throw NegativeArraySizeException [v3]
On Thu, 1 Oct 2020 14:42:21 GMT, Jaikiran Pai wrote: >> Can I please get a review and a sponsor for a fix for >> https://bugs.openjdk.java.net/browse/JDK-8242882? >> >> As noted in that JBS issue, if the size of the Manifest entry in the jar >> happens to be very large (such that it exceeds >> the `Integer.MAX_VALUE`), then the current code in `JarFile#getBytes` can >> lead to a `NegativeArraySizeException`. This >> is due to the: if (len != -1 && len <= 65535) block which evaluates to >> `true` when the size of the manifest entry is >> larger than `Integer.MAX_VALUE`. As a result, this then ends up calling the >> code which can lead to the >> `NegativeArraySizeException`. The commit in this PR fixes that issue by >> changing those `if/else` blocks to prevent >> this issue and instead use a code path that leads to the >> `InputStream#readAllBytes()` which internally has the >> necessary checks to throw the expected `OutOfMemoryError`. This commit also >> includes a jtreg test case which >> reproduces the issue and verifies the fix. > > Jaikiran Pai has updated the pull request incrementally with one additional > commit since the last revision: > > Second round of review comments addressed Marked as reviewed by bchristi (Reviewer). - PR: https://git.openjdk.java.net/jdk/pull/323
Re: RFR: 8153005: Upgrade the default PKCS12 encryption/MAC algorithms
On Wed, 7 Oct 2020 22:20:07 GMT, Hai-May Chao wrote: >> Looks good. Only minor comments. > > CSR looks good. In "Sepcification" section: a typo in 'Thr iteration counts > used by'. At the end, it describes the new > system property will override the security properties and use the older and > weaker algorithms, so suggest we could also > add text about setting the iteration counts to the default legacy values. CSR updated. More description, and iteration counts lowered to 1. Will update code soon. - PR: https://git.openjdk.java.net/jdk/pull/473
Re: RFR: 8153005: Upgrade the default PKCS12 encryption/MAC algorithms
On Thu, 1 Oct 2020 20:02:34 GMT, Weijun Wang wrote: > Default algorithms are bumped to be based on PBES2 with AES-256 and SHA-256. > Please also review the CSR at > https://bugs.openjdk.java.net/browse/JDK-8228481. Looks good. Only minor comments. src/java.base/share/classes/sun/security/pkcs12/PKCS12KeyStore.java line 103: > 101: = "PBEWithHmacSHA256AndAES_256"; > 102: private static final String DEFAULT_MAC_ALGORITHM = "HmacPBESHA256"; > 103: private static final int DEFAULT_PBE_ITERATION_COUNT = 5; As we have keystore.pkcs12.certPbeIterationCount and keystore.pkcs12.keyPbeIterationCount, I would like to suggest that we can define DEFAULT_CERT_PBE_ITERATION_COUNT and DEFAULT_KEY_PBE_ITERATION_COUNT, specifying each of the values for finer granularity. Same for LEGACY_PBE_ITERATION_COUNT. test/jdk/sun/security/mscapi/VeryLongAlias.java line 48: > 46: > 47: static String alias = String.format("%0512d", new > Random().nextInt(10)); > 48: Add bug number to @bug. - PR: https://git.openjdk.java.net/jdk/pull/473
Re: RFR: 8153005: Upgrade the default PKCS12 encryption/MAC algorithms
On Wed, 7 Oct 2020 22:06:28 GMT, Hai-May Chao wrote: >> Default algorithms are bumped to be based on PBES2 with AES-256 and SHA-256. >> Please also review the CSR at >> https://bugs.openjdk.java.net/browse/JDK-8228481. > > test/jdk/sun/security/mscapi/VeryLongAlias.java line 48: > >> 46: >> 47: static String alias = String.format("%0512d", new >> Random().nextInt(10)); >> 48: > > Add bug number to @bug. OK. - PR: https://git.openjdk.java.net/jdk/pull/473
Re: RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator)
On Wed, 7 Oct 2020 17:13:22 GMT, Maurizio Cimadamore wrote: > This patch contains the changes associated with the third incubation round of > the foreign memory access API incubation > (see JEP 393 [1]). This iteration focus on improving the usability of the API > in 3 main ways: > * first, by providing a way to obtain truly *shared* segments, which can be > accessed and closed concurrently from > multiple threads > * second, by providing a way to register a memory segment against a > `Cleaner`, so as to have some (optional) guarantee > that the memory will be deallocated, eventually > * third, by not requiring users to dive deep into var handles when they first > pick up the API; a new `MemoryAccess` class > has been added, which defines several useful dereference routines; these > are really just thin wrappers around memory > access var handles, but they make the barrier of entry for using this API > somewhat lower. > > A big conceptual shift that comes with this API refresh is that the role of > `MemorySegment` and `MemoryAddress` is not > the same as it used to be; it used to be the case that a memory address could > (sometimes, not always) have a back link > to the memory segment which originated it; additionally, memory access var > handles used `MemoryAddress` as a basic unit > of dereference. This has all changed as per this API refresh; now a > `MemoryAddress` is just a dumb carrier which > wraps a pair of object/long addressing coordinates; `MemorySegment` has > become the star of the show, as far as > dereferencing memory is concerned. You cannot dereference memory if you don't > have a segment. This improves usability > in a number of ways - first, it is a lot easier to wrap native addresses > (`long`, essentially) into a `MemoryAddress`; > secondly, it is crystal clear what a client has to do in order to dereference > memory: if a client has a segment, it can > use that; otherwise, if the client only has an address, it will have to > create a segment *unsafely* (this can be done > by calling `MemoryAddress::asSegmentRestricted`). A list of the API, > implementation and test changes is provided > below. If you have any questions, or need more detailed explanations, I (and > the rest of the Panama team) will be > happy to point at existing discussions, and/or to provide the feedback > required. A big thank to Erik Osterlund, > Vladimir Ivanov and David Holmes, without whom the work on shared memory > segment would not have been possible; also I'd > like to thank Paul Sandoz, whose insights on API design have been very > helpful in this journey. Thanks Maurizio > Javadoc: > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html > Specdiff: > > http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html > > CSR: > > https://bugs.openjdk.java.net/browse/JDK-8254163 > > > > ### API Changes > > * `MemorySegment` > * drop factory for restricted segment (this has been moved to > `MemoryAddress`, see below) > * added a no-arg factory for a native restricted segment representing > entire native heap > * rename `withOwnerThread` to `handoff` > * add new `share` method, to create shared segments > * add new `registerCleaner` method, to register a segment against a cleaner > * add more helpers to create arrays from a segment e.g. `toIntArray` > * add some `asSlice` overloads (to make up for the fact that now segments > are more frequently used as cursors) > * rename `baseAddress` to `address` (so that `MemorySegment` can implement > `Addressable`) > * `MemoryAddress` > * drop `segment` accessor > * drop `rebase` method and replace it with `segmentOffset` which returns > the offset (a `long`) of this address relative > to a given segment > * `MemoryAccess` > * New class supporting several static dereference helpers; the helpers are > organized by carrier and access mode, where a > carrier is one of the usual suspect (a Java primitive, minus `boolean`); > the access mode can be simple (e.g. access > base address of given segment), or indexed, in which case the accessor > takes a segment and either a low-level byte > offset,or a high level logical index. The classification is reflected in > the naming scheme (e.g. `getByte` vs. > `getByteAtOffset` vs `getByteAtIndex`). > * `MemoryHandles` > * drop `withOffset` combinator > * drop `withStride` combinator > * the basic memory access handle factory now returns a var handle which > takes a `MemorySegment` and a `long` - from which > it is easy to derive all the other handles using plain var handle > combinators. > * `Addressable` > * This is a new interface which is attached to entities which can be > projected to a `MemoryAddress`. For now, both > `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP > 389 [2] to add more implementat
Re: RFR: 8253821: Improve ByteBuffer performance with GCM
On Tue, 29 Sep 2020 20:22:55 GMT, Anthony Scarpino wrote: > 8253821: Improve ByteBuffer performance with GCM I will take a look as well. - PR: https://git.openjdk.java.net/jdk/pull/411
RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator)
This patch contains the changes associated with the third incubation round of the foreign memory access API incubation (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways: * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from multiple threads * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee that the memory will be deallocated, eventually * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class has been added, which defines several useful dereference routines; these are really just thin wrappers around memory access var handles, but they make the barrier of entry for using this API somewhat lower. A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit of dereference. This has all changed as per this API refresh; now a `MemoryAddress` is just a dumb carrier which wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`; secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done by calling `MemoryAddress::asSegmentRestricted`). A list of the API, implementation and test changes is provided below. If you have any questions, or need more detailed explanations, I (and the rest of the Panama team) will be happy to point at existing discussions, and/or to provide the feedback required. A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey. Thanks Maurizio Javadoc: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html Specdiff: http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html CSR: https://bugs.openjdk.java.net/browse/JDK-8254163 ### API Changes * `MemorySegment` * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below) * added a no-arg factory for a native restricted segment representing entire native heap * rename `withOwnerThread` to `handoff` * add new `share` method, to create shared segments * add new `registerCleaner` method, to register a segment against a cleaner * add more helpers to create arrays from a segment e.g. `toIntArray` * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors) * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`) * `MemoryAddress` * drop `segment` accessor * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative to a given segment * `MemoryAccess` * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). * `MemoryHandles` * drop `withOffset` combinator * drop `withStride` combinator * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which it is easy to derive all the other handles using plain var handle combinators. * `Addressable` * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`. * `MemoryLayouts` * A new layout, for machine addresses, has been added to the mix. ### Implementation changes There
Re: RFR: 8253821: Improve ByteBuffer performance with GCM
On Tue, 29 Sep 2020 20:22:55 GMT, Anthony Scarpino wrote: > 8253821: Improve ByteBuffer performance with GCM Impressive update and performance improvement! I have no major concerns, all comments are just about trivial details like indents. src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 660: > 658: } > 659: > 660: /** There is one extra indent.. src/java.base/share/classes/com/sun/crypto/provider/AESCipher.java line 664: > 662: * engineUpdate() and engineDoFinal(). > 663: */ > 664: private int bufferCrypt(ByteBuffer input, ByteBuffer output, It looks like this method is copied from the CipherSpi. For maintenance, it would be nice to add an extra comment to state the copy and update. For example, "this method and implementation is copied from javax.crypto.CipherSpi, with an improvement for GCM mode." src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 947: > 945: // create temporary output buffer if the estimated size is > larger > 946: // than the user-provided buffer. > 947: if (output.length - outputOffset < estOutSize) { "outputCapacity" could be used to replace "output.length - outputOffset", and join the clause with the if-clause above. src/java.base/share/classes/com/sun/crypto/provider/CipherCore.java line 928: > 926: int outputCapacity = checkOutputCapacity(output, outputOffset, > 927: estOutSize); > 928: int offset = outputOffset; // 0 for decrypting the line comment, "// 0 for decrypting", could be remove as the expression get changed. src/java.base/share/classes/com/sun/crypto/provider/CounterMode.java line 28: > 26: package com.sun.crypto.provider; > 27: > 28: import java.nio.ByteBuffer; There is no more update except this line. The new import ByteBuffer may not be used. src/java.base/share/classes/com/sun/crypto/provider/FeedbackCipher.java line 250: > 248: * ByteBuffer methods should not be accessed as CipherCore and > AESCipher > 249: * copy the data to byte arrays. These methods are to satisfy the > compiler. > 250: * there is an extra blank comment line. src/java.base/share/classes/com/sun/crypto/provider/GHASH.java line 203: > 201: > 202: // Maximum buffer size rotating ByteBuffer->byte[] intrinsic copy > 203: static final int MAX_LEN = 1024; This filed could be private. I would like to declare class fields in the beginning of a class, for easy eyes-searching. Maybe, it is nice to use a multiple of the block size (for example 64 * AES_BLOCK_SIZE), just in case someone else update it to a weird value later. src/java.base/share/classes/com/sun/crypto/provider/GCTR.java line 144: > 142: > 143: // Maximum buffer size rotating ByteBuffer->byte[] intrinsic copy > 144: final int MAX_LEN = 1024; This filed could be private. I would like to declare class fields in the beginning of a class, for easy eyes-searching. Maybe, it is nice to use a multiple of the block size (for example 64 * AES_BLOCK_SIZE), just in case someone else update it to a weird value later. - Marked as reviewed by xuelei (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/411
Re: RFR: 8253952: Refine ZipOutputStream.putNextEntry() to ignore ZipEntry's compressed size
On Wed, 7 Oct 2020 16:43:53 GMT, Volker Simonis wrote: >> Alan makes a good point. Perhaps we keep things simple and focus solely on >> tweaking the description of the change in >> behavior which is described in your last paragraph > > I totally agree. What about using just the last sentence (as you've proposed) > in the spec section and add the other to > as @implNote? O you think the last sentence will be enough? I think we can just go with the last sentence/paragraph. Perhaps we can further simplify the paragraph/sentence with something like: The compressed entry size will be recalculated for compressed (DEFLATED) entries when ZipEntry::setCompressedSize has not been explicitly called on the ZipEntry. or The compressed (DEFLATED) entry size will be recalculated when ZipEntry::setCompressedSize has not been explicitly called on the ZipEntry. - PR: https://git.openjdk.java.net/jdk/pull/520
Re: RFR: 8253952: Refine ZipOutputStream.putNextEntry() to ignore ZipEntry's compressed size
On Wed, 7 Oct 2020 16:09:47 GMT, Lance Andersen wrote: >> ### Summary >> >> Work around wrong usage of `ZipOutputStream.putNextEntry()` in user code >> which can lead to the `ZipException "invalid >> entry compressed size"`. >> ### Motivation >> >> In general it is not safe to directly write a ZipEntry obtained from >> `ZipInputStream.getNextEntry()`, >> `ZipFile.entries()`, `ZipFile.getEntry()` or `ZipFile.stream()` with >> `ZipOutputStream.putNextEntry()` to a >> `ZipOutputStream` and then read the entries data from the `ZipInputStream` >> and write it to the `ZipOutputStream` as >> follows: >> ZipEntry entry; >> ZipInputStream zis = new ZipInputStream(...); >> ZipOutputStream zos = new ZipOutputStream(...); >> while((entry = zis.getNextEntry()) != null) { >> zos.putNextEntry(entry); >> zis.transferTo(zos); >> } >> The problem with this code is that the zip file format does not record the >> compression level used for deflation in its >> entries. In general, it doesn't even mandate a predefined compression ratio >> per compression level. Therefore the >> compressed size recorded in a `ZipEntry` read from a zip file might differ >> from the new compressed size produced by the >> receiving `ZipOutputStream`. Such a difference will result in a >> `ZipException` with the following message: >> java.util.zip.ZipException: invalid entry compressed size (expected 12 but >> got 7 bytes) >> >> The correct way of copying all entries from one zip file into another >> requires the creation of a new `ZipEntry` or at >> least resetting of the compressed size field. E.g.: >> while((entry = zis.getNextEntry()) != null) { >> ZipEntry newEntry = new ZipEntry(entry.getName()); >> zos.putNextEntry(newEntry); >> zis.transferTo(zos); >> } >> or: >> while((entry = zis.getNextEntry()) != null) { >> entry.setCompressedSize(-1); >> zos.putNextEntry(entry); >> zis.transferTo(zos); >> } >> Unfortunately, there's a lot of user code out there which gets this wrong >> and uses the bad coding pattern described >> before. Searching for `"java.util.zip.ZipException: invalid entry compressed >> size (expected 12 but got 7 bytes)"` gives >> ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of >> instances of this anti-pattern on GitHub when >> doing a code search for `ZipEntry` and `putNextEntry()`. E.g. [Gradle 4.x >> wrapper task][1] is affected as well as the >> latest version of the [mockableAndroidJar task][2]. I've recently fixed two >> occurrences of this pattern in OpenJDK (see >> [JDK-8240333][3] and [JDK-8240235][4]) but there still exist more of them >> (e.g. >> [`test/jdk/java/util/zip/ZipFile/CopyJar.java`][5] which is there since 1999 >> :). ### Description So while this has >> clearly been a problem before, it apparently wasn't painful enough to >> trigger any action from the side of the JDK. >> However, recently quite some zlib forks with [superior deflate/inflate >> performance have evolved][6]. Using them with >> OpenJDK is quite straight-forward: one just has to configure the alternative >> implementations by setting >> `LD_LIBRARY_PATH` or `LD_PRELOAD` correspondingly. We've seen big saving by >> using these new zlib implementations for >> selected services in production and the only reason why we haven't enabled >> them by default until now is the problem >> I've just described. The reason why these new libraries uncover the >> described anti-pattern much more often is because >> their compression ratio is slightly different from that of the default zlib >> library. This can easily trigger a >> `ZipException` even if an application is not using a different compression >> levels but just a zip file created with >> another zlib version. I'd therefore like to propose the following >> workaround for the wrong >> `ZipOutputStream.putNextEntry()` usage in user code: >> - ignore the compressed size if it was implicitly determined from the zip >> file and not explicitly set by calling >>`ZipEntry.setCompressedSize()`. >> >> - Change the API-documentation of `ZipOutputStream.putNextEntry()` and >> `JarOutputStream.putNextEntry()` to explain the >> problem and why `putNextEntry()` will ignore the compressed size of a >> `ZipEntry` if that was set implicitely when >> reading that entry from a `ZipFile` or `ZipInputStream`. >> >> >> ### Technical Details >> >> A zip file consists of a stream of File Entries followed by a Central >> Directory (see [here for a more detailed >> specification][7]). Each File Entry is composed of a Local File Header (LFH) >> followed by the compressed Data and an >> optional Data Descriptor. The LFH contains the File Name and among other >> attributes the Compressed and Uncompressed >> size and CRC of the Data. In the case where the latter three attributes are >> not available at the time when the LFH is >> created, this fact will be recorded in a flag of the LFH and will trigger
Re: RFR: 8253952: Refine ZipOutputStream.putNextEntry() to ignore ZipEntry's compressed size
On Wed, 7 Oct 2020 15:34:32 GMT, Lance Andersen wrote: >> src/java.base/share/classes/java/util/jar/JarOutputStream.java line 97: >> >>> 95: * and re-compute its value automatically after the associted data >>> has been >>> 96: * completely deflated. >>> 97: * >> >> We probably need to do a bit wordsmithing here. >> >> I think I'd drop the first two sentences and instead start a new paragraph >> here ( tag) with "Unless explicitly set, >> the output stream will ignore ..." and see how that looks. >> Probably should be "the ZipEntry" rather than "a ZipEntry" to be consistent >> with the parameter description. > > Alan makes a good point. Perhaps we keep things simple and focus solely on > tweaking the description of the change in > behavior which is described in your last paragraph I totally agree. What about using just the last sentence (as you've proposed) in the spec section and add the other to as @implNote? O you think the last sentence will be enough? - PR: https://git.openjdk.java.net/jdk/pull/520
Re: RFR: 8253952: Work around wrong usage of ZipOutputStream.putNextEntry() in user code
On Tue, 6 Oct 2020 10:02:09 GMT, Volker Simonis wrote: > ### Summary > > Work around wrong usage of `ZipOutputStream.putNextEntry()` in user code > which can lead to the `ZipException "invalid > entry compressed size"`. > ### Motivation > > In general it is not safe to directly write a ZipEntry obtained from > `ZipInputStream.getNextEntry()`, > `ZipFile.entries()`, `ZipFile.getEntry()` or `ZipFile.stream()` with > `ZipOutputStream.putNextEntry()` to a > `ZipOutputStream` and then read the entries data from the `ZipInputStream` > and write it to the `ZipOutputStream` as > follows: > ZipEntry entry; > ZipInputStream zis = new ZipInputStream(...); > ZipOutputStream zos = new ZipOutputStream(...); > while((entry = zis.getNextEntry()) != null) { > zos.putNextEntry(entry); > zis.transferTo(zos); > } > The problem with this code is that the zip file format does not record the > compression level used for deflation in its > entries. In general, it doesn't even mandate a predefined compression ratio > per compression level. Therefore the > compressed size recorded in a `ZipEntry` read from a zip file might differ > from the new compressed size produced by the > receiving `ZipOutputStream`. Such a difference will result in a > `ZipException` with the following message: > java.util.zip.ZipException: invalid entry compressed size (expected 12 but > got 7 bytes) > > The correct way of copying all entries from one zip file into another > requires the creation of a new `ZipEntry` or at > least resetting of the compressed size field. E.g.: > while((entry = zis.getNextEntry()) != null) { > ZipEntry newEntry = new ZipEntry(entry.getName()); > zos.putNextEntry(newEntry); > zis.transferTo(zos); > } > or: > while((entry = zis.getNextEntry()) != null) { > entry.setCompressedSize(-1); > zos.putNextEntry(entry); > zis.transferTo(zos); > } > Unfortunately, there's a lot of user code out there which gets this wrong and > uses the bad coding pattern described > before. Searching for `"java.util.zip.ZipException: invalid entry compressed > size (expected 12 but got 7 bytes)"` gives > ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of > instances of this anti-pattern on GitHub when > doing a code search for `ZipEntry` and `putNextEntry()`. E.g. [Gradle 4.x > wrapper task][1] is affected as well as the > latest version of the [mockableAndroidJar task][2]. I've recently fixed two > occurrences of this pattern in OpenJDK (see > [JDK-8240333][3] and [JDK-8240235][4]) but there still exist more of them > (e.g. > [`test/jdk/java/util/zip/ZipFile/CopyJar.java`][5] which is there since 1999 > :). ### Description So while this has > clearly been a problem before, it apparently wasn't painful enough to trigger > any action from the side of the JDK. > However, recently quite some zlib forks with [superior deflate/inflate > performance have evolved][6]. Using them with > OpenJDK is quite straight-forward: one just has to configure the alternative > implementations by setting > `LD_LIBRARY_PATH` or `LD_PRELOAD` correspondingly. We've seen big saving by > using these new zlib implementations for > selected services in production and the only reason why we haven't enabled > them by default until now is the problem > I've just described. The reason why these new libraries uncover the described > anti-pattern much more often is because > their compression ratio is slightly different from that of the default zlib > library. This can easily trigger a > `ZipException` even if an application is not using a different compression > levels but just a zip file created with > another zlib version. I'd therefore like to propose the following workaround > for the wrong > `ZipOutputStream.putNextEntry()` usage in user code: > - ignore the compressed size if it was implicitly determined from the zip > file and not explicitly set by calling >`ZipEntry.setCompressedSize()`. > > - Change the API-documentation of `ZipOutputStream.putNextEntry()` and > `JarOutputStream.putNextEntry()` to explain the > problem and why `putNextEntry()` will ignore the compressed size of a > `ZipEntry` if that was set implicitely when > reading that entry from a `ZipFile` or `ZipInputStream`. > > > ### Technical Details > > A zip file consists of a stream of File Entries followed by a Central > Directory (see [here for a more detailed > specification][7]). Each File Entry is composed of a Local File Header (LFH) > followed by the compressed Data and an > optional Data Descriptor. The LFH contains the File Name and among other > attributes the Compressed and Uncompressed > size and CRC of the Data. In the case where the latter three attributes are > not available at the time when the LFH is > created, this fact will be recorded in a flag of the LFH and will trigger the > creation of a Data Descriptor with the > corresponding information right after the Data section
Re: RFR: 8253952: Work around wrong usage of ZipOutputStream.putNextEntry() in user code
On Wed, 7 Oct 2020 15:10:06 GMT, Alan Bateman wrote: >> ### Summary >> >> Work around wrong usage of `ZipOutputStream.putNextEntry()` in user code >> which can lead to the `ZipException "invalid >> entry compressed size"`. >> ### Motivation >> >> In general it is not safe to directly write a ZipEntry obtained from >> `ZipInputStream.getNextEntry()`, >> `ZipFile.entries()`, `ZipFile.getEntry()` or `ZipFile.stream()` with >> `ZipOutputStream.putNextEntry()` to a >> `ZipOutputStream` and then read the entries data from the `ZipInputStream` >> and write it to the `ZipOutputStream` as >> follows: >> ZipEntry entry; >> ZipInputStream zis = new ZipInputStream(...); >> ZipOutputStream zos = new ZipOutputStream(...); >> while((entry = zis.getNextEntry()) != null) { >> zos.putNextEntry(entry); >> zis.transferTo(zos); >> } >> The problem with this code is that the zip file format does not record the >> compression level used for deflation in its >> entries. In general, it doesn't even mandate a predefined compression ratio >> per compression level. Therefore the >> compressed size recorded in a `ZipEntry` read from a zip file might differ >> from the new compressed size produced by the >> receiving `ZipOutputStream`. Such a difference will result in a >> `ZipException` with the following message: >> java.util.zip.ZipException: invalid entry compressed size (expected 12 but >> got 7 bytes) >> >> The correct way of copying all entries from one zip file into another >> requires the creation of a new `ZipEntry` or at >> least resetting of the compressed size field. E.g.: >> while((entry = zis.getNextEntry()) != null) { >> ZipEntry newEntry = new ZipEntry(entry.getName()); >> zos.putNextEntry(newEntry); >> zis.transferTo(zos); >> } >> or: >> while((entry = zis.getNextEntry()) != null) { >> entry.setCompressedSize(-1); >> zos.putNextEntry(entry); >> zis.transferTo(zos); >> } >> Unfortunately, there's a lot of user code out there which gets this wrong >> and uses the bad coding pattern described >> before. Searching for `"java.util.zip.ZipException: invalid entry compressed >> size (expected 12 but got 7 bytes)"` gives >> ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of >> instances of this anti-pattern on GitHub when >> doing a code search for `ZipEntry` and `putNextEntry()`. E.g. [Gradle 4.x >> wrapper task][1] is affected as well as the >> latest version of the [mockableAndroidJar task][2]. I've recently fixed two >> occurrences of this pattern in OpenJDK (see >> [JDK-8240333][3] and [JDK-8240235][4]) but there still exist more of them >> (e.g. >> [`test/jdk/java/util/zip/ZipFile/CopyJar.java`][5] which is there since 1999 >> :). ### Description So while this has >> clearly been a problem before, it apparently wasn't painful enough to >> trigger any action from the side of the JDK. >> However, recently quite some zlib forks with [superior deflate/inflate >> performance have evolved][6]. Using them with >> OpenJDK is quite straight-forward: one just has to configure the alternative >> implementations by setting >> `LD_LIBRARY_PATH` or `LD_PRELOAD` correspondingly. We've seen big saving by >> using these new zlib implementations for >> selected services in production and the only reason why we haven't enabled >> them by default until now is the problem >> I've just described. The reason why these new libraries uncover the >> described anti-pattern much more often is because >> their compression ratio is slightly different from that of the default zlib >> library. This can easily trigger a >> `ZipException` even if an application is not using a different compression >> levels but just a zip file created with >> another zlib version. I'd therefore like to propose the following >> workaround for the wrong >> `ZipOutputStream.putNextEntry()` usage in user code: >> - ignore the compressed size if it was implicitly determined from the zip >> file and not explicitly set by calling >>`ZipEntry.setCompressedSize()`. >> >> - Change the API-documentation of `ZipOutputStream.putNextEntry()` and >> `JarOutputStream.putNextEntry()` to explain the >> problem and why `putNextEntry()` will ignore the compressed size of a >> `ZipEntry` if that was set implicitely when >> reading that entry from a `ZipFile` or `ZipInputStream`. >> >> >> ### Technical Details >> >> A zip file consists of a stream of File Entries followed by a Central >> Directory (see [here for a more detailed >> specification][7]). Each File Entry is composed of a Local File Header (LFH) >> followed by the compressed Data and an >> optional Data Descriptor. The LFH contains the File Name and among other >> attributes the Compressed and Uncompressed >> size and CRC of the Data. In the case where the latter three attributes are >> not available at the time when the LFH is >> created, this fact will be recorded in a flag of the LFH and will trigger >>
Re: RFR: 8253952: Work around wrong usage of ZipOutputStream.putNextEntry() in user code
On Tue, 6 Oct 2020 10:02:09 GMT, Volker Simonis wrote: > ### Summary > > Work around wrong usage of `ZipOutputStream.putNextEntry()` in user code > which can lead to the `ZipException "invalid > entry compressed size"`. > ### Motivation > > In general it is not safe to directly write a ZipEntry obtained from > `ZipInputStream.getNextEntry()`, > `ZipFile.entries()`, `ZipFile.getEntry()` or `ZipFile.stream()` with > `ZipOutputStream.putNextEntry()` to a > `ZipOutputStream` and then read the entries data from the `ZipInputStream` > and write it to the `ZipOutputStream` as > follows: > ZipEntry entry; > ZipInputStream zis = new ZipInputStream(...); > ZipOutputStream zos = new ZipOutputStream(...); > while((entry = zis.getNextEntry()) != null) { > zos.putNextEntry(entry); > zis.transferTo(zos); > } > The problem with this code is that the zip file format does not record the > compression level used for deflation in its > entries. In general, it doesn't even mandate a predefined compression ratio > per compression level. Therefore the > compressed size recorded in a `ZipEntry` read from a zip file might differ > from the new compressed size produced by the > receiving `ZipOutputStream`. Such a difference will result in a > `ZipException` with the following message: > java.util.zip.ZipException: invalid entry compressed size (expected 12 but > got 7 bytes) > > The correct way of copying all entries from one zip file into another > requires the creation of a new `ZipEntry` or at > least resetting of the compressed size field. E.g.: > while((entry = zis.getNextEntry()) != null) { > ZipEntry newEntry = new ZipEntry(entry.getName()); > zos.putNextEntry(newEntry); > zis.transferTo(zos); > } > or: > while((entry = zis.getNextEntry()) != null) { > entry.setCompressedSize(-1); > zos.putNextEntry(entry); > zis.transferTo(zos); > } > Unfortunately, there's a lot of user code out there which gets this wrong and > uses the bad coding pattern described > before. Searching for `"java.util.zip.ZipException: invalid entry compressed > size (expected 12 but got 7 bytes)"` gives > ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of > instances of this anti-pattern on GitHub when > doing a code search for `ZipEntry` and `putNextEntry()`. E.g. [Gradle 4.x > wrapper task][1] is affected as well as the > latest version of the [mockableAndroidJar task][2]. I've recently fixed two > occurrences of this pattern in OpenJDK (see > [JDK-8240333][3] and [JDK-8240235][4]) but there still exist more of them > (e.g. > [`test/jdk/java/util/zip/ZipFile/CopyJar.java`][5] which is there since 1999 > :). ### Description So while this has > clearly been a problem before, it apparently wasn't painful enough to trigger > any action from the side of the JDK. > However, recently quite some zlib forks with [superior deflate/inflate > performance have evolved][6]. Using them with > OpenJDK is quite straight-forward: one just has to configure the alternative > implementations by setting > `LD_LIBRARY_PATH` or `LD_PRELOAD` correspondingly. We've seen big saving by > using these new zlib implementations for > selected services in production and the only reason why we haven't enabled > them by default until now is the problem > I've just described. The reason why these new libraries uncover the described > anti-pattern much more often is because > their compression ratio is slightly different from that of the default zlib > library. This can easily trigger a > `ZipException` even if an application is not using a different compression > levels but just a zip file created with > another zlib version. I'd therefore like to propose the following workaround > for the wrong > `ZipOutputStream.putNextEntry()` usage in user code: > - ignore the compressed size if it was implicitly determined from the zip > file and not explicitly set by calling >`ZipEntry.setCompressedSize()`. > > - Change the API-documentation of `ZipOutputStream.putNextEntry()` and > `JarOutputStream.putNextEntry()` to explain the > problem and why `putNextEntry()` will ignore the compressed size of a > `ZipEntry` if that was set implicitely when > reading that entry from a `ZipFile` or `ZipInputStream`. > > > ### Technical Details > > A zip file consists of a stream of File Entries followed by a Central > Directory (see [here for a more detailed > specification][7]). Each File Entry is composed of a Local File Header (LFH) > followed by the compressed Data and an > optional Data Descriptor. The LFH contains the File Name and among other > attributes the Compressed and Uncompressed > size and CRC of the Data. In the case where the latter three attributes are > not available at the time when the LFH is > created, this fact will be recorded in a flag of the LFH and will trigger the > creation of a Data Descriptor with the > corresponding information right after the Data section
Re: RFR: 8253952: Work around wrong usage of ZipOutputStream.putNextEntry() in user code
On Tue, 6 Oct 2020 10:02:09 GMT, Volker Simonis wrote: > ### Summary > > Work around wrong usage of `ZipOutputStream.putNextEntry()` in user code > which can lead to the `ZipException "invalid > entry compressed size"`. > ### Motivation > > In general it is not safe to directly write a ZipEntry obtained from > `ZipInputStream.getNextEntry()`, > `ZipFile.entries()`, `ZipFile.getEntry()` or `ZipFile.stream()` with > `ZipOutputStream.putNextEntry()` to a > `ZipOutputStream` and then read the entries data from the `ZipInputStream` > and write it to the `ZipOutputStream` as > follows: > ZipEntry entry; > ZipInputStream zis = new ZipInputStream(...); > ZipOutputStream zos = new ZipOutputStream(...); > while((entry = zis.getNextEntry()) != null) { > zos.putNextEntry(entry); > zis.transferTo(zos); > } > The problem with this code is that the zip file format does not record the > compression level used for deflation in its > entries. In general, it doesn't even mandate a predefined compression ratio > per compression level. Therefore the > compressed size recorded in a `ZipEntry` read from a zip file might differ > from the new compressed size produced by the > receiving `ZipOutputStream`. Such a difference will result in a > `ZipException` with the following message: > java.util.zip.ZipException: invalid entry compressed size (expected 12 but > got 7 bytes) > > The correct way of copying all entries from one zip file into another > requires the creation of a new `ZipEntry` or at > least resetting of the compressed size field. E.g.: > while((entry = zis.getNextEntry()) != null) { > ZipEntry newEntry = new ZipEntry(entry.getName()); > zos.putNextEntry(newEntry); > zis.transferTo(zos); > } > or: > while((entry = zis.getNextEntry()) != null) { > entry.setCompressedSize(-1); > zos.putNextEntry(entry); > zis.transferTo(zos); > } > Unfortunately, there's a lot of user code out there which gets this wrong and > uses the bad coding pattern described > before. Searching for `"java.util.zip.ZipException: invalid entry compressed > size (expected 12 but got 7 bytes)"` gives > ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of > instances of this anti-pattern on GitHub when > doing a code search for `ZipEntry` and `putNextEntry()`. E.g. [Gradle 4.x > wrapper task][1] is affected as well as the > latest version of the [mockableAndroidJar task][2]. I've recently fixed two > occurrences of this pattern in OpenJDK (see > [JDK-8240333][3] and [JDK-8240235][4]) but there still exist more of them > (e.g. > [`test/jdk/java/util/zip/ZipFile/CopyJar.java`][5] which is there since 1999 > :). ### Description So while this has > clearly been a problem before, it apparently wasn't painful enough to trigger > any action from the side of the JDK. > However, recently quite some zlib forks with [superior deflate/inflate > performance have evolved][6]. Using them with > OpenJDK is quite straight-forward: one just has to configure the alternative > implementations by setting > `LD_LIBRARY_PATH` or `LD_PRELOAD` correspondingly. We've seen big saving by > using these new zlib implementations for > selected services in production and the only reason why we haven't enabled > them by default until now is the problem > I've just described. The reason why these new libraries uncover the described > anti-pattern much more often is because > their compression ratio is slightly different from that of the default zlib > library. This can easily trigger a > `ZipException` even if an application is not using a different compression > levels but just a zip file created with > another zlib version. I'd therefore like to propose the following workaround > for the wrong > `ZipOutputStream.putNextEntry()` usage in user code: > - ignore the compressed size if it was implicitly determined from the zip > file and not explicitly set by calling >`ZipEntry.setCompressedSize()`. > > - Change the API-documentation of `ZipOutputStream.putNextEntry()` and > `JarOutputStream.putNextEntry()` to explain the > problem and why `putNextEntry()` will ignore the compressed size of a > `ZipEntry` if that was set implicitely when > reading that entry from a `ZipFile` or `ZipInputStream`. > > > ### Technical Details > > A zip file consists of a stream of File Entries followed by a Central > Directory (see [here for a more detailed > specification][7]). Each File Entry is composed of a Local File Header (LFH) > followed by the compressed Data and an > optional Data Descriptor. The LFH contains the File Name and among other > attributes the Compressed and Uncompressed > size and CRC of the Data. In the case where the latter three attributes are > not available at the time when the LFH is > created, this fact will be recorded in a flag of the LFH and will trigger the > creation of a Data Descriptor with the > corresponding information right after the Data section
Re: RFR: 8253952: Work around wrong usage of ZipOutputStream.putNextEntry() in user code
On Tue, 6 Oct 2020 10:02:09 GMT, Volker Simonis wrote: > ### Summary > > Work around wrong usage of `ZipOutputStream.putNextEntry()` in user code > which can lead to the `ZipException "invalid > entry compressed size"`. > ### Motivation > > In general it is not safe to directly write a ZipEntry obtained from > `ZipInputStream.getNextEntry()`, > `ZipFile.entries()`, `ZipFile.getEntry()` or `ZipFile.stream()` with > `ZipOutputStream.putNextEntry()` to a > `ZipOutputStream` and then read the entries data from the `ZipInputStream` > and write it to the `ZipOutputStream` as > follows: > ZipEntry entry; > ZipInputStream zis = new ZipInputStream(...); > ZipOutputStream zos = new ZipOutputStream(...); > while((entry = zis.getNextEntry()) != null) { > zos.putNextEntry(entry); > zis.transferTo(zos); > } > The problem with this code is that the zip file format does not record the > compression level used for deflation in its > entries. In general, it doesn't even mandate a predefined compression ratio > per compression level. Therefore the > compressed size recorded in a `ZipEntry` read from a zip file might differ > from the new compressed size produced by the > receiving `ZipOutputStream`. Such a difference will result in a > `ZipException` with the following message: > java.util.zip.ZipException: invalid entry compressed size (expected 12 but > got 7 bytes) > > The correct way of copying all entries from one zip file into another > requires the creation of a new `ZipEntry` or at > least resetting of the compressed size field. E.g.: > while((entry = zis.getNextEntry()) != null) { > ZipEntry newEntry = new ZipEntry(entry.getName()); > zos.putNextEntry(newEntry); > zis.transferTo(zos); > } > or: > while((entry = zis.getNextEntry()) != null) { > entry.setCompressedSize(-1); > zos.putNextEntry(entry); > zis.transferTo(zos); > } > Unfortunately, there's a lot of user code out there which gets this wrong and > uses the bad coding pattern described > before. Searching for `"java.util.zip.ZipException: invalid entry compressed > size (expected 12 but got 7 bytes)"` gives > ~2500 hits (~100 on StackOverflow). It's also no hard to find plenty of > instances of this anti-pattern on GitHub when > doing a code search for `ZipEntry` and `putNextEntry()`. E.g. [Gradle 4.x > wrapper task][1] is affected as well as the > latest version of the [mockableAndroidJar task][2]. I've recently fixed two > occurrences of this pattern in OpenJDK (see > [JDK-8240333][3] and [JDK-8240235][4]) but there still exist more of them > (e.g. > [`test/jdk/java/util/zip/ZipFile/CopyJar.java`][5] which is there since 1999 > :). ### Description So while this has > clearly been a problem before, it apparently wasn't painful enough to trigger > any action from the side of the JDK. > However, recently quite some zlib forks with [superior deflate/inflate > performance have evolved][6]. Using them with > OpenJDK is quite straight-forward: one just has to configure the alternative > implementations by setting > `LD_LIBRARY_PATH` or `LD_PRELOAD` correspondingly. We've seen big saving by > using these new zlib implementations for > selected services in production and the only reason why we haven't enabled > them by default until now is the problem > I've just described. The reason why these new libraries uncover the described > anti-pattern much more often is because > their compression ratio is slightly different from that of the default zlib > library. This can easily trigger a > `ZipException` even if an application is not using a different compression > levels but just a zip file created with > another zlib version. I'd therefore like to propose the following workaround > for the wrong > `ZipOutputStream.putNextEntry()` usage in user code: > - ignore the compressed size if it was implicitly determined from the zip > file and not explicitly set by calling >`ZipEntry.setCompressedSize()`. > > - Change the API-documentation of `ZipOutputStream.putNextEntry()` and > `JarOutputStream.putNextEntry()` to explain the > problem and why `putNextEntry()` will ignore the compressed size of a > `ZipEntry` if that was set implicitely when > reading that entry from a `ZipFile` or `ZipInputStream`. > > > ### Technical Details > > A zip file consists of a stream of File Entries followed by a Central > Directory (see [here for a more detailed > specification][7]). Each File Entry is composed of a Local File Header (LFH) > followed by the compressed Data and an > optional Data Descriptor. The LFH contains the File Name and among other > attributes the Compressed and Uncompressed > size and CRC of the Data. In the case where the latter three attributes are > not available at the time when the LFH is > created, this fact will be recorded in a flag of the LFH and will trigger the > creation of a Data Descriptor with the > corresponding information right after the Data section