On Mon, 28 Mar 2022 09:37:58 GMT, Volker Simonis <simo...@openjdk.org> wrote:
> Add an API note to `InflaterInputStream::read(byte[] b, int off, int len)` to > highlight that it might write more bytes than the returned number of > inflated bytes into the buffer `b`. > > The superclass `java.io.InputStream` specifies that `read(byte[] b, int off, > int len)` will leave the content beyond the last read byte in the read buffer > `b` unaffected. However, the overridden `read` method in > `InflaterInputStream` passes the read buffer `b` to `Inflater::inflate(byte[] > b, int off, int len)` which doesn't provide this guarantee. Depending on > implementation details, `Inflater::inflate` might write more than the > returned number of inflated bytes into the buffer `b`. > > ### TL;DR > > `java.util.zip.Inflater` is the Java wrapper class for zlib's inflater > functionality. `Inflater::inflate(byte[] output, int off, int len)` currently > calls zlib's native `inflate(..)` function and passes the address of > `output[off]` and `len` to it via JNI. > > The specification of zlib's `inflate(..)` function (i.e. the [API > documentation in the original zlib > implementation](https://github.com/madler/zlib/blob/cacf7f1d4e3d44d871b605da3b647f07d718623f/zlib.h#L400)) > doesn't give any guarantees with regard to usage of the output buffer. It > only states that upon completion the function will return the number of bytes > that have been written (i.e. "inflated") into the output buffer. > > The original zlib implementation only wrote as many bytes into the output > buffer as it inflated. However, this is not a hard requirement and newer, > more performant implementations of the zlib library like > [zlib-chromium](https://chromium.googlesource.com/chromium/src/third_party/zlib/) > or [zlib-cloudflare](https://github.com/cloudflare/zlib) can use more bytes > of the output buffer than they actually inflate as a scratch buffer. See > https://github.com/simonis/zlib-chromium for a more detailed description of > their approach and its performance benefit. > > These new zlib versions can still be used transparently from Java (e.g. by > putting them into the `LD_LIBRARY_PATH` or by using `LD_PRELOAD`), because > they still fully comply to specification of `Inflater::inflate(..)`. However, > we might run into problems when using the `Inflater` functionality from the > `InflaterInputStream` class. `InflaterInputStream` is derived from from > `InputStream` and as such, its `read(byte[] b, int off, int len)` method is > quite constrained. It specifically specifies that if *k* bytes have been > read, then "these bytes will be stored in elements `b[off]` through > `b[off+`*k*`-1]`, leaving elements `b[off+`*k*`]` through `b[off+len-1]` > **unaffected**". But `InflaterInputStream::read(byte[] b, int off, int len)` > (which is constrained by `InputStream::read(..)`'s specification) calls > `Inflater::inflate(byte[] b, int off, int len)` and directly passes its > output buffer down to the native zlib `inflate(..)` method which is free to > change the bytes beyond `b[off+`* k*`]` (where *k* is the number of inflated bytes). > > From a practical point of view, I don't see this as a big problem, because > callers of `InflaterInputStream::read(byte[] b, int off, int len)` can never > know how many bytes will be written into the output buffer `b` (and in fact > its content can always be completely overwritten). It therefore makes no > sense to depend on any data there being untouched after the call. Also, > having used zlib-cloudflare productively for about two years, we haven't seen > real-world issues because of this behavior yet. However, from a specification > point of view it is easy to artificially construct a program which violates > `InflaterInputStream::read(..)`'s postcondition if using one of the > alterantive zlib implementations. A recently integrated JTreg test > (test/jdk/jdk/nio/zipfs/ZipFSOutputStreamTest.java) "unintentionally" fails > with zlib-chromium but can fixed easily to run with alternative > implementations as well (see > [JDK-8283756](https://bugs.openjdk.java.net/browse/JDK-8283756)). Hi Volker, I believe your PR should point to the [JBS issue](https://bugs.openjdk.java.net/browse/JDK-8282648) in the title, which references the CSR and not the CSR directly in the title. ------------- PR: https://git.openjdk.java.net/jdk/pull/7986