On Mon, 28 Mar 2022 09:37:58 GMT, Volker Simonis <simo...@openjdk.org> wrote:

> Add an API note to `InflaterInputStream::read(byte[] b, int off, int len)` to 
> highlight that it might  write more bytes than the returned number of 
> inflated bytes into the buffer `b`.
> 
> The superclass `java.io.InputStream` specifies that `read(byte[] b, int off, 
> int len)` will leave the content beyond the last read byte in the read buffer 
> `b` unaffected. However, the overridden `read` method in 
> `InflaterInputStream` passes the read buffer `b` to `Inflater::inflate(byte[] 
> b, int off, int len)` which doesn't provide this guarantee. Depending on 
> implementation details, `Inflater::inflate` might write more than the 
> returned number of inflated bytes into the buffer `b`.
> 
> ### TL;DR
> 
> `java.util.zip.Inflater` is the Java wrapper class for zlib's inflater 
> functionality. `Inflater::inflate(byte[] output, int off, int len)` currently 
> calls zlib's native `inflate(..)` function and passes the address of 
> `output[off]` and `len` to it via JNI.
> 
> The specification of zlib's `inflate(..)` function (i.e. the [API 
> documentation in the original zlib 
> implementation](https://github.com/madler/zlib/blob/cacf7f1d4e3d44d871b605da3b647f07d718623f/zlib.h#L400))
>  doesn't give any guarantees with regard to usage of the output buffer. It 
> only states that upon completion the function will return the number of bytes 
> that have been written (i.e. "inflated") into the output buffer.
> 
> The original zlib implementation only wrote as many bytes into the output 
> buffer as it inflated. However, this is not a hard requirement and newer, 
> more performant implementations of the zlib library like 
> [zlib-chromium](https://chromium.googlesource.com/chromium/src/third_party/zlib/)
>  or [zlib-cloudflare](https://github.com/cloudflare/zlib) can use more bytes 
> of the output buffer than they actually inflate as a scratch buffer. See 
> https://github.com/simonis/zlib-chromium for a more detailed description of 
> their approach and its performance benefit.
> 
> These new zlib versions can still be used transparently from Java (e.g. by 
> putting them into the `LD_LIBRARY_PATH` or by using `LD_PRELOAD`), because 
> they still fully comply to specification of `Inflater::inflate(..)`. However, 
> we might run into problems when using the `Inflater` functionality from the 
> `InflaterInputStream` class. `InflaterInputStream` is derived from from 
> `InputStream` and as such, its `read(byte[] b, int off, int len)` method is 
> quite constrained. It specifically specifies that if *k* bytes have been 
> read, then "these bytes will be stored in elements `b[off]` through 
> `b[off+`*k*`-1]`, leaving elements `b[off+`*k*`]` through `b[off+len-1]` 
> **unaffected**". But `InflaterInputStream::read(byte[] b, int off, int len)` 
> (which is constrained by `InputStream::read(..)`'s specification) calls 
> `Inflater::inflate(byte[] b, int off, int len)` and directly passes its 
> output buffer down to the native zlib `inflate(..)` method which is free to 
> change the bytes beyond `b[off+`*
 k*`]` (where *k* is the number of inflated bytes).
> 
> From a practical point of view, I don't see this as a big problem, because 
> callers of `InflaterInputStream::read(byte[] b, int off, int len)` can never 
> know how many bytes will be written into the output buffer `b` (and in fact 
> its content can always be completely overwritten). It therefore makes no 
> sense to depend on any data there being untouched after the call. Also, 
> having used zlib-cloudflare productively for about two years, we haven't seen 
> real-world issues because of this behavior yet. However, from a specification 
> point of view it is easy to artificially construct a program which violates 
> `InflaterInputStream::read(..)`'s postcondition if using one of the 
> alterantive zlib implementations. A recently integrated JTreg test 
> (test/jdk/jdk/nio/zipfs/ZipFSOutputStreamTest.java) "unintentionally" fails 
> with zlib-chromium but can fixed easily to run with alternative 
> implementations as well (see 
> [JDK-8283756](https://bugs.openjdk.java.net/browse/JDK-8283756)).

Hi Volker,

I believe your PR should point to the [JBS 
issue](https://bugs.openjdk.java.net/browse/JDK-8282648) in the title,   which 
references the CSR and not the CSR directly in the title.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7986

Reply via email to