Hi Martin,

On 05/19/2016 04:27 AM, Martin Buchholz wrote:
Another batch of ZipFile hackery.  I think I finally understand how
weak references and finalizers interact.
We could eliminate the Inflater cache entirely, but this proposal
keeps a one-element Inflater cache.

http://cr.openjdk.java.net/~martin/webrevs/openjdk9/ZipFile-Inflater-cache/
https://bugs.openjdk.java.net/browse/JDK-8156484

- fixes the memory bloat from cached Inflaters
- removes small races in close
- various other minor improvements

The fixes to races are (almost) good. Also other minor improvements and cleanups.

But I have reservation for the implementation of one-element global cache of Inflater. This cache can't be efficient. In order for cache to be efficient, majority of calls to ZipFile.getInputStream(zipEntry) would have to be accompanied by a corresponding explicit close() for the input stream before the WeakReference to the cached Inflater is cleared. We know that all WeakReference(s) to a referent are 1st cleared and then the finalize() method is called on it. GC clears all WeakReference(s) to interconnected graph of weakly-reachable referents atomically, so if the ZipFileInflaterInputStream is closed implicitly by the finalize() invocation which then tries to releaseInflater(), the inflater will not be released as the 'cachedInflater' WeakReference will already be cleared. It is too late to release inflater into cache anyway because if the input stream is already being finalized, then the associated inflater has either already been finalized or will be shortly as GC discovers and enqueues the FinalReference(s) pointing to inflater and corresponding input stream in the same GC cycle - Inflater and corresponding input stream become eligible for finalization at the same time. The "assert !inf.ended()" in releaseInflater() can therefore fail as final() methods on individual objects that are eligible for finalization may be invoked in arbitrary order.

To correct that and improve the efficiency of cache, we have to do two things:

- we can't fix all the usages of ZipFile.getInputStream(zipEntry) out there, but we can at least fix the usages in URLClassLoader and BuiltinClassLoader which amount to most getInputStream() invocations without corresponding close()s when classes are being loaded (one stream for each class).
- we can make the Inflater cache a bit smarter.

http://cr.openjdk.java.net/~plevart/jdk9-dev/ZipFile_Inflater.cleanup/webrev.01/

We don't actually need to cache the Inflater objects themselves, but rather Inflater's internal ZStreamRef(s) which are just objects holding address to native zlib's z_stream structure. I also suggest using new Cleaner API for tracking reachability of Inflater(s) so that Inflater.end() that results from explicit stream close() clears the PhantomCleanable and allows GC to refrain from pushing already end()-ed Inflater(s)/FinalReference(s) through the finalization pipeline. I also took the liberty to create a trusted package-private InflaterInputStream constructor so that trusted subclasses can use it and so that usesDefaultInflater flag can be private and final. Deflater is using ZStreamRef(s) too, so I also had to modify Deflater, but there's no caching of ZStreamRef(s) in deflater as there are too many different kinds of them and this is not critical.

By making ZStreamRef(s) take care of themselves, corresponding logic in ZipFile's streams is much simpler - no need for finalize() method any more on ZipFileInflaterInputStream. finalize() is also not needed on ZipFileInputStream any more as 'streams' WeakHashMap is not holding Inflater instances as values any more and so finalize() would serve only to remove finalized ZipFile[Inflater]InputStream(s) from the WeakHashMap which the map is doing by itself already (expunging).

java.util.zip tests pass with this patch except 2 tests that are ignored and ZipFile/TestZipFile.java that prints:

java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects

...which seems unrelated and the test passes when executed alone.

This touches quite a number of other classes, but it seems necessary.

So Martin, what do you think?

Regards, Peter

Reply via email to