On 05/23/2016 11:00 AM, Peter Levart wrote:
Hi Sherman,

On 05/23/2016 05:44 PM, Xueming Shen wrote:
It appears the scope of the change is bigger than the problem we are trying to 
solve here.

The ClassLoader(s)/Resource changes are there just to measure the impact if 
input streams opened from a jar file are closed explicitly as soon as possible 
when loading classes. This makes the cache/pool of Inflater's native resources 
more efficient. Those changes are not meant to be included in this patch - will 
be stripped off and proposed as a separate enhancement later.

There are also cleanups that are not absolutely necessary such as 
InflaterInputStream's package-private constructor with trusted subclasses using 
it so that usesDefaultInflater flag can be private and final in 
InflaterInputStream. They can be reverted if you don't like them.


The problem we are having here is that in certain use scenario the vm keeps a 
huge number
of zip/jar files open and in which makes even the minimal cache mechanism (keep 
even one
deflater for one zip file) not desirable.

Right. Even if vm keeps huge number of jar files open, it doesn't read from all 
of them at the same time, but typically a single thread only reads from one 
entry of one jar file at a time before going to the next one. So keeping an 
isolated cache of native resource that could be re-used to read from multiple 
jar files, per jar file, is not the most efficient way to cache it. It doesn't 
matter until the number of jar files is large and VMs with gigantic classpaths 
are an important use case.


It might be desirable to have a better cache mechanism, such as the 
"ObjectPool" approach you're
proposing here, to help in this situation. But I doubt if it is necessary to extend the 
"functionality"
to the rest of the package. For example, In most use scenario, we don't have 
such cache issue with
the deflater/inflater, a usual "open-end+finalize()" mechanism appears just 
fine for now, I don't
see a compelling reason to replace the existing mechanism with a new one. It 
might be better
to be conservative here to not make big change to something not proven broken.

As  presented in my 1st message in the thread, what Martin is seeing is a 
combination of several issues:

- class loading is not closing streams explicitly
- caching is not effective for Inflaters for which the streams are closed with 
finalizers because when finalizer invokes close() on stream, the 
WeakHashMap<InputStream, Inflater> is not keeping the InputStream any more, so 
the associated Inflater, while maybe still reachable in object graph, can not be 
retrieved (map.remove(stream) returns null).
- caching of Inflaters for streams that happen to be explicitly closed is not 
bounded so they may pile-up.

I claim that we must solve all 3 issues to make Martin happy again. :-)


It does not seem really necessary to have Deflater to be aware of the 
"ObjectPool" and the cleaner/
dealloctor, maybe all of these should be kept as an implementation inside the 
ObjectPool?  which
can have a cleaner to just invoke deflater.end() when needed?

Deflater is not aware of ObjectPool. It just uses Cleaner instead of finalize() 
because as a twin brother of Inflater, it would be nice for it to have the same 
cleanup mechanism. Cleaner is more efficient than finalization when the 
Deflater is explicitly end()-ed which is also a nice property. Otherwise the 
mechanism of Cleaner is similar to finalization. Cleaner is using 
PhantomReference(s) which are cleared when discovered and enqueued, while 
finalization is using FinalReference(s) which are not cleared and so present 
additional burden to CPU even in cases where Deflater is not explicitly 
end()ed. But if you don't like that, the changes to Deflater could be reverted 
back to using finalize().


My apology. I meant to say "to have Inflater to be aware of the ObjectPool...".
The proposed change switches from the finalizer to the ObjectPool to clean up
the native resource for Inflater. It appears to be a bigger change. Which has
a default 32/maxCacheSize and 1 secod keepAliveTime setting. It might have
a big impact to existing applications on how to use the Inflater. It would be
fine to have the ZipFile to arrange how to use the Inflater based on its use
scenario, but I'm concerned if it's the correct approach to change the default
behavior of Inflater, which might have nothing to do with ZipFile.

sherman

Even Inflater could use finalize() for what it is worth, combined with 
ObjectPool<ZStreamRef>, but why would we not take an opportunity and replace 
finalize() as we go? It's plain and simple. And Cleaner API was developed for just 
that purpose.

Reply via email to