Hi Adrian,
I looked briefly at your code and claims and have some comments inline...
On 09/13/2015 11:08 AM, Adrian wrote:
Hi,
I posted about this earlier (few weeks ago), and got some responses
about concerns which I addressed in my last email, though I didn't
hear back about it.
My apologies if I shouldn't be sending this; I'm not sure what the
protocol is about this stuff
Classloading on a standard Java application with jars on the classpath
currently does a linear search through every jar on the classpath, and
every entry in a jar, for every class loaded.
As URLClassPath searches for an entry/resource/class, it's possible to
cache each entry it encounters -> where to find it, so in the future
if a resource has already been seen we don't need to repeat the ~2d
search
Original thread (august list):
http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-August/035009.html
Last message (september list):
http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-September/035016.html
I got 3 responses: 2 concerning changes to jars at runtime
(invalidating cache), and 1 saying you're not supposed to modify jars
at runtime (can confirm via source code, and manual testing - it
crashes the JVM)
In my last message I addressed
- jars being modified (which you're not supposed to do; the current
classloader does not handle this) or the classpath changing (only
possible if you make public fields/methods via reflection, and this
could easily be handled gracefully)
- some details of the finding resource process (e.g. the meta index,
when the cache for jar entries can't be used because of the semantics
of other loaders/types of entries on the classpath)
- a reference implementation of caching that I believe is simple and
compliant with existing functionality
- some basic numbers on performance
---
So in this email I wanted to explain the problem again, hopefully more
clearly now
URLClassPath is used by URLClassLoader to find classes, though it
could be used for finding any resource on a classpath.
URLClassPath keeps an array of URLs, which are typically folders or
jars on the local filesystem.
They can be http or ftp or other files, but that's not
relevant/doesn't affect this problem
To find a resource/class (URLClassPath#getResource), it:
1. Loops through the URLs in order
2. Creates Loader objects for each URL lazily (URLClassPath$FileLoader
or URLClassPath$JarLoader). So if the Loader for the first URL finds
all the resources, Loaders for the remaining entries on the classpath
are never created/looked at
3. Calls Loader#getResource and returns the resource if found
(otherwise keep searching)
URLClassPath$JarLoader creates its corresponding JarFile either in the
constructor or in getResource (depending on the meta index - the
details I explained in my last email I won't repeat)
When a JarFile is created, it opens the jar file on the file system,
reads the central directory of the jar/zip file, and creates an
internal linked list with all its entries
It's not actually a linked list, but a hash table. See
jdk/src/java.base/share/native/libzip/zip_util.[c,h] ...
the jzfile.table is an array of jint mapping (entry-name-hash-code %
table.length) -> index into jzfile.entries
the jzfile.entries is an array of jzcell which represent heads of hash
buckets that are linked with jzcell.next
So look-ups into individual jar/zip files should be O(1). But each
look-up does cross the Java/JNI boundary, so it has a fixed JNI overhead
too. If there are lots of jar files in class path, you pay for the
unsuccessful hash-table look-ups before finding the resource. This
overhead is not that big, but increases with the number of jar files in
class path. Each look-up into class path is therefore O(N) where N = #
of jar files...
JarFile objects are immutable; you can only open them for read/delete
(see constructor API
http://docs.oracle.com/javase/7/docs/api/java/util/jar/JarFile.html#JarFile(java.io.File,%20boolean,%20int)
), they do not detect if the file has been modified externally, and
you only "append" or "delete" entries by creating a new jar
(JarOutputStream)
When URLClassPath$JarLoader#getResource calls JarFile#getEntry; in the
C code it searches through the linked list
(jdk/src/share/native/java/util/zip/zip_util.c, ZIP_GetEntry - jar
files are just zip files, and the java JarFile object just extends
ZipFile)
Since the order in which jar files and jar entries are searched is
invariant, we can create a map of resource -> first jar which contains
it
However, we don't want to introduce additional overhead.
When a JarFile is created, it already builds the internal linked list
- it's O(number of entries)
I propose that after the JarLoader creates the JarFile, it iterates
through its entries and adds them to the cache (if the map does not
already contain the resource, i.e. an earlier jar contains the
resource)
This adds a small overhead when instantiating loaders - but creating
the JarLoader/JarFile is still technically O(number of entries), and
now getResource is constant time, instead of requiring a linear search
through every jar and the linked list of entries for each jar
(O(number of jars * entries/jar))
Have you measured your entry iteration and cache initialization
overhead? When iterating over JarEntries, the code invokes 10 native
methods for each returned JarEntry:
long jzentry = getNextEntry(jzfile, i++)
getEntryFlag(jzentry);
getEntryBytes(jzentry, JZENTRY_NAME)
getEntryTime(jzentry)
getEntryCrc(jzentry)
getEntrySize(jzentry)
getEntryCSize(jzentry)
getEntryMethod(jzentry)
getEntryBytes(jzentry, JZENTRY_EXTRA)
getEntryBytes(jzentry, JZENTRY_COMMENT)
...it uses CharsetDecoder 1 or 2 times to decode the name and optional
comment of each JarEntry, it allocates several java objects...
So I have a feeling that initializing your cache when JarFiles are
constructed, would increase start-up time and not decrease it. It may
pay-of on the long run though. For achieving short start-up time, JDK
tries to be as lazy as possible. Your cache goes against that.
ZipFile native code tries hard to keep the memory usage down for
maintaining the look-up hash table. It only keeps hash codes in the
table, with offsets into memory mapped ZIP central directory for each
entry. Your proposed java-side cache is also a hash table, but it's
memory footprint is much bigger.
I have made a little experiment to see if JNI overhead for negative
answers from ZipFile.getEntry(name) is really that big. I modified
ZipFile.java/ZipFile.c to expose access to a look-up hash table that is
maintained in native code to the java side via two direct ByteBuffers. I
screen each ZipFile.getEntry(name) with a probe that gives a negative
answer for majority of negative lookups without invoking JNI methods.
Here are the changes I made for this experiment:
http://cr.openjdk.java.net/~plevart/jdk9-dev/ZipFile.getEntry/webrev.01/
I have tested the changed JDK with the following JMH benchmark:
http://cr.openjdk.java.net/~plevart/jdk9-dev/ZipFile.getEntry/ZipFileBench.java
This benchmark measures the ZipFile.getEntry(name) method on 2 jar files
(rt.jar ~20K entries, idea.jar ~40K entries) looking up entries in 1st
jar file with names of the entries taken from the 2nd jar file:
Original:
Benchmark (_zipTuple) Mode Samples Score
Error Units
j.t.ZipFileBench.getEntry rt.jar:rt.jar avgt 8 721.536 ±
17.344 ns/op
j.t.ZipFileBench.getEntry rt.jar:idea.jar avgt 8 501.451 ±
10.298 ns/op
j.t.ZipFileBench.getEntry idea.jar:rt.jar avgt 8 423.513 ±
23.268 ns/op
Patched:
Benchmark (_zipTuple) Mode Samples Score
Error Units
j.t.ZipFileBench.getEntry rt.jar:rt.jar avgt 8 743.281 ±
13.467 ns/op
j.t.ZipFileBench.getEntry rt.jar:idea.jar avgt 8 392.569 ±
7.710 ns/op
j.t.ZipFileBench.getEntry idea.jar:rt.jar avgt 8 333.249 ±
14.069 ns/op
The rt.jar:rt.jar therefore gives the score for successful look-ups,
while rt.jar:idea.jar and idea.jar:rt.jar give the score for
unsuccessful look-ups.
The screening of JNI call improves unsuccessful lookup by 20-25% while
not hurting much the successful lookup. The successful look-up could be
improved so that it wouldn't have any overhead by utilizing the result
of the screening probe that must now be re-computed in native code once
more (see TODO).
So is this worth the trouble? I don't know.
Adrian, does this patch improve your situation (don't forget to set the
system property -Dsun.zip.useNativeTableProbe=true to enable the
feature)? This patch should not hurt startup performance, as it does not
maintain or initialize any additional data structures (besides 2
ByteBuffer objects per ZipFile that only expose native memory to Java code).
Regards, Peter
There are several caveats when the cache cannot be used with non-jar
URLs on the classpath, and the meta index, but I explain them in my
last email along with comments in the reference implementation
---
Regarding modified jars:
- moved/renamed: the file handle is still valid and it doesn't affect
the JVM/classloading
- deleted: the file handle is still valid and it doesn't affect the
JVM/classloading
- modified: the JVM crashes
The first two may not be intuitive, but remember that file handles
point to files; not paths on the filesystem.
So even though a jar appears renamed in the shell, the java process
has opened a file, somewhere in the c implementation of file objects
it has the file handle, and when the JVM does the system call read on
the file handle say to read the class from the jar file, it all works
fine
For what it's worth, here's a stack overflow answer as "source":
http://stackoverflow.com/questions/2028874/what-happens-to-an-open-file-handler-on-linux-if-the-pointed-file-gets-moved-de
---
There is a protected method URLClassLoader#addURL which appends a URL
to the classpath.
People could use reflection to make it public.
Because jars are opened lazily and the cache is also built lazily
whenever a jar is opened, it doesn't matter if paths are appended
Regarding people making extensive use of reflection to modify the
order of entries on the classpath, I believe that's irrelevant as
that's clearly not the semantic of URLClassLoader/URLClassPath.
People who need custom classloading rules create custom classloaders;
that's the purpose of classloaders being extensible
---
Anyways, I hope this was discussion worthy.
I've looked much into this and believe I haven't missed anything, but
if someone knows why it hasn't/can't be done any insight would be
appreciated!
Alan from the last email thread said "There was a lot of experiments
and prototypes in the past along these lines" - are the results
public?
He also mentioned improving classloading in Java's upcoming module
system (originally planned for Java 7, currently delayed to Java 9),
but I believe the algorithmic complexity and performance of
URLClassLoader could be improved without complicated changes
Please let me know what you think, and thanks for your time!
Best regards,
Adrian