Excuse me if this is the wrong list for this discussion.  Please direct me to 
the right place if this isn’t it.

When doing an analysis of garbage generation in our application we discovered a 
significant number of redundant strings generated by the class loader.  In my 
case there are hundreds of jars on the classpath - everything in the 
application is a plugin.  I figured on average 10kB of useless garbage char[]s 
were generated per findResource call for plugin resources.

This is caused mostly by the ZipFile implementation.  What is the purpose of 
java.util.zip.ZipCoder’s byte[] getBytes(String s) method?  It seems to simply 
be a custom implementation of string.getBytes(CharSet cs) and as such needs to 
first make a copy of the char[] to work on.  This combined with the need to 
operate on byte[] path names internally in the ZipFile implementation means 
that URLClassLoader generates a lot of unnecessary garbage in a findResource 
call - proportional to the number of jars on the classpath.

Since JarFile forces the ZipFile to be open with UTF-8 always, if there was 
some API exposed that took a byte[] for the resource name, all of that extra 
string copying and encoding could be hoisted out of the loop in 
sun.misc.URLClassPath. Would this be worth it creating an internal class for 
something like a ‘ClasspathJarFile’ to and tweaking ZipFile so the byte[] based 
method is protected instead of private?

I also noticed that sun.net.www.ParseUtil.encodePath(String, boolean) usually 
had nothing useful to do but still made three copies of the string passed in 
anyway (two char arrays to work on, and the String returned).



Cheers,

Scott

Reply via email to