On 05/22/2015 11:41 AM, Staffan Friberg wrote:
On 05/21/2015 11:00 AM, Staffan Friberg wrote:
On 05/21/2015 09:48 AM, Staffan Friberg wrote:
On 05/20/2015 10:57 AM, Xueming Shen wrote:
On 05/18/2015 06:44 PM, Staffan Friberg wrote:
Hi,
Wanted to get reviews and feedback on this performance improvement for reading
from JAR/ZIP files during classloading by reducing unnecessary copying and
reading the entry in one go instead of in small portions. This shows a
significant improvement when reading a single entry and for a large application
with 10k classes and 500+ JAR files it improved the startup time by 4%.
For more details on the background and performance results please see the RFE
entry.
RFE - https://bugs.openjdk.java.net/browse/JDK-8080640
WEBREV - http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.0
Cheers,
Staffan
Hi Staffan,
If I did not miss something here, from your use scenario it appears to me the
only thing you really
need here to help boost your performance is
byte[] ZipFile.getAllBytes(ZipEntry ze);
You are allocating a byte[] at use side and wrapping it with a ByteBuffer if
the size is small enough,
otherwise, you letting the ZipFile to allocate a big enough one for you. It
does not look like you
can re-use that byte[] (has to be wrapped by the ByteArrayInputStream and
return), why do you
need two different methods here? The logic would be much easier to simply let
the ZipFile to allocate
the needed buffer with appropriate size, fill the bytes and return, with a
"OOME" if the entry size
is bigger than 2g.
The only thing we use from the input ze is its name, get the size/csize from
the jzentry, I don't think
jzentry.csize/size can be "unknown", they are from the "cen" table.
If the real/final use of the bytes is to wrap it with a
ByteArrayInputStream,why bother using ByteBuffer
here? Shouldn't a direct byte[] with exactly the size of the entry server
better.
-Sherman
Hi Sherman,
Thanks for the comments. I agree, was starting out with bytebuffer because I
was hoping to be able to cache things where the buffer was being used, but
since the buffer is past along further I couldn't figure out a clean way to do
it.
Will rewrite it to simply just return a buffer, and only wrap it in the
Resource class getByteBuffer.
What would be your thought on updating the ZipFile.getInputStream to return
ByteArrayInputStream for small entries? Currently I do that work outside in two
places and moving it would potentially speed up others reading small entries as
well.
Thanks,
Staffan
Just realized that my use of ByteArrayInputStream would miss Jar verification
if enabled so the way to go hear would be to add it if possible to the
ZipFile.getInputStream.
//Staffan
Hi,
Here is an updated webrev which uses a byte[] directly and also uses
ByteArrayInputStream in ZipFile for small entries below 128k.
I'm not sure about the benefit of doing the ByteArrayInputStream in
ZipFile.getInputStream. It has
the consequence of changing the "expected" behavior of getInputStream()
(instead of return an
input stream waiting for reading, it now reads all bytes in advance), something
we might not want
to do in a performance tuning. Though it might be reasonable to guess everyone
get an input stream
is to read all bytes from it later.
-Sherman
http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.1
//Staffan