Hi Sherman,
Removed the check, http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.4
Cheers,
Staffan
On 06/23/2015 11:22 AM, Xueming Shen wrote:
Hi Staffan,
#527 check is probably unnecessary. The size and csize are 32-bit
unsigned integer, they
should never be < 0.
The rest looks good.
Thanks,
-Sherman
On 06/23/2015 10:54 AM, Staffan Friberg wrote:
Hi Sherman,
Thanks for the review. I removed the unused import and the changes to
reduce the lock region.
New webrev, http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.3
Thanks,
Staffan
On 06/08/2015 02:37 PM, Xueming Shen wrote:
Staffan,
(1) ByteArrayInputSteram is no longer needed in ZipFile
(2) You've changed the lock region in ZipFile.getInputSteram. Given
we are not
doing ByteArrayInpusteram for this method, can we just not
touch this method
and the class ZipFileInputSteram()?
The concern is that we did some changes in that area back to
2010 and triggered
a complicated race condition regression [1], it was finally
fixed after lot of rounds
of discussion. I still have all those emails in my inbox. It
would be better to keep
whatever works for now, instead of re-fresh all the memory
(read all those emails)
to figure out if the latest change might have a negative impact.
The getBytes() implementation looks good to me.
Thanks!
-Sherman
[1]
http://mail.openjdk.java.net/pipermail/core-libs-dev/2011-March/006355.html
On 06/05/2015 11:09 AM, Staffan Friberg wrote:
Hi Sherman,
I have a new webrev which reverts that part,
http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.2/
Summary of changes
Reduce lock region in ZipFile.getInputstream
Add private ZipFile.getBytes that can be used in select places
in the JDK where all bytes will be read
Could you sponsor this change once it has been reviewed?
Thanks,
Staffan
On 06/03/2015 10:45 AM, Xueming Shen wrote:
Staffan,
I'm not convinced that the benefit here is significant enough to
change the
getInputStream() to return a ByteArrayInputStream, given this can
be easily
achieved by wrapping the returned byte[] from getBytes(ZipEntry)
at user's
site. I would suggest to file a separate rfe on this disagreement
and move on
with the agreed getBytes() for now.
Thanks,
-Sherman
On 06/02/2015 10:27 AM, Staffan Friberg wrote:
On 05/22/2015 01:15 PM, Staffan Friberg wrote:
On 05/22/2015 11:51 AM, Xueming Shen wrote:
On 05/22/2015 11:41 AM, Staffan Friberg wrote:
On 05/21/2015 11:00 AM, Staffan Friberg wrote:
On 05/21/2015 09:48 AM, Staffan Friberg wrote:
On 05/20/2015 10:57 AM, Xueming Shen wrote:
On 05/18/2015 06:44 PM, Staffan Friberg wrote:
Hi,
Wanted to get reviews and feedback on this performance
improvement for reading from JAR/ZIP files during
classloading by reducing unnecessary copying and reading
the entry in one go instead of in small portions. This
shows a significant improvement when reading a single
entry and for a large application with 10k classes and
500+ JAR files it improved the startup time by 4%.
For more details on the background and performance results
please see the RFE entry.
RFE - https://bugs.openjdk.java.net/browse/JDK-8080640
WEBREV -
http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.0
Cheers,
Staffan
Hi Staffan,
If I did not miss something here, from your use scenario it
appears to me the only thing you really
need here to help boost your performance is
byte[] ZipFile.getAllBytes(ZipEntry ze);
You are allocating a byte[] at use side and wrapping it
with a ByteBuffer if the size is small enough,
otherwise, you letting the ZipFile to allocate a big enough
one for you. It does not look like you
can re-use that byte[] (has to be wrapped by the
ByteArrayInputStream and return), why do you
need two different methods here? The logic would be much
easier to simply let the ZipFile to allocate
the needed buffer with appropriate size, fill the bytes and
return, with a "OOME" if the entry size
is bigger than 2g.
The only thing we use from the input ze is its name, get
the size/csize from the jzentry, I don't think
jzentry.csize/size can be "unknown", they are from the
"cen" table.
If the real/final use of the bytes is to wrap it with a
ByteArrayInputStream,why bother using ByteBuffer
here? Shouldn't a direct byte[] with exactly the size of
the entry server better.
-Sherman
Hi Sherman,
Thanks for the comments. I agree, was starting out with
bytebuffer because I was hoping to be able to cache things
where the buffer was being used, but since the buffer is
past along further I couldn't figure out a clean way to do it.
Will rewrite it to simply just return a buffer, and only
wrap it in the Resource class getByteBuffer.
What would be your thought on updating the
ZipFile.getInputStream to return ByteArrayInputStream for
small entries? Currently I do that work outside in two
places and moving it would potentially speed up others
reading small entries as well.
Thanks,
Staffan
Just realized that my use of ByteArrayInputStream would miss
Jar verification if enabled so the way to go hear would be to
add it if possible to the ZipFile.getInputStream.
//Staffan
Hi,
Here is an updated webrev which uses a byte[] directly and
also uses ByteArrayInputStream in ZipFile for small entries
below 128k.
I'm not sure about the benefit of doing the
ByteArrayInputStream in ZipFile.getInputStream. It has
the consequence of changing the "expected" behavior of
getInputStream() (instead of return an
input stream waiting for reading, it now reads all bytes in
advance), something we might not want
to do in a performance tuning. Though it might be reasonable to
guess everyone get an input stream
is to read all bytes from it later.
-Sherman
http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.1
//Staffan
Agree that it will change the behavior slightly, but as you said
it is probably expected that some one will read the stream
eventually.
We could reduce the size further if that makes a difference, if
the size is below 65k we would not use more memory than the
buffer allocated for the InflaterStream today.
The total allocation would be slightly larger for deflated
entries as we would allocate a byte[] for the compressed bytes,
but it would be GC:able and not kept alive. So from a memory
perspective the difference is very limited.
//Staffan
Hi,
Bumping this thread to get some more comments on the concern
about changing the ZipFile.getInputStream behavior. The benefit
of doing this change is that any read of small entries from ZIP
and JAR files will be much faster and less resources will be
held, including native resources normally held by the
ZipInputStream.
The behavior change that will occur is that the full entry will
be read as part of creating the stream and not lazily as might be
expected. However when getting a today InputStream zip file will
be accessed to read information about the size of the entry, so
the zip file is already touched when getting an InputStream, but
not the compressed bytes.
I'm fine with removing this part of the change and just push the
private getBytes function and the updates to the JDK libraries to
use it.
Thanks,
Staffan