Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater

2018-02-19 Thread David Lloyd
On Sun, Feb 18, 2018 at 3:33 AM, Alan Bateman  wrote:
> Thanks for bringing this one up again

Thanks for taking the time to review.

> I see Sherman is looking at implementation so I'll stick with the javadoc
> for now. At some point it will need a security review to ensure that there
> no possibility of tricking the implementation to access memory outside of
> the input/output. That is, we have to assume the user of
> setInput(ByteBuffer) is evil and will change the position/limit during
> deflate operations. I see the patch computes clamps the remainder before
> accessing memory, it will just need a closer look to make sure there are no
> issues. The patch will also need adjustments to make it consistent with the
> existing code but that can come later.

I did write the code with this in mind: that the address should always
be within the bounds of the buffer (be it heap or direct) at the time
of the call, and that the offset into the buffer should not be beyond
its end (or beginning).  So the effect could be a thrown exception but
escaping the buffer _should_ be impossible (by intent anyway; more
eyes are always better for noticing mistakes of course).

> On the APIs then the inflate and deflate methods look okay, I'll less sure
> that  setDcitionary(ByteBuffer) is really needed. Are you adding for
> setDcitionary(ByteBuffer) for consistency?

Yes; it was easy enough to add it so I did.

> The javadoc doesn't currently specify how it works with the buffer, e.g.
> inflate(ByteBuffer) doesn't specify that adds it bytes to the buffer
> starting at its position, it doesn't say if the position is adjusted. The
> javadoc will also need to set expectations on behavior when
> DataFormatException is thrown, is the position guaranteed to be unchanged or
> is it unspecified?

I intended for it to work similarly to the old code.  But I'll go
through the zlib docs and just make sure all the assumptions are still
correct, and the next version will have more concise docs in this
regard and also with regards to the disposition of the buffer in each
case.

-- 
- DML


Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater

2018-02-19 Thread David Lloyd
On Sat, Feb 17, 2018 at 3:18 PM, Xueming Shen  wrote:
> Hi David,
>
> Thanks for taking this one :-) some comments here.

Thanks for the review!

> (1) I would assume you might have to do more for ByteBuffer, something like
> [...]
> btw, any reason go unsafe to get the byte[]? instead of
> ByteBuffer.getArray()?

Yes: input should always be settable from a read-only heap buffer
without a performance penalty (i.e. copying the contents).  This btw
is why there is also an explicit check for read-only later on.

> (2) It might be a concerned that you have to warp the input byte[] every time
>  the inflate/deflate(byte[]...) is called. (Yes, I'm constantly hearing 
> people
>  complain that you have to wrap the byte[] to ByteBuffer to use CharsetEn/
>  Decoder, or even the implementation detail inside StringCoder), and it 
> might
>  be taken as a "regression". So it might be desired to wire the 
> "bf.hasArray())
>  path inside de/encode(ByteArray) back to de/encode(byte[], int, int), to 
> avoid
>  the "unnecessary" ByteBuffer wrap.

I can do some testing to see if there is an impact.  I am working on
the basis that the wrap may be optimized away (as it is in certain
similar cases), but Inflater/Deflater is not final (nor are the
inflate/deflate methods) that might not be true in practice.

> (3) Same "wrap concern" argument might go to the setInput(bye[] ...) as well.
>  I'm not sure if it is worth keeping both byte[]/int/int and ByteBuffer 
> as the
>  "input" field of In/Deflater.

Since input is stored on the object, the above theoretical
optimization is much less likely.  Again I can do some testing; it
might be a good idea to have separate byte[]/int/int and ByteBuffer in
the end, though the added expense of checking for and updating the
ByteBuffer position after each call in addition to the byte[]/int/int
might nullify the benefit.  Testing is required...

> (4) assume we keep the "wrap" approach. It appears ByteBuffer.wrap() does
> check the byte[]/off/len and throw an IndexOutOfBoundsException. So it might
> be better to take advantage of that check.

I wanted to however it throws the wrong exception type; I could catch
and rethrow I guess though if you think that is better (assuming we
keep the "wrap" approach as you say).

> (5) Deflater.input need to be initialized to a non-null value.
>  Btw ZipUtil.defaultBuf needs to be "direct"?

It doesn't need to be, but the direct buffer code path is possibly
somewhat "friendlier" to GC since there's no GetPrimitiveArrayCritical
call.

> (6) It might be desired to have some jmh measure to make sure byte[] case
> does not have regression.

I'll work on this when I get a chance (maybe not until Friday though).

One other thought I had this weekend was that I could possibly improve
things by getting the direct buffer address on the Java side, and pass
it in to JNI to avoid the calls to GetDirectBufferAddress.  Another
thought was to eliminate the remaining SetBooleanField calls by using
the remaining two bits in the doInflate/doDeflate methods' return
values.  I'm not sure how expensive these calls are though.  I could
also replace the JNI field reads with more method parameters if this
is a valuable thing to do.

-- 
- DML


Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater

2018-02-18 Thread Alan Bateman

On 16/02/2018 22:13, David Lloyd wrote:

It would be convenient to be able to inflate/deflate a direct or heap
byte buffer without having to copy it through an array first.  For my
Friday mini-project this week, I've decided to take a crack at this.
The attached patch is the result.  Would anyone be interested in
reviewing and maybe sponsoring this change?  It would be really great
to get it in to JDK 11.

The patch includes a modification to FlaterTest to run through
permutations of using direct and heap byte buffers.  Though I couldn't
get jtreg working, I did compile and run the test by hand and it seems
to pass; the build also works fine with the new code.

Extra thanks to Martin Balao for pointing me towards mapfile-vers when
I couldn't figure out why I was getting UnsatisfiedLinkError.  That
one was not obvious.
Thanks for bringing this one up again, I think the last time that it was 
discussed here was in 2012 when Martin Kirst proposed a patch to add 
these methods. If you go through the archives then you'll see there were 
several issues with that proposal before the discussion petered out.


I see Sherman is looking at implementation so I'll stick with the 
javadoc for now. At some point it will need a security review to ensure 
that there no possibility of tricking the implementation to access 
memory outside of the input/output. That is, we have to assume the user 
of setInput(ByteBuffer) is evil and will change the position/limit 
during deflate operations. I see the patch computes clamps the remainder 
before accessing memory, it will just need a closer look to make sure 
there are no issues. The patch will also need adjustments to make it 
consistent with the existing code but that can come later.


On the APIs then the inflate and deflate methods look okay, I'll less 
sure that  setDcitionary(ByteBuffer) is really needed. Are you adding 
for setDcitionary(ByteBuffer) for consistency?


The javadoc doesn't currently specify how it works with the buffer, e.g. 
inflate(ByteBuffer) doesn't specify that adds it bytes to the buffer 
starting at its position, it doesn't say if the position is adjusted. 
The javadoc will also need to set expectations on behavior when 
DataFormatException is thrown, is the position guaranteed to be 
unchanged or is it unspecified?


-Alan


Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater

2018-02-17 Thread Xueming Shen

On 2/16/18, 2:13 PM, David Lloyd wrote:

It would be convenient to be able to inflate/deflate a direct or heap
byte buffer without having to copy it through an array first.  For my
Friday mini-project this week, I've decided to take a crack at this.
The attached patch is the result.  Would anyone be interested in
reviewing and maybe sponsoring this change?  It would be really great
to get it in to JDK 11.

The patch includes a modification to FlaterTest to run through
permutations of using direct and heap byte buffers.  Though I couldn't
get jtreg working, I did compile and run the test by hand and it seems
to pass; the build also works fine with the new code.

Extra thanks to Martin Balao for pointing me towards mapfile-vers when
I couldn't figure out why I was getting UnsatisfiedLinkError.  That
one was not obvious.

Hi David,

Thanks for taking this one :-) some comments here.

(1) I would assume you might have to do more for ByteBuffer, something like

if (bf.isDirect()) {
// GetDirectBufferAddress
...
} else if (bf.hasArray()) {
byte[] array = bf.getArray();
do(bf.getArray(), offset + pos, pos - limit);
...
} else {
// probably still have to copy the bytes out of ByteBuffer
...
}

btw, any reason go unsafe to get the byte[]? instead of 
ByteBuffer.getArray()?


(2) It might be a concerned that you have to warp the input byte[] every 
time
 the inflate/deflate(byte[]...) is called. (Yes, I'm constantly 
hearing people
 complain that you have to wrap the byte[] to ByteBuffer to use 
CharsetEn/
 Decoder, or even the implementation detail inside StringCoder), 
and it might
 be taken as a "regression". So it might be desired to wire the 
"bf.hasArray())
 path inside de/encode(ByteArray) back to de/encode(byte[], int, 
int), to avoid

 the "unnecessary" ByteBuffer wrap.

(3) Same "wrap concern" argument might go to the setInput(bye[] ...) as 
well.
 I'm not sure if it is worth keeping both byte[]/int/int and 
ByteBuffer as the

 "input" field of In/Deflater.

(4) assume we keep the "wrap" approach. It appears ByteBuffer.wrap() 
does check
 the byte[]/off/len and throw an IndexOutOfBoundsException. So it 
might be

 better to take advantage of that check.

(5) Deflater.input need to be initialized to a non-null value.
 Btw ZipUtil.defaultBuf needs to be "direct"?

(6) It might be desired to have some jmh measure to make sure byte[] 
case does

 not have regression.

Thanks,
Sherman



Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater

2018-02-16 Thread David Lloyd
Also available in more readable form at:

  
https://github.com/dmlloyd/openjdk/commit/becd36e852e55a29a4685577453944552c817b66

On Fri, Feb 16, 2018 at 4:13 PM, David Lloyd  wrote:
> It would be convenient to be able to inflate/deflate a direct or heap
> byte buffer without having to copy it through an array first.  For my
> Friday mini-project this week, I've decided to take a crack at this.
> The attached patch is the result.  Would anyone be interested in
> reviewing and maybe sponsoring this change?  It would be really great
> to get it in to JDK 11.
>
> The patch includes a modification to FlaterTest to run through
> permutations of using direct and heap byte buffers.  Though I couldn't
> get jtreg working, I did compile and run the test by hand and it seems
> to pass; the build also works fine with the new code.
>
> Extra thanks to Martin Balao for pointing me towards mapfile-vers when
> I couldn't figure out why I was getting UnsatisfiedLinkError.  That
> one was not obvious.
> --
> - DML



-- 
- DML


[JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater

2018-02-16 Thread David Lloyd
It would be convenient to be able to inflate/deflate a direct or heap
byte buffer without having to copy it through an array first.  For my
Friday mini-project this week, I've decided to take a crack at this.
The attached patch is the result.  Would anyone be interested in
reviewing and maybe sponsoring this change?  It would be really great
to get it in to JDK 11.

The patch includes a modification to FlaterTest to run through
permutations of using direct and heap byte buffers.  Though I couldn't
get jtreg working, I did compile and run the test by hand and it seems
to pass; the build also works fine with the new code.

Extra thanks to Martin Balao for pointing me towards mapfile-vers when
I couldn't figure out why I was getting UnsatisfiedLinkError.  That
one was not obvious.
-- 
- DML
commit becd36e852e55a29a4685577453944552c817b66
Author: David M. Lloyd 
Date:   Fri Feb 16 11:00:10 2018 -0600

[JDK-6341887] Update Inflater/Deflator to handle ByteBuffer

diff --git a/make/mapfiles/libzip/mapfile-vers 
b/make/mapfiles/libzip/mapfile-vers
index d711d8e17f4..11ccc2d6ecb 100644
--- a/make/mapfiles/libzip/mapfile-vers
+++ b/make/mapfiles/libzip/mapfile-vers
@@ -33,20 +33,28 @@ SUNWprivate_1.1 {
Java_java_util_zip_CRC32_update;
Java_java_util_zip_CRC32_updateBytes0;
Java_java_util_zip_CRC32_updateByteBuffer0;
-   Java_java_util_zip_Deflater_deflateBytes;
+   Java_java_util_zip_Deflater_deflateBytesBytes;
+   Java_java_util_zip_Deflater_deflateBytesBuffer;
+   Java_java_util_zip_Deflater_deflateBufferBytes;
+   Java_java_util_zip_Deflater_deflateBufferBuffer;
Java_java_util_zip_Deflater_end;
Java_java_util_zip_Deflater_getAdler;
Java_java_util_zip_Deflater_init;
Java_java_util_zip_Deflater_initIDs;
Java_java_util_zip_Deflater_reset;
Java_java_util_zip_Deflater_setDictionary;
+   Java_java_util_zip_Deflater_setDictionaryBuffer;
Java_java_util_zip_Inflater_end;
Java_java_util_zip_Inflater_getAdler;
-   Java_java_util_zip_Inflater_inflateBytes;
+   Java_java_util_zip_Inflater_inflateBytesBytes;
+   Java_java_util_zip_Inflater_inflateBytesBuffer;
+   Java_java_util_zip_Inflater_inflateBufferBytes;
+   Java_java_util_zip_Inflater_inflateBufferBuffer;
Java_java_util_zip_Inflater_init;
Java_java_util_zip_Inflater_initIDs;
Java_java_util_zip_Inflater_reset;
Java_java_util_zip_Inflater_setDictionary;
+   Java_java_util_zip_Inflater_setDictionaryBuffer;
ZIP_Close;
ZIP_CRC32;
ZIP_FreeEntry;
diff --git a/src/java.base/share/classes/java/util/zip/Deflater.java 
b/src/java.base/share/classes/java/util/zip/Deflater.java
index c75dd4a33f0..5ededeb56ca 100644
--- a/src/java.base/share/classes/java/util/zip/Deflater.java
+++ b/src/java.base/share/classes/java/util/zip/Deflater.java
@@ -26,6 +26,9 @@
 package java.util.zip;
 
 import java.lang.ref.Cleaner.Cleanable;
+import java.nio.ByteBuffer;
+import java.util.Objects;
+
 import jdk.internal.ref.CleanerFactory;
 
 /**
@@ -92,8 +95,7 @@ import jdk.internal.ref.CleanerFactory;
 public class Deflater {
 
 private final DeflaterZStreamRef zsRef;
-private byte[] buf = new byte[0];
-private int off, len;
+private ByteBuffer input;
 private int level, strategy;
 private boolean setParams;
 private boolean finish, finished;
@@ -216,17 +218,11 @@ public class Deflater {
  * @see Deflater#needsInput
  */
 public void setInput(byte[] b, int off, int len) {
-if (b== null) {
-throw new NullPointerException();
-}
+Objects.requireNonNull(b);
 if (off < 0 || len < 0 || off > b.length - len) {
 throw new ArrayIndexOutOfBoundsException();
 }
-synchronized (zsRef) {
-this.buf = b;
-this.off = off;
-this.len = len;
-}
+setInput(ByteBuffer.wrap(b, off, len));
 }
 
 /**
@@ -236,7 +232,22 @@ public class Deflater {
  * @see Deflater#needsInput
  */
 public void setInput(byte[] b) {
-setInput(b, 0, b.length);
+// freeload the NPE off of wrap()
+setInput(ByteBuffer.wrap(b));
+}
+
+/**
+ * Sets input data for compression. This should be called whenever
+ * needsInput() returns true indicating that more input data is required.
+ * @param byteBuffer the input data bytes
+ * @see Deflater#needsInput
+ * @since 11
+ */
+public void setInput(ByteBuffer byteBuffer) {
+Objects.requireNonNull(byteBuffer);
+synchronized (zsRef) {
+this.input = byteBuffer;
+}
 }
 
 /**
@@