Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater
On Sun, Feb 18, 2018 at 3:33 AM, Alan Bateman wrote: > Thanks for bringing this one up again Thanks for taking the time to review. > I see Sherman is looking at implementation so I'll stick with the javadoc > for now. At some point it will need a security review to ensure that there > no possibility of tricking the implementation to access memory outside of > the input/output. That is, we have to assume the user of > setInput(ByteBuffer) is evil and will change the position/limit during > deflate operations. I see the patch computes clamps the remainder before > accessing memory, it will just need a closer look to make sure there are no > issues. The patch will also need adjustments to make it consistent with the > existing code but that can come later. I did write the code with this in mind: that the address should always be within the bounds of the buffer (be it heap or direct) at the time of the call, and that the offset into the buffer should not be beyond its end (or beginning). So the effect could be a thrown exception but escaping the buffer _should_ be impossible (by intent anyway; more eyes are always better for noticing mistakes of course). > On the APIs then the inflate and deflate methods look okay, I'll less sure > that setDcitionary(ByteBuffer) is really needed. Are you adding for > setDcitionary(ByteBuffer) for consistency? Yes; it was easy enough to add it so I did. > The javadoc doesn't currently specify how it works with the buffer, e.g. > inflate(ByteBuffer) doesn't specify that adds it bytes to the buffer > starting at its position, it doesn't say if the position is adjusted. The > javadoc will also need to set expectations on behavior when > DataFormatException is thrown, is the position guaranteed to be unchanged or > is it unspecified? I intended for it to work similarly to the old code. But I'll go through the zlib docs and just make sure all the assumptions are still correct, and the next version will have more concise docs in this regard and also with regards to the disposition of the buffer in each case. -- - DML
Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater
On Sat, Feb 17, 2018 at 3:18 PM, Xueming Shen wrote: > Hi David, > > Thanks for taking this one :-) some comments here. Thanks for the review! > (1) I would assume you might have to do more for ByteBuffer, something like > [...] > btw, any reason go unsafe to get the byte[]? instead of > ByteBuffer.getArray()? Yes: input should always be settable from a read-only heap buffer without a performance penalty (i.e. copying the contents). This btw is why there is also an explicit check for read-only later on. > (2) It might be a concerned that you have to warp the input byte[] every time > the inflate/deflate(byte[]...) is called. (Yes, I'm constantly hearing > people > complain that you have to wrap the byte[] to ByteBuffer to use CharsetEn/ > Decoder, or even the implementation detail inside StringCoder), and it > might > be taken as a "regression". So it might be desired to wire the > "bf.hasArray()) > path inside de/encode(ByteArray) back to de/encode(byte[], int, int), to > avoid > the "unnecessary" ByteBuffer wrap. I can do some testing to see if there is an impact. I am working on the basis that the wrap may be optimized away (as it is in certain similar cases), but Inflater/Deflater is not final (nor are the inflate/deflate methods) that might not be true in practice. > (3) Same "wrap concern" argument might go to the setInput(bye[] ...) as well. > I'm not sure if it is worth keeping both byte[]/int/int and ByteBuffer > as the > "input" field of In/Deflater. Since input is stored on the object, the above theoretical optimization is much less likely. Again I can do some testing; it might be a good idea to have separate byte[]/int/int and ByteBuffer in the end, though the added expense of checking for and updating the ByteBuffer position after each call in addition to the byte[]/int/int might nullify the benefit. Testing is required... > (4) assume we keep the "wrap" approach. It appears ByteBuffer.wrap() does > check the byte[]/off/len and throw an IndexOutOfBoundsException. So it might > be better to take advantage of that check. I wanted to however it throws the wrong exception type; I could catch and rethrow I guess though if you think that is better (assuming we keep the "wrap" approach as you say). > (5) Deflater.input need to be initialized to a non-null value. > Btw ZipUtil.defaultBuf needs to be "direct"? It doesn't need to be, but the direct buffer code path is possibly somewhat "friendlier" to GC since there's no GetPrimitiveArrayCritical call. > (6) It might be desired to have some jmh measure to make sure byte[] case > does not have regression. I'll work on this when I get a chance (maybe not until Friday though). One other thought I had this weekend was that I could possibly improve things by getting the direct buffer address on the Java side, and pass it in to JNI to avoid the calls to GetDirectBufferAddress. Another thought was to eliminate the remaining SetBooleanField calls by using the remaining two bits in the doInflate/doDeflate methods' return values. I'm not sure how expensive these calls are though. I could also replace the JNI field reads with more method parameters if this is a valuable thing to do. -- - DML
Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater
On 16/02/2018 22:13, David Lloyd wrote: It would be convenient to be able to inflate/deflate a direct or heap byte buffer without having to copy it through an array first. For my Friday mini-project this week, I've decided to take a crack at this. The attached patch is the result. Would anyone be interested in reviewing and maybe sponsoring this change? It would be really great to get it in to JDK 11. The patch includes a modification to FlaterTest to run through permutations of using direct and heap byte buffers. Though I couldn't get jtreg working, I did compile and run the test by hand and it seems to pass; the build also works fine with the new code. Extra thanks to Martin Balao for pointing me towards mapfile-vers when I couldn't figure out why I was getting UnsatisfiedLinkError. That one was not obvious. Thanks for bringing this one up again, I think the last time that it was discussed here was in 2012 when Martin Kirst proposed a patch to add these methods. If you go through the archives then you'll see there were several issues with that proposal before the discussion petered out. I see Sherman is looking at implementation so I'll stick with the javadoc for now. At some point it will need a security review to ensure that there no possibility of tricking the implementation to access memory outside of the input/output. That is, we have to assume the user of setInput(ByteBuffer) is evil and will change the position/limit during deflate operations. I see the patch computes clamps the remainder before accessing memory, it will just need a closer look to make sure there are no issues. The patch will also need adjustments to make it consistent with the existing code but that can come later. On the APIs then the inflate and deflate methods look okay, I'll less sure that setDcitionary(ByteBuffer) is really needed. Are you adding for setDcitionary(ByteBuffer) for consistency? The javadoc doesn't currently specify how it works with the buffer, e.g. inflate(ByteBuffer) doesn't specify that adds it bytes to the buffer starting at its position, it doesn't say if the position is adjusted. The javadoc will also need to set expectations on behavior when DataFormatException is thrown, is the position guaranteed to be unchanged or is it unspecified? -Alan
Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater
On 2/16/18, 2:13 PM, David Lloyd wrote: It would be convenient to be able to inflate/deflate a direct or heap byte buffer without having to copy it through an array first. For my Friday mini-project this week, I've decided to take a crack at this. The attached patch is the result. Would anyone be interested in reviewing and maybe sponsoring this change? It would be really great to get it in to JDK 11. The patch includes a modification to FlaterTest to run through permutations of using direct and heap byte buffers. Though I couldn't get jtreg working, I did compile and run the test by hand and it seems to pass; the build also works fine with the new code. Extra thanks to Martin Balao for pointing me towards mapfile-vers when I couldn't figure out why I was getting UnsatisfiedLinkError. That one was not obvious. Hi David, Thanks for taking this one :-) some comments here. (1) I would assume you might have to do more for ByteBuffer, something like if (bf.isDirect()) { // GetDirectBufferAddress ... } else if (bf.hasArray()) { byte[] array = bf.getArray(); do(bf.getArray(), offset + pos, pos - limit); ... } else { // probably still have to copy the bytes out of ByteBuffer ... } btw, any reason go unsafe to get the byte[]? instead of ByteBuffer.getArray()? (2) It might be a concerned that you have to warp the input byte[] every time the inflate/deflate(byte[]...) is called. (Yes, I'm constantly hearing people complain that you have to wrap the byte[] to ByteBuffer to use CharsetEn/ Decoder, or even the implementation detail inside StringCoder), and it might be taken as a "regression". So it might be desired to wire the "bf.hasArray()) path inside de/encode(ByteArray) back to de/encode(byte[], int, int), to avoid the "unnecessary" ByteBuffer wrap. (3) Same "wrap concern" argument might go to the setInput(bye[] ...) as well. I'm not sure if it is worth keeping both byte[]/int/int and ByteBuffer as the "input" field of In/Deflater. (4) assume we keep the "wrap" approach. It appears ByteBuffer.wrap() does check the byte[]/off/len and throw an IndexOutOfBoundsException. So it might be better to take advantage of that check. (5) Deflater.input need to be initialized to a non-null value. Btw ZipUtil.defaultBuf needs to be "direct"? (6) It might be desired to have some jmh measure to make sure byte[] case does not have regression. Thanks, Sherman
Re: [JDK-6341887] Patch: java.util.zip: Add ByteBuffer methods to Inflater/Deflater
Also available in more readable form at: https://github.com/dmlloyd/openjdk/commit/becd36e852e55a29a4685577453944552c817b66 On Fri, Feb 16, 2018 at 4:13 PM, David Lloyd wrote: > It would be convenient to be able to inflate/deflate a direct or heap > byte buffer without having to copy it through an array first. For my > Friday mini-project this week, I've decided to take a crack at this. > The attached patch is the result. Would anyone be interested in > reviewing and maybe sponsoring this change? It would be really great > to get it in to JDK 11. > > The patch includes a modification to FlaterTest to run through > permutations of using direct and heap byte buffers. Though I couldn't > get jtreg working, I did compile and run the test by hand and it seems > to pass; the build also works fine with the new code. > > Extra thanks to Martin Balao for pointing me towards mapfile-vers when > I couldn't figure out why I was getting UnsatisfiedLinkError. That > one was not obvious. > -- > - DML -- - DML