arouel opened a new pull request, #3490:
URL: https://github.com/apache/parquet-java/pull/3490

   ### Rationale for this change
   
   `Binary.copy()` is expected to return a value that the caller can safely 
store beyond the lifetime of the source buffer. The base implementation only 
copies when `isBackingBytesReused` is `true`, otherwise it returns `this`. For 
`ByteBufferBackedBinary` instances backed by a direct `ByteBuffer`, 
`isBackingBytesReused` is `false`, so `copy()` is a no-op. The returned object 
still points into the original direct buffer. When that buffer's memory is 
freed (e.g., after closing a `PageReadStore` that used the off-heap 
decompression path), any code that relied on `copy()` for ownership, such as 
`DictionaryValuesWriter.writeBytes()`, hits `IllegalStateException: Already 
closed on the next access`.
   
   ### What changes are included in this PR?
   
   `ByteBufferBackedBinary.copy()` is overridden to check `value.isDirect()`. 
When the backing `ByteBuffer` is direct, it unconditionally materializes to a 
heap-backed `Binary` via `Binary.fromConstantByteArray(getBytes())`. When the 
backing buffer is heap-based, it delegates to `super.copy()` to preserve the 
existing behavior and avoid unnecessary copies.
   
   ### Are these changes tested?
   
   Tests were added that verify `copy()` on a direct `ByteBuffer`-backed 
`Binary` returns an independent heap-backed instance whose data remains 
accessible after the source buffer's is closed. The existing test suites 
continue to pass.
   
   ### Are there any user-facing changes?
   
   No API changes. `Binary.copy()` now returns a heap-backed copy instead of 
this when the source is backed by a direct `ByteBuffer`. This is a correctness 
fix, callers already expected `copy()` to return an owned value, and the 
previous behavior silently violated that contract for direct buffers.
   
   Closes #3489
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to