arouel opened a new pull request, #3490: URL: https://github.com/apache/parquet-java/pull/3490
### Rationale for this change `Binary.copy()` is expected to return a value that the caller can safely store beyond the lifetime of the source buffer. The base implementation only copies when `isBackingBytesReused` is `true`, otherwise it returns `this`. For `ByteBufferBackedBinary` instances backed by a direct `ByteBuffer`, `isBackingBytesReused` is `false`, so `copy()` is a no-op. The returned object still points into the original direct buffer. When that buffer's memory is freed (e.g., after closing a `PageReadStore` that used the off-heap decompression path), any code that relied on `copy()` for ownership, such as `DictionaryValuesWriter.writeBytes()`, hits `IllegalStateException: Already closed on the next access`. ### What changes are included in this PR? `ByteBufferBackedBinary.copy()` is overridden to check `value.isDirect()`. When the backing `ByteBuffer` is direct, it unconditionally materializes to a heap-backed `Binary` via `Binary.fromConstantByteArray(getBytes())`. When the backing buffer is heap-based, it delegates to `super.copy()` to preserve the existing behavior and avoid unnecessary copies. ### Are these changes tested? Tests were added that verify `copy()` on a direct `ByteBuffer`-backed `Binary` returns an independent heap-backed instance whose data remains accessible after the source buffer's is closed. The existing test suites continue to pass. ### Are there any user-facing changes? No API changes. `Binary.copy()` now returns a heap-backed copy instead of this when the source is backed by a direct `ByteBuffer`. This is a correctness fix, callers already expected `copy()` to return an owned value, and the previous behavior silently violated that contract for direct buffers. Closes #3489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
