arouel opened a new pull request, #3465:
URL: https://github.com/apache/parquet-java/pull/3465

   ### Rationale for this change
   
   `DeltaByteArrayWriter.writeBytes()` is on the hot path for every `Binary` 
value written with `DELTA_BYTE_ARRAY` encoding (the default for BINARY columns 
in V2 data pages). It has two performance issues:
   
   1. **Unnecessary allocation per value.** Line 92 calls `v.getBytes()`, which 
*always* allocates a new `byte[]` and copies into it — even for 
`ByteArrayBackedBinary` (`Arrays.copyOfRange`) and `ByteBufferBackedBinary` 
(`new byte[length]` + ByteBuffer copy). The returned array is used for (a) a 
read-only prefix comparison against `previous` and (b) assignment to `previous` 
for the next iteration. `getBytesUnsafe()` would suffice for (a), and a 
defensive copy is only needed for (b) when `isBackingBytesReused()` is true.
   
   2. **Scalar prefix comparison loop.** The `for (i = 0; (i < length) && 
(previous[i] == vb[i]); i++)` loop cannot be vectorized by the JIT because of 
the per-element early-exit condition. `Arrays.mismatch()` (available since Java 
9) performs the same operation and is [intrinsified by the 
JVM](https://bugs.openjdk.org/browse/JDK-8033148) to use SIMD vector comparison.
   
   In a custom JFR-profiled benchmark merging 60 Parquet files (180M rows 
total, 4 BINARY columns out of 10), `Binary$ByteBufferBackedBinary.getBytes()` 
called from line 92 was the **#1 allocation hotspot**: 7,545 out of 12,711 
total allocation samples (59.4%), accounting for ~20.9 GB of allocation in a 
single operation. The byte-by-byte prefix loop at line 95 consumed 619 CPU 
samples (18.2% of all CPU samples).
   
   ### What changes are included in this PR?
   
   Three changes to `DeltaByteArrayWriter.writeBytes()`:
   
   - `v.getBytes()` → `v.getBytesUnsafe()` — avoids the per-value `byte[]` 
allocation. The array is only used for a read-only prefix comparison; no copy 
is needed.
   - `previous = vb` → `previous = v.isBackingBytesReused() ? v.getBytes() : 
vb` — retains an owned copy only when the caller may reuse the backing bytes. 
In the common case (`isBackingBytesReused == false`, e.g. values read from 
Parquet pages via `ColumnReader`), no copy is made.
   - `for` loop → `Arrays.mismatch(byte[], int, int, byte[], int, int)` — uses 
the JVM SIMD-intrinsified mismatch for prefix matching instead of a scalar 
byte-by-byte loop.
   
   **Custom JFR-profiled benchmark results** (merge of 60 Parquet files, 180M 
rows, 10 columns / 4 BINARY, JDK 26, macOS aarch64):
   
   | Metric | Before | After | Change |
   |---|---|---|---|
   | Allocation rate | 908.9 MB/s | 453.1 MB/s | **-50.2%** |
   | Allocation per op | 42.1 GB | 20.9 GB | **-50.4%** |
   | GC count (STW) | 10 | 7 | **-30%** |
   | GC pause time | 35 ms | 28 ms | **-20%** |
   | G1 evacuation failures | 10 | 0 | **-100%** |
   | `DeltaByteArrayWriter` alloc samples | 7,548 (59.4%) | 11 (0.1%) | 
**-99.9%** |
   | `DeltaByteArrayWriter` CPU samples | 854 (25.2%) | 615 (18.1%) | **-28%** |
   
   ### Are these changes tested?
   
   Yes. Existing tests for `DeltaByteArrayWriter` 
(`TestDeltaByteArrayEncoding`) cover round-trip correctness for the encoding. 
No new tests are needed since the change is purely a performance optimization 
that preserves the exact same semantics — the prefix length computation and 
suffix slicing produce identical output. The `Arrays.mismatch` semantics are 
equivalent to the original `for` loop, and `getBytesUnsafe()` returns the same 
byte content as `getBytes()`.
   
   ### Are there any user-facing changes?
   
   No. This is a transparent performance improvement. The encoding output is 
byte-for-byte identical. No API changes, no configuration changes, no 
behavioral changes. Files written before and after this change are fully 
interchangeable.
   
   
   Closes #3464
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to