arouel opened a new pull request, #3465: URL: https://github.com/apache/parquet-java/pull/3465
### Rationale for this change `DeltaByteArrayWriter.writeBytes()` is on the hot path for every `Binary` value written with `DELTA_BYTE_ARRAY` encoding (the default for BINARY columns in V2 data pages). It has two performance issues: 1. **Unnecessary allocation per value.** Line 92 calls `v.getBytes()`, which *always* allocates a new `byte[]` and copies into it — even for `ByteArrayBackedBinary` (`Arrays.copyOfRange`) and `ByteBufferBackedBinary` (`new byte[length]` + ByteBuffer copy). The returned array is used for (a) a read-only prefix comparison against `previous` and (b) assignment to `previous` for the next iteration. `getBytesUnsafe()` would suffice for (a), and a defensive copy is only needed for (b) when `isBackingBytesReused()` is true. 2. **Scalar prefix comparison loop.** The `for (i = 0; (i < length) && (previous[i] == vb[i]); i++)` loop cannot be vectorized by the JIT because of the per-element early-exit condition. `Arrays.mismatch()` (available since Java 9) performs the same operation and is [intrinsified by the JVM](https://bugs.openjdk.org/browse/JDK-8033148) to use SIMD vector comparison. In a custom JFR-profiled benchmark merging 60 Parquet files (180M rows total, 4 BINARY columns out of 10), `Binary$ByteBufferBackedBinary.getBytes()` called from line 92 was the **#1 allocation hotspot**: 7,545 out of 12,711 total allocation samples (59.4%), accounting for ~20.9 GB of allocation in a single operation. The byte-by-byte prefix loop at line 95 consumed 619 CPU samples (18.2% of all CPU samples). ### What changes are included in this PR? Three changes to `DeltaByteArrayWriter.writeBytes()`: - `v.getBytes()` → `v.getBytesUnsafe()` — avoids the per-value `byte[]` allocation. The array is only used for a read-only prefix comparison; no copy is needed. - `previous = vb` → `previous = v.isBackingBytesReused() ? v.getBytes() : vb` — retains an owned copy only when the caller may reuse the backing bytes. In the common case (`isBackingBytesReused == false`, e.g. values read from Parquet pages via `ColumnReader`), no copy is made. - `for` loop → `Arrays.mismatch(byte[], int, int, byte[], int, int)` — uses the JVM SIMD-intrinsified mismatch for prefix matching instead of a scalar byte-by-byte loop. **Custom JFR-profiled benchmark results** (merge of 60 Parquet files, 180M rows, 10 columns / 4 BINARY, JDK 26, macOS aarch64): | Metric | Before | After | Change | |---|---|---|---| | Allocation rate | 908.9 MB/s | 453.1 MB/s | **-50.2%** | | Allocation per op | 42.1 GB | 20.9 GB | **-50.4%** | | GC count (STW) | 10 | 7 | **-30%** | | GC pause time | 35 ms | 28 ms | **-20%** | | G1 evacuation failures | 10 | 0 | **-100%** | | `DeltaByteArrayWriter` alloc samples | 7,548 (59.4%) | 11 (0.1%) | **-99.9%** | | `DeltaByteArrayWriter` CPU samples | 854 (25.2%) | 615 (18.1%) | **-28%** | ### Are these changes tested? Yes. Existing tests for `DeltaByteArrayWriter` (`TestDeltaByteArrayEncoding`) cover round-trip correctness for the encoding. No new tests are needed since the change is purely a performance optimization that preserves the exact same semantics — the prefix length computation and suffix slicing produce identical output. The `Arrays.mismatch` semantics are equivalent to the original `for` loop, and `getBytesUnsafe()` returns the same byte content as `getBytes()`. ### Are there any user-facing changes? No. This is a transparent performance improvement. The encoding output is byte-for-byte identical. No API changes, no configuration changes, no behavioral changes. Files written before and after this change are fully interchangeable. Closes #3464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
