iemejia opened a new pull request, #3496: URL: https://github.com/apache/parquet-java/pull/3496
## Summary Closes #3495. Two-commit PR optimizing `PlainValuesWriter` and following up with API cleanup of the now-unused `LittleEndianDataOutputStream` wrapper. ### Commit 1 — `PlainValuesWriter` direct ByteBuffer writes Removes the `LittleEndianDataOutputStream` layer between `PlainValuesWriter` and `CapacityByteArrayOutputStream`. Adds `writeInt(int)`/`writeLong(long)` methods on CBOS that write directly to its internal `ByteBuffer` slabs (set to `LITTLE_ENDIAN`), making the value write a single HotSpot intrinsic instead of a 4-byte decomposition through a temp array and an `OutputStream` chain. **`IntEncodingBenchmark.encodePlain`** (100k INT32 / invocation, JMH `-wi 5 -i 10 -f 2`): | Pattern | Before (ops/s) | After (ops/s) | Improvement | |-------------------|---------------:|--------------:|:-----------:| | SEQUENTIAL | 20,944,860 | 53,231,121 | **+154%** (2.55x) | | RANDOM | 20,613,242 | 53,419,118 | **+160%** | | LOW_CARDINALITY | 20,749,103 | 53,510,247 | **+158%** | | HIGH_CARDINALITY | 20,521,786 | 52,825,012 | **+157%** | The same code path is shared by `writeLong`, `writeFloat`, `writeDouble`, and the length prefix in `writeBytes(Binary)`, so PLAIN-encoded `INT64`/`FLOAT`/`DOUBLE`/`BINARY` columns benefit too. Decode benchmarks (`decodePlain` etc.) are unchanged, as expected. ### Commit 2 — Deprecate `LittleEndianDataOutputStream`, remove last wrapper usages Pure API cleanup, **no measurable performance impact**. After commit 1, `FixedLenByteArrayPlainValuesWriter` and `DeltaLengthByteArrayValuesWriter` were the last two production usages of `LittleEndianDataOutputStream`. Both wrapped a `CapacityByteArrayOutputStream` only to call `Binary.writeTo(out)`, which goes through `OutputStream.write(byte[], int, int)` — the wrapper added nothing for that call. Removing the wrapper allows marking `LittleEndianDataOutputStream` as `@Deprecated` (kept for binary compatibility, scheduled for removal in a future major). Benchmarks for the two touched paths (`BinaryEncodingBenchmark`, JMH `-wi 5 -i 10 -f 3`, 30 samples per row) are within ±5% with allocation rates per op unchanged within 2% — consistent with noise rather than a real effect either way. Rationale is code health (one fewer wrapper layer, deprecation of an internal-shaped public class), not performance. Full numbers are in the commit message. ## Validation - `parquet-column`: 573 tests pass - `parquet-common`: 308 tests pass - Built with `-Dspotless.check.skip=true -Drat.skip=true -Djapicmp.skip=true` ## Related This is the second in a small series of focused performance PRs from work in https://github.com/iemejia/parquet-perf. The first was #3494. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
