[PR] GH-3495: Optimize PlainValuesWriter with direct ByteBuffer slab writes (~2.5x encode speedup) [parquet-java]

via GitHub Sun, 19 Apr 2026 07:13:10 -0700


iemejia opened a new pull request, #3496:
URL: https://github.com/apache/parquet-java/pull/3496


   ## Summary
   
   Closes #3495.
   
   Two-commit PR optimizing `PlainValuesWriter` and following up with API 
cleanup of the now-unused `LittleEndianDataOutputStream` wrapper.
   
   ### Commit 1 — `PlainValuesWriter` direct ByteBuffer writes
   
   Removes the `LittleEndianDataOutputStream` layer between `PlainValuesWriter` 
and `CapacityByteArrayOutputStream`. Adds `writeInt(int)`/`writeLong(long)` 
methods on CBOS that write directly to its internal `ByteBuffer` slabs (set to 
`LITTLE_ENDIAN`), making the value write a single HotSpot intrinsic instead of 
a 4-byte decomposition through a temp array and an `OutputStream` chain.
   
   **`IntEncodingBenchmark.encodePlain`** (100k INT32 / invocation, JMH `-wi 5 
-i 10 -f 2`):
   
   | Pattern           | Before (ops/s) | After (ops/s) | Improvement |
   |-------------------|---------------:|--------------:|:-----------:|
   | SEQUENTIAL        |     20,944,860 |    53,231,121 | **+154%** (2.55x) |
   | RANDOM            |     20,613,242 |    53,419,118 | **+160%** |
   | LOW_CARDINALITY   |     20,749,103 |    53,510,247 | **+158%** |
   | HIGH_CARDINALITY  |     20,521,786 |    52,825,012 | **+157%** |
   
   The same code path is shared by `writeLong`, `writeFloat`, `writeDouble`, 
and the length prefix in `writeBytes(Binary)`, so PLAIN-encoded 
`INT64`/`FLOAT`/`DOUBLE`/`BINARY` columns benefit too. Decode benchmarks 
(`decodePlain` etc.) are unchanged, as expected.
   
   ### Commit 2 — Deprecate `LittleEndianDataOutputStream`, remove last wrapper 
usages
   
   Pure API cleanup, **no measurable performance impact**. After commit 1, 
`FixedLenByteArrayPlainValuesWriter` and `DeltaLengthByteArrayValuesWriter` 
were the last two production usages of `LittleEndianDataOutputStream`. Both 
wrapped a `CapacityByteArrayOutputStream` only to call `Binary.writeTo(out)`, 
which goes through `OutputStream.write(byte[], int, int)` — the wrapper added 
nothing for that call. Removing the wrapper allows marking 
`LittleEndianDataOutputStream` as `@Deprecated` (kept for binary compatibility, 
scheduled for removal in a future major).
   
   Benchmarks for the two touched paths (`BinaryEncodingBenchmark`, JMH `-wi 5 
-i 10 -f 3`, 30 samples per row) are within ±5% with allocation rates per op 
unchanged within 2% — consistent with noise rather than a real effect either 
way. Rationale is code health (one fewer wrapper layer, deprecation of an 
internal-shaped public class), not performance. Full numbers are in the commit 
message.
   
   ## Validation
   
   - `parquet-column`: 573 tests pass
   - `parquet-common`: 308 tests pass
   - Built with `-Dspotless.check.skip=true -Drat.skip=true -Djapicmp.skip=true`
   
   ## Related
   
   This is the second in a small series of focused performance PRs from work in 
https://github.com/iemejia/parquet-perf. The first was #3494.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] GH-3495: Optimize PlainValuesWriter with direct ByteBuffer slab writes (~2.5x encode speedup) [parquet-java]

Reply via email to