iemejia opened a new pull request, #3535: URL: https://github.com/apache/parquet-java/pull/3535
## Summary - Add `readIntegers()`, `readLongs()`, `readFloats()`, `readDoubles()` batch methods to `ValuesReader` with default loop-based implementations - Override in specialized readers to amortize per-value overhead across batches ## Overrides - **RunLengthBitPackingHybridDecoder.readInts()**: batch across RLE runs and packed groups using `Arrays.fill`/`System.arraycopy` - **DictionaryValuesReader**: batch-decode dictionary IDs first, then batch-lookup values (eliminates per-value IOException try/catch) - **DeltaBinaryPackingValuesReader**: `System.arraycopy` from pre-decoded buffer - **PlainValuesReader** (all types): loop over LittleEndianDataInputStream - **ByteStreamSplitValuesReader** (all types): indexed ByteBuffer bulk read ## Rationale These APIs enable callers to amortize per-value overhead (virtual dispatch, bounds checks, mode switches) across batches. Combined with other optimizations in this series (ByteBuffer-based RLE decoder, etc.), batch reads yield significant throughput improvements over per-value loops. All 576 parquet-column tests pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
