Thanks Ismaël for working on this. I did a first round of reviews with great 
interest, and I'll do another one soon.

I noticed that there is some overlap with the work by André 
(https://github.com/apache/parquet-java/issues?q=is%3Apr+is%3Aopen+author%3Aarouel)
 maybe it would be good to align the effort.

Thanks!

Kind regards,
Fokko

On 2026/04/29 10:13:43 Steve Loughran wrote:
> there's a JMH comparer tool at https://github.com/JohnTortugo/jmh-tabulate
> ...
> 
> Even though it comes from an AWS engineer I did review that code for
> security, and even  got claude to (dynamically) generate the config file
> needed to run the project in a chroot-style sandbox on macos. Only tangible
> risk is the chart.js file, and now that's cryptographically locked down.
> 
> https://github.com/steveloughran/jmh-tabulate/tree/hardened
> 
> Nobody should be pulling head dependencies from NPM repos, hard coded
> version numbers can be subverted by new tags. Hash codes are the only thing
> to trust for something you run on file://
> Even if you bypass the sandbox, the .html file generated does enforce
> chart.js version integrity. So all should be good.
> 
> Given all that, what do your numbers look like?
> 
> 
> 
> 
> On Wed, 29 Apr 2026 at 08:28, Ismaël Mejía <[email protected]> wrote:
> 
> > Hi dev@,
> >
> > I’ve been working on performance improvements across the main
> > encoding/decoding hot paths of Apache Parquet Java. I presented this
> > work during last week’s Parquet community sync and I am sharing a
> > summary here for broader visibility, in line with Apache best
> > practices.
> >
> > Using AI assisted tools and JMH, I expanded the existing coverage of
> > microbenchmarks covering critical hot paths. I then iterated on a
> > series of optimizations, validated for correctness, and reviewed with
> > other AI tools. The results are promising.
> >
> > The improvements focus on eliminating per-value overhead in the hot
> > loops without changing the file format or public API. Key changes:
> >
> > - Plain INT32/LONG: bulk System.arraycopy instead of per-value
> > ByteBuffer.putInt (~4x encode, ~3x decode)
> > - ByteStreamSplit: zero-allocation batch scatter/gather (3-5x encode, 2x
> > decode)
> > - Dictionary encoding: custom open-addressing hash map replacing
> > java.util.HashMap (up to 80x for low-cardinality string columns)
> > - RLE dictionary index decoder: direct ByteBuffer access bypassing
> > InputStream
> > - New batch read APIs: readIntegers()/readLongs() for vectorized consumers
> >
> > End-to-end file read/write throughput improves by ~13–14% on average
> > across codecs in my test suite (Java 11, AMD EPYC). Full JMH results
> > (303 benchmarks) and a more detailed write-up will follow.
> >
> > Most changes have been grouped and tracked under the following issue,
> > which provides background and links to the related pull requests
> > https://github.com/apache/parquet-java/issues/3530
> >
> > The first set of pull requests is ready for review. Feedback and
> > comments from Java committers would be greatly appreciated.
> >
> > Thanks,
> > Ismaël
> >
> > ps. Kudos to Fokko Driesprong who already started reviewing some of them.
> >
> 

Reply via email to