Thanks for the feedback Robert; makes sense to me. I'll tinker with a forked codec and see if the experimentation produces anything interesting.
When you mention "autogenerated decompression code", do you mean that some of this code is actually being generated? Cheers, -Greg On Sun, Feb 28, 2021 at 5:05 AM Robert Muir <rcm...@gmail.com> wrote: > If you want to test a different block size (say 64 or 256), I really > recommend to just fork a different codec for the experiment. > > There will likely be higher level changes you need to make, not just > changing a number. For example if you just increased this number to 256 > without doing anything else, I wouldn't be surprised if you see worse > performance. More of the postings would be vint-encoded than before with > 128, which might have some consequences. skipdata layout might be > inappropriate, these things are optimized for blocks of 128. > > Just in general, I recommend making a codec for the benchmarking > experiments, tools like luceneutil support comparing codecs against each > other anyway so you can easily compare fairly against the existing codec. > Also, it should be much easier/faster to just make a new codec and adapt it > to test what you want! > > I think it is an antipattern to make stuff within the codec "flexible", it > is autogenerated decompression code :) I am concerned such "flexibility" > would create barriers in the future to optimizations. For example we should > be able to experiment with converting this compression code over to > explicit vector API in java. > > On Sat, Feb 27, 2021 at 4:29 PM Greg Miller <gsmil...@gmail.com> wrote: > >> Hi folks! >> >> I've been a bit curious to test out different block size configurations >> in the Lucene postings list format, but thought I'd reach out to the >> community here first to see what work may have gone into this previously. >> I'm essentially interested in benchmarking different block size >> configurations on the real-world application of Lucene I'm working on. >> >> If my understanding of the code is correct, I know we're currently >> encoding compressed runs of 128 docs per block, relying on ForUtil for >> encoding/decoding purposes. It looks like we define this in >> ForUtil#BLOCK_SIZE (and reference it in a few external classes), but also >> know that it's not as simple as just changing that one definition. It >> appears much of the logic in ForUtil relies on the assumption of 128 >> docs-per-block. >> >> I'm toying with the idea of making ForUtil a bit more flexible to allow >> for different block sizes to be tested in order to run the benchmarking I'd >> like to run, but the class looks heavily optimized to generate SIMD >> instructions (I think?), so that might be folly. Before I start hacking on >> a local branch to see what I can learn, is there any prior work that might >> be useful to be aware of? Anyone gone down this path and have some >> learnings to share? Any thoughts would be much appreciated! >> >> Cheers, >> -Greg >> >