I think its a good idea, especially if the assert can be in a good place (ideally a not-so-hot place, e.g. encoding, patching code). asserts have some costs for this kind of code even when disabled, bytecode count limits are used for compiler threshold and stuff.
On Wed, Mar 3, 2021 at 9:05 PM Greg Miller <gsmil...@gmail.com> wrote: > So, slightly different topic, maybe, but related so tacking onto this > thread... > > While tweaking ForUtil locally to experiment with different block sizes, I > realized that PForUtil encodes the offset for each "patch" using a single > byte, which implies a strict upper limit of 256 on the BLOCK_SIZE defined > in ForUtil. This essentially silently failed on me when I was trying to set > up blocks of 512. The unit tests caught it since the results were incorrect > after encoding/decoding with PForUtil (hooray!), but it would have been > nice to have an assert somewhere guarding for this to make matters a little > more explicit. > > While I realize that the likelihood of changing the blockside in ForUtil > may be low for now, it seems like such a small, easy change to toss an > assert in that it seems useful. What do you all think? Worth opening a > minor issue for this and putting in a one-liner? > > Cheers, > -Greg > > On Mon, Mar 1, 2021 at 11:30 AM Greg Miller <gsmil...@gmail.com> wrote: > >> Oh, got it. This is great, thanks! >> >> Cheers, >> -Greg >> >> On Mon, Mar 1, 2021 at 11:28 AM Robert Muir <rcm...@gmail.com> wrote: >> >>> Yeah, have a look at gen_ForUtil.py >>> >>> On Mon, Mar 1, 2021 at 1:05 PM Greg Miller <gsmil...@gmail.com> wrote: >>> >>>> Thanks for the feedback Robert; makes sense to me. I'll tinker with a >>>> forked codec and see if the experimentation produces anything interesting. >>>> >>>> When you mention "autogenerated decompression code", do you mean that >>>> some of this code is actually being generated? >>>> >>>> Cheers, >>>> -Greg >>>> >>>> On Sun, Feb 28, 2021 at 5:05 AM Robert Muir <rcm...@gmail.com> wrote: >>>> >>>>> If you want to test a different block size (say 64 or 256), I really >>>>> recommend to just fork a different codec for the experiment. >>>>> >>>>> There will likely be higher level changes you need to make, not just >>>>> changing a number. For example if you just increased this number to 256 >>>>> without doing anything else, I wouldn't be surprised if you see worse >>>>> performance. More of the postings would be vint-encoded than before with >>>>> 128, which might have some consequences. skipdata layout might be >>>>> inappropriate, these things are optimized for blocks of 128. >>>>> >>>>> Just in general, I recommend making a codec for the benchmarking >>>>> experiments, tools like luceneutil support comparing codecs against each >>>>> other anyway so you can easily compare fairly against the existing codec. >>>>> Also, it should be much easier/faster to just make a new codec and adapt >>>>> it >>>>> to test what you want! >>>>> >>>>> I think it is an antipattern to make stuff within the codec >>>>> "flexible", it is autogenerated decompression code :) I am concerned such >>>>> "flexibility" would create barriers in the future to optimizations. For >>>>> example we should be able to experiment with converting this compression >>>>> code over to explicit vector API in java. >>>>> >>>>> On Sat, Feb 27, 2021 at 4:29 PM Greg Miller <gsmil...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi folks! >>>>>> >>>>>> I've been a bit curious to test out different block size >>>>>> configurations in the Lucene postings list format, but thought I'd reach >>>>>> out to the community here first to see what work may have gone into this >>>>>> previously. I'm essentially interested in benchmarking different block >>>>>> size >>>>>> configurations on the real-world application of Lucene I'm working on. >>>>>> >>>>>> If my understanding of the code is correct, I know we're currently >>>>>> encoding compressed runs of 128 docs per block, relying on ForUtil for >>>>>> encoding/decoding purposes. It looks like we define this in >>>>>> ForUtil#BLOCK_SIZE (and reference it in a few external classes), but also >>>>>> know that it's not as simple as just changing that one definition. It >>>>>> appears much of the logic in ForUtil relies on the assumption of 128 >>>>>> docs-per-block. >>>>>> >>>>>> I'm toying with the idea of making ForUtil a bit more flexible to >>>>>> allow for different block sizes to be tested in order to run the >>>>>> benchmarking I'd like to run, but the class looks heavily optimized to >>>>>> generate SIMD instructions (I think?), so that might be folly. Before I >>>>>> start hacking on a local branch to see what I can learn, is there any >>>>>> prior >>>>>> work that might be useful to be aware of? Anyone gone down this path and >>>>>> have some learnings to share? Any thoughts would be much appreciated! >>>>>> >>>>>> Cheers, >>>>>> -Greg >>>>>> >>>>>