Thanks Robert. I've created https://issues.apache.org/jira/browse/LUCENE-9822 and will attach a patch shortly.
Cheers, -Greg On Wed, Mar 3, 2021 at 6:21 PM Robert Muir <rcm...@gmail.com> wrote: > I think its a good idea, especially if the assert can be in a good place > (ideally a not-so-hot place, e.g. encoding, patching code). asserts have > some costs for this kind of code even when disabled, bytecode count limits > are used for compiler threshold and stuff. > > On Wed, Mar 3, 2021 at 9:05 PM Greg Miller <gsmil...@gmail.com> wrote: > >> So, slightly different topic, maybe, but related so tacking onto this >> thread... >> >> While tweaking ForUtil locally to experiment with different block sizes, >> I realized that PForUtil encodes the offset for each "patch" using a single >> byte, which implies a strict upper limit of 256 on the BLOCK_SIZE defined >> in ForUtil. This essentially silently failed on me when I was trying to set >> up blocks of 512. The unit tests caught it since the results were incorrect >> after encoding/decoding with PForUtil (hooray!), but it would have been >> nice to have an assert somewhere guarding for this to make matters a little >> more explicit. >> >> While I realize that the likelihood of changing the blockside in ForUtil >> may be low for now, it seems like such a small, easy change to toss an >> assert in that it seems useful. What do you all think? Worth opening a >> minor issue for this and putting in a one-liner? >> >> Cheers, >> -Greg >> >> On Mon, Mar 1, 2021 at 11:30 AM Greg Miller <gsmil...@gmail.com> wrote: >> >>> Oh, got it. This is great, thanks! >>> >>> Cheers, >>> -Greg >>> >>> On Mon, Mar 1, 2021 at 11:28 AM Robert Muir <rcm...@gmail.com> wrote: >>> >>>> Yeah, have a look at gen_ForUtil.py >>>> >>>> On Mon, Mar 1, 2021 at 1:05 PM Greg Miller <gsmil...@gmail.com> wrote: >>>> >>>>> Thanks for the feedback Robert; makes sense to me. I'll tinker with a >>>>> forked codec and see if the experimentation produces anything interesting. >>>>> >>>>> When you mention "autogenerated decompression code", do you mean that >>>>> some of this code is actually being generated? >>>>> >>>>> Cheers, >>>>> -Greg >>>>> >>>>> On Sun, Feb 28, 2021 at 5:05 AM Robert Muir <rcm...@gmail.com> wrote: >>>>> >>>>>> If you want to test a different block size (say 64 or 256), I really >>>>>> recommend to just fork a different codec for the experiment. >>>>>> >>>>>> There will likely be higher level changes you need to make, not just >>>>>> changing a number. For example if you just increased this number to 256 >>>>>> without doing anything else, I wouldn't be surprised if you see worse >>>>>> performance. More of the postings would be vint-encoded than before with >>>>>> 128, which might have some consequences. skipdata layout might be >>>>>> inappropriate, these things are optimized for blocks of 128. >>>>>> >>>>>> Just in general, I recommend making a codec for the benchmarking >>>>>> experiments, tools like luceneutil support comparing codecs against each >>>>>> other anyway so you can easily compare fairly against the existing codec. >>>>>> Also, it should be much easier/faster to just make a new codec and adapt >>>>>> it >>>>>> to test what you want! >>>>>> >>>>>> I think it is an antipattern to make stuff within the codec >>>>>> "flexible", it is autogenerated decompression code :) I am concerned such >>>>>> "flexibility" would create barriers in the future to optimizations. For >>>>>> example we should be able to experiment with converting this compression >>>>>> code over to explicit vector API in java. >>>>>> >>>>>> On Sat, Feb 27, 2021 at 4:29 PM Greg Miller <gsmil...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi folks! >>>>>>> >>>>>>> I've been a bit curious to test out different block size >>>>>>> configurations in the Lucene postings list format, but thought I'd reach >>>>>>> out to the community here first to see what work may have gone into this >>>>>>> previously. I'm essentially interested in benchmarking different block >>>>>>> size >>>>>>> configurations on the real-world application of Lucene I'm working on. >>>>>>> >>>>>>> If my understanding of the code is correct, I know we're currently >>>>>>> encoding compressed runs of 128 docs per block, relying on ForUtil for >>>>>>> encoding/decoding purposes. It looks like we define this in >>>>>>> ForUtil#BLOCK_SIZE (and reference it in a few external classes), but >>>>>>> also >>>>>>> know that it's not as simple as just changing that one definition. It >>>>>>> appears much of the logic in ForUtil relies on the assumption of 128 >>>>>>> docs-per-block. >>>>>>> >>>>>>> I'm toying with the idea of making ForUtil a bit more flexible to >>>>>>> allow for different block sizes to be tested in order to run the >>>>>>> benchmarking I'd like to run, but the class looks heavily optimized to >>>>>>> generate SIMD instructions (I think?), so that might be folly. Before I >>>>>>> start hacking on a local branch to see what I can learn, is there any >>>>>>> prior >>>>>>> work that might be useful to be aware of? Anyone gone down this path and >>>>>>> have some learnings to share? Any thoughts would be much appreciated! >>>>>>> >>>>>>> Cheers, >>>>>>> -Greg >>>>>>> >>>>>>