Re: benchmark drop for PrimaryKey

2018-08-24 Thread Michael Sokolov
In fact I see a pronounced effect even with the smallish (10k) index! And I should correct my earlier statement about FST50 - My earlier test was flawed: I was confused about how these benchmarks work and updated nightlyBench.py rather than my localrun.py. After correcting that and comparing FST50

Re: benchmark drop for PrimaryKey

2018-08-24 Thread Adrien Grand
I don't think you need an index that is so large that the terms dictionary doesn't fit in the OS cache to reproduce the difference, but you might need a larger index indeed. On my end I use wikimedium10M or wikimediumall (and wikibigall if I need to test phrases) most of the time as I get more nois

Re: benchmark drop for PrimaryKey

2018-08-23 Thread Michael Sokolov
I think the benchmarks need updating after LUCENE-8461. I got them working again by replacing lucene70 with lucene80 everywhere except for the DocValues formats, and adding the backward-codecs.jar to the benchmarks build. I'm not sure that was really the right way to go about this? After that I did

Re: benchmark drop for PrimaryKey

2018-08-23 Thread Michael Sokolov
OK thanks. I guess this benchmark must be run on a large-enough index that it doesn't fit entirely in RAM already anyway? When I ran it locally using the vanilla benchmark instructions, I believe the generated index was quite small (wikimedium10k). At any rate, I don't have any specific use case y

Re: benchmark drop for PrimaryKey

2018-08-23 Thread David Smiley
Switching to "FST50" ought to bring back much of the benefit of "Memory". On Thu, Aug 23, 2018 at 5:15 PM Adrien Grand wrote: > The commit that caused this slowdown might be > https://github.com/mikemccand/luceneutil/commit/1d8460f342f269c98047def9f9eb76213acae5d9 > . > > We don't have anything

Re: benchmark drop for PrimaryKey

2018-08-23 Thread Adrien Grand
The commit that caused this slowdown might be https://github.com/mikemccand/luceneutil/commit/1d8460f342f269c98047def9f9eb76213acae5d9 . We don't have anything that performs as well anymore indeed, but I'm not sure this is a big deal. I would suspect that there were not many users of that postings