Hi All (and especially Robert), Lucene NumericDocValues seems to operate slower than we would expect. In our application, we're using it for storing coordinate values, which we retrieve to compute a distance. While doing timings trying to determine the impact of including a sqrt in the calculation, we noted that the lucene overhead itself overwhelmed pretty much anything we did in the ValueSource.
One of our engineers did performance testing (code attached, hope it gets through), which shows what we are talking about. Please see the thread below. The question is: why is lucene 2.5x slower than a direct buffer access for this case? And is there anything we can do in the Lucene paradigm to get our performance back closer to the direct buffer case? Karl -----Original Message----- From: Ziech Christian (HERE/Berlin) Sent: Tuesday, October 08, 2013 9:08 AM To: Wright Karl (HERE/Cambridge) Subject: AW: Is there a really performant way to store a full 32-bit int in doc values? Hi, I have tested now the approach with usind the NumericDocValues directly and it indeed helps about 20% compared to the original Lucene numbers - Lucene is still 2,5x slower than using a DirectBuffer alone but it helps. The funny thing is really that with lucene using the SquareRoot is almost meaningless which can be explained well by the CPU calculating the SquareRoot while other things are computated and since it doesn't need the result for a while in my micro-Benchmark it can happily do other things in the meantime. Since we also have a lot of other query aspects we'd get that gain either way I assume so calculating about 30-50ms for the square root for the scoring 25M documents should be about accurate. So what is lucene doing that causes it to be 3 times slower than the naive approach. And why is that impact compared to the one of a simple square root (slowing down things by ~20% when assuming the 30ms with more complex actions) so big? I mean 20% vs 200% is a magnitude! As a side note: Storing the values as a int when using a DirectBuffer doesn't seem helpful - I assume because we have to cast the in to float either way later. BR Christian PS: The new numbers are: Scoring 25000000 documents with direct float buffers (without square root) took 190 Scoring 25000000 documents with direct float buffers (without square root) took 171 Scoring 25000000 documents with direct float buffers (without square root) took 172 Scoring 25000000 documents with direct float buffers (and a square root) took 281 Scoring 25000000 documents with direct float buffers (and a square root) took 280 Scoring 25000000 documents with direct float buffers (and a square root) took 266 Scoring 25000000 documents with a lucene float value source (without square root) took 1045 Scoring 25000000 documents with a lucene float value source (without square root) took 625 Scoring 25000000 documents with a lucene float value source (without square root) took 630 Scoring 25000000 documents with a lucene float value source (and a square root) took 661 Scoring 25000000 documents with a lucene float value source (and a square root) took 670 Scoring 25000000 documents with a lucene float value source (and a square root) took 665 Scoring 25000000 documents with direct int buffers (without square root) took 218 Scoring 25000000 documents with direct int buffers (without square root) took 219 Scoring 25000000 documents with direct int buffers (without square root) took 204 Scoring 25000000 documents with a lucene numeric values (without square root) source took 1123 Scoring 25000000 documents with a lucene numeric values (without square root) source took 500 Scoring 25000000 documents with a lucene numeric values (without square root) source took 499 Scoring 25000000 documents with a lucene numeric values (and a square root) source took 531 Scoring 25000000 documents with a lucene numeric values (and a square root) source took 531 Scoring 25000000 documents with a lucene numeric values (and a square root) source took 535 ________________________________________ Von: Wright Karl (HERE/Cambridge) Gesendet: Montag, 7. Oktober 2013 09:22 An: Ziech Christian (HERE/Berlin) Betreff: FW: Is there a really performant way to store a full 32-bit int in doc values? -----Original Message----- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Monday, October 07, 2013 8:28 AM To: Wright Karl (HERE/Cambridge) Subject: Re: Is there a really performant way to store a full 32-bit int in doc values? Well, it is a micro-benchmark ... so it'd be better to test in the wider/full context of the application? I'm also a little worried that you go through ValueSource instead of interacting directly with the NumericDocValues instance; it's just an additional level of indirection that may confuse hotspot. But it really ought not be so bad ... Under the hood we encode a float to an int using Float.floatToRawIntBits; it could be that this doesn't work well w/ the compression we then do on the ints by default? I'm curious which impl the Lucene45DocValuesConsumer is using in your case. Looks like you are using random floats, so I'd expect it's using DELTA_COMPRESSED. It'd be a simple test to just make your own DVFormat using raw 32 bit ints, to see how much that helps. But, yes, I would just email the list and see if there are other ideas? Mike McCandless http://blog.mikemccandless.com On Mon, Oct 7, 2013 at 7:14 AM, <karl.wri...@here.com> wrote: > Hi Mike, > > > > Before I post to the general list, do you see any problem with our > testing methodology? > > > > Basically, we conclude that by far the most expensive thing is > retrieving the NumericDocValue value. This currently overwhelms any > expensive operations we might do in the scoring ourselves, which is > why we're looking for potential improvements in that area. > > > > Do you agree with the assessment? > > Karl > > > > From: Ziech Christian (HERE/Berlin) > Sent: Friday, October 04, 2013 11:09 PM > To: Wright Karl (HERE/Cambridge) > Subject: AW: Is there a really performant way to store a full 32-bit > int in doc values? > > > > Hi, > > maybe it's best if I share where I got my numbers from - I have > written a small test (which originally should only test the > Math.sqrt() impact for 10M scorings). > > The output is (I looped over the search invocation to give lucene a > chance to load everything): > Scoring 25000000 documents with direct buffers (without square root) > took > 203 > Scoring 25000000 documents with direct buffers (without square root) > took > 179 > Scoring 25000000 documents with direct buffers (without square root) > took > 172 > Scoring 25000000 documents with direct buffers (and a square root) > took 292 Scoring 25000000 documents with direct buffers (and a square > root) took 289 Scoring 25000000 documents with direct buffers (and a > square root) took 289 Scoring 25000000 documents with a lucene value > (without square root) source took 1045 Scoring 25000000 documents with > a lucene value (without square root) source took 656 Scoring 25000000 > documents with a lucene value (without square root) source took 660 > Scoring 25000000 documents with a lucene value (without square root) > source took 658 Scoring 25000000 documents with a lucene value > (without square root) source took 663 Scoring 25000000 documents with > a lucene value (and a square root) source took 711 Scoring 25000000 > documents with a lucene value (and a square root) source took 710 > Scoring 25000000 documents with a lucene value (and a square root) > source took 713 Scoring 25000000 documents with a lucene value (and a > square root) source took 711 Scoring 25000000 documents with a lucene > value (and a square root) source took 714 > > So the impact of a square root is roughly 110ms while the impact of > using the lucene function values is far higher (depending on the run > between 300-350ms). Interstingly the square root impact is not as high > on the lucene function query for some reason (most likely java or the > cpu can just optimize the very simple scorer best). > > I did measure the values with a FSDirectory and a RAMDirectory which > both essentially yield the same performance. Do you see any problem > with the attached code? > > BR > Christian > > ________________________________ > > Von: Wright Karl (HERE/Cambridge) > Gesendet: Freitag, 4. Oktober 2013 20:56 > An: Ziech Christian (HERE/Berlin) > Betreff: FW: Is there a really performant way to store a full 32-bit > int in doc values? > > > FYI > Karl > > Sent from my Windows Phone > > ________________________________ > > From: ext Michael McCandless > Sent: 10/4/2013 4:51 PM > To: Wright Karl (HERE/Cambridge) > Subject: Re: Is there a really performant way to store a full 32-bit > int in doc values? > > Hmmm, that's interesting that you see decode cost is too high. Are > you sure? > > Can you email the list? I'm sure Rob will have suggestions. The > worst case is you make a custom DV format that stores things raw. > > 4.5 has a new default DocValuesFormat with more compression, but with > values stored on disk by default (cached by the OS if you have the > RAM) ... I wonder how that would compare to what you're using now. > > I think the simplest thing to do is to instantiate the > Lucene42DocValuesConsumer (renamed to MemoryDVConsumer in 4.5), > passing a very high acceptableOverheadRatio? This should caused > packed ints to upgraded to a byte[], short[], int[], long[]. If this > is still not fast enough then I suspect a custom DVFormat that just > uses int[] directly (avoiding the abstractions of packed ints) is your > best shot. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, Oct 4, 2013 at 8:46 AM, <karl.wri...@here.com> wrote: >> >> >> Hi Mike, >> >> >> >> We're using docvalues to store geocoordinates in meters in X,Y,Z >> space, and discovering that they are taking more time to unpack than >> we'd like. I was surprised to find no raw representation available >> for docvalues right now >> - >> otherwise, a fixed 4-byte representation would have been ideal. Would >> you have any suggestions? >> >> >> >> Karl >> >>
LuceneFloatSourceTest.java
Description: LuceneFloatSourceTest.java
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org