Re: BinaryDocValues compression with 8.5.1

2020-05-21 Thread Michael McCandless
Thanks Viral! Mike McCandless http://blog.mikemccandless.com On Thu, May 21, 2020 at 2:21 PM Viral Gandhi wrote: > Thank you! Opened https://issues.apache.org/jira/browse/LUCENE-9378 to > address this. > > Viral Gandhi > > On Wed, 20 May 2020 at 15:27, Michael McCandless < > luc...@mikemccand

Re: BinaryDocValues compression with 8.5.1

2020-05-21 Thread Viral Gandhi
Thank you! Opened https://issues.apache.org/jira/browse/LUCENE-9378 to address this. Viral Gandhi On Wed, 20 May 2020 at 15:27, Michael McCandless wrote: > I think we could do this at the Codec level? > > For example, for stored fields, the current default format > (Lucene50StoredFieldsFormat)

Re: BinaryDocValues compression with 8.5.1

2020-05-20 Thread Michael McCandless
I think we could do this at the Codec level? For example, for stored fields, the current default format (Lucene50StoredFieldsFormat) has two modes, Mode.BEST_SPEED and Mode.BEST_COMPRESSION, that are easy for the user to pick. Both modes use compression, just at varying levels. I think for the (

Re: BinaryDocValues compression with 8.5.1

2020-05-20 Thread Michael Sokolov
I guess the compression we added to binary doc values, and for postings, seems to have hurt performance in a way that wasn't detected in testing when those changes were made, or if it was detected, I don't recall any discussion about the tradeoff being made. Now that we do see there is a tradeoff,

Re: BinaryDocValues compression with 8.5.1

2020-05-18 Thread David Smiley
I don't have a direct answer for you, but your message causes me to reflect on how Lucene does *not* give users choice of format on a per-type basis (e.g. BinaryDocValues vs NumericDocValues vs etc.), which is annoying. Ideally the previous simple format would be available for you to choose, but it

BinaryDocValues compression with 8.5.1

2020-05-18 Thread Viral Gandhi
Hi, I tried upgrading to lucene 8.5.1 from 8.4 and ran our internal benchmarking. We noticed that with this upgrade our QPS dropped more than 40% and also affected latencies. After doing some profiling and reverting LUCENE-9211 commit related to B