Re: Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-22 Thread Alexander Lukyanchikov
Hi Uwe, Thank you for the detailed explanation, that helps a lot. I am still trying to understand and confirm a few details though. > Is my understanding correct that it still makes sense to avoid MMAPing > files with the random access pattern on the most recent Lucene and JVM > versions? > Who

Re: Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-22 Thread Uwe Schindler
Hi Alexander, I understand that NIOFSDirectory also uses the FS cache, but doesn't MMapDirectory tend to fill up the cache with unnecessary data for random access pattern due to sequential read-ahead? Our concern is that it can potentially lead to evicting hot pages used by another process

Re: Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-20 Thread Alexander Lukyanchikov
Hi Robert, thank you for the response. I understand that NIOFSDirectory also uses the FS cache, but doesn't MMapDirectory tend to fill up the cache with unnecessary data for random access pattern due to sequential read-ahead? Our concern is that it can potentially lead to evicting hot pages used

Re: Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-19 Thread Robert Muir
On Thu, Aug 18, 2022 at 1:47 PM Alexander Lukyanchikov wrote: > > Currently we are trying to avoid switching to MMAP because there is another > process running on the same host and extensively utilizes the FS cache. > This makes no sense, NIOFSDirectory uses the FS cache the exact same way as

Re: Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-19 Thread Dawid Weiss
Hi Alex, If you're using NIOFSDirectory then indeed there will be a lot of kernel calls (seeks over the off-heap fst). I'm not sure anything can be done about it. mmap seems to be just about the only option that comes to my mind as then the cost is shifted to the kernel (and data is cached/

Re: Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-18 Thread Alexander Lukyanchikov
Hello everyone, I did not get a response on this, but wanted to give an update and ask a few more questions. Our profiling shows that 80% of the time Lucene spends on reading FSTs for these queries, so the regression seems to be related to the change in Lucene 8.6 (LUCENE-9257) where the FSTs were

Boolean query regression after migrating from Lucene 8.5 to 9.2

2022-08-09 Thread Alexander Lukyanchikov
Hello everyone, We have a use-case which shows about 10 times higher latency for boolean queries after migrating to Lucene 9.2. Each query contains 3 filter clauses and up to a thousand single-term should clauses. They usually return less than 5 documents with a single stored field and used to