Hi Alex,

If you're using NIOFSDirectory then indeed there will be a lot of kernel
calls (seeks over the off-heap fst). I'm not sure anything can be done
about it. mmap seems to be just about the only option that comes to my mind
as then the cost is shifted to the kernel (and data is cached/ released
more efficiently). It would be interesting to provide some kind of
benchmark that would create a large-ish index and then use both directories
for the same lookups - then we'd know whether:

a) it's indeed the problem with nio vs. mmap (very likely),
b) what the actual hotspots are in nio (profiling),
c) the problem is OS-specific; I bet the behavior here is different between
different OSs. MMap will likely be faster on most of them but I wonder if
it's consistent everywhere.

Dawid

On Thu, Aug 18, 2022 at 7:47 PM Alexander Lukyanchikov <
alexanderlukyanchi...@gmail.com> wrote:

> Hello everyone, I did not get a response on this, but wanted to give an
> update and ask a few more questions. Our profiling shows that 80% of the
> time Lucene spends on reading FSTs for these queries, so the regression
> seems to be related to the change in Lucene 8.6 (LUCENE-9257) where the
> FSTs were moved off-heap. Our understanding is that with this change,
> reading FSTs became less efficient with NIOFSDirectory because Lucene
> spends more time on bringing data on heap which also significantly
> increases the number of system calls / kernel CPU.
>
> Currently we are trying to avoid switching to MMAP because there is
> another process running on the same host and extensively utilizes the FS
> cache. We did try to use FileSwitchDirectory and MMAP only a minimal amount
> of files - Term Dictionary and Term Index. That helped, but only in some
> use cases.
>
> Is there anything else we are missing, maybe some other Lucene data
> structures are critical to MMAP with FSTs off-heap? Is there anything else
> we could try? There is probably an option to bring FSTs back on heap, but
> we are trying to avoid it since there is no configuration for this so it
> requires a lot of code changes.
>
> Thank you,
> Alex
>
>
> On Tue, Aug 9, 2022 at 6:34 AM Alexander Lukyanchikov <
> alexanderlukyanchi...@gmail.com> wrote:
>
>> Hello everyone,
>> We have a use-case which shows about 10 times higher latency for boolean
>> queries after migrating to Lucene 9.2. Each query contains 3 filter clauses
>> and up to a thousand single-term should clauses. They usually return less
>> than 5 documents with a single stored field and used to execute in about
>> 500 ms, but now take more than 5 seconds. Throughput remains roughly the
>> same, about 60 qps.
>>
>> A compounding factor is that our disk utilization is quite high and
>> usually in the range of 70% to 100%. That seems like an obvious issue, but
>> it did not affect these queries too much before. After the upgrade, we can
>> also see about twice as high kernel CPU utilization. We are currently using
>> NIOFSDirectory.
>>
>> Please let me know if these symptoms mean something to you, it would be
>> great to know if we are doing something wrong and if there is a way to fix
>> the queries without upgrading hardware.
>>
>> Thank you,
>> Alex
>>
>

Reply via email to