[
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mayya Sharipova resolved LUCENE-10592.
--------------------------------------
Fix Version/s: 9.4
Resolution: Fixed
> Should we build HNSW graph on the fly during indexing
> -----------------------------------------------------
>
> Key: LUCENE-10592
> URL: https://issues.apache.org/jira/browse/LUCENE-10592
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Mayya Sharipova
> Assignee: Mayya Sharipova
> Priority: Minor
> Fix For: 9.4
>
> Time Spent: 8h
> Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors
> in memory and on flush during a segment construction we build an HNSW graph.
> As building an HNSW graph is very expensive, this makes flush operation take
> a lot of time. This also makes overall indexing performance quite
> unpredictable (as the number of flushes are defined by memory used, and the
> presence of concurrent searches), e.g. some indexing operations return almost
> instantly while others that trigger flush take a lot of time.
> Building an HNSW graph on the fly as we index vectors allows to avoid this
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]