[ https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570112#comment-17570112 ]
ASF subversion and git services commented on LUCENE-10592: ---------------------------------------------------------- Commit bd06cebfc2815bb508314ed8a4215e9da7f36de6 in lucene's branch refs/heads/main from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=bd06cebfc28 ] Add change log for LUCENE-10592 > Should we build HNSW graph on the fly during indexing > ----------------------------------------------------- > > Key: LUCENE-10592 > URL: https://issues.apache.org/jira/browse/LUCENE-10592 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Mayya Sharipova > Assignee: Mayya Sharipova > Priority: Minor > Time Spent: 8h > Remaining Estimate: 0h > > Currently, when we index vectors for KnnVectorField, we buffer those vectors > in memory and on flush during a segment construction we build an HNSW graph. > As building an HNSW graph is very expensive, this makes flush operation take > a lot of time. This also makes overall indexing performance quite > unpredictable (as the number of flushes are defined by memory used, and the > presence of concurrent searches), e.g. some indexing operations return almost > instantly while others that trigger flush take a lot of time. > Building an HNSW graph on the fly as we index vectors allows to avoid this > problem, and spread a load of HNSW graph construction evenly during indexing. > This will also supersede LUCENE-10194 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org