[jira] [Commented] (LUCENE-10592) Should we build HNSW graph on the fly during indexing

Mayya Sharipova (Jira) Thu, 14 Jul 2022 06:53:07 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566853#comment-17566853
 ]


Mayya Sharipova commented on LUCENE-10592:
------------------------------------------

[~julietibs] Thanks for studying this PR.  Indeed, sorting logic got 
complicated, and for now I could not find a better way. As an alternative I was 
thinking to completely rebuild a graph with sorted vector values (similar to 
merging procedure) but I thought it would take more time than just re-maping 
the ordinals. 

> Should we build HNSW graph on the fly during indexing
> -----------------------------------------------------
>
>                 Key: LUCENE-10592
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10592
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Mayya Sharipova
>            Assignee: Mayya Sharipova
>            Priority: Minor
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Currently, when we index vectors for KnnVectorField, we buffer those vectors 
> in memory and on flush during a segment construction we build an HNSW graph.  
> As building an HNSW graph is very expensive, this makes flush operation take 
> a lot of time. This also makes overall indexing performance quite 
> unpredictable (as the number of flushes are defined by memory used, and the 
> presence of concurrent searches), e.g. some indexing operations return almost 
> instantly while others that trigger flush take a lot of time. 
> Building an HNSW graph on the fly as we index vectors allows to avoid this 
> problem, and spread a load of HNSW graph construction evenly during indexing.
> This will also supersede LUCENE-10194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-10592) Should we build HNSW graph on the fly during indexing

Reply via email to