Re: Weird HNSW merge performance result

2023-10-11 Thread Patrick Zhai
Hi Ben, Thanks! I think that's the issue! I was using some old local checkout. I will try with the latest commit and report back if the results still look weird. On Wed, Oct 11, 2023, 12:26 Benjamin Trent wrote: > Heya Patrick, > > What version of Lucene Util are you using? There was a bug

Re: Weird HNSW merge performance result

2023-10-11 Thread Benjamin Trent
Heya Patrick, What version of Lucene Util are you using? There was a bug where `forceMerge` was not actually using your configured maxConn & beamWidth. See: https://github.com/mikemccand/luceneutil/pull/232 Do you have that commit and rebuilt the KnnGraphTester? On Wed, Oct 11, 2023 at 10:10 AM

Re: Weird HNSW merge performance result

2023-10-11 Thread Patrick Zhai
Hi Adrien, I'm using the default CMS, but I doubt whether the merge will be triggered at all in the background. Since no merge policy is changed the default TMP will likely only merge the segments after they reach 10 I believe? But the index is about 300M and the buffer size is around 50M so I

Re: Weird HNSW merge performance result

2023-10-10 Thread Adrien Grand
Regarding building time, did you configure a SerialMergeScheduler? Otherwise merges run in separate threads, which would explain the speedup as adding vectors to the graph gets more and more expensive as the size of the graph increases. Le mer. 11 oct. 2023, 05:07, Patrick Zhai a écrit : > Hi

Weird HNSW merge performance result

2023-10-10 Thread Patrick Zhai
Hi folks, I was running the HNSW benchmark today and found some weird results. Want to share it here and see whether people have any ideas. The set up is: the 384 dimension vector that's available in luceneutil, 100k documents. And lucene main branch. max_conn=64, fanout=0, beam_width=250 I