Re: Weird HNSW merge performance result

Patrick Zhai Wed, 11 Oct 2023 09:41:04 -0700

Hi Ben,
Thanks! I think that's the issue! I was using some old local checkout. I
will try with the latest commit and report back if the results still look
weird.



On Wed, Oct 11, 2023, 12:26 Benjamin Trent <ben.w.tr...@gmail.com> wrote:

> Heya Patrick,
>
> What version of Lucene Util are you using? There was a bug where
> `forceMerge` was not actually using your configured maxConn & beamWidth.
> See: https://github.com/mikemccand/luceneutil/pull/232
>
> Do you have that commit and rebuilt the KnnGraphTester?
>
> On Wed, Oct 11, 2023 at 10:10 AM Patrick Zhai <zhai7...@gmail.com> wrote:
>
>> Hi Adrien,
>> I'm using the default CMS, but I doubt whether the merge will be
>> triggered at all in the background. Since no merge policy is changed the
>> default TMP will likely only merge the segments after they reach 10 I
>> believe? But the index is about 300M and the buffer size is around 50M so I
>> don't think we will have enough segments to trigger the merge when I'm
>> building the index?
>>
>> On Wed, Oct 11, 2023, 02:47 Adrien Grand <jpou...@gmail.com> wrote:
>>
>>> Regarding building time, did you configure a SerialMergeScheduler?
>>> Otherwise merges run in separate threads, which would explain the speedup
>>> as adding vectors to the graph gets more and more expensive as the size of
>>> the graph increases.
>>>
>>> Le mer. 11 oct. 2023, 05:07, Patrick Zhai <zhai7...@gmail.com> a écrit :
>>>
>>>> Hi folks,
>>>> I was running the HNSW benchmark today and found some weird results.
>>>> Want to share it here and see whether people have any ideas.
>>>>
>>>> The set up is:
>>>> the 384 dimension vector that's available in luceneutil, 100k
>>>> documents. And lucene main branch.
>>>> max_conn=64, fanout=0, beam_width=250
>>>>
>>>> I first tried with the default setting where we use a 1994MB writer
>>>> buffer, so with 100k documents, there will be no merge happening and I will
>>>> have 1 segment at the end.
>>>> This gives me 0.755 recall and 101113ms index building time.
>>>>
>>>> Then I tried with 50MB writer buffer and then forcemerge at the last,
>>>> and with 100k documents, I'll get several segments (the final index is
>>>> around 300MB so I guess 5 or 6) before merge, and then merge them into 1 at
>>>> last.
>>>> This gives me 0.692 recall but it took only 81562ms (including 34394ms
>>>> doing the merge) to index.
>>>> I have also tried disabling the initialize from graph feature (such
>>>> that when we merge we always rebuild the whole graph), or change the random
>>>> seed, but still get the similar result.
>>>>
>>>> I'm wondering:
>>>> 1. Why recall drops that much in the later setup?
>>>> 2. Why index time is way better? I think we still need to rebuild the
>>>> whole graph, or maybe it's just because we're using more off-heap memory
>>>> (and less heap) when merge (do we?)?
>>>>
>>>> Best
>>>> Patrick
>>>>
>>>

Re: Weird HNSW merge performance result

Reply via email to