Hi all,
I first published a concurrent HNSW PR in April, which turned out to be a
bit premature. There was a lot of code churn as I fixed bugs and improved
performance. Sorry about that!
This code has been available as part of DataStax Astra’s public vector
search preview for almost a month now
Draft PR is posted here: https://github.com/apache/lucene/pull/12254
This depends on my PR to use HashMap in the non-concurrent OnHeapHnswGraph
(because that PR updates the tests to not assume sorted order of nodes in a
given level): https://github.com/apache/lucene/pull/12248
On Fri, Apr 28, 202
Great, I will work on squashing to get a clean PR.
One thing I am struggling with is the RamUsageTester. Here is the
stacktrace: https://gist.github.com/jbellis/20676b0e23f43751cbe8834a8def0d12
Apparently RamUsageTester tries to flip private fields to public so it can
introspect them, but the JV
That's great! And we were talking about this exactly here:
https://github.com/apache/lucene/pull/12169
It would also help with the new token filter :)
--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*
e-mail: a.benede..
+1 for a pull request
Thanks
Michael
Am 27.04.23 um 20:53 schrieb Ishan Chattopadhyaya:
+1, please contribute to Lucene. Thanks!
On Thu, 27 Apr, 2023, 10:59 pm Jonathan Ellis, wrote:
Hi all,
I've created an HNSW index implementation that allows for
concurrent build and querying
+1, please contribute to Lucene. Thanks!
On Thu, 27 Apr, 2023, 10:59 pm Jonathan Ellis, wrote:
> Hi all,
>
> I've created an HNSW index implementation that allows for concurrent build
> and querying. On my i9-12900 (8 performance cores and 8 efficiency) I get
> a bit less than 10x speedup of wa
Hi all,
I've created an HNSW index implementation that allows for concurrent build
and querying. On my i9-12900 (8 performance cores and 8 efficiency) I get
a bit less than 10x speedup of wall clock time for building and querying
the "siftsmall" and "sift" datasets from http://corpus-texmex.irisa