Idea about faster vector format merge

2022-10-18 Thread Patrick Zhai
Hi Folks I've talked with Mike Sokolov and learnt some KNN knowledge from him (thank you!) during ApacheCon and one thing I learnt was that our KNN implementation was kind of suffering from long merging time because we currently rebuild the graph from scratch every time we merge. I noticed

Re: call for 9.4.1 release (bug in vectors format)

2022-10-18 Thread Julie Tibshirani
I've uploaded a fix in https://github.com/apache/lucene/pull/11861 (thanks Mike for the review!). If there are no objections, I plan to merge it tomorrow and then get started on a 9.4.1 release candidate. Julie On Tue, Oct 18, 2022 at 2:52 PM Michael Sokolov wrote: > Oh no! Very sorry -- thank

Re: call for 9.4.1 release (bug in vectors format)

2022-10-18 Thread Michael Sokolov
Oh no! Very sorry -- thank you for volunteering to fix (hangs head in shame). I guess I'll see where the bug is soon ... On Tue, Oct 18, 2022 at 2:50 PM Michael Wechner wrote: > > +1 :-) > > Thanks > > Michael > > Am 18.10.22 um 19:52 schrieb Julie Tibshirani: > > Hi everyone, > > > > We

Re: call for 9.4.1 release (bug in vectors format)

2022-10-18 Thread Mayya Sharipova
+1, Thanks Julie for tackling this, and serving as a release manager. On Tue, Oct 18, 2022 at 2:51 PM Michael Wechner wrote: > +1 :-) > > Thanks > > Michael > > Am 18.10.22 um 19:52 schrieb Julie Tibshirani: > > Hi everyone, > > > > We recently discovered a severe bug in the 9.4 release in the

RE: Backporting of Nori

2022-10-18 Thread Shad Storhaug
Hello, To clarify, this is regarding the GitHub PR and Lucene JIRA ticket where you can read more info: https://github.com/apache/lucenenet/pull/645 https://issues.apache.org/jira/browse/LUCENE-8231 We attempted to port the Nori analysis package to .NET 3 years ago from Lucene 8.2.0 to

Re: call for 9.4.1 release (bug in vectors format)

2022-10-18 Thread Michael Wechner
+1 :-) Thanks Michael Am 18.10.22 um 19:52 schrieb Julie Tibshirani: Hi everyone, We recently discovered a severe bug in the 9.4 release in the kNN vectors format: https://github.com/apache/lucene/issues/11858. Explaining the problem: when ingesting a lot of data, or when performing a

call for 9.4.1 release (bug in vectors format)

2022-10-18 Thread Julie Tibshirani
Hi everyone, We recently discovered a severe bug in the 9.4 release in the kNN vectors format: https://github.com/apache/lucene/issues/11858. Explaining the problem: when ingesting a lot of data, or when performing a force merge, segments can grow large. The format validation code accidentally