Re: [JENKINS] Lucene-9.x-Linux (64bit/openj9/jdk-17.0.5) - Build # 9891 - Unstable!

2023-04-19 Thread Dawid Weiss
openj9. Does not reproduce for me. On Thu, Apr 20, 2023 at 4:50 AM Policeman Jenkins Server wrote: > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-Linux/9891/ > Java: 64bit/openj9/jdk-17.0.5 -XX:-UseCompressedOops -Xgcpolicy:metronome > > 1 tests failed. > FAILED:

Should IndexWriter.flush return seqNo?

2023-04-19 Thread Patrick Zhai
Hi folks, I just realized that while "commit" returns the sequence number which represents the latest event that committed in the index, "flush" still returns nothing. Since they're essentially the same except fsync I wonder whether there's any specific reason to not do so? Best Patrick

Re: HNSW questions

2023-04-19 Thread Michael Sokolov
That class is intended for use by the Lucene index writer - it's not designed as a general purpose class for re-use outside that context. And IndexWriter writes documents to disk in bulk. On Wed, Apr 19, 2023 at 3:54 PM Jonathan Ellis wrote: > > Thanks, Michael! > > Looking at the paper by

Re: Lucene 9.6 release

2023-04-19 Thread Michael Sokolov
Yes, thanks Alan! On Wed, Apr 19, 2023 at 3:41 PM Michael Wechner wrote: > > +1 > > Thanks! > > Michael > > Am 19.04.23 um 18:09 schrieb Benjamin Trent: > > +1 ! > > You rock Alan! > > On Wed, Apr 19, 2023, 9:54 AM Ignacio Vera wrote: >> >> +1 >> >> Thanks Alan! >> >> On Wed, Apr 19, 2023 at

Re: HNSW questions

2023-04-19 Thread Jonathan Ellis
Thanks, Michael! Looking at the paper by Malkov and Yashunin, it looks like the algorithm allows for building the hnsw graph incrementally. Why does our implementation require specifying all the vectors up front to HnswGraphBuilder.create? On Wed, Apr 19, 2023 at 3:04 AM Michael Sokolov wrote:

Re: Lucene 9.6 release

2023-04-19 Thread Michael Wechner
+1 Thanks! Michael Am 19.04.23 um 18:09 schrieb Benjamin Trent: +1 ! You rock Alan! On Wed, Apr 19, 2023, 9:54 AM Ignacio Vera wrote: +1 Thanks Alan! On Wed, Apr 19, 2023 at 1:27 PM Alan Woodward wrote: Hi all, It’s been a while since our last release,

Re: Lucene 9.6 release

2023-04-19 Thread Benjamin Trent
+1 ! You rock Alan! On Wed, Apr 19, 2023, 9:54 AM Ignacio Vera wrote: > +1 > > Thanks Alan! > > On Wed, Apr 19, 2023 at 1:27 PM Alan Woodward > wrote: > >> Hi all, >> >> It’s been a while since our last release, and we have a number of nice >> improvements and optimisations sitting in the 9x

Re: Lucene 9.6 release

2023-04-19 Thread Ignacio Vera
+1 Thanks Alan! On Wed, Apr 19, 2023 at 1:27 PM Alan Woodward wrote: > Hi all, > > It’s been a while since our last release, and we have a number of nice > improvements and optimisations sitting in the 9x branch. I propose that we > start the process for a 9.6 release, and I will volunteer to

Lucene 9.6 release

2023-04-19 Thread Alan Woodward
Hi all, It’s been a while since our last release, and we have a number of nice improvements and optimisations sitting in the 9x branch. I propose that we start the process for a 9.6 release, and I will volunteer to be the release manager. If there are no objections, I will cut a release

Re: HNSW questions

2023-04-19 Thread Michael Sokolov
Oh identical vectors. Basically unsupported. If you create a large index filled with identical vectors it leads to pathological behavior. Seems to be a weakness in the algorithm. If you have any idea how to improve that, it would be welcome. But in real world scenarios, it doesn't seem to arise?

Re: HNSW questions

2023-04-19 Thread Michael Sokolov
These vector values have internal buffers they use to return the vectors. In order to compare two vectors we need to use two independent sources so that one doesn't overwrite this internal state when fetching the second vector. Sorry I forgot the second question and can't see it on my phone. Brb