Hello! I also ran some local vector search benchmarks on branch_9_4. I found that given the same parameters, there is a significant change in recall/ QPS before and after the initial "enable quantization to 8-bit" backport (https://github.com/apache/lucene/pull/1054). Here's an example with M=16, efConst=100, plus fanout=100:
*BEFORE (commit 45d06772cb62a518dab43ca9d7000a2dc707d345)* Algorithm Recall QPS luceneknn dim=100 {'M': 16, 'efConstruction': 100} 0.828 788.259 *AFTER (commit 87a8f7d48f3d25ed008b5930cf83afc01d7dc588, with compile fixes for Java 11)* Algorithm Recall QPS luceneknn dim=100 {'M': 16, 'efConstruction': 100} 0.792 813.534 This is really unexpected, because the commit didn't intend to change anything about the core ANN algorithm. Looking at Mike's results, the recall changes from 0.775 -> 0.755, which is a significant difference -- if we didn't change the algorithm, I don't think the recall should change between runs. It seems important to look into before the release! Looking at the nightly benchmarks, there wasn't a drop in vector search performance. I'm guessing this is because we only report QPS, and after this commit the QPS actually improved a little (for some given parameters). I don't think we have any checks for if the recall has dropped? Julie On Thu, Sep 15, 2022 at 9:32 AM Michael Sokolov <msoko...@gmail.com> wrote: > it looks like a small bug fix, we have had on main (and 9.x?) for a > while now and no test failures showed up, I guess. Should be OK to > port. I plan to cut artifacts this weekend, or Monday at the latest, > but if you can do the backport today or tomorrow, that's fine by me. > > On Thu, Sep 15, 2022 at 10:55 AM Adrien Grand <jpou...@gmail.com> wrote: > > > > Mike, I'm tempted to backport https://github.com/apache/lucene/pull/1068 > to branch_9_4, which is a bugfix that looks pretty safe to me. What do you > think? > > > > On Tue, Sep 13, 2022 at 4:11 PM Mayya Sharipova < > mayya.sharip...@elastic.co.invalid> wrote: > >> > >> Thanks for running more tests, Michael. > >> It is encouraging that you saw a similar performance between 9.3 and > 9.4. I will also run more tests with different parameters. > >> > >> On Tue, Sep 13, 2022 at 9:30 AM Michael Sokolov <msoko...@gmail.com> > wrote: > >>> > >>> As a follow-up, I ran a test using the same parameters as above, only > >>> changing M=200 to M=16. This did result in a single segment in both > >>> cases (9.3, 9.4) and the performance was pretty similar; within noise > >>> I think. The main difference I saw was that the 9.3 index was written > >>> using CFS: > >>> > >>> 9.4: > >>> recall latency nDoc fanout maxConn beamWidth visited index > ms > >>> 0.755 1.36 1000000 100 16 100 200 891402 1.00 > >>> post-filter > >>> -rw-r--r-- 1 sokolovm amazon 382M Sep 13 13:06 > >>> _0_Lucene94HnswVectorsFormat_0.vec > >>> -rw-r--r-- 1 sokolovm amazon 262K Sep 13 13:06 > >>> _0_Lucene94HnswVectorsFormat_0.vem > >>> -rw-r--r-- 1 sokolovm amazon 131M Sep 13 13:06 > >>> _0_Lucene94HnswVectorsFormat_0.vex > >>> > >>> 9.3: > >>> recall latency nDoc fanout maxConn beamWidth visited index > ms > >>> 0.775 1.34 1000000 100 16 100 4033 977043 > >>> rw-r--r-- 1 sokolovm amazon 297 Sep 13 13:26 _0.cfe > >>> -rw-r--r-- 1 sokolovm amazon 516M Sep 13 13:26 _0.cfs > >>> -rw-r--r-- 1 sokolovm amazon 340 Sep 13 13:26 _0.si > >>> > >>> On Tue, Sep 13, 2022 at 8:50 AM Michael Sokolov <msoko...@gmail.com> > wrote: > >>> > > >>> > I ran another test. I thought I had increased the RAM buffer size to > >>> > 8G and heap to 16G. However I still see two segments in the index > that > >>> > was created. And looking at the infostream I see: > >>> > > >>> > dir=MMapDirectory@ > /local/home/sokolovm/workspace/knn-perf/glove-100-angular.hdf5-train-200-200.index > >>> > lockFactory=org\ > >>> > .apache.lucene.store.NativeFSLockFactory@4466af20 > >>> > index= > >>> > version=9.4.0 > >>> > analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer > >>> > ramBufferSizeMB=8000.0 > >>> > maxBufferedDocs=-1 > >>> > ... > >>> > perThreadHardLimitMB=1945 > >>> > ... > >>> > DWPT 0 [2022-09-13T02:42:53.329404950Z; main]: flush postings as > >>> > segment _6 numDocs=555373 > >>> > IW 0 [2022-09-13T02:42:53.330671171Z; main]: 0 msec to write norms > >>> > IW 0 [2022-09-13T02:42:53.331113184Z; main]: 0 msec to write > docValues > >>> > IW 0 [2022-09-13T02:42:53.331320146Z; main]: 0 msec to write points > >>> > IW 0 [2022-09-13T02:42:56.424195657Z; main]: 3092 msec to write > vectors > >>> > IW 0 [2022-09-13T02:42:56.429239944Z; main]: 4 msec to finish stored > fields > >>> > IW 0 [2022-09-13T02:42:56.429593512Z; main]: 0 msec to write postings > >>> > and finish vectors > >>> > IW 0 [2022-09-13T02:42:56.430309031Z; main]: 0 msec to write > fieldInfos > >>> > DWPT 0 [2022-09-13T02:42:56.431721622Z; main]: new segment has 0 > deleted docs > >>> > DWPT 0 [2022-09-13T02:42:56.431921144Z; main]: new segment has 0 > >>> > soft-deleted docs > >>> > DWPT 0 [2022-09-13T02:42:56.435738086Z; main]: new segment has no > >>> > vectors; no norms; no docValues; no prox; freqs > >>> > DWPT 0 [2022-09-13T02:42:56.435952356Z; main]: > >>> > flushedFiles=[_6_Lucene94HnswVectorsFormat_0.vec, _6.fdm, _6.fdt, > _6_\ > >>> > Lucene94HnswVectorsFormat_0.vem, _6.fnm, _6.fdx, > >>> > _6_Lucene94HnswVectorsFormat_0.vex] > >>> > DWPT 0 [2022-09-13T02:42:56.436121861Z; main]: flushed codec=Lucene94 > >>> > DWPT 0 [2022-09-13T02:42:56.437691468Z; main]: flushed: segment=_6 > >>> > ramUsed=1,945.002 MB newFlushedSize=1,065.701 MB \ > >>> > docs/MB=521.134 > >>> > > >>> > so I think it's this perThreadHardLimit that is triggering the > >>> > flushes? TBH this isn't something I had seen before; but the docs > say: > >>> > > >>> > /** > >>> > * Expert: Sets the maximum memory consumption per thread > triggering > >>> > a forced flush if exceeded. A > >>> > * {@link DocumentsWriterPerThread} is forcefully flushed once it > >>> > exceeds this limit even if the > >>> > * {@link #getRAMBufferSizeMB()} has not been exceeded. This is a > >>> > safety limit to prevent a {@link > >>> > * DocumentsWriterPerThread} from address space exhaustion due to > >>> > its internal 32 bit signed > >>> > * integer based memory addressing. The given value must be less > >>> > that 2GB (2048MB) > >>> > * > >>> > * @see #DEFAULT_RAM_PER_THREAD_HARD_LIMIT_MB > >>> > */ > >>> > > >>> > On Mon, Sep 12, 2022 at 6:28 PM Michael Sokolov <msoko...@gmail.com> > wrote: > >>> > > > >>> > > Hi Mayya, thanks for persisting - I think we need to wrestle this > to > >>> > > the ground for sure. In the test I ran, RAM buffer was the default > >>> > > checked in, which is weirdly: 1994MB. I did not specifically set > heap > >>> > > size. I used maxConn/M=200. I'll try with larger buffer to see if > I > >>> > > can get 9.4 to produce a single segment for the same test > settings. I > >>> > > see you used a much smaller M (16), which should have produced > quite > >>> > > small graphs, and I agree, should have been a single segment. Were > you > >>> > > able to verify the number of segments? > >>> > > > >>> > > Agree that decrease in recall is not expected when more segments > are produced. > >>> > > > >>> > > On Mon, Sep 12, 2022 at 1:51 PM Mayya Sharipova > >>> > > <mayya.sharip...@elastic.co.invalid> wrote: > >>> > > > > >>> > > > Hello Michael, > >>> > > > Thanks for checking. > >>> > > > Sorry for bringing this up again. > >>> > > > First of all, I am ok with proceeding with the Lucene 9.4 > release and leaving the performance investigations for later. > >>> > > > > >>> > > > I am interested in what's the maxConn/M value you used for your > tests? What was the heap memory and the size of the RAM buffer for indexing? > >>> > > > Usually, when we have multiple segments, recall should increase, > not decrease. But I agree that with multiple segments we can see a big drop > in QPS. > >>> > > > > >>> > > > Here is my investigation with detailed output of the performance > difference between 9.3 and 9.4 releases. In my tests I used a large > indexing buffer (2Gb) and large heap (5Gb) to end up with a single segment > for both 9.3 and 9.4 tests, but still see a big drop in QPS in 9.4. > >>> > > > > >>> > > > Thank you. > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > On Fri, Sep 9, 2022 at 12:21 PM Alan Woodward < > romseyg...@gmail.com> wrote: > >>> > > >> > >>> > > >> Done. Thanks! > >>> > > >> > >>> > > >> > On 9 Sep 2022, at 16:32, Michael Sokolov <msoko...@gmail.com> > wrote: > >>> > > >> > > >>> > > >> > Hi Alan - I checked out the interval queries patch; seems > pretty safe, > >>> > > >> > please go ahead and port to 9.4. Thanks! > >>> > > >> > > >>> > > >> > Mike > >>> > > >> > > >>> > > >> > On Fri, Sep 9, 2022 at 10:41 AM Alan Woodward < > romseyg...@gmail.com> wrote: > >>> > > >> >> > >>> > > >> >> Hi Mike, > >>> > > >> >> > >>> > > >> >> I’ve opened https://github.com/apache/lucene/pull/11760 as > a small bug fix PR for a problem with interval queries. Am I OK to port > this to the 9.4 branch? > >>> > > >> >> > >>> > > >> >> Thanks, Alan > >>> > > >> >> > >>> > > >> >> On 2 Sep 2022, at 20:42, Michael Sokolov <msoko...@gmail.com> > wrote: > >>> > > >> >> > >>> > > >> >> NOTICE: > >>> > > >> >> > >>> > > >> >> Branch branch_9_4 has been cut and versions updated to 9.5 > on stable branch. > >>> > > >> >> > >>> > > >> >> Please observe the normal rules: > >>> > > >> >> > >>> > > >> >> * No new features may be committed to the branch. > >>> > > >> >> * Documentation patches, build patches and serious bug fixes > may be > >>> > > >> >> committed to the branch. However, you should submit all > patches you > >>> > > >> >> want to commit to Jira first to give others the chance to > review > >>> > > >> >> and possibly vote against the patch. Keep in mind that it is > our > >>> > > >> >> main intention to keep the branch as stable as possible. > >>> > > >> >> * All patches that are intended for the branch should first > be committed > >>> > > >> >> to the unstable branch, merged into the stable branch, and > then into > >>> > > >> >> the current release branch. > >>> > > >> >> * Normal unstable and stable branch development may continue > as usual. > >>> > > >> >> However, if you plan to commit a big change to the unstable > branch > >>> > > >> >> while the branch feature freeze is in effect, think twice: > can't the > >>> > > >> >> addition wait a couple more days? Merges of bug fixes into > the branch > >>> > > >> >> may become more difficult. > >>> > > >> >> * Only Jira issues with Fix version 9.4 and priority > "Blocker" will delay > >>> > > >> >> a release candidate build. > >>> > > >> >> > >>> > > >> >> > --------------------------------------------------------------------- > >>> > > >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >>> > > >> >> For additional commands, e-mail: dev-h...@lucene.apache.org > >>> > > >> >> > >>> > > >> >> > >>> > > >> > > >>> > > >> > > --------------------------------------------------------------------- > >>> > > >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >>> > > >> > For additional commands, e-mail: dev-h...@lucene.apache.org > >>> > > >> > > >>> > > >> > >>> > > >> > >>> > > >> > --------------------------------------------------------------------- > >>> > > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >>> > > >> For additional commands, e-mail: dev-h...@lucene.apache.org > >>> > > >> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >>> For additional commands, e-mail: dev-h...@lucene.apache.org > >>> > > > > > > -- > > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >