date:20230518

Custom SliceExecutor and slices computation in IndexSearcher

2023-05-18 Thread SorabhApache

Hi All, For concurrent segment search, lucene uses the *slices* method to compute the number of work units which can be processed concurrently. a) It calculates *slices* in the constructor of *IndexSearcher*

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Nicholas Knize

Difficult to keep up with this topic when it's spread across issues, PRs, and email lists. My poll response is option 3. -1 to option 2, I think the configuration should be moved to the HNSW specific implementation. At this point of technical maturity, it doesn't make sense (to me) to have the

Re: Allowing tests to use multiple cores

2023-05-18 Thread Michael McCandless

Hmm, I think that setting just tells the JVM to pretend the underlying hardware has only one core? I.e. forcing "Runtime.getRuntime().availableProcessors()" to return 1. But your test is still free to launch multiple threads to test concurrency and they should run on multiple actual CPU cores if

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Michael Wechner

Am 18.05.23 um 12:22 schrieb Michael McCandless: I love all the energy and passion going into debating all the ways to poke at this limit, but please let's also spend some of this passion on actually improving the scalability of our aKNN implementation! E.g. Robert opened an exciting

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Michael Wechner

It is basically the code which Michael Sokolov posted at https://markmail.org/message/kf4nzoqyhwacb7ri except - that I have replaced KnnVectorField by KnnFloatVectorField, because KnnVectorField is deprecated. - that I don't hard code the dimension as 2048 and the metric as EUCLIDEAN, but

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Michael McCandless

This isn't really a VOTE (no specific code change is being proposed), but rather a poll? Anyway, I would prefer Option 3: put the limit check into the HNSW algorithm itself. This is the right place for the limit check, since HNSW has its own scaling behaviour. It might have other limits, like

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Alessandro Benedetti

That's great and a good plan B, but let's try to focus this thread of collecting votes for a week (let's keep discussions on the nice PR opened by David or the discussion thread we have in the mailing list already :) On Thu, 18 May 2023, 10:10 Ishan Chattopadhyaya, wrote: > That sounds

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Ishan Chattopadhyaya

That sounds promising, Michael. Can you share scripts/steps/code to reproduce this? On Thu, 18 May, 2023, 1:16 pm Michael Wechner, wrote: > I just implemented it and tested it with OpenAI's text-embedding-ada-002, > which is using 1536 dimensions and it works very fine :-) > > Thanks > >

Re: [VOTE] Dimension Limit for KNN Vectors

2023-05-18 Thread Michael Wechner

I just implemented it and tested it with OpenAI's text-embedding-ada-002, which is using 1536 dimensions and it works very fine :-) Thanks Michael Am 18.05.23 um 00:29 schrieb Michael Wechner: IIUC KnnVectorField is deprecated and one is supposed to use KnnFloatVectorField when using

Custom SliceExecutor and slices computation in IndexSearcher

Re: [VOTE] Dimension Limit for KNN Vectors

Re: Allowing tests to use multiple cores

Re: [VOTE] Dimension Limit for KNN Vectors

Re: [VOTE] Dimension Limit for KNN Vectors

Re: [VOTE] Dimension Limit for KNN Vectors

Re: [VOTE] Dimension Limit for KNN Vectors

Re: [VOTE] Dimension Limit for KNN Vectors

Re: [VOTE] Dimension Limit for KNN Vectors

9 matches

Site Navigation

Mail list logo

Footer information