Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-01 Thread Michael Sokolov
I'm also in favor of raising this limit. We do see some datasets with higher than 1024 dims. I also think we need to keep a limit. For example we currently need to keep all the vectors in RAM while indexing and we want to be able to support reasonable numbers of vectors in an index segment. Also

Re: [Proposal] Remove max number of dimensions for KNN vectors

2023-04-01 Thread Ishan Chattopadhyaya
+1 to raising the limit. Maybe in future performance problems can be mitigated with optimisations or hardware acceleration (GPUs) etc. On Sat, 1 Apr, 2023, 6:18 pm Michael Sokolov, wrote: > I'm also in favor of raising this limit. We do see some datasets with > higher than 1024 dims. I also

question about impacts use case

2023-04-01 Thread Michael Sokolov
Hi, I've been working on seeing whether we can make use of impacts in Amazon search and I have some questions. To date, we haven't used Lucene's scoring APIs at all; all of our queries are constant score, we early terminate based on a sorted index rank and then re-rank using custom non-Lucene

Re: question about impacts use case

2023-04-01 Thread Michael Sokolov
Well, digging a little deeper I can see that skipping behavior is going to depend heavily on the distribution of documents in the index, and how many skip levels there are and so on, and I may be getting hung up on a particular test case that doesn't generalize. In this case all the high-scoring