Hi, Thanks for the great summary ! About the point 3, the ticket has been opened: https://issues.apache.org/jira/browse/SOLR-17843
Regards, Emeric Bernet-Rollande France Labs – Your knowledge, now Datafari Enterprise Search - Retrouvez-nous au salon Big Data & IA les 1 et 2 octobre à Paris, stand C31 -----Message d'origine----- De : David Smiley <[email protected]> Envoyé : vendredi 8 août 2025 19:56 À : [email protected] Objet : Re: Dense Vector Dev Group - Aug 6 Meeting Summary Fantastic meeting summary! And it's great to see these so well attended. On Wed, Aug 6, 2025 at 3:38 PM Kevin Liang (BLOOMBERG/ 919 3RD A) < [email protected]> wrote: > Thanks to all those that attended, here's a brief summary of topics > discussed: > > 1. Lucene 10 PR is unblocked with only a few tests left to fix. Lucene > 10 is a pre-requisite for both Solr 10 release and some new dense > vector changes (seeded knn query, patience early termination knn > query, binary bit > quantization...etc) > > 2. Highlighting some other in-flight dense vector changes: > * Reciprocal rank fusion (SOLR-17319) > * Block join multi vector document (SOLR-17736) > * Scalar quantized vector field (SOLR-17780) > * Lucene support for GPUs > > 3. Question re: atomic updates and text to vector update processor. > Ticket with use case and details to be filed > > 4. Performance of Solr dense vector vs. other available vector DBs > * Solr performance is surprisingly good and comparable > * No standard community benchmark process for Solr (dense or > otherwise). Ishan and Fullstory have created solrbench, but needs > hardware to continuously run on - perhaps there can be sponsorship to enable > this? > > 5. Areas of Improvement > * Dense vector indexing needs more love (performance can quickly > drop > off) - Ishan / Noble have done some investigation into this area before. > They will see what code can be contributed / what JIRA tickets can be > created for further investigation > * Lucene HNSW graph search is fast, but likely there is room to > improve search at Solr level (there is no sharing of information or > optimization between segment searches and shards). Perhaps the > multi-threaded searching needs further refinement (SOLR-13350) > * FAISS integration exists in Lucene - add support in Solr? > > 6. Need to get a better understanding of where Solr is behind. > * Create a table to list out relevant Lucene changes by version > and if there are complementary Solr changes needed to unlock the value > * Create a table to list out other popular vector DB feature sets > against Solr's and see what is missing > > 7. How long do we intend to keep these meetings going? Can we > eventually merge into regular Solr community meetup? > * As long as people have interest in attending > * Primary goal is to build up momentum of dense vector > contributions through roadmap planning, information sharing, and > community support > > The next meeting will be on Sept 3rd. We will follow-up on the feature > matrix initiatives in 6, features for Solr 10, and probably further > discussion on performance benchmarking. > > Cheers > > -Kevin > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
