Re:Dense Vector Group - Next Steps

Kevin Liang (BLOOMBERG/ 919 3RD A) Thu, 10 Jul 2025 14:35:09 -0700

Thanks Alessandro!

Additionally to give a quick recap of the first meeting for those who weren't 
able to make it:
* Intros
* JIRA process for organizing dense vector work (as Alessandro summarized)
* Initial areas of interest:
    1. Quantization
    2. HNSW early termination
    3. ACORN based filtering
    4. Solr embedded LLMs? (open discussion about if this would gain 
traction/how best to do it)
    5. Reciprocal rank fusion support
    6. Chunking support
As Alessandro noted, please do not feel limited by these categories. Any good 
ideas please share and run with it!


We will keep the dense vector meetup going as a recurring monthly meetup on the 
first Wednesday of every month @ 12PM ET/4PM GMT. 
So the next one will be August 6th. See you all then

-Kevin

From: [email protected] At: 07/10/25 06:29:05 UTC-4:00To:  Kevin Liang 
(BLOOMBERG/ 919 3RD A ) ,  [email protected]
Subject: Dense Vector Group - Next Steps

Hi guys, thanks for the meeting yesterday, cool stuff!

I spent a bit of JIRA time, and I think I managed to create something that
would make more or less everybody happy (please iterate/suggest other
options if it doesn't!):

Components-based approach as recommended by David and Houston:
https://issues.apache.org/jira/browse/SOLR-17815?jql=project%20%3D%20SOLR%20AND%
20component%20%3D%20vector-search

A kanban board to have a quick glance at what's going on, what's in
progress, what's in to-do, etc.
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=633&projectKey=S
OLR

Some general guidelines:

   - All Solr work is volunteering based, so there's no obligation on
   having to work on the issues created so far first, they are a good start
   but if you fancy to contribute anything else, *it comes without saying
   that you can create other issues*, tag them with the right component
   'vector-search' and start working on them.
   - It's ok to be in 'stealth mode' until you are confident enough to
   share what you built, we all have limited time to donate and drafting
   something is better than just wait indefinitely for discussions that may
   happen months later, just to better coordinate if you are working on
   something put the Jira in progress and leave a comment, in this way if
   someone else wants to takle the same issue, can reach out to you and
   potentially interact even before a pull request is opened.
   - If you have the luxury of more time and no rush, f*eel free to reach
   out on the dev list/Slack to discuss new issues/designs/ideas*. There's
   no guarantee, but someone may be available to jump on a call in a day or
   two. Especially if you are unsure on where or how to start.
   - Once a Pull Request is opened, link it to the Jira (if you follow
   standard naming conventions, this should happen automatically), keep it
   open for a while, participate in discussions if any, and after a bit with
   no interaction, try to not lose the momentum and ask a committer to merge.
   If it's good enough, it's better than never merging, waiting for
   perfection; other iterations can happen and improve/add what's missing.
   - Once you start a contribution, don't feel pressured to finish it; it's
   ok to donate something half-baked, it could be an interesting starting
   point for others that can re-use the initial work. If you feel your
   contribution is not good enough for a pull request but at the same time you
   start to question how much time you can dedicate to it to continue the
   work, don't worry, publish the pull request with a disclaimer,* no-one
   will judge you and someone potentially can take it from there*!

Feel free to add other issues and tag them appropriately,* it's now
fundamental to use the 'vector-search' component if we want a cohesive view
on the topic.*

In the next few months, hoping this initiative is successful, I'll do the
same with 'LLM-Search'.

We'll keep the meeting rolling. I'm afraid I may not join the next one at
the same time, but I can catch up offline with @Kevin Liang
<[email protected]> on the same day or something.

Cheers
--------------------------
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr Chair of PMC*

e-mail: [email protected]


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io <http://sease.io/>
LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
<https://twitter.com/seaseltd> | Youtube
<https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
<https://github.com/seaseltd>

Re:Dense Vector Group - Next Steps

Reply via email to