Thanks Alessandro!
Additionally to give a quick recap of the first meeting for those who weren't
able to make it:
* Intros
* JIRA process for organizing dense vector work (as Alessandro summarized)
* Initial areas of interest:
1. Quantization
2. HNSW early termination
3. ACORN based filtering
4. Solr embedded LLMs? (open discussion about if this would gain
traction/how best to do it)
5. Reciprocal rank fusion support
6. Chunking support
As Alessandro noted, please do not feel limited by these categories. Any good
ideas please share and run with it!
We will keep the dense vector meetup going as a recurring monthly meetup on the
first Wednesday of every month @ 12PM ET/4PM GMT.
So the next one will be August 6th. See you all then
-Kevin
From: [email protected] At: 07/10/25 06:29:05 UTC-4:00To: Kevin Liang
(BLOOMBERG/ 919 3RD A ) , [email protected]
Subject: Dense Vector Group - Next Steps
Hi guys, thanks for the meeting yesterday, cool stuff!
I spent a bit of JIRA time, and I think I managed to create something that
would make more or less everybody happy (please iterate/suggest other
options if it doesn't!):
Components-based approach as recommended by David and Houston:
https://issues.apache.org/jira/browse/SOLR-17815?jql=project%20%3D%20SOLR%20AND%
20component%20%3D%20vector-search
A kanban board to have a quick glance at what's going on, what's in
progress, what's in to-do, etc.
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=633&projectKey=S
OLR
Some general guidelines:
- All Solr work is volunteering based, so there's no obligation on
having to work on the issues created so far first, they are a good start
but if you fancy to contribute anything else, *it comes without saying
that you can create other issues*, tag them with the right component
'vector-search' and start working on them.
- It's ok to be in 'stealth mode' until you are confident enough to
share what you built, we all have limited time to donate and drafting
something is better than just wait indefinitely for discussions that may
happen months later, just to better coordinate if you are working on
something put the Jira in progress and leave a comment, in this way if
someone else wants to takle the same issue, can reach out to you and
potentially interact even before a pull request is opened.
- If you have the luxury of more time and no rush, f*eel free to reach
out on the dev list/Slack to discuss new issues/designs/ideas*. There's
no guarantee, but someone may be available to jump on a call in a day or
two. Especially if you are unsure on where or how to start.
- Once a Pull Request is opened, link it to the Jira (if you follow
standard naming conventions, this should happen automatically), keep it
open for a while, participate in discussions if any, and after a bit with
no interaction, try to not lose the momentum and ask a committer to merge.
If it's good enough, it's better than never merging, waiting for
perfection; other iterations can happen and improve/add what's missing.
- Once you start a contribution, don't feel pressured to finish it; it's
ok to donate something half-baked, it could be an interesting starting
point for others that can re-use the initial work. If you feel your
contribution is not good enough for a pull request but at the same time you
start to question how much time you can dedicate to it to continue the
work, don't worry, publish the pull request with a disclaimer,* no-one
will judge you and someone potentially can take it from there*!
Feel free to add other issues and tag them appropriately,* it's now
fundamental to use the 'vector-search' component if we want a cohesive view
on the topic.*
In the next few months, hoping this initiative is successful, I'll do the
same with 'LLM-Search'.
We'll keep the meeting rolling. I'm afraid I may not join the next one at
the same time, but I can catch up offline with @Kevin Liang
<[email protected]> on the same day or something.
Cheers
--------------------------
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr Chair of PMC*
e-mail: [email protected]
*Sease* - Information Retrieval Applied
Consulting | Training | Open Source
Website: Sease.io <http://sease.io/>
LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
<https://twitter.com/seaseltd> | Youtube
<https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
<https://github.com/seaseltd>