Hi Together
Any news on the MAX_DIMENSIONS discussion?
https://github.com/apache/lucene/issues/11507
I just implemented Cohere.ai embeddings and Cohere is offering
small: 1024
medium: 2048
large: 4096
whereas Cohere has a nice demo described at
https://txt.cohere.ai/building-a-search-based-discord-bot-with-language-models/
whereas I am not sure which model they are using for the demo.
Thanks
Michael
Am 09.08.22 um 21:56 schrieb Julie Tibshirani:
Thank you Marcus for raising this, it's an important topic! On the
issue you filed, Mike pointed to the JIRA ticket where we've been
discussing this (https://issues.apache.org/jira/browse/LUCENE-10471)
and suggested commenting with the embedding models you've heard about
from users. This seems like a good idea to me too -- looking forward
to discussing more on that JIRA issue. (Unless we get caught in the
middle of the migration -- then we'll discuss once it's been moved to
GitHub!)
Julie
On Mon, Aug 8, 2022 at 10:05 PM Michael Wechner
<michael.wech...@wyona.com> wrote:
I agree that Lucene should support vector sizes depending on the
model one is choosing.
For example Weaviate seems to do this
https://weaviate.slack.com/archives/C017EG2SL3H/p1659981294040479
Thanks
Michael
Am 07.08.22 um 22:48 schrieb Marcus Eagan:
Hi Lucene Team,
In general, I have advised very strongly against our team at
MongoDB modifying the Lucene source, except in scenarios where we
have strong needs for a particular customization. Ultimately,
people can do what they would like to do.
That being said, we have a number of customers preparing to use
Lucene for dense vector search. There are many language models
that are optimized for > 1024 dimensions. I remember Michael
Wechner's email
<https://www.mail-archive.com/dev@lucene.apache.org/msg314281.html>
about one instance with Open API.
I just tried to test the OpenAI model
"text-similarity-davinci-001" with 12288 dimension
It seems that customers who attempt to use these models should
not be turned away. It could be sufficient to explain the issues.
The only ones I have identified are two expected ones in very
slow indexing throughput, high CPU usage, and a maybe less
defined risk of more numerical errors.
I opened an issue <https://github.com/apache/lucene/issues/1060>
and PR <https://github.com/apache/lucene/pull/1061> for the
discussion as well. I would appreciate guidance on where we think
the warning should go. I feel like burying in a Javadoc is a
less than ideal experience. It would be better to be a warning on
startup. In the PR, I increased the max limit by a factor of
twenty. We should let users use the system based on their needs
even if it was designed or optimized for the models they bring
because we need the feedback and the data from the world.
Is there something I'm overlooking from a risk standpoint?
Best,
--
Marcus Eagan