Thanks Jonathan,

That’s good to know.

Andy


From: Jonathan Ellis <jbel...@gmail.com>
Date: Friday, 16 June 2023 at 18:04
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [VOTE] CEP-30 ANN Vector Search

CAUTION: This email originated from outside the University of Dundee. Do not 
click links or open attachments unless you recognise the sender's email address 
and know the content is safe.
Correct.  They will be ordered closest-first.

Unfortunately it's not possible for the near or medium future to do 
farthest-first.  HNSW index gets to log(n) time by only keeping a subset of the 
closest neighbors for each vector.  So you'd need a separate index with a 
inverse-cosine similarity metric, and it's not possible today to use a custom 
metric function.

(This has been GA for over a year in Elastic and Solr and so far nobody has 
needed farthest-first badly enough to add this as an option to the underlying 
Lucene library.)

You can get the distances back today, like this:

SELECT my_text, similarity_cosine(my_embedding, ?)
FROM my_table
ORDER BY my_embedding ANN OF ? LIMIT 2

Then just pass the query vector into both bind variables.

On Fri, Jun 16, 2023 at 7:09 AM Andrew Cobley (Staff) 
<a.e.cob...@dundee.ac.uk<mailto:a.e.cob...@dundee.ac.uk>> wrote:
Hi,

I’ve got a question and a request about this CEP

In the example:


SELECT * FROM test.foo WHERE j ANN OF [3.4, 7.8, 9.1] limit 1;

I presume that limit n will return the nth nearest neighbours?

If that’s the case what order will they be in? Is it posssible to reverse the 
order ?

Secondly would it be possible to return the calculated distances?  This might 
be particular important if there are n returned neighbours?

Andy
________________________________
From: Patrick McFadin <pmcfa...@gmail.com<mailto:pmcfa...@gmail.com>>
Sent: 15 June 2023 01:03
To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org> 
<dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>>
Subject: Re: [VOTE] CEP-30 ANN Vector Search




CAUTION: This email originated from outside the University of Dundee. Do not 
click links or open attachments unless you recognise the sender's email address 
and know the content is safe.
Andy,

Good to see you on the ML again! CEP-30 is slated for release with 5.0 later in 
the year. Until then, you'll need to do a local build or try it out in a 
preview in Astra. A few of us have been talking about creating a preview docker 
image since there is some interest in having it run in k8ssandra. In any case, 
this is very alpha code and should be treated as such. Reporting errors or 
unusual results would be greatly appreciated!

Patrick



On Wed, Jun 14, 2023 at 7:10 AM Andrew Cobley (Staff) 
<a.e.cob...@dundee.ac.uk<mailto:a.e.cob...@dundee.ac.uk>> wrote:

Hi All,



Great news this has gone through, I wondering if we have a timescale for this 
making it to Beta or release ?  I’m asking because we have a project that would 
benefit from this approach.



Andy





From: Jonathan Ellis <jbel...@gmail.com<mailto:jbel...@gmail.com>>
Date: Tuesday, 30 May 2023 at 14:44
To: dev <dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>>
Subject: Re: [VOTE] CEP-30 ANN Vector Search



CAUTION: This email originated from outside the University of Dundee. Do not 
click links or open attachments unless you recognise the sender's email address 
and know the content is safe.

Thanks, all.  Closing the vote as accepted with 8 binding +1 (including me) and 
11 non-binding votes.



On Thu, May 25, 2023 at 10:45 AM Jonathan Ellis 
<jbel...@gmail.com<mailto:jbel...@gmail.com>> wrote:

Let's make this official.

CEP: 
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes



POC that demonstrates all the big rocks, including distributed queries: 
https://github.com/datastax/cassandra/tree/cep-vsearch

--

Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


--

Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096


--
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

The University of Dundee is a registered Scottish Charity, No: SC015096

Reply via email to