Thanks Jonathan, That’s good to know.
Andy From: Jonathan Ellis <jbel...@gmail.com> Date: Friday, 16 June 2023 at 18:04 To: dev@cassandra.apache.org <dev@cassandra.apache.org> Subject: Re: [VOTE] CEP-30 ANN Vector Search CAUTION: This email originated from outside the University of Dundee. Do not click links or open attachments unless you recognise the sender's email address and know the content is safe. Correct. They will be ordered closest-first. Unfortunately it's not possible for the near or medium future to do farthest-first. HNSW index gets to log(n) time by only keeping a subset of the closest neighbors for each vector. So you'd need a separate index with a inverse-cosine similarity metric, and it's not possible today to use a custom metric function. (This has been GA for over a year in Elastic and Solr and so far nobody has needed farthest-first badly enough to add this as an option to the underlying Lucene library.) You can get the distances back today, like this: SELECT my_text, similarity_cosine(my_embedding, ?) FROM my_table ORDER BY my_embedding ANN OF ? LIMIT 2 Then just pass the query vector into both bind variables. On Fri, Jun 16, 2023 at 7:09 AM Andrew Cobley (Staff) <a.e.cob...@dundee.ac.uk<mailto:a.e.cob...@dundee.ac.uk>> wrote: Hi, I’ve got a question and a request about this CEP In the example: SELECT * FROM test.foo WHERE j ANN OF [3.4, 7.8, 9.1] limit 1; I presume that limit n will return the nth nearest neighbours? If that’s the case what order will they be in? Is it posssible to reverse the order ? Secondly would it be possible to return the calculated distances? This might be particular important if there are n returned neighbours? Andy ________________________________ From: Patrick McFadin <pmcfa...@gmail.com<mailto:pmcfa...@gmail.com>> Sent: 15 June 2023 01:03 To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org> <dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>> Subject: Re: [VOTE] CEP-30 ANN Vector Search CAUTION: This email originated from outside the University of Dundee. Do not click links or open attachments unless you recognise the sender's email address and know the content is safe. Andy, Good to see you on the ML again! CEP-30 is slated for release with 5.0 later in the year. Until then, you'll need to do a local build or try it out in a preview in Astra. A few of us have been talking about creating a preview docker image since there is some interest in having it run in k8ssandra. In any case, this is very alpha code and should be treated as such. Reporting errors or unusual results would be greatly appreciated! Patrick On Wed, Jun 14, 2023 at 7:10 AM Andrew Cobley (Staff) <a.e.cob...@dundee.ac.uk<mailto:a.e.cob...@dundee.ac.uk>> wrote: Hi All, Great news this has gone through, I wondering if we have a timescale for this making it to Beta or release ? I’m asking because we have a project that would benefit from this approach. Andy From: Jonathan Ellis <jbel...@gmail.com<mailto:jbel...@gmail.com>> Date: Tuesday, 30 May 2023 at 14:44 To: dev <dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>> Subject: Re: [VOTE] CEP-30 ANN Vector Search CAUTION: This email originated from outside the University of Dundee. Do not click links or open attachments unless you recognise the sender's email address and know the content is safe. Thanks, all. Closing the vote as accepted with 8 binding +1 (including me) and 11 non-binding votes. On Thu, May 25, 2023 at 10:45 AM Jonathan Ellis <jbel...@gmail.com<mailto:jbel...@gmail.com>> wrote: Let's make this official. CEP: https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-Attached+Indexes POC that demonstrates all the big rocks, including distributed queries: https://github.com/datastax/cassandra/tree/cep-vsearch -- Jonathan Ellis co-founder, http://www.datastax.com @spyced -- Jonathan Ellis co-founder, http://www.datastax.com @spyced The University of Dundee is a registered Scottish Charity, No: SC015096 The University of Dundee is a registered Scottish Charity, No: SC015096 -- Jonathan Ellis co-founder, http://www.datastax.com @spyced The University of Dundee is a registered Scottish Charity, No: SC015096