A > B > C

I don't think that ML is such a niche application that it can't have its
own CQL data type. Also, vectors are mathematical elements that have more
applications that ML.

On Tue, 2 May 2023 at 19:15, Mick Semb Wever <m...@apache.org> wrote:

>
>
> On Tue, 2 May 2023 at 17:14, Jonathan Ellis <jbel...@gmail.com> wrote:
>
>> Should we add a vector type to Cassandra designed to meet the needs of
>> machine learning use cases, specifically feature and embedding vectors for
>> training, inference, and vector search?
>>
>> ML vectors are fixed-dimension (fixed-length) sequences of numeric types,
>> with no nulls allowed, and with no need for random access. The ML industry
>> overwhelmingly uses float32 vectors, to the point that the industry-leading
>> special-purpose vector database ONLY supports that data type.
>>
>> This poll is to gauge consensus subsequent to the recent discussion
>> thread at
>> https://lists.apache.org/thread/0lj1nk9jbhkf1rlgqcvxqzfyntdjrnk0.
>>
>> Please rank the discussed options from most preferred option to least,
>> e.g., A > B > C (A is my preference, followed by B, followed by C) or C > B
>> = A (C is my preference, followed by B or A approximately equally.)
>>
>> (A) I am in favor of adding a vector type for floats; I do not believe we
>> need to tie it to any particular implementation details.
>>
>> (B) I am okay with adding a vector type but I believe we must add array
>> types that compose with all Cassandra types first, and make vectors a
>> special case of arrays-without-null-elements.
>>
>> (C) I am not in favor of adding a built-in vector type.
>>
>
>
>
> A  > B > C
>
> B is stated as "must add array types…".  I think this is a bit loaded.  If
> B was the (A + the implementation needs to be a non-null frozen float32
> array, serialisation forward compatible with other frozen arrays later
> implemented) I would put this before (A).  Especially because it's been
> shown already this is easy to implement.
>
>
>

Reply via email to