Hi All - I'd like to share some initial results for the vector search on Cassandra 5.0 beta1.  3 node cluster running in kubernetes; fast Netapp storage.

Have a table (doc.embeddings_googleflan5tlarge) with definition:

CREATE TABLE doc.embeddings_googleflant5large (
    uuid text,
    type text,
    fieldname text,
    offset int,
    sourceurl text,
    textdata text,
    creationdate timestamp,
    embeddings vector<float, 768>,
    metadata boolean,
    source text,
    PRIMARY KEY ((uuid, type), fieldname, offset, sourceurl, textdata)
) WITH CLUSTERING ORDER BY (fieldname ASC, offset ASC, sourceurl ASC, textdata ASC)
    AND additional_write_policy = '99p'
    AND allow_auto_snapshot = true
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}     AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND memtable = 'default'
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND incremental_backups = true
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

CREATE CUSTOM INDEX ann_index_googleflant5large ON doc.embeddings_googleflant5large (embeddings) USING 'sai'; CREATE CUSTOM INDEX offset_index_googleflant5large ON doc.embeddings_googleflant5large (offset) USING 'sai';

nodetool status -r

UN  cassandra-1.cassandra5.cassandra5-jos.svc.cluster.local 18.02 GiB  128     100.0% f2989dea-908b-4c06-9caa-4aacad8ba0e8  rack1 UN  cassandra-2.cassandra5.cassandra5-jos.svc.cluster.local  17.98 GiB  128     100.0% ec4e506d-5f0d-475a-a3c1-aafe58399412  rack1 UN  cassandra-0.cassandra5.cassandra5-jos.svc.cluster.local  18.16 GiB  128     100.0% 92c6d909-ee01-4124-ae03-3b9e2d5e74c0  rack1

nodetool tablestats doc.embeddings_googleflant5large

Total number of tables: 1
----------------
Keyspace: doc
        Read Count: 0
        Read Latency: NaN ms
        Write Count: 2893108
        Write Latency: 326.3586520174843 ms
        Pending Flushes: 0
                Table: embeddings_googleflant5large
                SSTable count: 6
                Old SSTable count: 0
                Max SSTable size: 5.108GiB
                Space used (live): 19318114423
                Space used (total): 19318114423
                Space used by snapshots (total): 0
                Off heap memory used (total): 4874912
                SSTable Compression Ratio: 0.97448
                Number of partitions (estimate): 58399
                Memtable cell count: 0
                Memtable data size: 0
                Memtable off heap memory used: 0
                Memtable switch count: 16
                Speculative retries: 0
                Local read count: 0
                Local read latency: NaN ms
                Local write count: 2893108
                Local write latency: NaN ms
                Local read/write ratio: 0.00000
                Pending flushes: 0
                Percent repaired: 100.0
                Bytes repaired: 9.066GiB
                Bytes unrepaired: 0B
                Bytes pending repair: 0B
                Bloom filter false positives: 7245
                Bloom filter false ratio: 0.00286
                Bloom filter space used: 87264
                Bloom filter off heap memory used: 87216
                Index summary off heap memory used: 34624
                Compression metadata off heap memory used: 4753072
                Compacted partition minimum bytes: 2760
                Compacted partition maximum bytes: 4866323
                Compacted partition mean bytes: 154523
                Average live cells per slice (last five minutes): NaN
                Maximum live cells per slice (last five minutes): 0
                Average tombstones per slice (last five minutes): NaN
                Maximum tombstones per slice (last five minutes): 0
                Droppable tombstone ratio: 0.00000

nodetool tablehistograms doc.embeddings_googleflant5large

doc/embeddings_googleflant5large histograms
Percentile      Read Latency     Write Latency          SSTables    Partition Size        Cell Count
                    (micros) (micros)                             (bytes)
50%                     0.00              0.00 0.00            105778               124 75%                     0.00              0.00 0.00            182785               215 95%                     0.00              0.00 0.00            379022               446 98%                     0.00              0.00 0.00            545791               642 99%                     0.00              0.00 0.00            654949               924 Min                     0.00              0.00 0.00              2760                 4 Max                     0.00              0.00 0.00           4866323              5722

Running a query such as:

select uuid,offset,type,textdata from doc.embeddings_googleflant5large order by embeddings ANN OF [768 dimension vector] limit 20;

Works fine - typically less than 5 seconds to return.  Subsequent queries are even faster.  If I'm activity adding data to the table, the searches can sometimes timeout (using cqlsh). If I add something to the where clause, the performance drops significantly:

select uuid,offset,type,textdata from doc.embeddings_googleflant5large where offset=1 order by embeddings ANN OF [] limit 20;

That query will timeout when running in cqlsh and with no data being added to the table. We've been running a Weaviate database side-by-side with Cassandra 4, and would love to drop Weaviate if we can do all the vector searches inside of Cassandra.
What else can I try?  Anything to increase performance?
Thanks all!

-Joe


--
This email has been checked for viruses by AVG antivirus software.
www.avg.com

Reply via email to