Hi,

Clustering columns are used to order the data in a partition. However,
since data is split into SSTables, the rows are ordered by clustering key
only within each SSTable. Cassandra still needs to check all SSTables, and
merge the data if it is found in several SSTables. The only scanario where
I can imagine big performance gain is  super wide paritions, where each
partition is within a single SSTable (time series data, where partition
keys are time-buckets)

Has anybody done benchmarks on that and can share the data mode they have
used?

Cheers,
Eugene

Reply via email to