It sounds like you are misusing/abusing Cassandra.

I've noticed the following Cassandra anti-patterns in your post:

1. Large or uneven partitions
   All rows in a table in a single partition is definitely an
   anti-pattern unless you only have a very small number of rows.
2. "SELECT COUNT(*) FROM ..." without providing a partition key
   In your case, since all rows are in a single partition, it's
   equivalent to without a partition key.
3. Wide table (too many columns)
   91 columns sounds excessive, and may lead to reduced performance and
   heightened JVM GC pressure

Cassandra is not a SQL database. You should design your table schema around the queries, not design your queries around the table schema. You may also need to store multiple copies of the same data with different keys to satisfy different queries.

On 28/09/2022 12:44, Karthik K wrote:
Hi,

We have two doubts on cassandra 3.11 features:

1) Need to get counts of row from a cassandra table.
We have 3 node clusters with Apache Cassandra 3.11 version.

We loaded a table in cassandra with 9lakh records. We have around 91 columns in this table. Most of the records have text as datatype.
All these 9lakh records were part of a single partition key.

When we tried a select count(*) query with that partition key, the query was timing out.

However, we were able to retrieve counts through multiple calls by fetching only 1 lakh records in each call. The only disadvantage here is the time taken which
is around 1minute and 3 seconds.

Is there any other approach to get the row count faster in cassandra? Do we need to ' change the data modelling approach to achieve this? Suggestions are welcome


2) How to data model in cassandra to support usage of multiple filters.
 We may also need the count of rows for this multiple filter query.

Thanks & Regards,
Karthikeyan

Reply via email to