Did you benchmark these two options:

1)      Select with IN

2)      Select all words and filter in application

Mohammed

From: Philo Yang [mailto:ud1...@gmail.com]
Sent: Thursday, July 31, 2014 10:45 AM
To: user@cassandra.apache.org
Subject: select many rows one time or select many times?

Hi all,

I have a cluster of 2.0.6 and one of my tables is like this:
CREATE TABLE word (
  user text,
  word text,
  flag double,
  PRIMARY KEY (user, word)
)

each "user" has about 10000 "word" per node. I have a requirement of selecting 
all rows where user='someuser' and word is in a large set whose size is about 
1000 .

In C* document, it is not recommended to use "select ... in" just like:

select from word where user='someuser' and word in ('a','b','aa','ab',...)

So now I select all rows where user='someuser' and filtrate them via client 
rather than via C*. Of course, I use Datastax Java Driver to page the resultset 
by setFetchSize(1000).  Is it the best way? I found the system's load is high 
because of large range query, should I change to select for only one row each 
time and select 1000 times?

just like:
select from word where user='someuser' and word = 'a';
select from word where user='someuser' and word = 'b';
select from word where user='someuser' and word = 'c';
.....

Which method will cause lower pressure on Cassandra cluster?

Thanks,
Philo Yang

Reply via email to