What partitioner are you using? The default partitioner is not "ordered", so it will randomly order the hashes/tokens, so that tokens will not be ordered even if your PKs are ordered. You probably want to use customer as your partition key and event time as a clustering column - then you can use RDBMS-like WHERE conditions to select a slice of the partition.
-- Jack Krupansky On Thu, Mar 10, 2016 at 4:45 PM, Rakesh Kumar <dcrunch...@aim.com> wrote: > > typo: the primary key was (customer_id + event_time ) > > > -----Original Message----- > From: Rakesh Kumar <dcrunch...@aim.com> > To: user <user@cassandra.apache.org> > Sent: Thu, Mar 10, 2016 4:44 pm > Subject: What is wrong in this token function > > C* 3.0.3 > > I have a table table1 which has the primary key on > ((customer_id,event_id)). > > I loaded 1.03 million rows from a csv file. > > Business case: Show me all events for a given customer in a given time > frame > > In RDBMS it will be > > (Query1) > where customer_id = '289' > and event_time >= '2016-03-01 18:45:00+0000' and event_time <= '2016-03-12 > 19:05:00+0000' ; > > But C* does not allow >= <= on PKY cols. It suggested token function. > > So I did this: > > (Query2) > where token(customer_id,event_time) >= token('289','2016-03-01 > 18:45:00+0000') > and token(customer_id,event_time) <= token('289','2016-03-12 > 19:05:00+0000') ; > > I am seeing 75% more rows than what it should be. It should be 99K rows, > it shows 163K. > > I checked the output with the csv file itself. To double check I loaded > the csv in another table > with modified PKY so that the first query (Query1) can be executed. It > also showed 99K rows. > > Am I using token function incorrectly ? > > > >