Re: token(), limit and wide rows

Keith Freeman Fri, 16 Aug 2013 09:09:36 -0700

I've run into the same problem, surprised nobody's responded to you.Any time someone asks "how do I page through all the rows of a table inCQL3?", the standard answer is token() and limit. But as you point out,this method will often miss some data from wide rows.


Maybe a Cassandra expert will chime in if we're wrong.

Your suggestion is possible if you know how to find the previous valueof 'name' field (and are willing to filter out repeated rows), butwouldn't that be difficult/impossible with some keys? So then, is therea way to do paging queries that get ALL of the rows, even in wide rows?



On 08/13/2013 02:46 PM, Jan Algermissen wrote:

HI,

ok, so I found token() [1], and that it is an option for paging through 
randomly partitioned data.

I take it that combining token() and LIMIT is the CQL3 idiom for paging (set 
aside the fact that one shouldn't raelly want to page and use C*)

Now, when I page through a CF with wide rows, limitting each 'page' to, for 
example, 100 I end up in situations where not all 'sub'rows that have the same 
result for token() are returned because LIMIT chops off the result after 100 
'sub'rows, not neccessarily at the boundary to the next wide row.

Obvious ... but inconvenient.

The solution would be to throw away the last token returned (because it's wide 
row could have been chopped off) and do the next query with the token before.

So instead of doing

      SELECT * FROM users WHERE token(name) > token(last-name-of-prev-result) 
LIMIT 100;

I'd be doing

     SELECT * FROM users WHERE token(name) > 
token(one-befoe-the-last-name-of-prev-result) LIMIT 100;


Question: Is that what I have to do or is there a way to make token() and limit 
work together to return complete wide rows?


Jan



[1] token() and how it relates to paging is actually quite hard to grasp from 
the docs.

Re: token(), limit and wide rows

Reply via email to