IMHO: user_name is not a column, it is the row key. Therefore, according to 
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ , the row does not 
contain a relevant column index, which causes the iterator to read each column 
(including value) of each row.

I believe that instead of referring to user_name as if it were a column, you 
need to refer to it via the reserved word "KEY", e.g.:


Select KEY from users where status = 2;

Always glad to share a theory with a friend....


From: Tamar Rosen [mailto:ta...@correlor.com]
Sent: Thursday, April 25, 2013 11:04 AM
To: user@cassandra.apache.org
Subject: Secondary Index on table with a lot of data crashes Cassandra

Hi,

We have a case of a reproducible crash, probably due to out of memory, but I 
don't understand why.

The installation is currently single node.

We have a column family with approx 50000 rows.

In cql, the CF definition is:




CREATE TABLE users (

  user_name text PRIMARY KEY,

  big_json text,

  status int

);



Each big_json can have 500K or more of data.



There is also a secondary index on the status column.

Status can have various values, over 90% of all rows have status = 2.





Calling:



Select user_name from users limit 80000;



Is pretty fast







Calling:



Select user_name from users where status = 1;

is slower, even though much less data is returned.



Calling:



Select user_name from users where status = 2;



Always crashes.





What are we doing wrong? Can it be that Cassandra is actually trying to read 
all the CF data rather than just the keys! (actually, it doesn't need to go to 
the users CF at all - all the data it needs is in the index CF)





Also, in the code I am doing the same using Astyanax index query with 
pagination, and the behavior is the same.



Please help me:



1. solve the immediate issue



2. understand if there is something in this use case which indicates that we 
are not using Cassandra the way it is meant.





Thanks,





Tamar Rosen



Correlor.com







_______________________________________________

This message may contain information that is confidential or privileged. If you 
are not an intended recipient of this message, please delete it and any 
attachments, and notify the sender that you have received it in error. Unless 
specifically stated in the message or otherwise indicated, you may not 
duplicate, redistribute or forward this message or any portion thereof, 
including any attachments, by any means to any other person, including any 
retail investor or customer. This message is not a recommendation, advice, 
offer or solicitation, to buy/sell any product or service, and is not an 
official confirmation of any transaction. Any opinions presented are solely 
those of the author and do not necessarily represent those of Barclays.

This message is subject to terms available at: www.barclays.com/emaildisclaimer 
and, if received from Barclays' Sales or Trading desk, the terms available at: 
www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you 
consent to the foregoing. Barclays Bank PLC is a company registered in England 
(number 1026167) with its registered office at 1 Churchill Place, London, E14 
5HP. This email may relate to or be sent from other members of the Barclays 
group.

_______________________________________________

Reply via email to