Re: Max records per node for a given secondary index value

2012-01-19 Thread aaron morton
Each node is stores  the rows in it's token range, and those in the token 
ranges it is a replica for. So it will store roughly num_nodes / rf   the rows.

If you are approaching a situation where the node may store 2 billion rows, and 
so may have 2 billion entries in the secondary index row, you would need to add 
more nodes to reduce the number of rows the node stores. 

IMHO it sounds like there are some efficiencies to be found in your data model. 
If you have write once records it may be more efficient to create a CF to 
support your common queries. Also the utility of 2 billion things in an index 
is probably questionable, it may be useful to partition by date. 

Hope that helps.
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/01/2012, at 3:11 PM, Mohit Anchlia wrote:

> You need to shard your rows
> 
> On Wed, Jan 18, 2012 at 5:46 PM, Kamal Bahadur  wrote:
>> Anyone?
>> 
>> 
>> On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur 
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> It is great to know that Cassandra column family can accommodate 2 billion
>>> columns per row! I was reading about how Cassandra stores the secondary
>>> index info internally. I now understand that the index related data are
>>> stored in hidden CF and each node is responsible to store the keys of data
>>> that reside on that node only.
>>> 
>>> I have been using secondary index for a low cardinality column called
>>> "product". There can only be 3 possible values for this column. I have a
>>> four node cluster and process about 5000 records per second with a RF 2.
>>> 
>>> My question here is, what happens after the number of columns in hidden
>>> index CF exceeds 2 billion? How does Cassandra handle this situation? I
>>> guess, one way to handle this is to add more nodes to the cluster. I am
>>> interested in knowing if any other solution exist.
>>> 
>>> Thanks,
>>> Kamal
>> 
>> 



Re: Max records per node for a given secondary index value

2012-01-18 Thread Mohit Anchlia
You need to shard your rows

On Wed, Jan 18, 2012 at 5:46 PM, Kamal Bahadur  wrote:
> Anyone?
>
>
> On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur 
> wrote:
>>
>> Hi All,
>>
>> It is great to know that Cassandra column family can accommodate 2 billion
>> columns per row! I was reading about how Cassandra stores the secondary
>> index info internally. I now understand that the index related data are
>> stored in hidden CF and each node is responsible to store the keys of data
>> that reside on that node only.
>>
>> I have been using secondary index for a low cardinality column called
>> "product". There can only be 3 possible values for this column. I have a
>> four node cluster and process about 5000 records per second with a RF 2.
>>
>> My question here is, what happens after the number of columns in hidden
>> index CF exceeds 2 billion? How does Cassandra handle this situation? I
>> guess, one way to handle this is to add more nodes to the cluster. I am
>> interested in knowing if any other solution exist.
>>
>> Thanks,
>> Kamal
>
>


Re: Max records per node for a given secondary index value

2012-01-18 Thread Kamal Bahadur
Anyone?

On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur wrote:

> Hi All,
>
> It is great to know that Cassandra column family can accommodate 2 billion
> columns per row! I was reading about how Cassandra stores the secondary
> index info internally. I now understand that the index related data are
> stored in hidden CF and each node is responsible to store the keys of data
> that reside on that node only.
>
> I have been using secondary index for a low cardinality column called
> "product". There can only be 3 possible values for this column. I have a
> four node cluster and process about 5000 records per second with a RF 2.
>
> My question here is, what happens after the number of columns in hidden
> index CF exceeds 2 billion? How does Cassandra handle this situation? I
> guess, one way to handle this is to add more nodes to the cluster. I am
> interested in knowing if any other solution exist.
>
> Thanks,
> Kamal
>