Re: Wide row column slicing - row size shard limit

aaron morton Sun, 19 Feb 2012 16:33:19 -0800

> I know the hard limit is 2 billion columns per row. My question is at what 
> size it will slowdown read/write performance and maintenance.  The blog I 
> reference said the row size should be less than 10MB.
A look at read performance with different row sizes….
http://thelastpickle.com/2011/10/03/Reverse-Comparators/
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/


> Are there any other ways to model historical data (or time-series-data) 
> besides wide row column slicing in Cassandra?
Not that I am aware of. You will need to partition the rows.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/02/2012, at 12:41 PM, Data Craftsman wrote:

> Hi Aaron Morton and   R. Verlangen,
> 
> Thanks for the quick answer. It's good to know Thrift's limit on the amount 
> of data it will accept / send.
> 
> I know the hard limit is 2 billion columns per row. My question is at what 
> size it will slowdown read/write performance and maintenance.  The blog I 
> reference said the row size should be less than 10MB.
> 
> It'll be better if Cassandra can transparently shard/split the wide row and 
> then distribute them to many nodes, to help the load balancing.
> 
> Are there any other ways to model historical data (or time-series-data) 
> besides wide row column slicing in Cassandra?
> 
> Thanks,
> Charlie | Data Solution Architect Developer
> http://mujiang.blogspot.com
> 
> 
> 
> On Thu, Feb 16, 2012 at 12:38 AM, aaron morton <aa...@thelastpickle.com> 
> wrote:
> > Based on this blog of Basic Time Series with Cassandra data modeling,
> > http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
> I've not read that one but it sounds right. Mat Dennis knows his stuff  
> http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling
> 
> > There is a limit on how big the row size can be before slowing down the 
> > update and query performance, that is 10MB or less.
> There is no hard limit. Wide rows wont upset writes too much. Some read 
> queries can avoid problems but most will not.
> 
> Wide rows are a pain when it comes to maintenance.  They take longer to 
> compact and repair.
> 
> > Is this still true in Cassandra latest version? or in what release 
> > Cassandra will remove this limit?
> There is a limit of 2 billion columns per row. There is a not a limit of 10MB 
> per row. I've seen some rows in the 100's of MB and they are always a pain.
> 
> > Manually sharding the wide row will increase the application complexity, it 
> > would be better if Cassandra can handle it transparently.
> it's not that hard :)
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 16/02/2012, at 7:40 AM, Data Craftsman wrote:
> 
> > Hello experts,
> >
> > Based on this blog of Basic Time Series with Cassandra data modeling,
> > http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
> >
> > "This (wide row column slicing) works well enough for a while, but over 
> > time, this row will get very large. If you are storing sensor data that 
> > updates hundreds of times per second, that row will quickly become gigantic 
> > and unusable. The answer to that is to shard the data up in some way"
> >
> > There is a limit on how big the row size can be before slowing down the 
> > update and query performance, that is 10MB or less.
> >
> > Is this still true in Cassandra latest version? or in what release 
> > Cassandra will remove this limit?
> >
> > Manually sharding the wide row will increase the application complexity, it 
> > would be better if Cassandra can handle it transparently.
> >
> > Thanks,
> > Charlie | DBA & Developer
> >
> >
> > p.s. Quora link,
> > http://www.quora.com/Cassandra-database/What-are-good-ways-to-design-data-model-in-Cassandra-for-historical-data
> >
> >
> >
> 
>

Re: Wide row column slicing - row size shard limit

Reply via email to