Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-08 Thread Jean-Christophe Sirot

On 03/07/2011 10:08 PM, Aaron Morton wrote:

You can fill your boots.

So long as your boots have a capacity of 2 billion.

Background ...
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

http://wiki.apache.org/cassandra/CassandraLimitations

http://www.pcworld.idg.com.au/article/373483/new_cassandra_can_pack_two_billion_columns_into_row/



Thx, I haven't seen these wiki pages.

--
Jean-Christophe Sirot


Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-07 Thread Jean-Christophe Sirot

Hello,

On 03/06/2011 06:35 PM, Aditya Narayan wrote:

Next, I also need to store the blogComments which I am planning to
store all, in another single row. 1 comment per column. Thus the
entire information about the a single comment like  commentBody,
commentor would be serialized(using google Protocol buffers) and
stored in a single column,


Is there any limitation/issue in having a signle row with a lot of 
columns? For instance, can I have millions of columns in a single row?


--
Jean-Christophe Sirot



Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-07 Thread Aaron Morton
You can fill your boots.

So long as your boots have a capacity of 2 billion.

Background ...
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

http://wiki.apache.org/cassandra/CassandraLimitations

http://www.pcworld.idg.com.au/article/373483/new_cassandra_can_pack_two_billion_columns_into_row/

aaron

On 8/03/2011, at 4:57 AM, Jean-Christophe Sirot 
jean-christophe.si...@cryptolog.com wrote:

 Hello,
 
 On 03/06/2011 06:35 PM, Aditya Narayan wrote:
 Next, I also need to store the blogComments which I am planning to
 store all, in another single row. 1 comment per column. Thus the
 entire information about the a single comment like  commentBody,
 commentor would be serialized(using google Protocol buffers) and
 stored in a single column,
 
 Is there any limitation/issue in having a signle row with a lot of columns? 
 For instance, can I have millions of columns in a single row?
 
 -- 
 Jean-Christophe Sirot
 


What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-06 Thread Aditya Narayan
What would be a good strategy to store large text content/(blog posts
of around 1500-3000 characters)  in cassandra? I need to store these
blog posts along with their metadata like bloggerId, blogTags. I am
looking forward to store this data in a single row giving each
attribute a single column. So one blog per row. Is using a single
column for a large blog post like this a good strategy?

Next, I also need to store the blogComments which I am planning to
store all, in another single row. 1 comment per column. Thus the
entire information about the a single comment like  commentBody,
commentor would be serialized(using google Protocol buffers) and
stored in a single column,
For storing the no. of likes of each comment itself,  I am planning to
keep a counter_column, in the same row, for each comment that will
hold an no. specifiying no. of 'likes' of that comment.

Any suggestions on the above design highly appreciated.. Thanks.


Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-06 Thread Aaron Morton
Sounds reasonable, one CF for the blog post one CF for the comments. You could 
also use a single CF if you will often read the blog and the comments at the 
same time. The best design is the one that suits how your app works, try one 
and be prepared to change.

Note that counters are only in the 0.8 trunk and are still under development, 
they are not going to be released for a couple of months.

Your per column data size is nothing to be concerned abut.

Hope that helps.
Aaron 

On 7/03/2011, at 6:35 AM, Aditya Narayan ady...@gmail.com wrote:

 What would be a good strategy to store large text content/(blog posts
 of around 1500-3000 characters)  in cassandra? I need to store these
 blog posts along with their metadata like bloggerId, blogTags. I am
 looking forward to store this data in a single row giving each
 attribute a single column. So one blog per row. Is using a single
 column for a large blog post like this a good strategy?
 
 Next, I also need to store the blogComments which I am planning to
 store all, in another single row. 1 comment per column. Thus the
 entire information about the a single comment like  commentBody,
 commentor would be serialized(using google Protocol buffers) and
 stored in a single column,
 For storing the no. of likes of each comment itself,  I am planning to
 keep a counter_column, in the same row, for each comment that will
 hold an no. specifiying no. of 'likes' of that comment.
 
 Any suggestions on the above design highly appreciated.. Thanks.


Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

2011-03-06 Thread Aditya Narayan
Thanks Aaron!!

I didnt knew about the upcoming facility for inbuilt counters. This
sounds really great for my use-case!! Could you let me know where can
I read more about this, if this had been blogged about, somewhere ?

I'll go forward with the one (entire)blog per column design.

Thanks



On Mon, Mar 7, 2011 at 5:10 AM, Aaron Morton aa...@thelastpickle.com wrote:
 Sounds reasonable, one CF for the blog post one CF for the comments. You 
 could also use a single CF if you will often read the blog and the comments 
 at the same time. The best design is the one that suits how your app works, 
 try one and be prepared to change.

 Note that counters are only in the 0.8 trunk and are still under development, 
 they are not going to be released for a couple of months.

 Your per column data size is nothing to be concerned abut.

 Hope that helps.
 Aaron

 On 7/03/2011, at 6:35 AM, Aditya Narayan ady...@gmail.com wrote:

 What would be a good strategy to store large text content/(blog posts
 of around 1500-3000 characters)  in cassandra? I need to store these
 blog posts along with their metadata like bloggerId, blogTags. I am
 looking forward to store this data in a single row giving each
 attribute a single column. So one blog per row. Is using a single
 column for a large blog post like this a good strategy?

 Next, I also need to store the blogComments which I am planning to
 store all, in another single row. 1 comment per column. Thus the
 entire information about the a single comment like  commentBody,
 commentor would be serialized(using google Protocol buffers) and
 stored in a single column,
 For storing the no. of likes of each comment itself,  I am planning to
 keep a counter_column, in the same row, for each comment that will
 hold an no. specifiying no. of 'likes' of that comment.

 Any suggestions on the above design highly appreciated.. Thanks.