Re: Multiblog Scenario: Schema Design

2010-02-01 Thread William Aue
You can consult a full example of schema for multiblog here: http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model In fact, Cassandra is not RDBMS so you'll have to create a CF for everything you want to "index" as Jonathan's suggestion. On Tue, Feb 2, 2010 at 10:27 AM, Jonathan Ellis w

Re: Multiblog Scenario: Schema Design

2010-02-01 Thread Jonathan Ellis
You would add a columnfamily with a row per month and write blog posts (either ids or entire post data) to that CF. On Mon, Feb 1, 2010 at 9:10 PM, Rockett Williams wrote: > Most people are aware of Evan Weaver's (from Twitter) blog post introducing > Cassandra. > http://blog.evanweaver.com/artic

Multiblog Scenario: Schema Design

2010-02-01 Thread Rockett Williams
Most people are aware of Evan Weaver's (from Twitter) blog post introducing Cassandra. http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/ In the post he uses a example multiblog application -> a blog for multiple users. I was wondering how would you be able to query by

Re: Recommended way to do parallel reads of a large column slice?

2010-02-01 Thread Jonathan Ellis
If you want to parallelize (a good idea in general) you are best served by doing so across rows rather than across columns. (Another possibility if you have a relatively static breakdown of columns that makes sense is to spread them across different CFs w/ the same key.) -Jonathan On Mon, Feb 1,

Re: Recommended way to do parallel reads of a large column slice?

2010-02-01 Thread Cagatay Kavukcuoglu
A large column slice in my case is tens of thousands of columns, each a few K's in size and independent in processing from others. My plan was to read slices of a few hundred to a thousand columns and process them in a pipeline for reduced overall latency. Regardless of my specific case, though, I

Re: Best design in Cassandra

2010-02-01 Thread Brandon Williams
On Mon, Feb 1, 2010 at 5:20 PM, Erik Holstad wrote: > Hey! > Have a couple of questions about the best way to use Cassandra. > Using the random partitioner + the multi_get calls vs order preservation + > range_slice calls? > When you use an OPP, the distribution of your keys becomes your problem

Best design in Cassandra

2010-02-01 Thread Erik Holstad
Hey! Have a couple of questions about the best way to use Cassandra. Using the random partitioner + the multi_get calls vs order preservation + range_slice calls? What is the benefit of using multiple families vs super column? For example in the case of sorting in different orders. One good thing

Re: Recommended way to do parallel reads of a large column slice?

2010-02-01 Thread Jonathan Ellis
No. Why do you want to do multiple parallel reads instead of one sequential read? On Mon, Feb 1, 2010 at 4:45 PM, Cagatay Kavukcuoglu wrote: > Hi, > > What's the recommended way to do parallel reads of a large slice of > columns when one doesn't know enough about the column names to divide > the

Recommended way to do parallel reads of a large column slice?

2010-02-01 Thread Cagatay Kavukcuoglu
Hi, What's the recommended way to do parallel reads of a large slice of columns when one doesn't know enough about the column names to divide them for parallel reading in a meaningful way? SliceRange allows setting the start and finish column names, but you wouldn't be able to set the start field

Re: Did CASSANDRA-647 get fixed in 0.5?

2010-02-01 Thread Jonathan Ellis
Can you create a ticket for this? Thanks! On Mon, Feb 1, 2010 at 4:11 PM, Omer van der Horst Jansen wrote: > I checked out the 0.5 branch and ran ant release (on my linux box). > Installed the new tar.gz and ran the test on my Windows laptop as before but > got the same result -- the key isn't d

Re: Did CASSANDRA-647 get fixed in 0.5?

2010-02-01 Thread Omer van der Horst Jansen
I checked out the 0.5 branch and ran ant release (on my linux box). Installed the new tar.gz and ran the test on my Windows laptop as before but got the same result -- the key isn't deleted from the perspective of get_range_slice. Omer From: Jonathan Ellis

Sample applications

2010-02-01 Thread Carlos Sanchez
Hi, I am new to Cassandra and I was wondering if someone has develop simple applications (java) that would serve as a guide to understand it Thanks a lot, Carlos This email message and any attachments are for the sole use of the intended recipients and may contain proprietary and/or confiden

Re: Internal structure of api calls

2010-02-01 Thread Erik Holstad
Thanks a lot Brandon!

Re: Did CASSANDRA-647 get fixed in 0.5?

2010-02-01 Thread Jonathan Ellis
647 was committed for 0.5, yes, but CASSANDRA-703 was not. Can you try the 0.5 branch and see if it is fixed there? On Mon, Feb 1, 2010 at 3:26 PM, Omer van der Horst Jansen wrote: > I'm running > into an issue with Cassandra 0.5 (the current release version) that > sounds exactly like the descr

Re: Internal structure of api calls

2010-02-01 Thread Brandon Williams
On Mon, Feb 1, 2010 at 3:48 PM, Erik Holstad wrote: > Hey guys! > > I'm totally new to Cassandra and have a couple of question about the > internal structure of some of the calls. > > When using the slicerange(count) for the get calls, does the actual result > being truncated on the server > or i

Internal structure of api calls

2010-02-01 Thread Erik Holstad
Hey guys! I'm totally new to Cassandra and have a couple of question about the internal structure of some of the calls. When using the slicerange(count) for the get calls, does the actual result being truncated on the server or is it happening on the client ie is it more efficient than the regula

Did CASSANDRA-647 get fixed in 0.5?

2010-02-01 Thread Omer van der Horst Jansen
I'm running into an issue with Cassandra 0.5 (the current release version) that sounds exactly like the description of issue CASSANDRA-647. I'm using the Thrift Java API to store a couple of columns in a single row. A few seconds after that my application deletes the entire row. A plain Cassa

Re: Error running chiton GTK

2010-02-01 Thread Brandon Williams
On Sun, Jan 31, 2010 at 3:07 AM, Richard Grossman wrote: > Hi > > Sorry but I succed to launch chiton but stay stuck when trying to retrieve > the keyspaces nothing else. > May be it's not compatible with cassandra 0.5 ? > thanks > Chiton uses twisted, so it requires the framed transport to be us

setLog4jLevel Operation

2010-02-01 Thread Adam Holmberg
Greetings. I'm just getting acquainted with the interfaces exposed by Cassandra via JMX (presently have 0.4.2 installed). I'm curious is someone can explain to me how to use the 'setLog4jLevel' operation in org.apache.cassandra.service StorageServices. I've tried invoking for various values of p1

Re: Cassandra error with large connection

2010-02-01 Thread Jonathan Ellis
On Mon, Feb 1, 2010 at 10:03 AM, Jonathan Ellis wrote: >> I see a lot of CLOSE_WAIT TCP connection. Also, this sounds like you are not properly pooling client connections to casssandra. You should have one connection per user, not one connection per operation. -Jonathan

Re: Cassandra error with large connection

2010-02-01 Thread Jonathan Ellis
as usual with OOME, you can fix it by giving the jvm a larger max heap. in this case, you can also mitigate it by reducing the per-thread stack to the minimum with -Xss. I believe that in 1.6 the minimum is 64k on 32bit jvm and 2x that for 64bit. -Jonathan On Mon, Feb 1, 2010 at 9:56 AM, JKnigh

Cassandra error with large connection

2010-02-01 Thread JKnight JKnight
Dear all, When working with large amount of user, we have an error: ERROR [main] 2010-02-01 17:12:37,354 CassandraDaemon.java (line 71) Fatal exception in thread Thread[main,5,main] java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method)

Re: How do cassandra clients failover?

2010-02-01 Thread Jonathan Ellis
No. Thrift is just an RPC mechanism. Whether RRDNS, software or hardware load balancing, or client-based failover like Gary describes is best is not a one-size-fits-all answer. 2010/2/1 Noble Paul നോബിള്‍ नोब्ळ् : > is it worth adding this feature to the standard java client? > > On Mon, Feb 1,

Re: How do cassandra clients failover?

2010-02-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
is it worth adding this feature to the standard java client? On Mon, Feb 1, 2010 at 7:28 PM, Gary Dusbabek wrote: > One approach is to discover what other nodes there are before any of > them fail.  Then when you detect failure, you can connect to a > different node that is (hopefully) still resp

Re: How do cassandra clients failover?

2010-02-01 Thread Gary Dusbabek
One approach is to discover what other nodes there are before any of them fail. Then when you detect failure, you can connect to a different node that is (hopefully) still responding. There is an API call that allows you get get a list of all the nodes: client.get_string_property("token map"), wh

How do cassandra clients failover?

2010-02-01 Thread Noble Paul നോബിള്‍ नोब्ळ्
The cassandra client (thift client) is started up with the host:post of a single cassandra node. * What happens if that node fails? * Does it mean that all the operations go through the same node? --Noble