Re: High CPU usage on all nodes without any read or write

2010-07-09 Thread Peter Schuller
But in Cassandra output log : r...@cassandra-2:~#  tail -f /var/log/cassandra/output.log  INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600 reclaimed leaving 1684169392 used; max is 6563430400  INFO 15:32:09,875 GC for ConcurrentMarkSweep: 1363 ms, 4296991416 reclaimed

Re: Why is cassandra named cassandra?

2010-07-09 Thread Daniel Jue
It's in a FAQ somewhere. Based on this: http://en.wikipedia.org/wiki/Cassandra An oracle might also be called a prophet. On Thu, Jul 8, 2010 at 9:43 PM, ChingShen chingshenc...@gmail.com wrote: Hi,   Why is cassandra named cassandra? Thanks. Shen

Re: Why is cassandra named cassandra?

2010-07-09 Thread Pieter Maes
it is explained in the video on the site ;) Op 9/07/10 03:43, ChingShen schreef: Hi, Why is cassandra named cassandra? Thanks. Shen

UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
Hi, I am a bit confused about getting an UnavailableException when doing a QUORUM write. I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM write succeeds. When 1 of the 3 nodes are down, the QUORUM write fails with UnavailableException. Shouldn't it be enough with 2

total disk space used on a node for a CF is too large than expected

2010-07-09 Thread Sagar Agrawal
row size is 10 KB and write count on a node for a CF is 1054451, so ideally the total disk space used on that node by that CF should be around 10 GB but it's showing 23 GB what else might be taking up so much space? Thanks

Re: UnavailableException on QUORUM write

2010-07-09 Thread ChingShen
Which client library do you use? Shen On Fri, Jul 9, 2010 at 4:53 PM, Per Olesen p...@trifork.com wrote: Hi, I am a bit confused about getting an UnavailableException when doing a QUORUM write. I have a 3 node cluster, with RF=3. When all 3 nodes are up, the QUORUM write succeeds. When 1

Re: UnavailableException on QUORUM write

2010-07-09 Thread Per Olesen
On Jul 9, 2010, at 11:11 AM, ChingShen wrote: Which client library do you use? Direct on thrift api using thrift.jar, in version 917130.

new node can't find seed node

2010-07-09 Thread Boris Spasojevic
Hi, I am attempting to add a new node to a single node already running. I have set the first node not to bootstrap, and the second node to bootstrap whit the first node as it's seeder. The IP configuration is OK, the machines can ping each other, the seed machine (or should I say cassandra

Re: new node can't find seed node

2010-07-09 Thread Boris Spasojevic
Solved it! Sorry to spam your inbox! BoriS On Fri, 2010-07-09 at 11:50 +0200, Boris Spasojevic wrote: Hi, I am attempting to add a new node to a single node already running. I have set the first node not to bootstrap, and the second node to bootstrap whit the first node as it's seeder.

Re: total disk space used on a node for a CF is too large than expected

2010-07-09 Thread Sagar Agrawal
what does WriteCount signify actually, it should also include writes which are replicas right? It is total no of writes on that node for that CFtill now, right ? On Fri, Jul 9, 2010 at 2:39 PM, Sagar Agrawal sna...@gmail.com wrote: row size is 10 KB and write count on a node for a CF is

Re: new node can't find seed node

2010-07-09 Thread Dimitry Lvovsky
Sounds like maybe your not binding the 7000 port to the correct interface, maybe you have it set to localhost, rather then the IP address. If you want to confirm, try prompt telnet [machine ip] 7000 If you get a connection refused, then the above is true. Hope this helps. Dimitry Lvovsky

Iterate all keys - doing it as the faq fails for me :(

2010-07-09 Thread Per Olesen
Hi, I was reading http://wiki.apache.org/cassandra/FAQ#iter_world and decided to implement the get_range_slices method for listing all keys of a CF. Only thing is, it doesn't work that well for me :-) I do as it says (I think), and take KeyRanges of size N and use the key of the last call as

manual InitialToken assignemnt

2010-07-09 Thread Sagar Agrawal
I have a 2 node cluster node1 - InitialToken5/InitialToken node2 - InitialToken9/InitialToken If I insert a row with key=a, which node should it go and why? It is going to node1, but I think it should go to node2, since token value of node is closer to a (using java string compareTo method)

Re: manual InitialToken assignemnt

2010-07-09 Thread Jonathan Ellis
see the beginning of http://wiki.apache.org/cassandra/Operations On Fri, Jul 9, 2010 at 7:16 AM, Sagar Agrawal sna...@gmail.com wrote: I have a 2 node cluster node1 - InitialToken5/InitialToken node2 - InitialToken9/InitialToken If I insert a row with key=a, which node should it go and why?

Re: manual InitialToken assignemnt

2010-07-09 Thread Per Olesen
Are you using OrderPreservingPartitioner or RandomPartitioner? Cause if you are using RandomPartitioner, a hash is calculated from a and that hash is used to determine where the data for a key goes, not a. On Jul 9, 2010, at 2:16 PM, Sagar Agrawal wrote: I have a 2 node cluster node1 -

Re: Digg 4 Preview on TWiT

2010-07-09 Thread Terje Marthinussen
http://twitter.com/nk/status/17903187277 Another not using joke?

Re: NYC Cassandra training

2010-07-09 Thread S Ahmed
My previous reply seemed to have bounced. Will there be a training day before/after the Cassandr Summit? (in SF on the 10th) On Fri, Jul 2, 2010 at 2:08 PM, Jonathan Ellis jbel...@gmail.com wrote: Riptano's one day Cassandra training is coming to NYC in August, our first public session on the

Re: manual InitialToken assignemnt

2010-07-09 Thread Sagar Agrawal
got it, thanks On Fri, Jul 9, 2010 at 6:21 PM, Per Olesen p...@trifork.com wrote: Are you using OrderPreservingPartitioner or RandomPartitioner? Cause if you are using RandomPartitioner, a hash is calculated from a and that hash is used to determine where the data for a key goes, not a.

Re: NYC Cassandra training

2010-07-09 Thread Jeremy Dunck
On Fri, Jul 2, 2010 at 1:08 PM, Jonathan Ellis jbel...@gmail.com wrote: Riptano's one day Cassandra training is coming to NYC in August, our first public session on the East coast: http://www.eventbrite.com/event/749518831 Is there a calendar where you're listing this stuff, or is it just

Help! Cassandra disk space utilization WAY higher than I would expect

2010-07-09 Thread Julie
Hi guys, I am on the hook to explain why 30GB of data is filling up 106GB of disk space since this is concerning information for my project. We are very excited about the possibility of using Cassandra but need to understand this anomaly in order to feel confident. Does anyone know why this

Last day to submit your Surge 2010 CFP!

2010-07-09 Thread Jason Dixon
Today is your last chance to submit a CFP abstract for the 2010 Surge Scalability Conference. The event is taking place on Sept 30 and Oct 1, 2010 in Baltimore, MD. Surge focuses on case studies that address production failures and the re-engineering efforts that led to victory in Web

RE: Help! Cassandra disk space utilization WAY higher than I would expect

2010-07-09 Thread Stu Hood
Cassandra has a very high constant per-row overhead at the moment of around 40 bytes. Additionally, there is around 12 bytes of overhead per column. Finally, column names are repeated for each row. CASSANDRA-674 and CASSANDRA-1207 will help with these overheads, but they will not be fixed

Re: get_range_slices

2010-07-09 Thread Jonathan Shook
FYI: https://issues.apache.org/jira/browse/CASSANDRA-1145 Yes, it's a bug. CL.ONE is a reasonable work around. On Thu, Jul 8, 2010 at 11:04 PM, Mike Malone m...@simplegeo.com wrote: I think the answer to your question is no, you shouldn't. I'm feeling far too lazy to do even light research on

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-09 Thread Jonathan Ellis
then obsolete sstables is not your culprit. On Thu, Jul 8, 2010 at 8:32 AM, Julie julie.su...@nextcentury.com wrote: Jonathan Ellis jbellis at gmail.com writes: SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs a GC. You can force a GC from jconsole

Re: How to stop Cassandra running in embeded mode

2010-07-09 Thread Jonathan Ellis
there's some support for this in 0.7 (see http://issues.apache.org/jira/browse/CASSANDRA-1018) but fundamentally it's not really designed to be started and stopped multiple times within the same process. On Thu, Jul 8, 2010 at 3:44 AM, Andriy Kopachevsky kopachev...@gmail.com wrote: Hi, we are

Re: Understanding atomicity in Cassandra

2010-07-09 Thread Jonathan Ellis
typically you will update both as part of a batch_mutate, and if it fails, retry the operation. re-writing any part that succeeded will be harmless. On Thu, Jul 8, 2010 at 11:13 AM, Stuart Langridge stuart.langri...@canonical.com wrote: Hi, Cassandra people! We're looking at Cassandra as a

Re: InitialToken assignemnt

2010-07-09 Thread Jonathan Ellis
Short answer: yes. Longer answer: http://wiki.apache.org/cassandra/Operations On Fri, Jul 9, 2010 at 1:19 PM, Claire Chang cla...@merchantcircle.com wrote: my keys are sequential integers and i use random partitioner in a multi-node cluster. In this case, do I still have to specify  

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread maneela a
Thanks for your quick reply.. JoeI forgot to mentioned that we are using PropertyFileEndPointSnitch to provide cassandra about our network topology and below is property file uses by that class cat

Re: How to stop Cassandra running in embeded mode

2010-07-09 Thread Ran Tavory
The workaround I do is fork always. Each test pulls up its own jvm. On Jul 9, 2010 9:51 PM, Jonathan Ellis jbel...@gmail.com wrote: there's some support for this in 0.7 (see http://issues.apache.org/jira/browse/CASSANDRA-1018) but fundamentally it's not really designed to be started and stopped

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread maneela a
ConsistencyLevel.ONE is default option given inside stress.py so I am using default one --- On Fri, 7/9/10, Bill de hÓra b...@dehora.net wrote: From: Bill de hÓra b...@dehora.net Subject: Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud To: user@cassandra.apache.org Date: Friday,

Re: RackAwareStrategy vs RackUnAwareStrategy on AWS EC2 cloud

2010-07-09 Thread Joe Stump
On Jul 9, 2010, at 1:16 PM, maneela a wrote: Is there any way to mark cassandra node to keep it as just for replication purpose and not to be as Primary for any data range in the ring? I believe there is. This is what we're doing, but we do all of our writes via a queue. Derek or Mike from

TechCrunch article on Twitter and Cassandra

2010-07-09 Thread Kochheiser,Todd W - TOK-DITT-1
A good read. http://techcrunch.com/2010/07/09/twitter-analytics-mysql/ Todd