Cassandra node is not blanced Rf=2 Random Partitioner

2011-05-12 Thread Ali Ahsan
My cluster is unbalanced.One have 99 GB Data and other have 87 GB can any one explain why this is happening. [root@cassandra2 conf]# /root/cassandra/bin/nodetool -h 10.0.0.4 ring Address Status Load Range Ring

Re: Commitlog Disk Full

2011-05-12 Thread Sanjeev Kulkarni
Hi Peter, Thanks for the response. I haven't explictly set a value for the memtable_flush_after_mins parameter. Looks like the default is 60minutes. I will try to play around this value to see if that fixes things. Thanks again! On Thu, May 12, 2011 at 11:41 AM, Peter Schuller < peter.schul...@inf

Re: running TPC-C on cassandra clusters

2011-05-12 Thread Xiaowei Wang
Thanks! 2011/5/12 Jonathan Ellis > Not if you want the pausing/marking down fixes that were done more > recently. :) > > On Thu, May 12, 2011 at 8:39 PM, Xiaowei Wang > wrote: > > Oh sorry, we use cassandra-0.7.4 already. Is the version fine? > > > > 2011/5/12 Jonathan Ellis > >> > >> https://

Re: running TPC-C on cassandra clusters

2011-05-12 Thread Jonathan Ellis
Not if you want the pausing/marking down fixes that were done more recently. :) On Thu, May 12, 2011 at 8:39 PM, Xiaowei Wang wrote: > Oh sorry, we use cassandra-0.7.4 already. Is the version fine? > > 2011/5/12 Jonathan Ellis >> >> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0

assertion error in cassandra when doing nodetool move

2011-05-12 Thread Anurag Gujral
Hi All, I run following command on one of my nodes to move the token from 0 to 2. /usr/cassandra/cassandra/bin/nodetool -h 10.170.195.204 -p 8080 move 2. I dont understand why is this happening? I am getting the following assertion error: Exception in thread "main" java.lang.Assertion

Re: running TPC-C on cassandra clusters

2011-05-12 Thread Xiaowei Wang
Oh sorry, we use cassandra-0.7.4 already. Is the version fine? 2011/5/12 Jonathan Ellis > https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7 > > On Thu, May 12, 2011 at 8:33 PM, Xiaowei Wang > wrote: > > Thanks Jonathan, but can you provide some links about 0.7 svn branch? > > >

Re: running TPC-C on cassandra clusters

2011-05-12 Thread Jonathan Ellis
https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7 On Thu, May 12, 2011 at 8:33 PM, Xiaowei Wang wrote: > Thanks Jonathan, but can you provide some links about 0.7 svn branch? > > 2011/5/12 Jonathan Ellis >> >> I'd recommend trying the 0.7 svn branch (soon to be voted on as 0.7.6)

Re: running TPC-C on cassandra clusters

2011-05-12 Thread Xiaowei Wang
Thanks Jonathan, but can you provide some links about 0.7 svn branch? 2011/5/12 Jonathan Ellis > I'd recommend trying the 0.7 svn branch (soon to be voted on as 0.7.6) > > On Thu, May 12, 2011 at 3:09 PM, Xiaowei Wang > wrote: > > Hi all, > > > > My partner and I currently using cassandra clust

Re: Crash when uploading large data sets

2011-05-12 Thread Jeffrey Kesselman
If this a 64bit VM? A 32bit Java VM with default c-heap settings can only actually use about 2GB of Java Heap. On Thu, May 12, 2011 at 8:08 PM, James Cipar wrote: > Oh, forgot this detail:  I have no swap configured, so swapping is not the > cause of the crash.  Could it be that I'm running out

Re: Crash when uploading large data sets

2011-05-12 Thread Jonathan Ellis
If it's a jvm crash there should be a hs_err_pid.log file left around in the directory you started Cassandra from. On Thu, May 12, 2011 at 6:15 PM, James Cipar wrote: > I'm using Cassandra 0.7.5, and uploading about 200 GB of data total (20 GB > unique data), to a cluster of 10 servers.  I'm usi

nodetool move in cassandra

2011-05-12 Thread Anurag Gujral
Hi All, I want to change the token for my cassandra node from T to T+1. It appears that cassandra in this case brings all the data from the other nodes(which basically mean i can only do move token if RF is > 1 for my cluster) . Does cassandra node whose token I am changing looks at the

Re: Crash when uploading large data sets

2011-05-12 Thread James Cipar
Oh, forgot this detail: I have no swap configured, so swapping is not the cause of the crash. Could it be that I'm running out of memory on a 15GB machine? That seems unlikely. I grepped dmesg for "oom" and didn't see anything from the oom killer, and I used the instructions from the followi

Re: Crash when uploading large data sets

2011-05-12 Thread James Cipar
It looks like MAX_HEAP_SIZE is set in cassandra-env.sh to be half of my physical memory. These are 15GB VMs, so that's 7.5GB for Cassandra. I would have expected that to work, but I will override to 13 GB just to see what happens. I've also got the JNA thing set up. Do you think this would c

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-12 Thread Narendra Sharma
Can u share the code? On Mon, May 2, 2011 at 11:34 PM, anuya joshi wrote: > Hello, > > I am using Cassandra for my application.My Cassandra client uses Thrift > APIs directly. The problem I am facing currently is as follows: > > 1) I added a row and columns in it dynamically via Thrift API Clien

Re: Crash when uploading large data sets

2011-05-12 Thread Sameer Farooqui
The key JVM options for Cassandra are in cassandra.in.sh. What is your min and max heap size? The default setting of max heap size is 1GB. How much RAM do your nodes have? You may want to increase this setting. You can also set the -Xmx and -Xms options to the same value to keep Java from having

Crash when uploading large data sets

2011-05-12 Thread James Cipar
I'm using Cassandra 0.7.5, and uploading about 200 GB of data total (20 GB unique data), to a cluster of 10 servers. I'm using batch_mutate, and breaking the data up into chunks of about 10k records. Each record is about 5KB, so a total of about 50MB per batch. When I upload a smaller 2 GB da

Re: running TPC-C on cassandra clusters

2011-05-12 Thread Jonathan Ellis
I'd recommend trying the 0.7 svn branch (soon to be voted on as 0.7.6) On Thu, May 12, 2011 at 3:09 PM, Xiaowei Wang wrote: > Hi all, > > My partner and I currently using cassandra cluster to run TPC-C. We first > use 2 ec2 nodes to load 20 warehouses. One(client node)  has 8 cores,  the > other(

Re: Hinted Handoff

2011-05-12 Thread Sameer Farooqui
I'm not sure about your first question. I believe the internal system keyspace holds the hinted handoff information. In 0.6 and earlier, HintedHandoffManager.sendMessage used to read the entire row into memory and then send the row back to the client in a single message. As of 0.7, Cassandra page

running TPC-C on cassandra clusters

2011-05-12 Thread Xiaowei Wang
Hi all, My partner and I currently using cassandra cluster to run TPC-C. We first use 2 ec2 nodes to load 20 warehouses. One(client node) has 8 cores, the other(worker node) has 4 cores. During the loading time, either the client node or the worker node will "down"(cannot be detected) randomly a

Hinted Handoff

2011-05-12 Thread Anurag Gujral
Hi All, I have two questions: a) Is there a way to turn on and off hinted handoff per keyspace rather than for multiple keyspaces. b)It looks like cassandra stores hinted handoff data in one row.Is it true? .Does having one row for hinted handoff implies if nodes are down for longer per

Re: Commitlog Disk Full

2011-05-12 Thread Peter Schuller
> I understand that cassandra periodically cleans up the commitlog directories > by generating sstables in datadir. Is there any way to speed up this > movement from commitog to datadir? commitlog_rotation_threshold_in_mb could cause problems if it was set very very high, but with the default of 1

Commitlog Disk Full

2011-05-12 Thread Sanjeev Kulkarni
Hey guys, I have a ec2 debian cluster consisting of several nodes running 0.7.5 on ephimeral disks. These are fresh installs and not upgrades. The commitlog is set to the smaller of the disks which is around 10G in size and the datadir is set to the bigger disk. The config file is basically the sam

Re: network topology issue

2011-05-12 Thread Anurag Gujral
Thanks everyone for your responses. On Thu, May 12, 2011 at 1:18 AM, Sylvain Lebresne wrote: > On Thu, May 12, 2011 at 1:58 AM, Anurag Gujral > wrote: > > Hi All, > > I am testing network topology strategy in cassandra I am > using > > two nodes , one node each in different data cen

Re: Cassandra causing very high load on CPU's 0.6.3

2011-05-12 Thread Jonathan Ellis
https://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.6.13/CHANGES.txt On Thu, May 12, 2011 at 11:56 AM, Ali Ahsan wrote: > >> It is indeed advised to use sunjdk as openjdk is a bit behind as far >> as bug fixes are >> concerned. >> >> Moreover, 0.6.3 is pretty old now and we do have fixed

Monitoring bytes read per cf

2011-05-12 Thread Daniel Doubleday
Hi all got a question for folks with some code insight again. To be able to better understand where our IO load is coming from we want to monitor the number of bytes read from disc per cf. (we love stats) What I have done is wrapping the FileDataInput in SSTableReader to sum the bytes read in

Re: Cassandra causing very high load on CPU's 0.6.3

2011-05-12 Thread Ali Ahsan
It is indeed advised to use sunjdk as openjdk is a bit behind as far as bug fixes are concerned. Moreover, 0.6.3 is pretty old now and we do have fixed a number of issue related to load spikes, so before investigating further the best advice I can give you is to upgrade (either to 0.6.13 if you

Re: Cassandra causing very high load on CPU's 0.6.3

2011-05-12 Thread Sylvain Lebresne
On Thu, May 12, 2011 at 6:04 PM, Ali Ahsan wrote: > On 05/12/2011 04:08 PM, Ali Ahsan wrote: >> >> Hi All >> >> I am experience some problem with me two Cassandra node with RF=2,Both >> node CPU's usage is very high,load average: 9.47, 5.72, 5.11 and this >> causing my application to time out .I h

Re: Cassandra causing very high load on CPU's 0.6.3

2011-05-12 Thread Ali Ahsan
On 05/12/2011 04:08 PM, Ali Ahsan wrote: Hi All I am experience some problem with me two Cassandra node with RF=2,Both node CPU's usage is very high,load average: 9.47, 5.72, 5.11 and this causing my application to time out .I have xeon with 8 processor and 16 GB of Ram.and LVM setup for Cass

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Gabriel Tataranu
>> What does the TPStats look like on the nodes under pressure ? And how many >> nodes are delivering hints to the nodes when they restart? $nodetool -h 127.0.0.1 tpstats Pool NameActive Pending Completed ReadStage 1 11992475 Requ

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Gabriel Tataranu
> I'm assuming the two nodes are the ones receiving the HH after they were > down. Adjacent, so yes. > > Are there a lot of hints collected while they are down ? you can check the > HintedHandOffManager MBean in JConsole There wasn't any downtime - that's something else that's weird. > >

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Gabriel Tataranu
Greetings, > Doesn't really look abnormal to me for a heavy write load situation > which is what "receiving hints" is. I would agree with you but this raises some questions about write performance. Plus I've only seen this kind of behavior recently and only on 2 adjacent nodes. So I have good rea

Re: CounterColumn increments gone after restart

2011-05-12 Thread Utku Can Topçu
see the ticket https://issues.apache.org/jira/browse/CASSANDRA-2642 please On Thu, May 12, 2011 at 3:28 PM, Utku Can Topçu wrote: > Hi guys, > > I have strange problem with 0.8.0-rc1. I'm not quite sure if this is the > way it should be but: > - I create a ColumnFamily named Counters > - do a fe

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Gabriel Tataranu
> An if you have 10 nodes, do all of them happen to send hints to the two > with GC? The 2 nodes are adjacent in token range. They are replicating to each other. Other nodes have no data to replicate so there's no proof one way or another. Best, Gabriel

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Gabriel Tataranu
Greetings, > Just out of curiosity is this on the receiver or sender side? Looks like sender side, although the 2 nodes were replicating to each other so it's hard to tell. > > I have been wondering a bit if the hint playback could need some > adjustment. > There is potentially quite big diffe

CounterColumn increments gone after restart

2011-05-12 Thread Utku Can Topçu
Hi guys, I have strange problem with 0.8.0-rc1. I'm not quite sure if this is the way it should be but: - I create a ColumnFamily named Counters - do a few increments on a column. - kill cassandra - start cassandra When I look at the counter column, the value is 1. See the following pastebin ple

Cassandra causing very high load on CPU's 0.6.3

2011-05-12 Thread Ali Ahsan
Hi All I am experience some problem with me two Cassandra node with RF=2,Both node CPU's usage is very high,load average: 9.47, 5.72, 5.11 and this causing my application to time out .I have xeon with 8 processor and 16 GB of Ram.and LVM setup for Cassandra.How can i trace the main issue of l

Re: Knowing when there is a *real* need to add nodes

2011-05-12 Thread Watanabe Maki
It's interesting topic for me too. How about to add measurement on static disk utilization (% used) and memory utilization ( rss, JVM heap, JVM GC )? maki From iPhone On 2011/05/12, at 0:49, Tomer B wrote: > Hi > > I'm trying to predict when my cluster would soon be needing new nodes > adde

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Terje Marthinussen
An if you have 10 nodes, do all of them happen to send hints to the two with GC? Terje On Thu, May 12, 2011 at 6:10 PM, Terje Marthinussen wrote: > Just out of curiosity is this on the receiver or sender side? > > I have been wondering a bit if the hint playback could need some > adjustment. >

Re: Excessive allocation during hinted handoff

2011-05-12 Thread Terje Marthinussen
Just out of curiosity is this on the receiver or sender side? I have been wondering a bit if the hint playback could need some adjustment. There is potentially quite big differences on how much is sent per throttle delay time depending on what your data looks like. Early 0.7 releases also built u

Import/Export of Schema Migrations

2011-05-12 Thread David Boxenhorn
My use case is like this: I have a development cluster, a staging cluster and a production cluster. When I finish a set of migrations (i.e. changes) on the development cluster, I want to apply them to the staging cluster, and eventually the production cluster. I don't want to do it by hand, because

Re: network topology issue

2011-05-12 Thread Sylvain Lebresne
On Thu, May 12, 2011 at 1:58 AM, Anurag Gujral wrote: > Hi All, > I am testing network topology strategy in cassandra I am using > two nodes , one node each in different data center. > Since the nodes are in different dc I assigned token 0 to both the nodes. > I added both the nodes a

Knowing when there is a *real* need to add nodes

2011-05-12 Thread Tomer B
Hi I'm trying to predict when my cluster would soon be needing new nodes added, i want a continuous graph telling my of my cluster health so that when i see my cluster becomes more and more busy (I want numbers & measurments) i would be able to know i need to start purchasing more machines and get