cassandra crashed

2010-06-11 Thread hive13 Wong
One of our cassandra nodes suddenly crashed, then the other 2... Exceptions found in the system.log are attached below. Any ideas? Does it mean that we've got some bad data running around in the cluster? Many thanks The exeption on the node that crashed first was like ERROR [RESPONSE-STAGE:669] 20

Best way of adding new nodes

2010-06-10 Thread hive13 Wong
Hi, guys The 2 ways of adding new nodes, when add with bootstrapping, since we've already got lots of data, often it will take many hours to complete the bootstrapping and probably affect the performance of existing nodes. But if we add without bootstrapping, the data load on the new node could be

Re: single node capacity

2010-06-10 Thread hive13 Wong
action is happening. > > My recommendation is to reduce the write traffic to the nodes so that each > node can > keep up with compaction. If reducing the load is not possible, you have to > add nodes > (or get faster hard disks, but that is often not possible). > > Martin > &

single node capacity

2010-06-10 Thread hive13 Wong
Hi, How much data load can a single typical cassandra instance handle? It seems like we are getting into trouble when one of our node's load grows to bigger than 200g. Both read latency and write latency are increasing, varying from 10 to several thousand milliseconds. machine config is 16*cpu 32G

Re: Skipping corrupted rows when doing compaction

2010-06-01 Thread hive13 Wong
27;t need to reload data. > > It's also worth trying 0.6.2 and DiskAccessMode=standard, in case > you've found another similar bug. > > On Tue, Jun 1, 2010 at 7:37 AM, hive13 Wong wrote: > > Hi, > > Is there a way to skip corrupted rows when doing compaction? &g

Skipping corrupted rows when doing compaction

2010-06-01 Thread hive13 Wong
Hi, Is there a way to skip corrupted rows when doing compaction? We are currently deploying 2 nodes with replicationfactor=2 but one node reports lots of exceptions like java.io.UTFDataFormatException: malformed input around byte 72. My guess is that some of the data in the SSTable is corrupted b