unstable write performance

2014-03-26 Thread Jiaan Zeng
Hi, I am doing some performance benchmarks in a *single* node cassandra 1.2.4. BTW, the machine is dedicated to run one cassandra instance. The workload is 100% write. The throughput varies dramatically and sometimes even drops to 0. I have tried several things below and still got the same

Re: Cassandra input paging for Hadoop

2013-09-11 Thread Jiaan Zeng
Speaking of thrift client, i.e. ColumnFamilyInputFormat, yes, ConfigHelper.setRangeBatchSize() can reduce the number of rows sent to Cassandra. Depend on how big your column is, you may also want to increase thrift message length through setThriftMaxMessageLengthInMb(). Hope that helps. On Tue,

what happen if coordinator node fails during write

2013-06-25 Thread Jiaan Zeng
Hi there, I am writing data to Cassandra by thrift client (not hector) and wonder what happen if the coordinator node fails. The same question applies for bulk loader which uses gossip protocol instead of thrift protocol. In my understanding, the HintedHandoff only takes care of the replica node

Re: how to handle join properly in this case

2013-05-29 Thread Jiaan Zeng
Thanks for all the comments and thoughts! I think Hiller points out a promising direction. I wonder if the partition and filter are features shipped with Cassandra or features came from PlayOrm. Any resources about that would be appreciated. Thanks! On Tue, May 28, 2013 at 11:39 AM, Hiller, Dean

how to handle join properly in this case

2013-05-25 Thread Jiaan Zeng
Hi Experts, We have tables (a.k.a. column family) A and B. The row of the table is simply a key value pair. Table A and B are written by clients all the time. We need to transform the row key of table A and B according to a set of rules, join these two tables and save the results to table C for