Hi all,

My partner and I currently using cassandra cluster to run TPC-C. We first
use 2 ec2 nodes to load 20 warehouses. One(client node)  has 8 cores,  the
other(worker node) has 4 cores. During the loading time, either the client
node or the worker node will "down"(cannot be detected) randomly and then
"up" again in a short time. If the two nodes both down, we failed in
loading. If only one of them down, we can continue to load data.

The problem is if we use multiple threads(we write multiprocess code), say 4
clients threads, some of them might be stop at the point one of the nodes
first down, and the dead threads will never come back.... This will not only
enlarge our loading time, but also effect the amount of data we can load.

So we need to figure out why the nodes continue to be up and down and fix
this problem.

Thanks for any help!

Best,
Xiaowei

Reply via email to