I'm about to extend my two node cluster with four dedicated nodes and removing one of the old nodes, leaving a five node cluster. The cluster is in production, but I can spare it to do some stress testing in the meantime as I'm also interested about my cluster performance. I can't dedicate the cluster for the test, but the load at day time should be low enough not to screw with the end results too much. The results might come in within a few days as we'll get the nodes up - hopefully my tests will produce something meaningful data which can be applied to this issue.
I haven't used stress.py yet, any tips on that? Could you, David, send me the stress.py command line which you used? - Juho Mäkinen On Mon, Jul 19, 2010 at 10:51 PM, David Schoonover <david.schoono...@gmail.com> wrote: > Sorry, mixed signals in my response. I was partially replying to suggestions > that we were limited by the box's NIC or DC's bandwidth (which is gigabit, no > dice there). I also ran the tests with -t50 on multiple tester machines in > the cloud with no change in performance; I've now rerun those tests on > dedicated hardware. > > > reads/sec @ > nodes one client two clients > 1 53k 73k > 2 37k 50k > 4 37k 50k > > > Notes: > - All notes from the previous dataset apply here. > - All clients were reading with 50 processes. > - Test clients were not co-located with the databases or each other. > - All machines are in the same DC. > - Servers showed about 20MB/sec in network i/o for the multi-node clusters, > which is well under the max for gigabit. > - Latency was about 2.5ms/req. > > > At this point, we'd really appreciate it if anyone else could attempt to > replicate our results. Ultimately, our goal is to see an increase in > throughput given an increase in cluster size. > > -- > David Schoonover > > On Jul 19, 2010, at 2:25 PM, Stu Hood wrote: > >> If you put 25 processes on each of the 2 machines, all you are testing is >> how fast 50 processes can hit Cassandra... the point of using more machines >> is that you can use more processes. >> >> Presumably, for a single machine, there is some limit (K) to the number of >> processes that will give you additional gains: above that point, you should >> use more machines, each running K processes. >> > >