Hi Branislav,
what is it you would expect?
Some thoughts:
Batches are often misunderstood, they work well only if they contain
only one partition key - think of a batch of different sensor data to
one key. If you group batches with many partition keys and/or do large
batches this puts high load on the coordinator node with then itself
needs to talk to the nodes holding the partitions. This could explain
the scaling you see in your second try without batches. Keep in mind
that the driver supports executeAsync and ResultSetFutures.
Second, put commitlog and data directories on seperate disks when using
spindles.
Third, have you monitored iostats and cpustats while running your tests?
Cheers,
Jan
Am 08.02.2017 um 16:39 schrieb Branislav Janosik -T (bjanosik - AAP3 INC
at Cisco):
Hi all,
I have a cluster of three nodes and would like to ask some questions
about the performance.
I wrote a small benchmarking tool in java that mirrors (read, write)
operations that we do in the real project.
Problem is that it is not scaling like it should. The program runs two
tests: one using batch statement and one without using the batch.
The operation sequence is: optional select, insert, update, insert. I
run the tool on my server with 128 threads (# of threads has no
influence on the performance),
creating usually 100K resources for testing purposes.
The average results (operations per second) with the use of batch
statement are:
Replication Factor = 1 with reading without reading
1-node cluster 37K 46K
2-node cluster 37K 47K
3-node cluster 39K 70K
Replication Factor = 2 with reading without reading
2-node cluster 21K 40K
3-node cluster 30K 48K
The average results (operations per second) without the use of batch
statement are:
Replication Factor = 1 with reading without reading
1-node cluster 31K 20K
2-node cluster 38K 39K
3-node cluster 45K 87K
Replication Factor = 2 with reading without reading
2-node cluster 19K 22K
3-node cluster 26K 36K
The Cassandra VMs specs are: 16 CPUs, 16GB and two 32GB of RAM, at
least 30GB of disk space for each node. Non SSD, each VM is on
separate physical server.
The code is available here
https://github.com/bjanosik/CassandraBenchTool.git . It can be built
with Maven and then you can use jar in target directory with java -jar
target/cassandra-test-1.0-SNAPSHOT-jar-with-dependencies.jar .
Thank you for any help.
--
Jan Kesten, mailto:j.kes...@enercast.de
Tel.: +49 561/4739664-0 FAX: -9 Mobil: +49 160 / 90 98 41 68
enercast GmbH Universitätsplatz 12 D-34127 Kassel HRB15471
http://www.enercast.de Online-Prognosen für erneuerbare Energien
Geschäftsführung: Thomas Landgraf (CEO), Bernd Kratz (CTO), Philipp Rinder (CSO)
Diese E-Mail und etwaige Anhänge können vertrauliche und/oder rechtlich
geschützte Informationen enthalten. Falls Sie nicht der angegebene Empfänger
sind oder falls diese E-Mail irrtümlich an Sie adressiert wurde,
benachrichtigen Sie uns bitte sofort durch Antwort-E-Mail und löschen Sie diese
E-Mail nebst etwaigen Anlagen von Ihrem System. Ebenso dürfen Sie diese E-Mail
oder ihre Anlagen nicht kopieren oder an Dritte weitergeben. Vielen Dank.
This e-mail and any attachment may contain confidential and/or privileged
information. If you are not the named addressee or if this transmission has
been addressed to you in error, please notify us immediately by reply e-mail
and then delete this e-mail and any attachment from your system. Please
understand that you must not copy this e-mail or any attachment or disclose the
contents to any other person. Thank you for your cooperation.