Apache Cassandra performance tuning - call for contribution

2022-02-09 Thread Daniel Seybold

Dear Apache Cassandra community,

we plan to run a large case performance study for Apache Cassandra and 
MongoDB where the focus is not to compare both systems directly but to 
answer the question: /how much performance can you get out each DBMS 
with an optimal configuration compared to the vanilla installation?/


In this study, we use three different configurations of the well-known 
Yahoo Cloud Serving (YCSB) benchmark to emulate three types of workloads 
(write-heavy, ready-heavy, mixed).


With these workloads, we stress the DBMS’ hosted on AWS EC2.

In order to get the optimal answer, we need your support as Apache 
Cassandra experts to find the optimal OS and DBMS configuration for the 
outlined workloads (either one general configuration or 
workload-specific configurations).


We will carry out the benchmarks with our Benchmarking-as-a-Service 
(BaaS) platform and include your configurations into the benchmarking 
process.


And of course, we will release all data as open data sets to the 
community and publish the study on our website and distribute it through 
our marketing channels.
Moreover, we will reference you in this study and give you the 
opportunity to introduce yourself and your company as well as comment on 
the results with your experience and assessment.


If you are interested in contributing, feel free to reach out to me.

Cheers,
Daniel




Re: [EXTERNAL] Availability issues for write/update/read workloads (up to 100s downtime) in case of a Cassandra node failure

2018-11-23 Thread Daniel Seybold

Hi Alexander,

thanks a lot for the pointers, I checked the mentioned issue.

While the reported issue seems to match our problem it only occurs reads 
and not for writes (according to the Datastax Jira). But we experience 
downtimes for writes and reads.



Which version of the Datastax Driver are you using for your tests?

We use version 3.0.0

But I have also tried version 3.2.0 to avoid your mentioned JAVA-1346 
issue, but still the same behaviour with respect to the downtime.



How is it configured (load balancing policies, etc...) ?

Besides the write consistency of ONE it uses the default settings.

As we use the YCSB as workload for our experiments, you can have a look 
at the driver settings in the basic class: 
https://github.com/brianfrankcooper/YCSB/blob/master/cassandra/src/main/java/com/yahoo/ycsb/db/CassandraCQLClient.java 




Do you have some debug logs on the client side that could help?

On client side the logs shows no exceptions or any suspicious messages.

I also turned on the tracing but didn't find any suspicious messages 
(yet I did not spend too much time in that and I am no expert the 
Cassandra Driver)


If more detailed logs or the traces would help to further investigate 
the issue let me know and I will rerun the experiments to create the 
logs and traces.


Many thanks again for your help.

Cheers,

Daniel


Am 16.11.2018 um 15:08 schrieb Alexander Dejanovski:

Hi Daniel,

it seems like the driver isn't detecting that the node went down, 
which is probably due to the way the node is being killed.
If I remember correctly, in some cases Netty transport is still up in 
the client, which will still allows to send queries without them 
answering back : https://datastax-oss.atlassian.net/browse/JAVA-1346

Eventually, the node gets discarded when the heartbeat system catches up.
It's also possible that the stuck queries then eat up all the 
available slots in the driver, preventing any other query to be sent 
in that JVM.


Which version of the Datastax Driver are you using for your tests?
How is it configured (load balancing policies, etc...) ?
Do you have some debug logs on the client side that could help?

Thanks,


On Fri, Nov 16, 2018 at 1:19 PM Daniel Seybold 
mailto:daniel.seyb...@uni-ulm.de>> wrote:


Hi Sean,

thanks for your comments, find below some more details with
respect to the (1) VM sizing and (2) the replication factor:

(1) VM sizing:

We selected the small VMs as intial setup to run our experiments.
We have also executed the same experiments (5 nodes) on larger VMs
with 6 cores and 12GB memory (where 6GB was allocated to Cassandra).

We use the default CMS garbace collector (with default settings)
and the debug.log and system.log does not show any suspicious GC
messages.

(2) Replication factor

We set the RF to 5 as we want to emulate a scenario which is able
to survive multiple-node failures. We have also tried a RF of 3
(in the 5 node cluster) but the downtime in case of a node failure
persists.


I also attached two plots which show the results with the
downtimes for using the larger VMs and setting the RF to 3

Any further comments much appreciated,

Cheers,
Daniel


Am 09.11.2018 um 19:04 schrieb Durity, Sean R:


The VMs’ memory (4 GB) seems pretty small for Cassandra. What
heap size are you using? Which garbage collector? Are you seeing
long GC times on the nodes? The basic rule of thumb is to give
the Cassandra heap 50% of the RAM on the host. 2 GB isn’t very much.

Also, I wouldn’t set the replication factor to 5 (the number of
nodes). If RF is always equal to the number of nodes, you can’t
really scale beyond the size of the disk on any one node (all
data is on each node). A replication factor of 3 would be more
like a typical production set-up.

Sean Durity

*From:*Daniel Seybold 
<mailto:daniel.seyb...@uni-ulm.de>
*Sent:* Friday, November 09, 2018 5:49 AM
*To:* user@cassandra.apache.org <mailto:user@cassandra.apache.org>
*Subject:* [EXTERNAL] Availability issues for write/update/read
workloads (up to 100s downtime) in case of a Cassandra node failure

Hi Apache Cassandra experts,

we are running a set of availability evaluations under a
write/read/update workloads with Apache Cassandra and experience
some unexpected results, i.e.  0 ops/s over a period up to 100s.

In order to provide a clear picture find below the details of (1)
the setup and (2) the evaluation workflow

*1. Setup:*

Cassandra version: 3.11.2
Cluster size: 5 nodes
Replication Factor: 5
Each nodes runs in the same private OpenStack based cloud, within
the same availability zone and uses the private network.
Each nodes runs as OS Ubuntu 16.04 server and has 2 cores, 4GB
RAM and 50GB disk.

Workload:
Yahoo Cloud Serving Benchmark 0.12
W1: 100% write

Re: [EXTERNAL] Availability issues for write/update/read workloads (up to 100s downtime) in case of a Cassandra node failure

2018-11-16 Thread Daniel Seybold

Hi Sean,

thanks for your comments, find below some more details with respect to 
the (1) VM sizing and (2) the replication factor:


(1) VM sizing:

We selected the small VMs as intial setup to run our experiments. We 
have also executed the same experiments (5 nodes) on larger VMs with 6 
cores and 12GB memory (where 6GB was allocated to Cassandra).


We use the default CMS garbace collector (with default settings) and the 
debug.log and system.log does not show any suspicious GC messages.


(2) Replication factor

We set the RF to 5 as we want to emulate a scenario which is able to 
survive multiple-node failures. We have also tried a RF of 3 (in the 5 
node cluster) but the downtime in case of a node failure persists.



I also attached two plots which show the results with the downtimes for 
using the larger VMs and setting the RF to 3


Any further comments much appreciated,

Cheers,
Daniel


Am 09.11.2018 um 19:04 schrieb Durity, Sean R:


The VMs’ memory (4 GB) seems pretty small for Cassandra. What heap 
size are you using? Which garbage collector? Are you seeing long GC 
times on the nodes? The basic rule of thumb is to give the Cassandra 
heap 50% of the RAM on the host. 2 GB isn’t very much.


Also, I wouldn’t set the replication factor to 5 (the number of 
nodes). If RF is always equal to the number of nodes, you can’t really 
scale beyond the size of the disk on any one node (all data is on each 
node). A replication factor of 3 would be more like a typical 
production set-up.


Sean Durity

*From:*Daniel Seybold 
*Sent:* Friday, November 09, 2018 5:49 AM
*To:* user@cassandra.apache.org
*Subject:* [EXTERNAL] Availability issues for write/update/read 
workloads (up to 100s downtime) in case of a Cassandra node failure


Hi Apache Cassandra experts,

we are running a set of availability evaluations under a 
write/read/update workloads with Apache Cassandra and experience some 
unexpected results, i.e.  0 ops/s over a period up to 100s.


In order to provide a clear picture find below the details of (1) the 
setup and (2) the evaluation workflow


*1. Setup:*

Cassandra version: 3.11.2
Cluster size: 5 nodes
Replication Factor: 5
Each nodes runs in the same private OpenStack based cloud, within the 
same availability zone and uses the private network.
Each nodes runs as OS Ubuntu 16.04 server and has 2 cores, 4GB RAM and 
50GB disk.


Workload:
Yahoo Cloud Serving Benchmark 0.12
W1: 100% write
W2: 100% read
W3: 100% update

*2. Evaluation Workflow: *

1. allocate 5 VMs & deploy DBMS cluster
2. start a YCSB worklod (only one of W1-3) which runs up to 30 minutes
3. wait for 200s
4. trigger the selection of a  random node in the cluster and delete 
the VM without stopping  Cassandra before

5. analyze throughput time series over the evaluation

*3. (Unexpected) Results

*We expected to see a (slight) drop in the throughput as soon as the 
VM was deleted.
But the throughput results show that the there are periods of ~10s - 
150s (not deterministic) where no operations are executed (all metrics 
are collected on client side)
Yet, there are no timeout exceptions on client side and also the logs 
on cluster side do not show anything that explains this behaviour.


I attached a series of plots which show the throughput and the 
downtimes over the evaluation runs.


Do you have any explanations for this behaviour or recommendations how 
to reduce the  potential "downtime" ?


Thanks in advance for any help and recommendations,

Cheers,
Daniel



--
M.Sc. Daniel Seybold
Universität Ulm
Institut Organisation und Management
von Informationssystemen (OMI)
Albert-Einstein-Allee 43
89081 Ulm
Phone: +49 (0)731 50-28 799



The information in this Internet Email is confidential and may be 
legally privileged. It is intended solely for the addressee. Access to 
this Email by anyone else is unauthorized. If you are not the intended 
recipient, any disclosure, copying, distribution or any action taken 
or omitted to be taken in reliance on it, is prohibited and may be 
unlawful. When addressed to our clients any opinions or advice 
contained in this Email are subject to the terms and conditions 
expressed in any applicable governing The Home Depot terms of business 
or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this 
attachment and for any damages or losses arising from any 
inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or 
other items of a destructive nature, which may be contained in this 
attachment and shall not be liable for direct, indirect, consequential 
or special damages in connection with this e-mail message or its 
attachment.


--
M.Sc. Daniel Seybold

Universität Ulm
Institut Organisation und Management
von Informationssystemen (OMI)
Albert-Einstein-Allee 43
89081 Ulm
Phone: +49 (0)731 50-28 799



cassandra_failures_v2.p