Re: underutilized servers

Attila Wind Sat, 06 Mar 2021 07:47:46 -0800

Thanks Bowen,

 * "How do you split?"
   challenging to answer short, but let me try: physical host has cores
   from idx 0 - 11 (6 physical and 6 virtual in pairs - they are in
   pairs as 0,6 belongs together, then 1,7 and then 2,8 and so on)
   What we do is that in the virt-install command we use --cpu
   host-passthrough --cpuset={{virtinst_cpu_set}} --vcpus=6
   where {{virtinst_cpu_set}} is
   - 0,6,1,7,2,8 - for CassandraVM
   - 3,9,4,10,5,11 - for the other VM
   (we split the physical host into 2 VMs)


 * "do you expose physical disks to the VM or use disk image files"
   no images, physical host has 2 spinning disks and 1 SSD drive
   CassandraVM gets assigned explicitly 1 of the spinning disks and she
   also gets assigned a partition of the SSD (which is used for commit
   logs only so that is separated from the data)

 * "A 50-70% utilization of a 1 Gbps network interface on average
   doesn't sound good at all."
   Yes, this is weird... Especially because e.g. if we bring down a
   node, the other 2 nodes (we go with RF=2) are producing ~600Mb hints
   files / minute
   And assuming hint files is basicall the saved "network traffic"
   until node is down this would still just give 10Mb/sec ...
   OK, these are just the replicated updates, there is also read and of
   course App layer is also reading but even with that in mind it does
   not add up... So we will try to do further analysis here

Thanks for the article also regarding the Counter tables!

Actually we already know for a while there are "interesting" thingsgoing around the Counter tables it is surprising how difficult to findinfo regarding this topic...I personally tried to look around here several times and always justgetting the same and the same information in posts...

Moving away from counters would not be bad especially because of thedifficulties around DELETEing (we also feel it) them however I do notsee any obvious migration strategy here...But maybe let me ask this in a separate question. Might make moresense... :-)


Thanks again - and thanks to others as well

It looks mastering the "nodetool tpstats" and the Cassandra thread poolswould worth some time... :-)



Attila Wind

http://www.linkedin.com/in/attilaw <http://www.linkedin.com/in/attilaw>
Mobile: +49 176 43556932


06.03.2021 13:03 keltezéssel, Bowen Song írta:

Hi Attila,
Addressing your data modelling issue is definitely important, and thisalone may be enough to solve all the issues you have with Cassandra.
  * "Since these are VMs, is there any chance they are competing for
    resources on the same physical host?"
    We are splitting the physical hardware into 2 VMs - and resources
    (cpu cores, disks, ram) all assigned in a dedicated fashion to the
    VMs without intersection
How do you split? Number of cores in all VMs sums to the totalphysical CPU cores is not enough, because context switches andpossible thread contentions will waste CPU cycles. Since you have alsosaid 8-12% CPU time is spent in sys mode, I think it warrants aninvestigation.
Also, do you expose physical disks to the VM or use disk image files?Disk image files can be slow, especially for high IOPS random reads.
Personally, I won't recommend running a database on a VM other thanfor dev/testing/etc. purposes. If possible, you should try to add anode running on a bare metal server of the similar spec as the VM, andsee if there's any noticeable performance differences between thisbare metal node and the VM nodes.
  * The bandwidth limit is 1Gbit/sec (so 120Mb/sec) BUT it is the
    limit of the physical host - so our 2 VMs competing here. Possible
    that Cassandra VM has ~50-70% of it...
A 50-70% utilization of a 1 Gbps network interface on average doesn'tsound good at all. That over 60MB/s network traffic constantly. Canyou investigate why is this happening? Do you really read/write thatmuch? Or is it something else?
  * "nodetool tpstats"
    whooa I never used it, we definitely need some learning here to
    even understand the output... :-) But I copy that here to the
    bottom ... maybe clearly shows something to someone who can read it...
I noticed that you are using counters in Cassandra. I have to say thatI haven't had a good experience with Cassandra counters. An article<https://ably.com/blog/cassandra-counter-columns-nice-in-theory-hazardous-in-practice>which I read recently may convince you to get rid of it. I also don'tthink counter is something the Cassandra developers are focused on,because things like CASSANDRA-6506<https://issues.apache.org/jira/browse/CASSANDRA-6506> have beensitting there for many years.
Use your database software for their strengths, not their weaknesses.You have Cassandra, but you don't have to use every feature inCassandra. Sometimes another technology may be more suitable forsomething that Cassandra can do but isn't very good at.
Cheers,

Bowen

On 05/03/2021 18:37, Attila Wind wrote:
Thanks for the answers @Sean and @Bowen !!!
First of all, this article described very similar thing we experience- let me share
https://www.senticore.com/overcoming-cassandra-write-performance-problems/
we are studying that now

Furthermore

  * yes, we have some level of unbalanced data which needs to be
    improved - this is on our backlog so should be done
  * and yes we do see clearly that this unbalanced data is slowing
    down everything in Cassandra (there is proof of it in our
    Prometheus+Grafana based monitoring)
  * we will do this optimization now definitely (luckily we have plan
    already)

@Sean:

  * "Since these are VMs, is there any chance they are competing for
    resources on the same physical host?"
    We are splitting the physical hardware into 2 VMs - and resources
    (cpu cores, disks, ram) all assigned in a dedicated fashion to
    the VMs without intersection
    BUT!!
    You are right... There is one thing we are sharing: network
    bandwidth... and actually that one does not come up in the
    "iowait" part for sure. We will further analyze into this
    direction definitely because from the monitoring as far as I see
    yeppp, we might hit the wall here
  * consistency level: we are using LOCAL_ONE
  * "Does the app use prepared statements that are only prepared once
    per app invocation?"
    Yes and yes :-)
  * "Any LWT/”if exists” in your code?"
    No. We go with RF=2 so we even can not use this (as LWT goes with
    QUORUM and in our case this would mean we could not tolerate
    losing a node... not good... so no)

@Bowen:

  * The bandwidth limit is 1Gbit/sec (so 120Mb/sec) BUT it is the
    limit of the physical host - so our 2 VMs competing here.
    Possible that Cassandra VM has ~50-70% of it...
  * The CPU's "system" value shows 8-12%
  * "nodetool tpstats"
    whooa I never used it, we definitely need some learning here to
    even understand the output... :-) But I copy that here to the
    bottom ... maybe clearly shows something to someone who can read
    it...

so, "nodetool tpstats" from one of the nodes

Pool Name Active   Pending      Completed   Blocked  All time blocked
ReadStage 0 0 6248406 0 0CompactionExecutor 0 0 168525 0 0MutationStage 0 0 25116817 0 0MemtableReclaimMemory 0 0 17636 0 0PendingRangeCalculator 0 0 7 0 0GossipStage 0 0 324388 0 0SecondaryIndexManagement 0 0 0 0 0HintsDispatcher 1 0 75 0 0Repair-Task 0 0 1 0 0RequestResponseStage 0 0 31186150 0 0Native-Transport-Requests 0 0 22827219 0 0CounterMutationStage 0 0 12560992 0 0MemtablePostFlush 0 0 19259 0 0PerDiskMemtableFlushWriter_0 0 0 17636 0 0ValidationExecutor 0 0 48 0 0Sampler 0 0 0 0 0ViewBuildExecutor 0 0 0 0 0MemtableFlushWriter 0 0 17636 0 0InternalResponseStage 0 0 44658 0 0AntiEntropyStage 0 0 161 0 0CacheCleanupExecutor 0 0 0 0 0
Message type Dropped Latency waiting inqueue (micros)
50%               95%               99%               Max
READ_RSP 18 1629.72 8409.01 155469.30 386857.37RANGE_REQ 0 0.00 0.00 0.00 0.00PING_REQ 0 0.00 0.00 0.00 0.00_SAMPLE 0 0.00 0.00 0.00 0.00VALIDATION_RSP 0 0.00 0.00 0.00 0.00SCHEMA_PULL_RSP 0 0.00 0.00 0.00 0.00SYNC_RSP 0 0.00 0.00 0.00 0.00SCHEMA_VERSION_REQ 0 0.00 0.00 0.00 0.00HINT_RSP 0 943.13 3379.39 5839.59 52066.35BATCH_REMOVE_RSP 0 0.00 0.00 0.00 0.00PAXOS_COMMIT_REQ 0 0.00 0.00 0.00 0.00SNAPSHOT_RSP 0 0.00 0.00 0.00 0.00COUNTER_MUTATION_REQ 94 1358.10 5839.59 14530.76 464228.84GOSSIP_DIGEST_SYN 0 1358.10 5839.59 25109.16 25109.16PAXOS_PREPARE_REQ 0 0.00 0.00 0.00 0.00PREPARE_MSG 0 0.00 0.00 0.00 0.00PAXOS_COMMIT_RSP 0 0.00 0.00 0.00 0.00HINT_REQ 0 0.00 0.00 0.00 0.00BATCH_REMOVE_REQ 0 0.00 0.00 0.00 0.00STATUS_RSP 0 0.00 0.00 0.00 0.00READ_REPAIR_RSP 0 0.00 0.00 0.00 0.00GOSSIP_DIGEST_ACK2 0 1131.75 5839.59 7007.51 7007.51CLEANUP_MSG 0 0.00 0.00 0.00 0.00REQUEST_RSP 0 0.00 0.00 0.00 0.00TRUNCATE_RSP 0 0.00 0.00 0.00 0.00REPLICATION_DONE_RSP 0 0.00 0.00 0.00 0.00SNAPSHOT_REQ 0 0.00 0.00 0.00 0.00ECHO_REQ 0 0.00 0.00 0.00 0.00PREPARE_CONSISTENT_REQ 0 0.00 0.00 0.00 0.00FAILURE_RSP 9 0.00 0.00 0.00 0.00BATCH_STORE_RSP 0 0.00 0.00 0.00 0.00SCHEMA_PUSH_RSP 0 0.00 0.00 0.00 0.00MUTATION_RSP 17 1131.75 4866.32 8409.01 464228.84FINALIZE_PROPOSE_MSG 0 0.00 0.00 0.00 0.00ECHO_RSP 0 0.00 0.00 0.00 0.00INTERNAL_RSP 0 0.00 0.00 0.00 0.00FAILED_SESSION_MSG 0 0.00 0.00 0.00 0.00_TRACE 0 0.00 0.00 0.00 0.00SCHEMA_VERSION_RSP 0 0.00 0.00 0.00 0.00FINALIZE_COMMIT_MSG 0 0.00 0.00 0.00 0.00SNAPSHOT_MSG 0 0.00 0.00 0.00 0.00PREPARE_CONSISTENT_RSP 0 0.00 0.00 0.00 0.00PAXOS_PROPOSE_REQ 0 0.00 0.00 0.00 0.00PAXOS_PREPARE_RSP 0 0.00 0.00 0.00 0.00MUTATION_REQ 265 1358.10 5839.59 223875.79 802187.44READ_REQ 45 1629.72 5839.59 36157.19 386857.37PING_RSP 0 0.00 0.00 0.00 0.00RANGE_RSP 0 0.00 0.00 0.00 0.00VALIDATION_REQ 0 0.00 0.00 0.00 0.00SYNC_REQ 0 0.00 0.00 0.00 0.00_TEST_1 0 0.00 0.00 0.00 0.00GOSSIP_SHUTDOWN 0 0.00 0.00 0.00 0.00TRUNCATE_REQ 0 0.00 0.00 0.00 0.00_TEST_2 0 0.00 0.00 0.00 0.00GOSSIP_DIGEST_ACK 0 1629.72 5839.59 43388.63 43388.63SCHEMA_PUSH_REQ 0 0.00 0.00 0.00 0.00FINALIZE_PROMISE_MSG 0 0.00 0.00 0.00 0.00BATCH_STORE_REQ 0 0.00 0.00 0.00 0.00COUNTER_MUTATION_RSP 96 1358.10 4866.32 8409.01 464228.84REPAIR_RSP 0 0.00 0.00 0.00 0.00STATUS_REQ 0 0.00 0.00 0.00 0.00SCHEMA_PULL_REQ 0 0.00 0.00 0.00 0.00READ_REPAIR_REQ 0 0.00 0.00 0.00 0.00ASYMMETRIC_SYNC_REQ 0 0.00 0.00 0.00 0.00REPLICATION_DONE_REQ 0 0.00 0.00 0.00 0.00PAXOS_PROPOSE_RSP 0 0.00 0.00 0.00 0.00
Attila Wind

http://www.linkedin.com/in/attilaw <http://www.linkedin.com/in/attilaw>
Mobile: +49 176 43556932


05.03.2021 17:45 keltezéssel, Bowen Song írta:
Based on my personal experience, the combination of slow readqueries and low CPU usage is often an indicator of bad table schemadesign (e.g.: large partitions) or bad query (e.g. without partitionkey). Check the Cassandra logs first, is there any longstop-the-world GC? tombstone warning? anything else that's out ofordinary? Check the output from "nodetool tpstats", is there anypending or blocked tasks? Which thread pool(s) are they in? Is therea high number of dropped messages? If you can't find anything usefulfrom the Cassandra server logs and "nodetool tpstats", try to get afew slow queries from your application's log, and run them manuallyin the cqlsh. Are the results very large? How long do they take?
Regarding some of your observations:

/> CPU load is around 20-25% - so we have lots of spare capacity/
Is it very few threads each uses nearly 100% of a CPU core? If so,what are those threads? (I find the ttop command from the sjk tool<https://github.com/aragozin/jvm-tools> very helpful)
/> network load is around 50% of the full available bandwidth/
This sounds alarming to me. May I ask what's the full availablebandwidth? Do you have a lots of CPU time spent in sys (vs user) mode?
On 05/03/2021 14:48, Attila Wind wrote:
Hi guys,
I have a DevOps related question - hope someone here could givesome ideas/pointers...
We are running a 3 nodes Cassandra cluster
Recently we realized we do have performance issues. And based oninvestigation we took it seems our bottleneck is the Cassandracluster. The application layer is waiting a lot for Cassandra ops.So queries are running slow on Cassandra side however due to ourmonitoring it looks the Cassandra servers still have lots of freeresources...
The Cassandra machines are virtual machines (we do own the physicalhosts too) built with kvm - with 6 CPU cores (3 physical) and 32GBRAM dedicated to it.We are using Ubuntu Linux 18.04 distro - everywhere the sameversion (the physical and virtual host)
We are running Cassandra 4.0-alpha4

What we see is

  * CPU load is around 20-25% - so we have lots of spare capacity
  * iowait is around 2-5% - so disk bandwidth should be fine
  * network load is around 50% of the full available bandwidth
  * loadavg is max around 4 - 4.5 but typically around 3 (because
    of the cpu count 6 should represent 100% load)
and still, query performance is slow ... and we do not understandwhat could hold Cassandra back to fully utilize the server resources...
We are clearly missing something!
Anyone any idea / tip?

thanks!

--
Attila Wind

http://www.linkedin.com/in/attilaw <http://www.linkedin.com/in/attilaw>
Mobile: +49 176 43556932

Re: underutilized servers

Reply via email to