The tpstats you posted show that the node is dropping reads and writes which means that your disk can't keep up with the load meaning your disk is the bottleneck. If you haven't already, place data and commitlog on separate disks so they're not competing for the same IO bandwidth. Note that It's OK to have them on the same disk/volume if you have NVMe SSDs since it's a lot more difficult to saturate them.
The challenge with monitoring is that typically it's only checking disk stats every 5 minutes (for example). But your app traffic is bursty in nature so stats averaged out over a period of time is irrelevant because the only thing that matters is what the disk IO is at the the time you hit peak loads. The dropped reads and mutations tell you the node is overloaded. Provided your nodes are configured correctly, the only way out of this situation is to correctly size your cluster and add more nodes -- your cluster needs to be sized for peak loads, not average throughput. Cheers!