Hi, Would you share some more context with us?
- What Cassandra version do you use? - What is the data size per node? - How much RAM does the hardware have? - Does your client use paging? A few ideas to explore: - Try tracing the query, see what's taking time (and resources) - From the tracing, logs, sstablemetadata tool or monitoring dashboard, do you see any tombstone? - What is the percentage of GC pause per second? 128 GB seems huge to me, even with G1GC. Do you still have memory for page caching? Also from general logs, gc logs or dashboard. Reallocating 70GB every minute does not seem right. Maybe using a smaller size for the heap (more common) would have more frequent but smaller pauses? - Any pending/blocked thread (monitoring charts about thread pool or 'nodetool tpstats'. Also 'watch -d "nodetool tpstats' will make evolution and newly pending/blocked thread obvious to you (or a cassandra restart reset stats as well). - What is the number of SSTable touched per read operations on the main tables? - Are the bloom filters efficient? - Is key cache efficient (ratio of hit 0.8, 0.9+) - The logs should be reporting something during the 10 minutes the machines were unresponsive, give a try to: grep -e "WARN" -e "ERROR" /var/log/cassandra/system.log More than 200 MB per partitions is quite big. Explore improving what can be operationally, but you might have to reduce the partition size ultimately. On the other side, Cassandra tends to evolve allowing bigger partition sizes, as it handles them with a better efficiency over time. If you can work on the operational side, you might be able to keep this model. If it is possible to experiment on a canary node and observe, I would probably go this path after identifying a possible origin and solution for this issue. Other tips that might help here: - Disabling 'dynamic snitching' proved to improve performances (often clearly visible looking at p99) as there is a better usage of page caching (disk) mostly. - Making sure that most of your partitions fit within the read block size (buffer) you are using can also make reads more efficient (when data is compressed, the chunk size determines the buffer size. I hope, this helps. I am curious about that one, please let us know what you find out :). C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2018-05-26 14:21 GMT+01:00 onmstester onmstester <onmstes...@zoho.com>: > By reading 90 partitions concurrently(each having size > 200 MB), My > single node Apache Cassandra became unresponsive, > no read and write works for almost 10 minutes. > I'm using this configs: > memtable_allocation_type: offheap_buffers > gc: G1GC > heap: 128GB > concurrent_reads: 128 (having more than 12 disk) > > There is not much pressure on my resources except for the memory that the > eden with 70GB is filled and reallocated in less than a minute. > Cpu is about 20% while read is crashed and iostat shows no significant > load on disk. > > Sent using Zoho Mail <https://www.zoho.com/mail/> > > >