Re: Network / GC / Latency spike

2015-09-11 Thread Otis Gospodnetić
Hi Alain, Nice charts! ;) (attachments came through the list). Since you're using SPM for monitoring Cassandra, you may want to have a look at https://sematext.atlassian.net/wiki/display/PUBSPM/Network+Map which I think would have shown which nodes were talking to which nodes and how much.

Re: Network / GC / Latency spike

2015-09-10 Thread Alain RODRIGUEZ
Hi, just wanted to drop the follow up here. I finally figure out that bigdata guys were basically hammering the cluster by reading 2 month of data as fast as possible on one table at boot time to cache it. As this table is storing 12 MB blobs (Bloom Filters), even if the number of reads was not

Re: Network / GC / Latency spike

2015-09-01 Thread Fabien Rousseau
Hi Alain, Maybe it's possible to confirm this by testing on a small cluster: - create a cluster of 2 nodes (using https://github.com/pcmanus/ccm for example) - create a fake wide row of a few mb (using the python driver for example) - drain and stop one of the two nodes - remove the sstables of

Re: Network / GC / Latency spike

2015-09-01 Thread Alain RODRIGUEZ
Hi Fabien, thanks for your help. I did not mention it but I indeed saw a correlation between latency and read repairs spikes. Though this is like going from 5 RR per second to 10 per sec cluster wide according to opscenter: http://img42.com/L6gx1 I have indeed some wide rows and this explanation

Re: Network / GC / Latency spike

2015-08-31 Thread Fabien Rousseau
Hi Alain, Could it be wide rows + read repair ? (Let's suppose the "read repair" repairs the full row, and it may not be subject to stream throughput limit) Best Regards Fabien 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ : > I just realised that I have no idea about how this