Hi Alain, Nice charts! ;) (attachments came through the list).
Since you're using SPM for monitoring Cassandra, you may want to have a look at https://sematext.atlassian.net/wiki/display/PUBSPM/Network+Map which I think would have shown which nodes were talking to which nodes and how much. Don't have a screenshot to share, but it looks a bit like the one on http://blog.sematext.com/2015/08/06/introducing-appmap/ Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thu, Sep 10, 2015 at 11:43 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > Hi, just wanted to drop the follow up here. > > I finally figure out that bigdata guys were basically hammering the > cluster by reading 2 month of data as fast as possible on one table at boot > time to cache it. As this table is storing 12 MB blobs (Bloom Filters), > even if the number of reads was not very high, as each row is really big, > reads + read repairs were putting to much pressure on Cassandra. Those > reads were mixed with much higher workloads so I was not seeing any burst > in reads, making this harder to troubleshoot. Local reads (from Sematext / > Opscenter) helped finding this out. > > Given the use case (no random reads, write once, no update) and the data > size for each element, we will get this out of Cassandra to some HDFS or S3 > storage, basically. We do not need any database for this kind of job. > Meanwhile we just disabled this feature as it is not something critical. > > @Fabien, Thank you for your help. > > C*heers, > > Alain > > 2015-09-02 0:43 GMT+02:00 Fabien Rousseau <fabifab...@gmail.com>: > >> Hi Alain, >> >> Maybe it's possible to confirm this by testing on a small cluster: >> - create a cluster of 2 nodes (using https://github.com/pcmanus/ccm for >> example) >> - create a fake wide row of a few mb (using the python driver for example) >> - drain and stop one of the two nodes >> - remove the sstables of the stopped node (to provoke inconsistencies) >> - start it again >> - select a small portion of the wide row (many times, use nodetool >> tpstats to know when a read repair has been triggered) >> - nodetool flush (on the previously stopped node) >> - check the size of the sstable (if a few kb, then only the selected >> slice was repaired, but if a few mb then the whole row was repaired) >> >> The wild guess was: if a read repair was triggered when reading a small >> portion of a wide row and if it resulted in streaming the whole wide row, >> it could explain a network burst. (But, on a second thought it make more >> sense to only repair the small portion being read...) >> >> >> >> 2015-09-01 12:05 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: >> >>> Hi Fabien, thanks for your help. >>> >>> I did not mention it but I indeed saw a correlation between latency and >>> read repairs spikes. Though this is like going from 5 RR per second to 10 >>> per sec cluster wide according to opscenter: http://img42.com/L6gx1 >>> >>> I have indeed some wide rows and this explanation looks reasonable to >>> me, I mean this makes sense. Yet isn't this amount of Read Repair too low >>> to induce such a "shitstorm" (even if it spikes x2, I got network x10) ? >>> Also wide rows are present on heavy used tables (sadly...), so I should be >>> using more network all the time (why only a few spikes per day (like 2 / 3 >>> max) ? >>> >>> How could I confirm this, without removing RR and waiting a week I mean, >>> is there a way to see the size of the data being repaired through this >>> mechanism ? >>> >>> C*heers >>> >>> Alain >>> >>> 2015-09-01 0:11 GMT+02:00 Fabien Rousseau <fabifab...@gmail.com>: >>> >>>> Hi Alain, >>>> >>>> Could it be wide rows + read repair ? (Let's suppose the "read repair" >>>> repairs the full row, and it may not be subject to stream throughput limit) >>>> >>>> Best Regards >>>> Fabien >>>> >>>> 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: >>>> >>>>> I just realised that I have no idea about how this mailing list handle >>>>> attached files. >>>>> >>>>> Please find screenshots there --> http://img42.com/collection/y2KxS >>>>> >>>>> Alain >>>>> >>>>> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: >>>>> >>>>>> Hi, >>>>>> >>>>>> Running a 2.0.16 C* on AWS (private VPC, 2 DC). >>>>>> >>>>>> I am facing an issue on our EU DC where I have a network burst >>>>>> (alongside with GC and latency increase). >>>>>> >>>>>> My first thought was a sudden application burst, though, I see no >>>>>> corresponding evolution on reads / write or even CPU. >>>>>> >>>>>> So I thought that this might come from the node themselves as IN >>>>>> almost equal OUT Network. I tried lowering stream throughput on the whole >>>>>> DC to 1 Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network >>>>>> went a >>>>>> lot higher about 30 M in both sides (see screenshots attached). >>>>>> >>>>>> I have tried to use iftop to see where this network is headed too, >>>>>> but I was not able to do it because burst are very shorts. >>>>>> >>>>>> So, questions are: >>>>>> >>>>>> - Did someone experienced something similar already ? If so, any clue >>>>>> would be appreciated :). >>>>>> - How can I know (monitor, capture) where this big amount of network >>>>>> is headed to or due to ? >>>>>> - Am I right trying to figure out what this network is or should I >>>>>> follow an other lead ? >>>>>> >>>>>> Notes: I also noticed that CPU does not spike nor does R&W, but disk >>>>>> reads also spikes ! >>>>>> >>>>>> C*heers, >>>>>> >>>>>> Alain >>>>>> >>>>> >>>>> >>>> >>> >> >