Blowing out to 1k SSTables seems a bit full on. What args are you passing to repair?
Kurt Greaves k...@instaclustr.com www.instaclustr.com On 31 October 2016 at 09:49, Stefano Ortolani <ostef...@gmail.com> wrote: > I've collected some more data-points, and I still see dropped > mutations with compaction_throughput_mb_per_sec set to 8. > The only notable thing regarding the current setup is that I have > another keyspace (not being repaired though) with really wide rows > (100MB per partition), but that shouldn't have any impact in theory. > Nodes do not seem that overloaded either and don't see any GC spikes > while those mutations are dropped :/ > > Hitting a dead end here, any further idea where to look for further ideas? > > Regards, > Stefano > > On Wed, Aug 10, 2016 at 12:41 PM, Stefano Ortolani <ostef...@gmail.com> > wrote: > > That's what I was thinking. Maybe GC pressure? > > Some more details: during anticompaction I have some CFs exploding to 1K > > SStables (to be back to ~200 upon completion). > > HW specs should be quite good (12 cores/32 GB ram) but, I admit, still > > relying on spinning disks, with ~150GB per node. > > Current version is 3.0.8. > > > > > > On Wed, Aug 10, 2016 at 12:36 PM, Paulo Motta <pauloricard...@gmail.com> > > wrote: > >> > >> That's pretty low already, but perhaps you should lower to see if it > will > >> improve the dropped mutations during anti-compaction (even if it > increases > >> repair time), otherwise the problem might be somewhere else. Generally > >> dropped mutations is a signal of cluster overload, so if there's nothing > >> else wrong perhaps you need to increase your capacity. What version are > you > >> in? > >> > >> 2016-08-10 8:21 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > >>> > >>> Not yet. Right now I have it set at 16. > >>> Would halving it more or less double the repair time? > >>> > >>> On Tue, Aug 9, 2016 at 7:58 PM, Paulo Motta <pauloricard...@gmail.com> > >>> wrote: > >>>> > >>>> Anticompaction throttling can be done by setting the usual > >>>> compaction_throughput_mb_per_sec knob on cassandra.yaml or via > nodetool > >>>> setcompactionthroughput. Did you try lowering that and checking if > that > >>>> improves the dropped mutations? > >>>> > >>>> 2016-08-09 13:32 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>: > >>>>> > >>>>> Hi all, > >>>>> > >>>>> I am running incremental repaird on a weekly basis (can't do it every > >>>>> day as one single run takes 36 hours), and every time, I have at > least one > >>>>> node dropping mutations as part of the process (this almost always > during > >>>>> the anticompaction phase). Ironically this leads to a system where > repairing > >>>>> makes data consistent at the cost of making some other data not > consistent. > >>>>> > >>>>> Does anybody know why this is happening? > >>>>> > >>>>> My feeling is that this might be caused by anticompacting column > >>>>> families with really wide rows and with many SStables. If that is > the case, > >>>>> any way I can throttle that? > >>>>> > >>>>> Thanks! > >>>>> Stefano > >>>> > >>>> > >>> > >> > > >