On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:
> Also, are you using incremental repairs (not sure about the available > options in Spotify Reaper) what command did you run ? > > No. > 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ <arodr...@gmail.com>: > >> CPU load is fine, SSD disks below 30% utilization, no long GC pauses >> >> >> >> What is your current compaction throughput ? The current value of >> 'concurrent_compactors' (cassandra.yaml or through JMX) ? >> > Throughput was initially set to 1024 and I've gradually increased it to 2048, 4K and 16K but haven't seen any changes. Tried to change it both from `nodetool` and also cassandra.yaml (with restart after changes). > >> nodetool getcompactionthroughput >> >> How to speed up compaction? Increased compaction throughput and >>> concurrent compactors but no change. Seems there is plenty idle >>> resources but can't force C* to use it. >>> >> >> You might want to try un-throttle the compaction throughput through: >> >> nodetool setcompactionsthroughput 0 >> >> Choose a canari node. Monitor compaction pending and disk throughput >> (make sure server is ok too - CPU...) >> > Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit sceptical about it. > >> Some other information could be useful: >> >> What is your number of cores per machine and the compaction strategies >> for the 'most compacting' tables. What are write/update patterns, any TTL >> or tombstones ? Do you use a high number of vnodes ? >> > I'm using bare-metal box, 40CPU, 64GB, 2 SSD each. num_tokens is set to 256. Using LCS for all tables. Write / update heavy. No warnings about large number of tombstones but we're removing items frequently. > >> Also what is your repair routine and your values for gc_grace_seconds ? >> When was your last repair and do you think your cluster is suffering of a >> high entropy ? >> > We're having problem with repair for months (CASSANDRA-9935). gc_grace_seconds is set to 345600 now. Yes, as we haven't launched it successfully for long time I guess cluster is suffering of high entropy. > >> You can lower the stream throughput to make sure nodes can cope with what >> repairs are feeding them. >> >> nodetool getstreamthroughput >> nodetool setstreamthroughput X >> > Yes, this sounds interesting. As we're having problem with repair for months it could that lots of things are transferred between nodes. Thanks! > >> C*heers, >> >> ----------------- >> Alain Rodriguez >> France >> >> The Last Pickle >> http://www.thelastpickle.com >> >> 2016-02-11 16:55 GMT+01:00 Michał Łowicki <mlowi...@gmail.com>: >> >>> Hi, >>> >>> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair >>> using Cassandra Reaper but nodes after couple of hours are full of pending >>> compaction tasks (regular not the ones about validation) >>> >>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses. >>> >>> How to speed up compaction? Increased compaction throughput and >>> concurrent compactors but no change. Seems there is plenty idle >>> resources but can't force C* to use it. >>> >>> Any clue where there might be a bottleneck? >>> >>> >>> -- >>> BR, >>> Michał Łowicki >>> >>> >> > -- BR, Michał Łowicki