On Thu, Feb 11, 2016 at 5:38 PM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:

> Also, are you using incremental repairs (not sure about the available
> options in Spotify Reaper) what command did you run ?
>
>
No.


> 2016-02-11 17:33 GMT+01:00 Alain RODRIGUEZ <arodr...@gmail.com>:
>
>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses
>>
>>
>>
>> What is your current compaction throughput ?  The current value of
>> 'concurrent_compactors' (cassandra.yaml or through JMX) ?
>>
>

Throughput was initially set to 1024 and I've gradually increased it to
2048, 4K and 16K but haven't seen any changes. Tried to change it both from
`nodetool` and also cassandra.yaml (with restart after changes).


>
>> nodetool getcompactionthroughput
>>
>> How to speed up compaction? Increased compaction throughput and
>>> concurrent compactors but no change. Seems there is plenty idle
>>> resources but can't force C* to use it.
>>>
>>
>> You might want to try un-throttle the compaction throughput through:
>>
>> nodetool setcompactionsthroughput 0
>>
>> Choose a canari node. Monitor compaction pending and disk throughput
>> (make sure server is ok too - CPU...)
>>
>

Yes, I'll try it out but if increasing it 16 times didn't help I'm a bit
sceptical about it.


>
>> Some other information could be useful:
>>
>> What is your number of cores per machine and the compaction strategies
>> for the 'most compacting' tables. What are write/update patterns, any TTL
>> or tombstones ? Do you use a high number of vnodes ?
>>
>
I'm using bare-metal box, 40CPU, 64GB, 2 SSD each. num_tokens is set to
256.

Using LCS for all tables. Write / update heavy. No warnings about large
number of tombstones but we're removing items frequently.



>
>> Also what is your repair routine and your values for gc_grace_seconds ?
>> When was your last repair and do you think your cluster is suffering of a
>> high entropy ?
>>
>
We're having problem with repair for months (CASSANDRA-9935).
gc_grace_seconds is set to 345600 now. Yes, as we haven't launched it
successfully for long time I guess cluster is suffering of high entropy.


>
>> You can lower the stream throughput to make sure nodes can cope with what
>> repairs are feeding them.
>>
>> nodetool getstreamthroughput
>> nodetool setstreamthroughput X
>>
>
Yes, this sounds interesting. As we're having problem with repair for
months it could that lots of things are transferred between nodes.

Thanks!


>
>> C*heers,
>>
>> -----------------
>> Alain Rodriguez
>> France
>>
>> The Last Pickle
>> http://www.thelastpickle.com
>>
>> 2016-02-11 16:55 GMT+01:00 Michał Łowicki <mlowi...@gmail.com>:
>>
>>> Hi,
>>>
>>> Using 2.1.12 across 3 DCs. Each DC has 8 nodes. Trying to run repair
>>> using Cassandra Reaper but nodes after couple of hours are full of pending
>>> compaction tasks (regular not the ones about validation)
>>>
>>> CPU load is fine, SSD disks below 30% utilization, no long GC pauses.
>>>
>>> How to speed up compaction? Increased compaction throughput and
>>> concurrent compactors but no change. Seems there is plenty idle
>>> resources but can't force C* to use it.
>>>
>>> Any clue where there might be a bottleneck?
>>>
>>>
>>> --
>>> BR,
>>> Michał Łowicki
>>>
>>>
>>
>


-- 
BR,
Michał Łowicki

Reply via email to