Re: GC freeze just after repair session

aaron morton Thu, 05 Jul 2012 15:34:10 -0700

> Ideally we would like to collect maximum garbage from ParNew itself, during 
> compactions. What are the steps to take towards to achieving this?
I'm not sure what you are asking.


Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/07/2012, at 6:56 PM, Ravikumar Govindarajan wrote:

> We have modified maxTenuringThreshold from 1 to 5. May be it is causing 
> problems. Will change it back to 1 and see how the system is.
> 
> concurrent_compactors=8. We will reduce this, as anyway our system won't be 
> able to handle this number of compactions at the same time. Think it will 
> ease GC also to some extent.
> 
> Ideally we would like to collect maximum garbage from ParNew itself, during 
> compactions. What are the steps to take towards to achieving this?
> 
> On Wed, Jul 4, 2012 at 4:07 PM, aaron morton <aa...@thelastpickle.com> wrote:
> It *may* have been compaction from the repair, but it's not a big CF.
> 
> I would look at the logs to see how much data was transferred to the node. 
> Was their a compaction going on while the GC storm was happening ? Do you 
> have a lot of secondary indexes ? 
> 
> If you think it correlated to compaction you can try reducing the 
> concurrent_compactors 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 3/07/2012, at 6:33 PM, Ravikumar Govindarajan wrote:
> 
>> Recently, we faced a severe freeze [around 30-40 mins] on one of our 
>> servers. There were many mutations/reads dropped. The issue happened just 
>> after a routine nodetool repair for the below CF completed [1.0.7, NTS, 
>> DC1:3,DC2:2]
>> 
>>              Column Family: MsgIrtConv
>>              SSTable count: 12
>>              Space used (live): 17426379140
>>              Space used (total): 17426379140
>>              Number of Keys (estimate): 122624
>>              Memtable Columns Count: 31180
>>              Memtable Data Size: 81950175
>>              Memtable Switch Count: 31
>>              Read Count: 8074156
>>              Read Latency: 15.743 ms.
>>              Write Count: 2172404
>>              Write Latency: 0.037 ms.
>>              Pending Tasks: 0
>>              Bloom Filter False Postives: 1258
>>              Bloom Filter False Ratio: 0.03598
>>              Bloom Filter Space Used: 498672
>>              Key cache capacity: 200000
>>              Key cache size: 200000
>>              Key cache hit rate: 0.9965579513062582
>>              Row cache: disabled
>>              Compacted row minimum size: 51
>>              Compacted row maximum size: 89970660
>>              Compacted row mean size: 226626
>> 
>> 
>> Our heap config is as follows
>> 
>> -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC 
>> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
>> -XX:MaxTenuringThreshold=5 -XX:CMSInitiatingOccupancyFraction=75 
>> -XX:+UseCMSInitiatingOccupancyOnly
>> 
>> from yaml
>> in_memory_compaction_limit=64
>> compaction_throughput_mb_sec=8
>> multi_threaded_compaction=false
>> 
>>  INFO [AntiEntropyStage:1] 2012-06-29 09:21:26,085 AntiEntropyService.java 
>> (line 762) [repair #2b6fcbf0-c1f9-11e1-0000-2ea8811bfbff] MsgIrtConv is 
>> fully synced
>>  INFO [AntiEntropySessions:8] 2012-06-29 09:21:26,085 
>> AntiEntropyService.java (line 698) [repair 
>> #2b6fcbf0-c1f9-11e1-0000-2ea8811bfbff] session completed successfully
>>  INFO [CompactionExecutor:857] 2012-06-29 09:21:31,219 CompactionTask.java 
>> (line 221) Compacted to 
>> [/home/sas/system/data/ZMail/MsgIrtConv-hc-858-Data.db,].  47,907,012 to 
>> 40,554,059 (~84% of original) bytes for 4,564 keys at 6.252080MB/s.  Time: 
>> 6,186ms.
>> 
>> After this, the logs were fully filled with GC [ParNew/CMS]. ParNew ran for 
>> every 3 seconds, while CMS ran for every 30 seconds approx continuous for 40 
>> minutes.
>> 
>>  INFO [ScheduledTasks:1] 2012-06-29 09:23:39,921 GCInspector.java (line 122) 
>> GC for ParNew: 776 ms for 2 collections, 2901990208 used; max is 8506048512
>>  INFO [ScheduledTasks:1] 2012-06-29 09:23:42,265 GCInspector.java (line 122) 
>> GC for ParNew: 2028 ms for 2 collections, 3831282056 used; max is 8506048512
>> 
>> .........................................
>> 
>>  INFO [ScheduledTasks:1] 2012-06-29 10:07:53,884 GCInspector.java (line 122) 
>> GC for ParNew: 817 ms for 2 collections, 2808685768 used; max is 8506048512
>>  INFO [ScheduledTasks:1] 2012-06-29 10:07:55,632 GCInspector.java (line 122) 
>> GC for ParNew: 1165 ms for 3 collections, 3264696776 used; max is 8506048512
>>  INFO [ScheduledTasks:1] 2012-06-29 10:07:57,773 GCInspector.java (line 122) 
>> GC for ParNew: 1444 ms for 3 collections, 4234372296 used; max is 8506048512
>>  INFO [ScheduledTasks:1] 2012-06-29 10:07:59,387 GCInspector.java (line 122) 
>> GC for ParNew: 1153 ms for 2 collections, 4910279080 used; max is 8506048512
>>  INFO [ScheduledTasks:1] 2012-06-29 10:08:00,389 GCInspector.java (line 122) 
>> GC for ParNew: 697 ms for 2 collections, 4873857072 used; max is 8506048512
>>  INFO [ScheduledTasks:1] 2012-06-29 10:08:01,443 GCInspector.java (line 122) 
>> GC for ParNew: 726 ms for 2 collections, 4941511184 used; max is 8506048512
>> 
>> After this, the node got stable and was back and running. Any pointers will 
>> be greatly helpful
> 
>

Re: GC freeze just after repair session

Reply via email to