> Ideally we would like to collect maximum garbage from ParNew itself, during > compactions. What are the steps to take towards to achieving this? I'm not sure what you are asking.
Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/07/2012, at 6:56 PM, Ravikumar Govindarajan wrote: > We have modified maxTenuringThreshold from 1 to 5. May be it is causing > problems. Will change it back to 1 and see how the system is. > > concurrent_compactors=8. We will reduce this, as anyway our system won't be > able to handle this number of compactions at the same time. Think it will > ease GC also to some extent. > > Ideally we would like to collect maximum garbage from ParNew itself, during > compactions. What are the steps to take towards to achieving this? > > On Wed, Jul 4, 2012 at 4:07 PM, aaron morton <aa...@thelastpickle.com> wrote: > It *may* have been compaction from the repair, but it's not a big CF. > > I would look at the logs to see how much data was transferred to the node. > Was their a compaction going on while the GC storm was happening ? Do you > have a lot of secondary indexes ? > > If you think it correlated to compaction you can try reducing the > concurrent_compactors > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 3/07/2012, at 6:33 PM, Ravikumar Govindarajan wrote: > >> Recently, we faced a severe freeze [around 30-40 mins] on one of our >> servers. There were many mutations/reads dropped. The issue happened just >> after a routine nodetool repair for the below CF completed [1.0.7, NTS, >> DC1:3,DC2:2] >> >> Column Family: MsgIrtConv >> SSTable count: 12 >> Space used (live): 17426379140 >> Space used (total): 17426379140 >> Number of Keys (estimate): 122624 >> Memtable Columns Count: 31180 >> Memtable Data Size: 81950175 >> Memtable Switch Count: 31 >> Read Count: 8074156 >> Read Latency: 15.743 ms. >> Write Count: 2172404 >> Write Latency: 0.037 ms. >> Pending Tasks: 0 >> Bloom Filter False Postives: 1258 >> Bloom Filter False Ratio: 0.03598 >> Bloom Filter Space Used: 498672 >> Key cache capacity: 200000 >> Key cache size: 200000 >> Key cache hit rate: 0.9965579513062582 >> Row cache: disabled >> Compacted row minimum size: 51 >> Compacted row maximum size: 89970660 >> Compacted row mean size: 226626 >> >> >> Our heap config is as follows >> >> -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 >> -XX:MaxTenuringThreshold=5 -XX:CMSInitiatingOccupancyFraction=75 >> -XX:+UseCMSInitiatingOccupancyOnly >> >> from yaml >> in_memory_compaction_limit=64 >> compaction_throughput_mb_sec=8 >> multi_threaded_compaction=false >> >> INFO [AntiEntropyStage:1] 2012-06-29 09:21:26,085 AntiEntropyService.java >> (line 762) [repair #2b6fcbf0-c1f9-11e1-0000-2ea8811bfbff] MsgIrtConv is >> fully synced >> INFO [AntiEntropySessions:8] 2012-06-29 09:21:26,085 >> AntiEntropyService.java (line 698) [repair >> #2b6fcbf0-c1f9-11e1-0000-2ea8811bfbff] session completed successfully >> INFO [CompactionExecutor:857] 2012-06-29 09:21:31,219 CompactionTask.java >> (line 221) Compacted to >> [/home/sas/system/data/ZMail/MsgIrtConv-hc-858-Data.db,]. 47,907,012 to >> 40,554,059 (~84% of original) bytes for 4,564 keys at 6.252080MB/s. Time: >> 6,186ms. >> >> After this, the logs were fully filled with GC [ParNew/CMS]. ParNew ran for >> every 3 seconds, while CMS ran for every 30 seconds approx continuous for 40 >> minutes. >> >> INFO [ScheduledTasks:1] 2012-06-29 09:23:39,921 GCInspector.java (line 122) >> GC for ParNew: 776 ms for 2 collections, 2901990208 used; max is 8506048512 >> INFO [ScheduledTasks:1] 2012-06-29 09:23:42,265 GCInspector.java (line 122) >> GC for ParNew: 2028 ms for 2 collections, 3831282056 used; max is 8506048512 >> >> ......................................... >> >> INFO [ScheduledTasks:1] 2012-06-29 10:07:53,884 GCInspector.java (line 122) >> GC for ParNew: 817 ms for 2 collections, 2808685768 used; max is 8506048512 >> INFO [ScheduledTasks:1] 2012-06-29 10:07:55,632 GCInspector.java (line 122) >> GC for ParNew: 1165 ms for 3 collections, 3264696776 used; max is 8506048512 >> INFO [ScheduledTasks:1] 2012-06-29 10:07:57,773 GCInspector.java (line 122) >> GC for ParNew: 1444 ms for 3 collections, 4234372296 used; max is 8506048512 >> INFO [ScheduledTasks:1] 2012-06-29 10:07:59,387 GCInspector.java (line 122) >> GC for ParNew: 1153 ms for 2 collections, 4910279080 used; max is 8506048512 >> INFO [ScheduledTasks:1] 2012-06-29 10:08:00,389 GCInspector.java (line 122) >> GC for ParNew: 697 ms for 2 collections, 4873857072 used; max is 8506048512 >> INFO [ScheduledTasks:1] 2012-06-29 10:08:01,443 GCInspector.java (line 122) >> GC for ParNew: 726 ms for 2 collections, 4941511184 used; max is 8506048512 >> >> After this, the node got stable and was back and running. Any pointers will >> be greatly helpful > >