[ https://issues.apache.org/jira/browse/HBASE-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896758#action_12896758 ]
stack commented on HBASE-2902: ------------------------------ Sure, but see the cited slide deck above. Its not the size of the YG but the amount of copying done (Not sure which kinda copying -- to be elicited). Might be worth playing w/ bigger YG making it so stuff is aged having made it through 4 or 5 or even 10 YGGCs before it gets tenured. St.Ack > Improve our default shipping GC config. and doc -- along the way do a bit of > GC myth-busting > -------------------------------------------------------------------------------------------- > > Key: HBASE-2902 > URL: https://issues.apache.org/jira/browse/HBASE-2902 > Project: HBase > Issue Type: Improvement > Reporter: stack > Attachments: Fragger.java > > > This issue is about improving the near-term story, working with our current > lot, the slowly evolving (?) 1.6x JVMs and CMS (Longer-term, another issue in > hbase tracks the G1 story and longer term, Todd is making a bit of traction > over on the GC hotspot list). > At the moment we ship with CMS and i-CMS enabled by default. At a minimum, > i-cms does not apply on most hw hbase is deployed on -- i-cms is for hw w/ 2 > or less processors -- and it seems as though we do not use multiple threads > doing YG collections; i.e. -XX:UseParNewGC "Use parallel threads in the new > generation" (Here's what I see...it seems to be off in jdk6 according to > http://www.md.pp.ru/~eu/jdk6options.html#UseParNewGC but then this says its > on by default when use CMS -> > http://blogs.sun.com/jonthecollector/category/Java ... but then this says > enable it http://www.austinjug.org/presentations/JDK6PerfUpdate_Dec2009.pdf. > I see this when its enabled: [Rescan (parallel) ... so it seems like its off. > Need to review the src code). > We should make the above changes or at least doc them. > We should consider enabling GC logging by default. Its low cost apparently > (citation below). We'd just need to do something about the log management. > Not sure you can roll them -- investigate -- and anyways we should roll on > startup at least so we don't lose GC logs across restarts. > We should play with initiating ratios; maybe starting CMS earlier will push > out the fragmented heap that brings on the killer stop-the-world collection. > I read somewhere recently that invoking System.gc will run a CMS GC if CMS is > enabled. We should investigate. If it ran the serial collector, we could at > least doc. that users could run a defragmenting stop-the-world serial > collection on 'off' times or at least make it so the stop-the-world happened > when expected instead of at some random time. > While here, lets do a bit of myth-busting. Here's a few postulates: > + Keep the young generation small or at least, cap its size else it grows to > occupy a large part of the heap > The above is a Ryanism. Doing the above -- along w/ massive heap size -- has > put off the fragmentation that others run into at SU at least. > Interestingly, this document -- > http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBcQFjAA&url=http%3A%2F%2Fmediacast.sun.com%2Fusers%2FLudovic%2Fmedia%2FGCTuningPresentationFISL10.pdf&ei=ZPtaTOiLL5bcsAa7gsl1&usg=AFQjCNHP691SIIE-6NSKccM4mZtm1U6Ahw&sig2=2cjvcaeyn1aISL2THEENjQ > -- would seem to recommend near the opposite in that it suggests that when > using CMS, do all you can to keep stuff in the YG. Avoid having stuff age up > to the tenured heap if you can. This would seem imply using a larger YG. > Chatting w/ Ryan, the reason to keep the YG small is so we don't have long > pauses doing YG collections. According to the above citation, its not big > YGs that cause long YG pauses but the copying of data (not sure if its > copying of data inside the YG or if it meant copying up to tenured -- > chatting w/ Ryan we thought there'd be no difference -- but we should > investigate) > I look a look at a running upload with a small heap admittedly. What I was > seeing was that using our defaults, rare was anything in YG of age > 1 GC; > i.e. near everything in YG was being promoted. This may have been a symptom > of my small (default) heap but we should look into this and try and ensure > objects are promoted because they are old, not because there is not enough > space in YG. > + We should write a slab allocator or allocate memory outside of the JVM heap > Thinking on this, slab allocator, while a lot of work, I can see it helping > us w/ block cache, but what if memstore is the fragmented-heap maker? In > this case, slab-allocator is only part of the fix. It should be easy to see > which is the fragmented heap maker since we can turn off the cache easy > enough (though it seems like its accessed anyways even if disabled -- need to > make sure its not doing allocations to the cache in this case) > Other things while on this topic. We need to come up w/ a loading that > brings on the CMS fault that comes of a fragmented heap (CMS is > non-compacting but apparently it will join together free blocks to make > bigger ones so there is some anti-fragmenting behavior going on). Apparently > lots of large irregular sized items is the ticket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.