Charles~ I settled on that after having run experiments varying the survivor ratio and tenuring threshold. In the end, I discovered that >99.9% of the young garbage got caught with this and each extra young gen run only reclaimed about 1% of the previously surviving objects. So it seemed like the trade off just wasn't winning me anything.
Matt On Tue, May 4, 2010 at 6:50 PM, Charles Oliver Nutter <[email protected]> wrote: > Why are you using -XX:MaxTenuringThreshold=0? That's basically forcing > all objects that survive one collection to immediately be promoted, > even if they just happen to be a slightly longer-lived young object. > Using 0 seems like a bad idea. > > On Tue, May 4, 2010 at 4:46 PM, Matt Fowles <[email protected]> wrote: >> All~ >> >> I have a large app that produces ~4g of garbage every minute and am >> trying to reduce the size of gc outliers. About 99% of this data is >> garbage, but almost anything that survives one collection survives for >> an indeterminately long amount of time. We are currently using the >> following VM and options >> >> >> java version "1.6.0_16" >> Java(TM) SE Runtime Environment (build 1.6.0_16-b01) >> Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) >> >> -verbose:gc >> -Xms32g -Xmx32g -Xmn4g >> -XX:+UseParNewGC >> -XX:ParallelGCThreads=4 >> -XX:+UseConcMarkSweepGC >> -XX:ParallelCMSThreads=4 >> -XX:MaxTenuringThreshold=0 >> -XX:SurvivorRatio=20000 >> -XX:CMSInitiatingOccupancyFraction=60 >> -XX:+UseCMSInitiatingOccupancyOnly >> -XX:+CMSParallelRemarkEnabled >> -XX:MaxGCPauseMillis=50 >> -Xloggc:gc.log >> >> >> >> As you can see from the GC log, we never actually reach the point >> where the CMS kicks in (after app startup). But our young gens seem >> to take increasingly long to collect as time goes by. >> >> One of the major metrics we have for measuring this system is latency >> as measured from connected clients and as measured internally. You >> can see an attached graph of latency vs time for the clients. It is >> not surprising the the internal latency (green and labeled 'sb') is >> not as large as the network latency. I assume this is in part because >> VM safe points are less likely to occur within our internal timing >> markers. But, one can easily see how the external latency >> measurements (blue and labeled 'network') display the same steady >> increase in times. >> >> My hope is to be able to tweak young gen size and trade off GC >> frequency with pause length; however, the steadily increasing GC times >> are proving to persist regardless of the size that I make the young >> generation. >> >> Has anyone seen this sort of behavior before? Are there more switches >> that I should try running with? >> >> Obviously, I am working to profile the app and reduce the garbage load >> in parallel. But if I still see this sort of problem, it is only a >> question of how long must the app run before I see unacceptable >> latency spikes. >> >> Matt >> >> -- >> You received this message because you are subscribed to the Google Groups >> "JVM Languages" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/jvm-languages?hl=en. >> >> > > -- > You received this message because you are subscribed to the Google Groups > "JVM Languages" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/jvm-languages?hl=en. > > -- You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en.
