I will also mention that a lot of time when we benchmark object-heavy algorithms in JRuby (which is most algorithms), the bottleneck is not in GC as much as it is in allocation rates. Kirk Pepperdine ran some numbers for me a few months back and calculated that for numeric benchmarks we were basically saturating the memory pipeline to *acquire* objects, even though GC times were negligible. It sounds like you're cranking through a lot of data...could you simply be asking too much of memory? What sort of application is it? Is it using one of the object-heavy languages like JRuby or Clojure?
On Wed, May 5, 2010 at 1:28 PM, Charles Oliver Nutter <[email protected]> wrote: > Ok, my next thought would be that your young generation is perhaps too > big? I'm sure you've probably tried choking it down and letting GC run > more often against a smaller young heap? > > If GC times for young gen are getting longer, something has to be > changing. Finalizers? Weak/SoftReferences? You say you're not getting > to the point of CMS running, but a large young gen can still take a > long time to collect. Do less more often? > > On Wed, May 5, 2010 at 9:25 AM, Matt Fowles <[email protected]> wrote: >> Charles~ >> >> I settled on that after having run experiments varying the survivor >> ratio and tenuring threshold. In the end, I discovered that >99.9% of >> the young garbage got caught with this and each extra young gen run >> only reclaimed about 1% of the previously surviving objects. So it >> seemed like the trade off just wasn't winning me anything. >> >> Matt >> >> On Tue, May 4, 2010 at 6:50 PM, Charles Oliver Nutter >> <[email protected]> wrote: >>> Why are you using -XX:MaxTenuringThreshold=0? That's basically forcing >>> all objects that survive one collection to immediately be promoted, >>> even if they just happen to be a slightly longer-lived young object. >>> Using 0 seems like a bad idea. >>> >>> On Tue, May 4, 2010 at 4:46 PM, Matt Fowles <[email protected]> wrote: >>>> All~ >>>> >>>> I have a large app that produces ~4g of garbage every minute and am >>>> trying to reduce the size of gc outliers. About 99% of this data is >>>> garbage, but almost anything that survives one collection survives for >>>> an indeterminately long amount of time. We are currently using the >>>> following VM and options >>>> >>>> >>>> java version "1.6.0_16" >>>> Java(TM) SE Runtime Environment (build 1.6.0_16-b01) >>>> Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) >>>> >>>> -verbose:gc >>>> -Xms32g -Xmx32g -Xmn4g >>>> -XX:+UseParNewGC >>>> -XX:ParallelGCThreads=4 >>>> -XX:+UseConcMarkSweepGC >>>> -XX:ParallelCMSThreads=4 >>>> -XX:MaxTenuringThreshold=0 >>>> -XX:SurvivorRatio=20000 >>>> -XX:CMSInitiatingOccupancyFraction=60 >>>> -XX:+UseCMSInitiatingOccupancyOnly >>>> -XX:+CMSParallelRemarkEnabled >>>> -XX:MaxGCPauseMillis=50 >>>> -Xloggc:gc.log >>>> >>>> >>>> >>>> As you can see from the GC log, we never actually reach the point >>>> where the CMS kicks in (after app startup). But our young gens seem >>>> to take increasingly long to collect as time goes by. >>>> >>>> One of the major metrics we have for measuring this system is latency >>>> as measured from connected clients and as measured internally. You >>>> can see an attached graph of latency vs time for the clients. It is >>>> not surprising the the internal latency (green and labeled 'sb') is >>>> not as large as the network latency. I assume this is in part because >>>> VM safe points are less likely to occur within our internal timing >>>> markers. But, one can easily see how the external latency >>>> measurements (blue and labeled 'network') display the same steady >>>> increase in times. >>>> >>>> My hope is to be able to tweak young gen size and trade off GC >>>> frequency with pause length; however, the steadily increasing GC times >>>> are proving to persist regardless of the size that I make the young >>>> generation. >>>> >>>> Has anyone seen this sort of behavior before? Are there more switches >>>> that I should try running with? >>>> >>>> Obviously, I am working to profile the app and reduce the garbage load >>>> in parallel. But if I still see this sort of problem, it is only a >>>> question of how long must the app run before I see unacceptable >>>> latency spikes. >>>> >>>> Matt >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "JVM Languages" group. >>>> To post to this group, send email to [email protected]. >>>> To unsubscribe from this group, send email to >>>> [email protected]. >>>> For more options, visit this group at >>>> http://groups.google.com/group/jvm-languages?hl=en. >>>> >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "JVM Languages" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/jvm-languages?hl=en. >>> >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "JVM Languages" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/jvm-languages?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en.
