> Also, why is there so much garbage collection to begin with? Memcache > uses a slab allocator to reuse blocks to prevent allocation/deallocation > of blocks from consuming all the cpu time. Are there any plans to reuse > blocks so the garbage collector doesn't have to work so hard?
And to address this btw, although it has nothing to do with the problem being investigated in this thread: It's not about how *much* time is spent on memory management. That is of course relevant, but the issue here is to avoid long stop-the-world pauses. Even if you're avoiding doing allocation, as long as you're not doing *zero* allocation you need to collect what you *do* allocate in such a way as to avoid long stop-the-world pauses. How this is accomplished in terms of garbage collection implementations is beyond the scope of this E-Mail, but a major point is: Yes, there are workloads under which a given garbage collector, like CMS, will fail to avoid stop-the-world full GC:s in perpetuity. However, the most common problems are dead simple ones like "oops, row cache too big" where the *SYMPTOM* is one of prolonged GC pauses, but the actual root cause is not that of a broken or inadequate GC. It is sub-optimal that this is so (it would clearly be more use-friendly if one were just told that there is too much live data), but for various reasons it is non-trivial, from a JVM perspective, to provide that information to the operator in a way that is useful and won't trigger incorrectly (false positively). Also, all memory management techniques have trade-offs. If you believe memcached is invulnerable, try this: Populate a memcached with a bunch of data of varying size but with a given average size. Wait until it's entirely full. Then, adjust your data so that the distribution looks the same but is displaced singificantly (eg., maybe move from 150 byte average blob size to 1000 byte average). Unless you were lucky in exactly how memcached ended up sizing it's slabs and what your data sizes happen to be, you can then watch how memcached crashes and burns your application (which relied on the cache having a good hit ratio) as you suddenly start seeing data evicted withing seconds after insertion. This can happen because memcached makes trade-offs in how it does memory management in order to achieve its performance. One of those trade-offs involves not being able to re-allocate memory to slabs for different object sizes, when the object size distribution changes. -- / Peter Schuller