Ok, I figured something like that. Switching to ConcurrentLinkedHashCacheProvider I see it is a lot better, but still instead of the 25-30ms response times I enjoyed with no caching, I'm seeing 500ms at 100% hit rate on the cache. No old gen pressure at all, just ParNew crazy.
More info on my use case is that I am picking 50 columns from the 70k. Since the whole row is in the cache, and no copying from off-heap nor disk buffers, seems like it should be faster than non-cache mode. More thoughts :) On 11/18/11 6:39 AM, "Sylvain Lebresne" <sylv...@datastax.com> wrote: >On Fri, Nov 18, 2011 at 1:53 AM, Todd Burruss <bburr...@expedia.com> >wrote: >> I'm using cassandra 1.0. Been doing some testing on using cass's cache. >> When I turn it on (using the CLI) I see ParNew jump from 3-4ms to >> 200-300ms. This really screws with response times, which jump from >>~25-30ms >> to 1300+ms. I've increase new gen and that helps, but still this is >> suprising to me, especially since 1.0 defaults to the >> SerializingCacheProvider off heap. >> The interesting tid bit is that I have wide rows. 70k+ columns per >>row, ~50 >> bytes per column value. The cache only must be about 400 rows to catch >>all >> the data per node and JMX is reporting 100% cache hits. Nodetool ring >> reports < 2gb per node, my heap is 6gb and total RAM is 16gb. >> Thoughts? > >You're problem is the mix of wide rows and the serializing cache. >What happens with the serializing cache is that our data is stored >out of the heap. But that means that for each read to a row, we >'deserialize' the row for the out-of-heap memory into the heap to >return it. The thing is, when we do that, we do the full row each >time. In other word, for each query we deserialize 70k+ columns >even if to return only one. I'm willing to bet this is what is killing >your response time. If you want to cache wide rows, I really >suggest you're using the ConcurrentLinkedHashCacheProvider >instead. > >I'll also note that this explain the ParNew times too. Deserializing >all those columns from off-heap creates lots of short-lived object, >and since you deserialize 70k+ on each query, that's quite some >pressure on the new gen. Note that the serializing cache is >actually minimizing the use of old gen, because that is the one >that is the one that can create huge GC pauses with big heap, >but it actually put more pressure on the new gen. This is by >design and because new gen is much less of a problem than >old gen. > >-- >Sylvain