The memory overhead issue is not directly related to GC because when JVM ran
out of memory the GC has been very busy for quite a while. In my case JVM
consumed all of the 6GB when the row cache size hit 1.4mil.

I haven't started test the row cache feature yet. But I think data
compression is useful to reduce memory consumption because in my impression
disk i/o is always the bottleneck for Cassandra while its CPU usage is
usually low all the time. In addition to this, compression should also help
to reduce the number of java objects dramatically (correct me if I'm wrong),
--especially in case we need to cache most of the data to achieve decent
read latency.

If ColumnFamily is serializable it shouldn't be that hard to implement the
compression feature which can be controlled by an option (again :-) in
storage conf xml.

When I get to that point you can instruct me to implement this feature along
with the row-cache-write-through. Our goal is straightforward: to support
short read latency in high volume web application with write/read ratio to
be 1:1.

-Weijun

-----Original Message-----
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Thursday, February 18, 2010 12:04 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Testing row cache feature in trunk: write should put record in
cache

Did you force a GC from jconsole to make sure you weren't just
measuring uncollected garbage?

On Wed, Feb 17, 2010 at 2:51 PM, Weijun Li <weiju...@gmail.com> wrote:
> OK I'll work on the change later because there's another problem to solve:
> the overhead for cache is too big that 1.4mil records (1k each) consumed
all
> of the 6gb memory of JVM (I guess 4gb are consumed by the row cache). I'm
> thinking that ConcurrentHashMap is not a good choice for LRU and the row
> cache needs to store compressed key data to reduce memory usage. I'll do
> more investigation on this and let you know.
>
> -Weijun
>
> On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>
>> ... tell you what, if you write the option-processing part in
>> DatabaseDescriptor I will do the actual cache part. :)
>>
>> On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis <jbel...@gmail.com>
>> wrote:
>> > https://issues.apache.org/jira/secure/CreateIssue!default.jspa, but
>> > this is pretty low priority for me.
>> >
>> > On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li <weiju...@gmail.com> wrote:
>> >> Just tried to make quick change to enable it but it didn't work out
:-(
>> >>
>> >>                ColumnFamily cachedRow =
>> >> cfs.getRawCachedRow(mutation.key());
>> >>
>> >>                 // What I modified
>> >>                 if( cachedRow == null ) {
>> >>                     cfs.cacheRow(mutation.key());
>> >>                     cachedRow = cfs.getRawCachedRow(mutation.key());
>> >>                 }
>> >>
>> >>                 if (cachedRow != null)
>> >>                     cachedRow.addAll(columnFamily);
>> >>
>> >> How can I open a ticket for you to make the change (enable row cache
>> >> write
>> >> through with an option)?
>> >>
>> >> Thanks,
>> >> -Weijun
>> >>
>> >> On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis <jbel...@gmail.com>
>> >> wrote:
>> >>>
>> >>> On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis <jbel...@gmail.com>
>> >>> wrote:
>> >>> > On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li <weiju...@gmail.com>
>> >>> > wrote:
>> >>> >> Just started to play with the row cache feature in trunk: it seems
>> >>> >> to
>> >>> >> be
>> >>> >> working fine so far except that for RowsCached parameter you need
>> >>> >> to
>> >>> >> specify
>> >>> >> number of rows rather than a percentage (e.g., "20%" doesn't
work).
>> >>> >
>> >>> > 20% works, but it's 20% of the rows at server startup.  So on a
>> >>> > fresh
>> >>> > start that is zero.
>> >>> >
>> >>> > Maybe we should just get rid of the % feature...
>> >>>
>> >>> (Actually, it shouldn't be hard to update this on flush, if you want
>> >>> to open a ticket.)
>> >>
>> >>
>> >
>
>

Reply via email to