[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

Jonathan Ellis (Commented) (JIRA) Wed, 18 Apr 2012 12:01:06 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256832#comment-13256832
 ]


Jonathan Ellis commented on CASSANDRA-2864:
-------------------------------------------

If so, how do you avoid scanning the sstables?  Does this only work on 
named-column queries?  That is, if I ask for a slice from X to Y, if you have 
data in your cache for X1 X2, how do you know there is not also an X3 on disk 
somewhere?
                
> Alternative Row Cache Implementation
> ------------------------------------
>
>                 Key: CASSANDRA-2864
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2864
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Daniel Doubleday
>            Assignee: Daniel Doubleday
>            Priority: Minor
>
> we have been working on an alternative implementation to the existing row 
> cache(s)
> We have 2 main goals:
> - Decrease memory -> get more rows in the cache without suffering a huge 
> performance penalty
> - Reduce gc pressure
> This sounds a lot like we should be using the new serializing cache in 0.8. 
> Unfortunately our workload consists of loads of updates which would 
> invalidate the cache all the time.
> The second unfortunate thing is that the idea we came up with doesn't fit the 
> new cache provider api...
> It looks like this:
> Like the serializing cache we basically only cache the serialized byte 
> buffer. we don't serialize the bloom filter and try to do some other minor 
> compression tricks (var ints etc not done yet). The main difference is that 
> we don't deserialize but use the normal sstable iterators and filters as in 
> the regular uncached case.
> So the read path looks like this:
> return filter.collectCollatedColumns(memtable iter, cached row iter)
> The write path is not affected. It does not update the cache
> During flush we merge all memtable updates with the cached rows.
> The attached patch is based on 0.8 branch r1143352
> It does not replace the existing row cache but sits aside it. Theres 
> environment switch to choose the implementation. This way it is easy to 
> benchmark performance differences.
> -DuseSSTableCache=true enables the alternative cache. It shares its 
> configuration with the standard row cache. So the cache capacity is shared. 
> We have duplicated a fair amount of code. First we actually refactored the 
> existing sstable filter / reader but than decided to minimize dependencies. 
> Also this way it is easy to customize serialization for in memory sstable 
> rows. 
> We have also experimented a little with compression but since this task at 
> this stage is mainly to kick off discussion we wanted to keep things simple. 
> But there is certainly room for optimizations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2864) Alternative Row Cache Implementation

Reply via email to