[jira] [Issue Comment Edited] (CASSANDRA-1956) Convert row cache to row+filter cache

Daniel Doubleday (Issue Comment Edited) (JIRA) Wed, 02 Nov 2011 09:41:57 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142264#comment-13142264
 ]


Daniel Doubleday edited comment on CASSANDRA-1956 at 11/2/11 4:40 PM:
----------------------------------------------------------------------

As I wrote earlier I'm a little sceptical that a query cache like this will be 
useful in many cases but since there is something going on here and Jonathan 
asked for a wish list:

Would you guys consider making the row cache a little more pluggable? This 
would allow us to maintain custom implementation more easily. Also I think that 
the core code could benefit as well moving some ifs out of CFS.

Instead of implementing the control flow in CFS and using the cache as a map 
you could introduce an RowCache instance that would act more like a service 
layer like:

{noformat}
interface RowCache {

    // returns filtered rows - ready to serve. reads the row from cfs if 
necessary.
    ColumnFamily getRow(CFS store, QueryFilter filter, int gcBefore);
    
    // notify the cache of a mutation. it can update or invalidate 
    // most impl will not need the store param but it might come handy in 
special cases
    void apply(CFS store, DK key, CF cf);
}    
{noformat}

This way CFS would need no knowledge wether a cache is able to update or only 
invalidate. And when it invalidates wether it has to invalidate the row or just 
portions of it. Also there would be no expectation about the internal caching 
format. The row cache could do whatever it likes. 

In CFS there would be only the cache reference. No distinction between old row 
cache, query cache, off-heap-cache, my-awesome-very-specialized-cache would be 
necessary.
 
                
      was (Author: doubleday):
    As I wrote earlier I'm a little sceptical that a query cache like this will 
be useful in many cases but since there is something going on here and Jonathan 
asked for a wish list:

Would you guys consider making the row cache a little more pluggable? This 
would allow us to maintain custom implementation more easily. Also I think that 
the core code could benefit as well moving some ifs out of CFS.

Instead of implementing the control flow in CFS and using the cache as a map 
you could introduce an RowCache instance that would act more like a service 
layer like:

{noformat}
interface RowCache {

    // returns filtered rows - ready to serve. reads the row from cfs if 
necessary.
    ColumnFamily getRow(CFS store, QueryFilter filter, int gcBefore);
    
    // notify the cache of a mutation. it can update or invalidate 
    void apply(CFS store, DK key, CF cf);
}    
{noformat}

This way CFS would need no knowledge wether a cache is able to update or only 
invalidate. And when it invalidates wether it has to invalidate the row or just 
portions of it. Also there would be no expectation about the internal caching 
format. The row cache could do whatever it likes. 

In CFS there would be only the cache reference. No distinction between old row 
cache, query cache, off-heap-cache, my-awesome-very-specialized-cache would be 
necessary.
 
                  
> Convert row cache to row+filter cache
> -------------------------------------
>
>                 Key: CASSANDRA-1956
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.1
>
>         Attachments: 0001-1956-cache-updates-v0.patch, 
> 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch
>
>
> Changing the row cache to a row+filter cache would make it much more useful. 
> We currently have to warn against using the row cache with wide rows, where 
> the read pattern is typically a peek at the head, but this usecase would be 
> perfect supported by a cache that stored only columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is 
> likely to have some gotchas for weird usage patterns, and it requires the 
> list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a 
> secondary index to lookup cache entries by rowkey so that you can keep them 
> in sync with the memtable
> * others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-1956) Convert row cache to row+filter cache

Reply via email to