[ 
https://issues.apache.org/jira/browse/CASSANDRA-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066513#comment-13066513
 ] 

Jean-Francois Im commented on CASSANDRA-2904:
---------------------------------------------

I forgot to mention that I am interested in writing a patch for this; I 
implemented something quick and dirty on my end to get an idea of the 
performance improvement, but it assumes that there is nothing else going on at 
the same moment (ie. nobody else is writing, consistency level is always ONE, 
no compaction or anything else is going on, there's only one client doing this 
kind of query, etc.).

Writing something more general purpose would be trickier and I would probably 
need some pointers for some things(how to handle a compaction, query cursors 
and a consistency level other than ONE, mostly), but it sounds really fun. Is 
there any interest for this?

> get_range_slices with no columns could be made faster by scanning the index 
> file
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2904
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2904
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.6
>            Reporter: Jean-Francois Im
>
> When scanning a column family using get_range_slices() and a predicate that 
> contains no columns, the scan operates on the actual data, not the index file.
> Our use case for this is that we have a column family that has relatively 
> wide rows(varying from 10kb to over 100kb of data per row) and we need to do 
> iterate through all the keys to figure out which rows we are interested in; 
> obviously, going through the index file than the data is faster in this 
> case(in the order of minutes versus hours).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to