[ 
https://issues.apache.org/jira/browse/CASSANDRA-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007977#comment-13007977
 ] 

Sylvain Lebresne commented on CASSANDRA-2319:
---------------------------------------------

bq. The important thing to remember is that the distinction between columns and 
keys should be very fuzzy: columns are a suffix on keys, and treating them 
otherwise leads to complications. In this case, we shouldn't be holding every 
128th "key" in memory, but instead every 128th-512th tuple: that way wide rows 
are handled naturally.

I'm not saying anything else. What I'm saying is that there is potentially 
order of magnitudes more 'tuple' than there are keys. So far I doubt many 
people have ever changed the index_interval value. I suppose this will change 
if we do this (I mean, we have advertised you can have 2 billions columns per 
rows after all :)), and we may even be willing to move index_interval 
configurable per-cf. In turns, this will be more things to consider for the 
user and a bigger chance for them to get OOM if they are not careful.  Probably 
nothing horrible, but let just make sure we do understand a maximum of the pros 
and cons.

Also, as tjake remarked, it is unclear how to update the key cache with that 
proposal. You could cache column position ('tuple') directly, but that will be 
much less useful. Maybe the key cache could be kept but limited to skinny rows. 
Something to consider anyway.


> Promote row index
> -----------------
>
>                 Key: CASSANDRA-2319
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2319
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>              Labels: index, timeseries
>             Fix For: 0.8
>
>
> The row index contains entries for configurably sized blocks of a wide row. 
> For a row of appreciable size, the row index ends up directing the third seek 
> (1. index, 2. row index, 3. content) to nearby the first column of a scan.
> Since the row index is always used for wide rows, and since it contains 
> information that tells us whether or not the 3rd seek is necessary (the 
> column range or name we are trying to slice may not exist in a given 
> sstable), promoting the row index into the sstable index would allow us to 
> drop the maximum number of seeks for wide rows back to 2, and, more 
> importantly, would allow sstables to be eliminated using only the index.
> An example usecase that benefits greatly from this change is time series data 
> in wide rows, where data is appended to the beginning or end of the row. Our 
> existing compaction strategy gets lucky and clusters the oldest data in the 
> oldest sstables: for queries to recently appended data, we would be able to 
> eliminate wide rows using only the sstable index, rather than needing to seek 
> into the data file to determine that it isn't interesting. For narrow rows, 
> this change would have no effect, as they will not reach the threshold for 
> indexing anyway.
> A first cut design for this change would look very similar to the file format 
> design proposed on #674: 
> http://wiki.apache.org/cassandra/FileFormatDesignDoc: row keys clustered, 
> column names clustered, and offsets clustered and delta encoded.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to