[jira] [Issue Comment Edited] (CASSANDRA-3581) Optimize RangeSlice operations for append-mostly use cases

Rick Branson (Issue Comment Edited) (JIRA) Tue, 06 Dec 2011 15:49:03 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163978#comment-13163978
 ]


Rick Branson edited comment on CASSANDRA-3581 at 12/6/11 11:47 PM:
-------------------------------------------------------------------

{quote}
I don't see any drawback to putting this [the minimum/maximum column names] in 
the metadata/statistics component, which would keep backwards compatibility 
headaches down.
{quote}

This is just my naiveté showing through as I wasn't aware of that component. 
From a conceptual perspective, any metadata storage for the SSTable would work, 
and since it's purely an optional optimization, this makes sense.

{quote}
Right, and the problem with that is you can't know if the row has a tombstone 
without looking up the row and reading its header, which is a large part of the 
overhead of reading the entire row. So unless we also add a "sstable contains 
row tombstones" flag to our metadata we're screwed.

Tracking that flag is not a problem per se, but it would narrow the usefulness 
of the optimization significantly if it can only be applied if there have been 
no row deletes in the entire sstable.
{quote}

Nullifying the minimum & maximum column name fields has the effect of flagging 
the SSTable as containing row tombstones.
                
      was (Author: rbranson):
    {quote}
I don't see any drawback to putting this [the minimum/maximum column names] in 
the metadata/statistics component, which would keep backwards compatibility 
headaches down.
{/quote}

This is just my naiveté showing through as I wasn't aware of that component. 
From a conceptual perspective, any metadata storage for the SSTable would work, 
and since it's purely an optional optimization, this makes sense.

{quote}
Right, and the problem with that is you can't know if the row has a tombstone 
without looking up the row and reading its header, which is a large part of the 
overhead of reading the entire row. So unless we also add a "sstable contains 
row tombstones" flag to our metadata we're screwed.

Tracking that flag is not a problem per se, but it would narrow the usefulness 
of the optimization significantly if it can only be applied if there have been 
no row deletes in the entire sstable.
{/quote}

Nullifying the minimum & maximum column name fields has the effect of flagging 
the SSTable as containing row tombstones.
                  
> Optimize RangeSlice operations for append-mostly use cases
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-3581
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3581
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Rick Branson
>            Assignee: Rick Branson
>            Priority: Minor
>             Fix For: 1.1
>
>
> Currently, to perform a slice or count with a SliceRange, all of the SSTables 
> containing the requested row must be interrogated to determine if they 
> contain matching column names. SliceRange operations on wide rows which have 
> columns distributed across many SSTable files can turn into a relatively 
> expensive operation involving many disk seeks. On time-series use cases such 
> as the one highlighted below, most of these I/O operations end up just 
> eliminating most of the SSTables.
> This optimization would require two values to be added to the SSTable header: 
> the minimum and maximum column names (according to the CF comparator) across 
> all rows (including tombstones) within the SSTable. For SliceRange 
> operations, SSTables containing rows with column names entirely outside of 
> the SliceRange would be completely eliminated without even a single disk 
> operation.
> Rationale: a very common use case for Cassandra is to use a column family to 
> store time-series data with a row for each metric and a column for each data 
> point with the column name being a TimeUUID. Data is typically read with a 
> bounded time range using a SliceRange. For the described use case, any given 
> SSTable within this ColumnFamily will have a tightly bound range of minimum 
> and maximum column names across all rows, and there will be little overlap of 
> these column name ranges across different SSTable files. Append-mostly column 
> families with serial column names (as ordered by the comparator) on which 
> SliceRange operations are used can benefit from this optimization, and the 
> cost to use cases that do not fall within this group range from negligible to 
> non-existant.
> Caveat: even just one row tombstone would throw this off completely. From 
> what I can tell, there's no way to skip an SSTable that contains a row 
> tombstone, and there is also no current way to segregate tombstones. Stu had 
> some interesting ideas in CASSANDRA-2498 about segregating tombstones to 
> separate SSTables, but that's for a later time. The light at the end of the 
> tunnel is that users which benefit from this optimization either do not 
> perform deletes or do them in large batches. These same users would also be 
> able to use slice tombstones instead of row tombstones to preverse the 
> optimized behavior. A full row tombstone would nullify the minimum/maximum 
> values, indicating that the optimization can't be used.
> Question for the audience: should there be some kind of cap to the size of 
> the min/max column names kept in the header to keep the internal bearings 
> greased and everyone honest? Something like 256 bytes seems reasonable to me, 
> and we just disable the optimization if the column name size exceeds this 
> limit. Is there a way we could, say, store only the most significant 32 bytes 
> for each end of the name range? I can't think of any.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-3581) Optimize RangeSlice operations for append-mostly use cases

Reply via email to