[jira] Commented: (CASSANDRA-674) New SSTable Format

T Jake Luciani (JIRA) Wed, 05 Jan 2011 06:28:15 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977776#action_12977776
 ]


T Jake Luciani commented on CASSANDRA-674:
------------------------------------------

bq. the metadata is useless on it's own. It only becomes useful when it is 
attached to data (a column or to a range), so there is no reason to cache the 
meta- independently of the data.

But above you mention:
{code}
Indexes for individual rows are gone, since the global index allows random 
access to the middle of column families that span Blocks, and Slices allow 
batches of columns to be skipped within a Block.
{code}

^ This wouldn't be useful to cache? in the situation you only want a small 
range of columns? 

----- More questions ----
Roughly how large would the actual chunk be? This is the unit of 
deserialization right? or can avro deserialize only part of a structure?

So if you are doing a range query on a very wide row how do you know when to 
stop processing chunks? do you keep going till you hit the sentinel value 
<empty> ?





> New SSTable Format
> ------------------
>
>                 Key: CASSANDRA-674
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 674-v1.diff, perf-674-v1.txt, 
> perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt
>
>
> Various tickets exist due to limitations in the SSTable file format, 
> including #16, #47 and #328. Attached is a proposed design/implementation of 
> a new file format for SSTables that addresses a few of these limitations. The 
> implementation has a bunch of issues/fixmes, which I'll describe in the 
> comments.
> The file format is described in the javadoc for the o.a.c.io.SSTableWriter 
> class, but briefly:
>  * Blocks are opaque (except for their header) so that they can be 
> compressed. The index file contains an entry for the first key in every 
> Block. Blocks contain Slices.
>  * Slices are series of columns with the same parents and (deletion) 
> metadata. They can be used to represent ColumnFamilies or SuperColumns (or a 
> slice of columns at any other depth). A single CF can be split across 
> multiple Slices, which can be split across multiple blocks.
>  * Neither Slices nor Blocks have a fixed size or maximum length, but they 
> each have target lengths which can be stretched and broken by very large 
> columns.
> The most interesting concepts from this patch are:
>  * Block compression is possible (currently using GZIP, which has one bug 
> mentioned in the comments),
>  * Compaction involves merging intersecting Slices from input SSTables. Since 
> large rows will be broken down into multiple slices, only the portions of 
> rows that intersect between tables need to be 
> deserialized/merged/held-in-memory,
>  * Indexes for individual rows are gone, since the global index allows random 
> access to the middle of column families that span Blocks, and Slices allow 
> batches of columns to be skipped within a Block.
>  * Bloom filters for individual rows are gone, and the global filter contains 
> ColumnKeys instead, meaning that a query for a column that doesn't exist in a 
> row that does will often not need to seek to the row.
>  * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) 
> for columns are defined recursively, so deeply nested slices are possible,
>  * Slices representing a single parent (CF, SC, etc) can have different 
> Metadata, meaning that a tombstone Slice from d-f could sit between Slices 
> containing columns a-c and g-h. This allows for eventually consistent range 
> deletes of columns.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-674) New SSTable Format

Reply via email to