[Cassandra Wiki] Update of "Cassandra2474" by JonathanEllis

Apache Wiki Thu, 29 Dec 2011 10:01:53 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Cassandra2474" page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/Cassandra2474?action=diff&rev1=5&rev2=6

  
  <<TableOfContents(100)>>
  
+ == Background ==
+ 
+ === Supercolumns ===
+ 
+ Cassandra has supported limited nesting of data within a row via SuperColumns 
since its initial release.  A supercolumn is a named container of subcolumns, 
with no other metadata attached to it: unlike data columns, it cannot have a 
timestamp or TTL associated with it.  (The one exception is that supercolumns 
CAN be deleted as a unit, and thus SuperColumns CAN be tombstones.)  Thus, a 
row using supercolumns looks like this:
+ 
+ ||key||||'''supercolumn1'''  ||||||'''supercolumn2'''             ||
+ ||   ||subcolumn1||subcolumn2|||subcolumn3||subcolumn4||subcolumn5||
+ 
+ The most common use case for SuperColumns is to represent "materialized 
views" or "precomputed resultsets": each object in the resultset maps to a 
single supercolumn.  This usually takes advantage of the sorting-by-column-name 
to give very performant "slice" lookups for this resultset.  To use a more 
concrete example, we could represent the Twitter timeline as a single 
supercolumn row per user, with the tweets made by that user's friends 
represented as supercolumns within that row.  The supercolumn names will be the 
posted_at information, so this lets us get "most recent tweets, in [reverse] 
chronological order" easily:
+ 
+ ||tjefferson||||'''1763'''                                                    
                       ||||'''1790'''                                           
                                   ||||'''1818'''                               
                ||
+ ||          ||body                                                            
           ||posted_by ||body                                                   
                      ||posted_by  ||body                                       
     ||posted_by||
+ ||          ||Democracy will soon degenerate into an anarchy                  
           ||jadams    ||To be prepared for war is one of the most effectual 
means of preserving peace||gwashington||Revolution was effected before the war 
commenced||jadams   ||
+ ||bfranklin ||||'''1781'''                                                    
                       ||                                                       
                      ||           ||                                           
     ||         ||
+ ||          ||body                                                            
           ||posted_by ||                                                       
                      ||           ||                                           
     ||         ||
+ ||          ||Every government degenerates when trusted to the rulers of the 
people alone||tjefferson||                                                      
                       ||           ||                                          
      ||         ||
+ 
+ === Composite columns ===
+ 
+ SuperColumns have a number of limitations, most notably
+ 
+  1. there can only ever be a single level of nesting
+  1. to read any subcolumn from supercolumn X, all of X is read into memory
+  1. they add a lot of complexity to the Cassandra implementation and cause a 
fair number of bugs
+ 
+ To address these problems, Cassandra added the CompositeType, which encodes a 
multi-value column name into a single column -- essentially the column name 
becomes a Tuple, for those with a background in Python.  I will use Python 
tuple representation (x, y, z) to denote a composite column with components x, 
y, and z.
+ 
+ Composite columns are flexible enough that there are multiple ways to encode 
the same data.  The most natural ways to encode the above timeline data are, 
first, an encoding where each object becomes a single column:
+ 
+ ||tjefferson||(1763, 'Democracy will soon degenerate into an anarchy', 
'jadams')||(1790, 'To be prepared for war is one of the most effectual means of 
preserving peace', 'gwashington')||(1818, 'Revolution was effected before the 
war commenced', 'jadams')||
+ ||bfranklin ||(1781, 'Every government degenerates when trusted to the rulers 
of the people alone', 'tjefferson')||||||
+ 
+ Note that a CompositeType definition includes type information -- (int, utf8, 
utf8) here -- but there is no column name information; this is governed purely 
by application convention.
+ 
+ The main drawback to this representation is that like row keys, column names 
are necessarily immutable in Cassandra.  So there is no way to update an object 
using this representation other than by deleting the old and adding the new.  
More subtly, this exposes us to some of the drawbacks of a pure key/value 
approach that normal Cassandra columns avoid: if one client updates field X in 
a result, while another client updates field Y, there will be no race when X 
and Y are distinct columns.  But if these fields are stored as part of the same 
composite column then there is a race.
+ 
+ Another way to encode this data addresses these drawbacks by splitting 
updateable fields into separate composite columns:
+ 
+ ||tjefferson||(1763, body)                                                    
           ||(1763, posted_by)||(1790, body)                                    
                             ||(1790, posted_by)||(1818, body)                  
                          ||(1818, posted_by)||
+ ||          ||Democracy will soon degenerate into an anarchy                  
           ||jadams           ||To be prepared for war is one of the most 
effectual means of preserving peace||gwashington      ||Revolution was effected 
before the war commenced||jadams   ||
+ ||bfranklin ||(1781, body)                                                    
           ||(1781, posted_by||                                                 
                      ||           ||                                           
     ||         ||
+ ||          ||Every government degenerates when trusted to the rulers of the 
people alone||tjefferson||                                                      
                       ||           ||                                          
      ||         ||
+ 
+ For lack of better terms, we have been calling these "dense" and "sparse" 
composite column encodings.
+ 
+ === DynamicCompositeType ===
+ 
+ DCT has no set type information or field count -- each component of the 
composite column name includes the type name as well (encoded as a utf-8 
String).  Currently, this allows rows within the same ColumnFamily to have 
different kinds of data in them.  In the future, this will also allow different 
kinds of data within the same row 
(https://issues.apache.org/jira/browse/CASSANDRA-3625).
+ 
  == Goals ==
  
  Primary: provide a CQL syntax for updating and querying composite column 
families.
  
  Secondary goal: proposed syntax should be implementable by the Hive driver 
with the minimum of changes from mainline Hive.  In particular, changes to the 
Hive parser are too difficult to maintain long-term and are Right Out.  We 
would prefer to avoid changes to the Hive metastore but this is doable if 
necessary.
  
- Tertiary goal: it would be nice to also support supercolumns
+ Tertiary goal: it would be nice to support supercolumns as well as composite 
columns
  
  == Non-goals ==
  
- Supporting arbitrarily-and-non-uniformly nested "document" data is a 
non-goal.  https://issues.apache.org/jira/browse/CASSANDRA-3647 is created to 
follow up on this related problem.
+ Supporting DynamicCompositeType or other arbitrarily-and-non-uniformly nested 
"document" data is a non-goal.  
https://issues.apache.org/jira/browse/CASSANDRA-3647 is created to follow up on 
this related problem.
  
  == Alpha ==
  
@@ -95, +145 @@

  
  Discussion starts 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13171304&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171304|here]]
  
+ Gamma can represent both dense and sparse composite types; fields included in 
the PRIMARY KEY definition will be represented as part of the composite column 
"prefix" with a dense encoding:
  {{{
+ -- the "dense" encoding shown above in the Background section
- -- "column" and "value" are sparse; a transposed row will be stored as
- -- two columns of (posted_at, 'column': string) and (posted_at, 'value': 
blob),
- -- with C* row key of user_id
  CREATE TABLE timeline (
      user_id int,
      posted_at uuid,
-     column string,
+     body string,
-     value blob,    
+     posted_by string,    
+     PRIMARY KEY(user_id, posted_at, body, posted_by)
+ ) TRANSPOSED;
+ 
+ -- the "sparse" encoding
+ CREATE TABLE timeline (
+     user_id int,
+     posted_at uuid,
+     body string,
+     posted_by string,    
      PRIMARY KEY(user_id, posted_at)
  ) TRANSPOSED;
  }}}
  
+ Consideration is also taken for non-String column names:
  {{{
- -- entire transposed row is stored as a single dense composite column
- -- (ts1, cat, subcat, 1337, 92d21d0a-...: []) with a C* row key of series.  
- -- Note that the composite column's value is unused in this case.
  CREATE TABLE events (
      series text,
      ts1 int,
@@ -125, +181 @@

  }}}
  === Examples ===
  
- `SELECT`, `INSERT`, and `UPDATE` syntax require no changes.  Some examples:
+ `SELECT`, `INSERT`, and `UPDATE` syntax require no changes.  Some examples, 
using the timeline data from the Background section above:
  
  {{{
  INSERT INTO timeline (user_id, posted_at, posted_by, body)
@@ -141, +197 @@

  VALUES ('bfranklin', '1781', 'tjefferson', 'Every government degenerates when 
trusted to the rulers of the people alone');
  }}}
  
- The corresponding data model would look like:
- 
- ||'''user_id'''||'''posted_at'''||'''posted_by'''||'''body'''||
- ||tjefferson||1818||jadams||Revolution was effected before the war commenced||
- ||tjefferson||1763||jadams||Democracy will soon degenerate into an anarchy||
- ||bfranklin||1781||tjefferson||Every government degenerates when trusted to 
the rulers of the people alone||
- ||tjefferson||1790||gwashington||To be prepared for war is one of the most 
effectual means of preserving peace||
- 
- In "raw" form this would look like:
- 
- ||tjefferson||(1763, 'body'): Democracy will soon degenerate into an 
anarchy||(1763, 'posted_by'): jadams||(1790, 'body'): To be prepared for war is 
one of the most effectual means of preserving peace||(1790, 'posted_by'): 
gwashington||(1818, 'body'): Revolution was effected before the war 
commenced||(1818, 'posted_by'): jadams||
- ||bfranklin||(1781, 'body'): Every government degenerates when trusted to the 
rulers of the people alone||(1781, 'posted_by'): tjefferson||||||||||
- 
- And an example `SELECT`:
+ An example `SELECT`:
  
  {{{
  SELECT * FROM timeline WHERE user_id = 'tjefferson' AND posted_at > 1770;
@@ -168, +211 @@

  
  Only minimal CQL changes are required.  The Hive metastore would need to be 
updated to understand the TRANSPOSED syntax.  Normal SELECTs and UPDATEs are 
supported, including "SELECT *," a weakness of the Beta proposals.
  
- With the addition of the PRIMARY KEY syntax, this allows for specifying both 
"sparse" and "dense" data layouts, without the SPARSE keyword that some found 
unappealing.  It also improves conceptual integrity with existing C* practice, 
namely, that row keys are not update-able.  So, the tradeoff is 
straightforward: include a column in the PRIMARY KEY if you want it to be part 
of the positional CompositeType tuple (and be more space efficient); leave it 
out if you want to update it.
+ The PRIMARY KEY syntax allows for specifying both "sparse" and "dense" data 
layouts, without the SPARSE keyword that some found unappealing.  It also 
improves conceptual integrity with existing C* practice, namely, that row keys 
are not update-able.  So, the tradeoff is straightforward: include a column in 
the PRIMARY KEY if you want it to be part of the positional CompositeType tuple 
(and be more space efficient); leave it out if you want to update it.
  
  This also allows supporting SuperColumns, should we choose to do so.

[Cassandra Wiki] Update of "Cassandra2474" by JonathanEllis

Reply via email to