[Cassandra Wiki] Update of "CassandraLimitations" by JonathanEllis

Apache Wiki Thu, 16 Feb 2012 09:20:43 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "CassandraLimitations" page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/CassandraLimitations?action=diff&rev1=29&rev2=30

Comment:
link composite columns

  = Limitations =
- 
  == Stuff that isn't likely to change ==
   * All data for a single row must fit (on disk) on a single machine in the 
cluster. Because row keys alone are used to determine the nodes responsible for 
replicating their data, the amount of data associated with a single key has 
this upper bound.
   * A single column value may not be larger than 2GB.  (However, large values 
are read into memory when requested, so in practice "small number of MB" is 
more appropriate.)
@@ -9, +8 @@

   * The key (and column names) must be under 64K bytes.
  
  == Artifacts of the current code base ==
-  * Cassandra has two levels of indexes: key and column.  But in super 
columnfamilies there is a third level of subcolumns; these are not indexed, and 
any request for a subcolumn deserializes _all_ the subcolumns in that 
supercolumn.  So you want to avoid a data model that requires large numbers of 
subcolumns.  https://issues.apache.org/jira/browse/CASSANDRA-598 is open to 
remove this limitation.
+  * Cassandra has two levels of indexes: key and column.  But in super 
columnfamilies there is a third level of subcolumns; these are not indexed, and 
any request for a subcolumn deserializes _all_ the subcolumns in that 
supercolumn.  So you want to avoid a data model that requires large numbers of 
subcolumns. 
[[http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1|Composite
 columns]] do not have this limitation.
   * <<Anchor(streaming)>>Cassandra's public API is based on Thrift, which 
offers no streaming abilities -- any value written or fetched has to fit in 
memory.  This is inherent to Thrift's design and is therefore unlikely to 
change.  So adding large object support to Cassandra would need a special API 
that manually split the large objects up into pieces. A potential approach is 
described in http://issues.apache.org/jira/browse/CASSANDRA-265.  As a 
workaround in the meantime, you can manually split files into chunks of 
whatever size you are comfortable with -- at least one person is using 64MB -- 
and making a file correspond to a row, with the chunks as column values.
  
  == Obsolete Limitations ==

[Cassandra Wiki] Update of "CassandraLimitations" by JonathanEllis

Reply via email to