[Cassandra Wiki] Update of "CassandraLimitations" by JonathanEllis

Apache Wiki Thu, 21 Mar 2013 05:43:21 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "CassandraLimitations" page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/CassandraLimitations?action=diff&rev1=32&rev2=33

  == Artifacts of the current code base ==
   * <<Anchor(streaming)>>Cassandra's public API is based on Thrift, which 
offers no streaming abilities -- any value written or fetched has to fit in 
memory.  This is inherent to Thrift's design and is therefore unlikely to 
change.  So adding large object support to Cassandra would need a special API 
that manually split the large objects up into pieces. A potential approach is 
described in http://issues.apache.org/jira/browse/CASSANDRA-265.  As a 
workaround in the meantime, you can manually split files into chunks of 
whatever size you are comfortable with -- at least one person is using 64MB -- 
and making a file correspond to a row, with the chunks as column values.
  
- == Obsolete Limitations ==
-  * Prior to version 0.7, Cassandra's compaction code deserialized an entire 
row (per columnfamily) at a time.  So all the data from a given 
columnfamily/key pair had to fit in memory, or 2GB, whichever was smaller 
(since the length of the row was serialized as a Java int).
-  * Prior to version 0.7, Thrift would crash Cassandra if you send random or 
malicious data to it.  This made exposing the Cassandra port directly to the 
outside internet a Bad Idea.
-  * Prior to version 0.4, Cassandra did not fsync the commitlog before acking 
a write.  Most of the time this is Good Enough when you are writing to multiple 
replicas since the odds are slim of all replicas dying before the data actually 
hits the disk, but the truly paranoid will want real fsync-before-ack.  This is 
now an option.
-

[Cassandra Wiki] Update of "CassandraLimitations" by JonathanEllis

Reply via email to