Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "CassandraLimitations" page has been changed by StuHood. http://wiki.apache.org/cassandra/CassandraLimitations?action=diff&rev1=9&rev2=10 -------------------------------------------------- == Artifacts of the current code base == * The byte[] size of a value can't be more than 2^31-1. * Cassandra's compaction code currently deserializes an entire row (per columnfamily) at a time. So all the data from a given columnfamily/key pair must fit in memory. Fixing this is relatively easy since columns are stored in-order on disk so there is really no reason you have to deserialize row-at-a-time except that that is easier with the current encapsulation of functionality. This will be fixed in https://issues.apache.org/jira/browse/CASSANDRA-16 + * A related limitation is that an entire row cannot be larger than 2^31-1 bytes, since the length of rows is serialized to disk using an integer. * Cassandra has two levels of indexes: key and column. But in super columnfamilies there is a third level of subcolumns; these are not indexed, and any request for a subcolumn deserializes _all_ the subcolumns in that supercolumn. So you want to avoid a data model that requires large numbers of subcolumns. https://issues.apache.org/jira/browse/CASSANDRA-598 is open to remove this limitation. * <<Anchor(streaming)>>Cassandra's public API is based on Thrift, which offers no streaming abilities -- any value written or fetched has to fit in memory. This is inherent to Thrift's design; I don't see it changing. So adding large object support to Cassandra would need a special API that manually split the large objects up into pieces. Jonathan Ellis sketched out one approach in http://issues.apache.org/jira/browse/CASSANDRA-265. As a workaround in the meantime, you can manually split files into chunks of whatever size you are comfortable with -- at least one person is using 64MB -- and making a file correspond to a row, with the chunks as column values. * Thrift will crash Cassandra if you send random or malicious data to it. This makes exposing the Cassandra port directly to the outside internet a Bad Idea. See http://issues.apache.org/jira/browse/CASSANDRA-475 and http://issues.apache.org/jira/browse/THRIFT-601 for details.