Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.
The "FileFormatDesignDoc" page has been changed by StuHood. The comment on this change is: Remove field-ordered section: not coming any time soon, and arguably not beneficial in a column-family oriented store. http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diff&rev1=35&rev2=36 -------------------------------------------------- || 4.9 || 1 || || china || 0 || + The parent change flags can be represented compactly using a bitmap, and type information can be stored using a byte per value. - The parent change flag and type information can be represented compactly using a bitmap. - - === Field reordering === - - ** NB: field reordering will likely not be implemented in initial versions of the format ** - - One weakness of the implementation so far is that it preserves the order of tuples within a level. This approach performs well for wide rows with high field cardinality, since adding compression is unlikely to remove data. - - But since we have domain knowledge that a compression algorithm would not, it will often be more efficient to perform reordering by ourselves, particularly when a chunk has low cardinality: for example at the "name2" level above. By assigning the chunk an ordering of ''self'' (as opposed to ''parent''), we can store the fields in sorted order (rather than in ''parent''-sorted order) and remove duplicates. - - || ''name2'' || - || flavor || - || origin || - - More importantly, a ''self''-ordered chunk should influence the order of tuples in child chunks. When we encounter an ''self''-ordered chunk at level "name2", we should expect its children in level "value" to be arranged as follows: - - || ''value'' || ''parent_change'' || - || 3.4 || 1 || - || 5.6 || 1 || - || 2.6 || 1 || - || 4.2 || 1 || - || 4.9 || 1 || - || || 0 || - || france || 1 || - || || 0 || - || || 0 || - || china || 1 || - - The ''parent_change'' field is now a bitmap representing nulls: it indicates that all parents have a 'flavor' tuple, but only the second and fifth parents have an 'origin' tuple. This representation is ripe for compression. - - === Summary === - - A (simplified) representation of the span so far (without metadata) is: - - ''(parent-ordered)'' - || ''row key'' || ''parent_change'' || - || cheese || 0 || - || fruit || 0 || - ''(parent-ordered)'' - || ''name1'' || ''parent_change'' || - || brie || 0 || - || gouda || 0 || - || swiss || 0 || - || apple || 1 || - || pear || 1 || - ''(self-ordered)'' - || ''name2'' || - || flavor || - || origin || - ''(parent-ordered)'' - || ''value'' || ''parent_change'' || - || 3.4 || 1 || - || 5.6 || 1 || - || 2.6 || 1 || - || 4.2 || 1 || - || 4.9 || 1 || - || || 0 || - || france || 1 || - || || 0 || - || || 0 || - || china || 1 || == Metadata ==