Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "FileFormatDesignDoc" page has been changed by StuHood.
The comment on this change is: Remove field-ordered section: not coming any 
time soon, and arguably not beneficial in a column-family oriented store.
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diff&rev1=35&rev2=36

--------------------------------------------------

  || 4.9 || 1 ||
  || china || 0 ||
  
+ The parent change flags can be represented compactly using a bitmap, and type 
information can be stored using a byte per value.
- The parent change flag and type information can be represented compactly 
using a bitmap.
- 
- === Field reordering ===
- 
- ** NB: field reordering will likely not be implemented in initial versions of 
the format **
- 
- One weakness of the implementation so far is that it preserves the order of 
tuples within a level. This approach performs well for wide rows with high 
field cardinality, since adding compression is unlikely to remove data.
- 
- But since we have domain knowledge that a compression algorithm would not, it 
will often be more efficient to perform reordering by ourselves, particularly 
when a chunk has low cardinality: for example at the "name2" level above. By 
assigning the chunk an ordering of ''self'' (as opposed to ''parent''), we can 
store the fields in sorted order (rather than in ''parent''-sorted order) and 
remove duplicates.
- 
- || ''name2'' ||
- || flavor ||
- || origin  ||
- 
- More importantly, a ''self''-ordered chunk should influence the order of 
tuples in child chunks. When we encounter an ''self''-ordered chunk at level 
"name2", we should expect its children in level "value" to be arranged as 
follows:
- 
- || ''value'' || ''parent_change'' ||
- || 3.4 || 1 ||
- || 5.6 || 1 ||
- || 2.6 || 1 ||
- || 4.2 || 1 ||
- || 4.9 || 1 ||
- || || 0 ||
- || france || 1 ||
- || || 0 ||
- || || 0 ||
- || china || 1 ||
- 
- The ''parent_change'' field is now a bitmap representing nulls: it indicates 
that all parents have a 'flavor' tuple, but only the second and fifth parents 
have an 'origin' tuple. This representation is ripe for compression.
- 
- === Summary ===
- 
- A (simplified) representation of the span so far (without metadata) is:
- 
- ''(parent-ordered)''
- || ''row key'' || ''parent_change'' ||
- || cheese  || 0 ||
- || fruit   || 0 ||
- ''(parent-ordered)''
- || ''name1''  || ''parent_change'' ||
- || brie || 0 ||
- || gouda || 0 ||
- || swiss || 0 ||
- || apple || 1 ||
- || pear  || 1 ||
- ''(self-ordered)''
- || ''name2'' ||
- || flavor ||
- || origin  ||
- ''(parent-ordered)''
- || ''value'' || ''parent_change'' ||
- || 3.4 || 1 ||
- || 5.6 || 1 ||
- || 2.6 || 1 ||
- || 4.2 || 1 ||
- || 4.9 || 1 ||
- || || 0 ||
- || france || 1 ||
- || || 0 ||
- || || 0 ||
- || china || 1 ||
  
  == Metadata ==
  

Reply via email to