[ https://issues.apache.org/jira/browse/CASSANDRA-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853711#action_12853711 ]
Stu Hood edited comment on CASSANDRA-847 at 4/6/10 5:12 AM: ------------------------------------------------------------ Alright, after the hiatus to implement byte[] keys, I'm back on this horse. > 2. Replace ColumnFamily and SuperColumn with ColumnGroup, implementing > IColumn, > and deleting IColumnContainer. I don't think that nested structures, each with their own iterators is a good idea... especially when they may be hiding the fact that they are fetching columns from disk. And if they are not fetching transparently from disk, how do we make this any more memory efficient than the current approach? The beauty in the Slice approach is that a List<Slice> can represent any arbitrarily nested structure you can think of, and yet the Slices are still autonomous. EDIT: Erased an offtopic point. > 3. Implement new disk format, read + write, but no compaction yet. I'm not sure how this is supposed to work: is the idea that we would break backwards compatibility in trunk, and then restore it later on in your steps 4,5,6? was (Author: stuhood): Alright, after the hiatus to implement byte[] keys, I'm back on this horse. > 2. Replace ColumnFamily and SuperColumn with ColumnGroup, implementing > IColumn, > and deleting IColumnContainer. I don't think that nested structures, each with their own iterators is a good idea... especially when they may be hiding the fact that they are fetching columns from disk. And if they are not fetching transparently from disk, how do we make this any more memory efficient than the current approach? The beauty in the Slice approach is that a List<Slice> can represent any arbitrarily nested structure you can think of, and yet the Slices are still autonomous. In the very long term I could imagine a Memtable being implemented as a SortedMap<ColumnKey,Slice>, where mutations are resolved into existing Slices, and then each are atomically swapped in order. > 3. Implement new disk format, read + write, but no compaction yet. I'm not sure how this is supposed to work: is the idea that we would break backwards compatibility in trunk, and then restore it later on in your steps 4,5,6? > Make the reading half of compactions memory-efficient > ----------------------------------------------------- > > Key: CASSANDRA-847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-847 > Project: Cassandra > Issue Type: Improvement > Reporter: Stu Hood > Priority: Critical > Fix For: 0.7 > > Attachments: > 0001-Add-structures-that-were-important-to-the-SSTableSca.patch, > 0002-Implement-most-of-the-new-SSTableScanner-interface.patch, > 0003-Rename-RowIndexedReader-specific-test.patch, > 0004-Improve-Scanner-tests-and-separate-SuperCF-handling-.patch, > 0005-Add-Scanner-interface-and-a-Filtered-implementation-.patch, > 0006-Add-support-for-compaction-of-super-CFs-and-some-tes.patch, > 0007-Remove-ColumnKey-bloom-filter-maintenance.patch, > 0008-Make-Scanner-extend-Iterator-again.patch, > 0009-Make-CompactionIterator-a-ReducingIterator-subclass-.patch, > 0010-Alternative-to-ReducingIterator-that-can-return-mult.patch, > compaction-bench-847.txt, compaction-bench-trunk.txt, compaction-bench.patch > > > This issue is the next on the road to finally fixing CASSANDRA-16. To make > compactions memory efficient, we have to be able to perform the compaction > process on the smallest possible chunks that might intersect and contend > one-another, meaning that we need a better abstraction for reading from > SSTables. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.