[jira] Issue Comment Edited: (CASSANDRA-847) Make the reading half of compactions memory-efficient

Stu Hood (JIRA) Mon, 05 Apr 2010 22:13:57 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853711#action_12853711
 ]


Stu Hood edited comment on CASSANDRA-847 at 4/6/10 5:12 AM:
------------------------------------------------------------

Alright, after the hiatus to implement byte[] keys, I'm back on this horse.

> 2. Replace ColumnFamily and SuperColumn with ColumnGroup, implementing 
> IColumn,
> and deleting IColumnContainer.
I don't think that nested structures, each with their own iterators is a good 
idea... especially when they may be hiding the fact that they are fetching 
columns from disk. And if they are not fetching transparently from disk, how do 
we make this any more memory efficient than the current approach?

The beauty in the Slice approach is that a List<Slice> can represent any 
arbitrarily nested structure you can think of, and yet the Slices are still 
autonomous.

EDIT: Erased an offtopic point.

> 3. Implement new disk format, read + write, but no compaction yet.
I'm not sure how this is supposed to work: is the idea that we would break 
backwards compatibility in trunk, and then restore it later on in your steps 
4,5,6?

      was (Author: stuhood):
    Alright, after the hiatus to implement byte[] keys, I'm back on this horse.

> 2. Replace ColumnFamily and SuperColumn with ColumnGroup, implementing 
> IColumn,
> and deleting IColumnContainer.
I don't think that nested structures, each with their own iterators is a good 
idea... especially when they may be hiding the fact that they are fetching 
columns from disk. And if they are not fetching transparently from disk, how do 
we make this any more memory efficient than the current approach?

The beauty in the Slice approach is that a List<Slice> can represent any 
arbitrarily nested structure you can think of, and yet the Slices are still 
autonomous. In the very long term I could imagine a Memtable being implemented 
as a SortedMap<ColumnKey,Slice>, where mutations are resolved into existing 
Slices, and then each are atomically swapped in order.

> 3. Implement new disk format, read + write, but no compaction yet.
I'm not sure how this is supposed to work: is the idea that we would break 
backwards compatibility in trunk, and then restore it later on in your steps 
4,5,6?
  
> Make the reading half of compactions memory-efficient
> -----------------------------------------------------
>
>                 Key: CASSANDRA-847
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-847
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Stu Hood
>            Priority: Critical
>             Fix For: 0.7
>
>         Attachments: 
> 0001-Add-structures-that-were-important-to-the-SSTableSca.patch, 
> 0002-Implement-most-of-the-new-SSTableScanner-interface.patch, 
> 0003-Rename-RowIndexedReader-specific-test.patch, 
> 0004-Improve-Scanner-tests-and-separate-SuperCF-handling-.patch, 
> 0005-Add-Scanner-interface-and-a-Filtered-implementation-.patch, 
> 0006-Add-support-for-compaction-of-super-CFs-and-some-tes.patch, 
> 0007-Remove-ColumnKey-bloom-filter-maintenance.patch, 
> 0008-Make-Scanner-extend-Iterator-again.patch, 
> 0009-Make-CompactionIterator-a-ReducingIterator-subclass-.patch, 
> 0010-Alternative-to-ReducingIterator-that-can-return-mult.patch, 
> compaction-bench-847.txt, compaction-bench-trunk.txt, compaction-bench.patch
>
>
> This issue is the next on the road to finally fixing CASSANDRA-16. To make 
> compactions memory efficient, we have to be able to perform the compaction 
> process on the smallest possible chunks that might intersect and contend 
> one-another, meaning that we need a better abstraction for reading from 
> SSTables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (CASSANDRA-847) Make the reading half of compactions memory-efficient

Reply via email to