[jira] [Issue Comment Edited] (CASSANDRA-2901) Allow taking advantage of multiple cores while compacting a single CF

Jonathan Ellis (JIRA) Mon, 01 Aug 2011 12:55:34 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073682#comment-13073682
 ]


Jonathan Ellis edited comment on CASSANDRA-2901 at 8/1/11 7:54 PM:
-------------------------------------------------------------------

bq. PCI.Reducer.getCompactedRow unwraps NotifyingSSTableIterator

fixed

bq. it doesn't seem like we ever call close() on the SSTableIdentityIterator

added ACR.close to fix

bq. we need to reuse the trick of DebuggableThreadPoolExecutor

fixed

bq. we already queue up rows in the MergeTask executor, so it feels like it 
would be simple to use direct handoff here

Simpler, but then any deserializer that can't deserialize the next row by the 
time the next merge slot is open, will stall the merge.  My thinking was that 
allowing per-deserializer buffers will keep the pipeline full better.

bq. the memory blow up is (potentially) much more than the 2x (compared to 
mono-threaded) in the description of this ticket

It's not so bad, because we can assume N >= 2 sstables, and we restrict each 
deserializer to 1/N of in-memory limit.  So I think we come close to 2x 
overall.  (And if we don't, I'd rather adjust our estimate, than make it less 
performant/more complex. :)

bq. we should probably remove the occurrence in PreCompactedRow as it's still 
multi-threaded while in the MergeTask

refactored PCR constructors to do this.

bq. we create one Comparator<IColumnIterator> each time instead of having a 
private static final one

added RowContainer.comparator

bq. there is a "race" when updating the bytesRead

fixed, and changed bytesRead to volatile


      was (Author: jbellis):
    bq. PCI.Reducer.getCompactedRow unwraps NotifyingSSTableIterator

fixed

bq. it doesn't seem like we ever call close() on the SSTableIdentityIterator

fixed

bq. we need to reuse the trick of DebuggableThreadPoolExecutor

added ACR.close to fix

bq. we already queue up rows in the MergeTask executor, so it feels like it 
would be simple to use direct handoff here

Simpler, but then any deserializer that can't deserialize the next row by the 
time the next merge slot is open, will stall the merge.  My thinking was that 
allowing per-deserializer buffers will keep the pipeline full better.

bq. the memory blow up is (potentially) much more than the 2x (compared to 
mono-threaded) in the description of this ticket

It's not so bad, because we can assume N >= 2 sstables, and we restrict each 
deserializer to 1/N of in-memory limit.  So I think we come close to 2x 
overall.  (And if we don't, I'd rather adjust our estimate, than make it less 
performant/more complex. :)

bq. we should probably remove the occurrence in PreCompactedRow as it's still 
multi-threaded while in the MergeTask

refactored PCR constructors to do this.

bq. we create one Comparator<IColumnIterator> each time instead of having a 
private static final one

added RowContainer.comparator

bq. there is a "race" when updating the bytesRead

fixed, and changed bytesRead to volatile

  
> Allow taking advantage of multiple cores while compacting a single CF
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-2901
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2901
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.8.3
>
>         Attachments: 2901-v2.txt, 2901-v3.txt, 2901.patch
>
>
> Moved from CASSANDRA-1876:
> There are five stages: read, deserialize, merge, serialize, and write. We 
> probably want to continue doing read+deserialize and serialize+write 
> together, or you waste a lot copying to/from buffers.
> So, what I would suggest is: one thread per input sstable doing read + 
> deserialize (a row at a time). A thread pool (one per core?) merging 
> corresponding rows from each input sstable. One thread doing serialize + 
> writing the output (this has to wait for the merge threads to complete 
> in-order, obviously). This should take us from being CPU bound on SSDs (since 
> only one core is compacting) to being I/O bound.
> This will require roughly 2x the memory, to allow the reader threads to work 
> ahead of the merge stage. (I.e. for each input sstable you will have up to 
> one row in a queue waiting to be merged, and the reader thread working on the 
> next.) Seems quite reasonable on that front.  You'll also want a small queue 
> size for the serialize-merged-rows executor.
> Multithreaded compaction should be either on or off. It doesn't make sense to 
> try to do things halfway (by doing the reads with a
> threadpool whose size you can grow/shrink, for instance): we still have 
> compaction threads tuned to low priority, by default, so the impact on the 
> rest of the system won't be very different. Nor do we expect to have so many 
> input sstables that we lose a lot in context switching between reader threads.
> IMO it's acceptable to punt completely on rows that are larger than memory, 
> and fall back to the old non-parallel code there. I don't see any sane way to 
> parallelize large-row compactions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2901) Allow taking advantage of multiple cores while compacting a single CF

Reply via email to