[ https://issues.apache.org/jira/browse/CASSANDRA-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2901: -------------------------------------- Reviewer: slebresne Assignee: Jonathan Ellis > Allow taking advantage of multiple cores while compacting a single CF > --------------------------------------------------------------------- > > Key: CASSANDRA-2901 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2901 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Jonathan Ellis > Priority: Minor > Fix For: 0.8.3 > > Attachments: 2901-v2.txt, 2901.patch > > > Moved from CASSANDRA-1876: > There are five stages: read, deserialize, merge, serialize, and write. We > probably want to continue doing read+deserialize and serialize+write > together, or you waste a lot copying to/from buffers. > So, what I would suggest is: one thread per input sstable doing read + > deserialize (a row at a time). A thread pool (one per core?) merging > corresponding rows from each input sstable. One thread doing serialize + > writing the output (this has to wait for the merge threads to complete > in-order, obviously). This should take us from being CPU bound on SSDs (since > only one core is compacting) to being I/O bound. > This will require roughly 2x the memory, to allow the reader threads to work > ahead of the merge stage. (I.e. for each input sstable you will have up to > one row in a queue waiting to be merged, and the reader thread working on the > next.) Seems quite reasonable on that front. You'll also want a small queue > size for the serialize-merged-rows executor. > Multithreaded compaction should be either on or off. It doesn't make sense to > try to do things halfway (by doing the reads with a > threadpool whose size you can grow/shrink, for instance): we still have > compaction threads tuned to low priority, by default, so the impact on the > rest of the system won't be very different. Nor do we expect to have so many > input sstables that we lose a lot in context switching between reader threads. > IMO it's acceptable to punt completely on rows that are larger than memory, > and fall back to the old non-parallel code there. I don't see any sane way to > parallelize large-row compactions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira