[ https://issues.apache.org/jira/browse/CASSANDRA-12269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
T Jake Luciani updated CASSANDRA-12269: --------------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) Nits addressed and CI runs clean [testall|http://cassci.datastax.com/view/Dev/view/tjake/job/tjake-write-perf-testall/lastCompletedBuild/testReport/] [dtest|http://cassci.datastax.com/view/Dev/view/tjake/job/tjake-write-perf2-dtest/lastCompletedBuild/testReport/] committed to trunk as {{dc9ed463417aa8028e77e91718e4f3d6ea563210}} > Faster write path > ----------------- > > Key: CASSANDRA-12269 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12269 > Project: Cassandra > Issue Type: Improvement > Reporter: T Jake Luciani > Assignee: T Jake Luciani > Fix For: 3.10 > > > The new storage engine (CASSANDRA-8099) has caused a regression in write > performance. This ticket is to address it and bring 3.0 as close to 2.2 as > possible. There are four main reasons for this I've discovered after much > toil: > 1. The cost of calculating the size of a serialized row is higher now since > we no longer have the cell name and value managed as ByteBuffers as we did > pre-3.0. That means we current re-serialize the row twice, once to calculate > the size and once to write the data. This happens during the SSTable writes > and was addressed in CASSANDRA-9766. > Double serialization is also happening in CommitLog and the > MessagingService. We need to apply the same techniques to these as we did to > the SSTable serialization. > 2. Even after fixing (1) there is still an issue with there being more GC > pressure and CPU usage in 3.0 due to the fact that we encode everything from > the {{Column}} to the {{Row}} to the {{Partition}} as a {{BTree}}. > Specifically, the {{BTreeSearchIterator}} is used for all iterator() methods. > Both these classes are useful for efficient removal and searching of the > trees but in the case of SerDe we almost always want to simply walk the > entire tree forwards or reversed and apply a function to each element. To > that end, we can use lambdas and do this without any extra classes. > 3. We use a lot of thread locals and check them constantly on the read/write > paths. For client warnings, tracing, temp buffers, etc. We should move all > thread locals to FastThreadLocals and threads to FastThreadLocalThreads. > 4. We changed the memtable flusher defaults in 3.2 that caused a regression > see: CASSANDRA-12228 -- This message was sent by Atlassian JIRA (v6.3.4#6332)