[ 
https://issues.apache.org/jira/browse/CASSANDRA-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034214#comment-15034214
 ] 

Ariel Weisberg commented on CASSANDRA-8639:
-------------------------------------------

bq. This isn't related to the ticket but maybe we should fix it as well; I 
don't see anyplace we wait for the replay futures to complete before we finish 
recover().
Both 2.1 code and your patch will exit early before the futures have all 
finished. It looks like the old version only waited when there were more than 
max outstanding mutations. Which is also wrong and racy. We should always wait 
for the queue to drain completely before the method exits.
Are you talking about 
{{[CommitLogReplayer.blockForWrites()|https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L98]}}
 which is invoked from 
{{[CommitLog.recover()|https://github.com/apache/cassandra/blob/cassandra-2.1.3/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L145]}}?
 I think that is already covered. I can refactor it if you want and have 
{{CommitLogReplayer.recover()}} return until everything is done anyways. Seems 
like a safe change unless it turns out to deadlock some weird test code 
somewhere.

bq. I'm not sure why futures was changed to a deque. looks like you only use 
queue methods, but maybe I missed it?
It's just a habit for this kind of asynchronous result queue. Muscle memory 
writes a loop that prioritizes consuming result futures over submitting new 
work in order to minimize latency, working set size, and temporal cache 
locality. To do that you need to be able to {{poll()}} a queue and {{List}} 
doesn't let you do that. 

The other issue with doing 1k and then draining 1k is that there is a period 
where the number of tasks in flight is small because the producer is waiting 
for the stragglers from the last 1k to arrive before issuing new work. This can 
starve consumers. This version keeps a targeted number of pending 
mutations/mutation bytes in flight at any given time.

bq. The only other thing I noticed was in the test you should validate the data 
test data is not found after you clear the CF in-case the replay isn't working.
[Does clearing the data before 
replay|https://github.com/apache/cassandra/compare/cassandra-2.1...aweisberg:CASSANDRA-8639-2.1?expand=1#diff-92ffe896212dc94b91ad86349f0647abR141]
 and [then checking for it afterwards accomplish 
that?|https://github.com/apache/cassandra/compare/cassandra-2.1...aweisberg:CASSANDRA-8639-2.1?expand=1#diff-92ffe896212dc94b91ad86349f0647abR183]
 Unless you think that {{clearUnsafe()}} might not work it seems sufficient.

bq. You also have a 2.1 utest failure related to CL not sure if that's related. 
org.apache.cassandra.cql3.DropKeyspaceCommitLogRecycleTest.testRecycle
[That is a pretty flakey 
test.|http://cassci.datastax.com/view/cassandra-2.1/job/cassandra-2.1_testall/271/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/history/]

bq. And one dtest failure in 2.1 commitlog_test.TestCommitLog.test_bad_crc
[Looks like another failing/unreliable 
test.|http://cassci.datastax.com/view/cassandra-2.1/job/cassandra-2.1_dtest/361/#showFailuresLink]



> Can OOM on CL replay with dense mutations
> -----------------------------------------
>
>                 Key: CASSANDRA-8639
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8639
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: T Jake Luciani
>            Assignee: Ariel Weisberg
>            Priority: Minor
>             Fix For: 2.1.x
>
>
> If you write dense mutations with many clustering keys, the replay of the CL 
> can quickly overwhelm a node on startup.  This looks to be caused by the fact 
> we only ensure there are 1000 mutations in flight at a time. but those 
> mutations could have thousands of cells in them.
> A better approach would be to limit the CL replay to the amount of memory in 
> flight using cell.unsharedHeapSize()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to