[ 
https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556837#comment-13556837
 ] 

Robert Coli commented on CASSANDRA-4446:
----------------------------------------

If only unflushed system changes are replayed, how do you account for :

"I upgraded from 1.0.12 to 1.1.8, using drain, and I noticed overcounting 
counters" ?

It's quite possible that upgrading from 1.1.x to 1.1.y>x does not in fact 
replay anything other than system keyspace and does not incur double counting 
of counters. I am however pretty confident based on multiple reports of the 
above quoted issue that counter increments may be over-replayed if one uses 
drain (as NEWS.txt suggests) while upgrading from 1.0.x to >1.1.6.

If this is being dealt with as "known limitation of 1.0.x", then I continue to 
suggest the above change to NEWS.txt, as otherwise people using counters in 
1.0.x WILL incur double-increment while upgrading per the instructions in 
NEWS.txt.
                
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-4446
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core, Tools
>         Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 
> x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
>            Reporter: Robert Coli
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.2.1
>
>         Attachments: 4446.txt, 
> cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified 
> that they were drained. When I restarted the nodes, I saw the commitlog 
> replay create a memtable and then flush it. I have attached a sanitized log 
> snippet from a representative node at the time. 
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the 
> compaction thread excepted in 5). This seems to suggest a timing issue 
> whereby the exception in 5) prevents the flush in 3) from marking all the 
> segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy 
> from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in 
> modern cassandra?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to