[ 
https://issues.apache.org/jira/browse/CASSANDRA-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448066#comment-13448066
 ] 

Robert Coli commented on CASSANDRA-3564:
----------------------------------------

> That sounds reasonable but I think we should have some kind of ceiling (10 
> minutes or something) where we kill -9 it, just in case we ever have a bug 
> that causes us not to exit (we've had them before), so we don't hang the 
> shutdown of the entire machine forever.
> ...
> Unless I'm missing something, you can't do anything about a kill -9, you're 
> cooked.

The trivial case of this is a node where the data directory has been marked 
read-only due to errors but the commitlog is on a different device which is 
still writable.

In the status quo, stopping such a node will not result in the sstable flush 
blocking forever. The node just stops. On restart it replays and 
(CASSANDRA-1967) these replayed memtables are then flushed. This results in the 
same flush blocking forever, but the node otherwise serves reads and can take 
writes until it OOMs. It also doesn't need to be sent a SIGKILL at any time.

If the node flushes on shutdown in such a way that it is effectively "drain"ing 
the node, then in order to avoid data loss you merely need to wait for the 
commitlog sync. The relevant thing seems to be that the *commitlog* is synced 
before you SIGKILL the node, not whether the *flush* succeeds or not. In 
practice, this window seems likely to be sized in a small number of seconds 
even with the most lenient commitlog flushing, and is therefore likely 
irrelevant.

However, with flush-on-shutdown you *have* to send the process SIGKILL in this 
case, because the flush can hang indefinitely. I get worried any time I *have* 
to send SIGKILL to a database, even if I understand logically that is it safe. 
Adding flush to the shutdown path seems to create a new case in which I *have* 
to do this uncomfortable thing.
 
> That is to say, with the status quo, if you want to flush before shutdown, 
> you call nodetool flush. Not a big deal. But if we made it 
> flush-everything-by-default then to make it NOT flush our options include.

I don't understand why calling nodetool flush/drain is not a big deal here, but 
is a big deal enough to special case shutdown when durable_writes is off in 
CASSANDRA-2958.

In my opinion, the sane default here is the pre-2958 status quo : no flushing 
on shutdown ever, including when durable_writes is off. Operators who want to 
drain nodes before stopping them can do so via nodetool.
                
> flush before shutdown so restart is faster
> ------------------------------------------
>
>                 Key: CASSANDRA-3564
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3564
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Packaging
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 1.2.0
>
>         Attachments: 3564.patch, 3564.patch
>
>
> Cassandra handles flush in its shutdown hook for durable_writes=false CFs 
> (otherwise we're *guaranteed* to lose data) but leaves it up to the operator 
> otherwise.  I'd rather leave it that way to offer these semantics:
> - cassandra stop = shutdown nicely [explicit flush, then kill -int]
> - kill -INT = shutdown faster but don't lose any updates [current behavior]
> - kill -KILL = lose most recent writes unless durable_writes=true and batch 
> commits are on [also current behavior]
> But if it's not reasonable to use nodetool from the init script then I guess 
> we can just make the shutdown hook flush everything.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to