Hello everyone,

I have some time working with Cassandra, but every time I need to shutdown
a node (for any reason like upgrading version or moving instance to another
host) I see several errors on the client applications (yes, I'm using the
official java driver).

By the way, I'm starting C* as a stand-alone process
<https://docs.datastax.com/en/cassandra/3.0/cassandra/initialize/referenceStartCprocess.html?hl=start>,
and C* version is 3.11.0.

The way I have implemented the shutdown process is something like the
following:

*# Drain all information from commitlog into sstables*

*bin/nodetool drain*


*cassandra_pid=`ps -ef|grep "java.*apache-cassandra"|grep -v "grep"|awk
'{print $2}'`*
*if [ ! -z "$cassandra_pid" ] && [ "$cassandra_pid" -ne "1" ]; then*
*        echo "Asking Cassandra to shutdown (nodetool drain doesn't stop
cassandra)"*
*        kill $cassandra_pid*

*        echo -n "+ Checking it is down. "*
*        counter=10*
*        while [ "$counter" -ne 0 -a ! kill -0 $cassandra_pid > /dev/null
2>&1 ]*
*        do*
*                echo -n ". "*
*                ((counter--))*
*                sleep 1s*
*        done*
*        echo ""*
*        if ! kill -0 $cassandra_pid > /dev/null 2>&1; then*
*                echo "+ Its down."*
*        else*
*                echo "- Killing Cassandra."*
*                kill -9 $cassandra_pid*
*        fi*
*else*
*        echo "Care there was a problem finding Cassandra PID"*
*fi*

Should I add at the beginning the following lines?

echo "shutdowing cassandra gracefully with: nodetool disable gossip"
$CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablegossip
echo "shutdowing cassandra gracefully with: nodetool disable binary
protocol"
$CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablebinary
echo "shutdowing cassandra gracefully with: nodetool thrift"
$CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablethrift

The shutdown log is the following:

*WARN  [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,343
StorageService.java:321 - Stopping gossip by operator request*
*INFO  [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,344
Gossiper.java:1532 - Announcing shutdown*
*INFO  [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,355
StorageService.java:2268 - Node /10.254.169.36 <http://10.254.169.36> state
jump to shutdown*
*INFO  [RMI TCP Connection(12)-127.0.0.1] 2017-10-12 14:20:56,141
Server.java:176 - Stop listening for CQL clients*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,472
StorageService.java:1442 - DRAINING: starting drain process*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,474
HintsService.java:220 - Paused hints dispatch*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,477
Gossiper.java:1532 - Announcing shutdown*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,480
StorageService.java:2268 - Node /127.0.0.1 <http://127.0.0.1> state jump to
shutdown*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:01,483
MessagingService.java:984 - Waiting for messaging service to quiesce*
*INFO  [ACCEPT-/192.168.6.174 <http://192.168.6.174>] 2017-10-12
14:21:01,485 MessagingService.java:1338 - MessagingService has terminated
the accept() thread*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,095
HintsService.java:220 - Paused hints dispatch*
*INFO  [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,111
StorageService.java:1442 - DRAINED*

Disabling Gossip seemed a good idea, but watching the logs, it may use it
to gracefully telling the other nodes he is going down, so I don't know if
it's good or bad idea.

Disabling Thrift and Binary protocol should only avoid new connections, but
the one stablished and running should be attempted to finish.

Any thoughts or comments?

Thanks

Javier.

Reply via email to