ZK has been built around the "fail fast" approach. In order to
maintain high availability we want to ensure that restarting a server
will result in it attempting to rejoin the quorum. IMO we would not
want to change this (kill -9).

Patrick

On Tue, Jul 26, 2011 at 2:02 AM, Laxman <lakshman...@huawei.com> wrote:
> Hi Everyone,
>
> Any thoughts?
> Do we need consider changing abrupt shutdown to
>
> Implementations in some other hadoop eco system projects for your reference.
> Hadoop - kill [SIGTERM]
> HBase - kill [SIGTERM] and then "kill -9" [SIGKILL] if process hung
> ZooKeeper - "kill -9" [SIGKILL]
>
>
> -----Original Message-----
> From: Laxman [mailto:lakshman...@huawei.com]
> Sent: Wednesday, July 13, 2011 12:36 PM
> To: 'dev@zookeeper.apache.org'
> Subject: RE: Does abrupt kill corrupts the datadir?
>
> Hi Mahadev,
>
> Shutdown hook is just a quick thought. Another approach can be just give a
> kill [SIGTERM] call which can be interpreted by process.
>
> First look at the "kill -9" triggered the following scenario.
>>In worst case, if latest snaps in all zookeeper nodes gets corrupted there
>>is a chance of dataloss.
>
> How does zookeeper can deal with this scenario gracefully?
>
> Also, I feel we should give a chance to application to shutdown gracefully
> before abrupt shutdown.
>
> http://en.wikipedia.org/wiki/SIGKILL
>
> Because SIGKILL gives the process no opportunity to do cleanup operations on
> terminating, in most system shutdown procedures an attempt is first made to
> terminate processes using SIGTERM, before resorting to SIGKILL.
>
> http://rackerhacker.com/2010/03/18/sigterm-vs-sigkill/
>
> The application can determine what it wants to do once a SIGTERM is
> received. While most applications will clean up their resources and stop,
> some may not. An application may be configured to do something completely
> different when a SIGTERM is received. Also, if the application is in a bad
> state, such as waiting for disk I/O, it may not be able to act on the signal
> that was sent.
>
> Most system administrators will usually resort to the more abrupt signal
> when an application doesn't respond to a SIGTERM.
>
> -----Original Message-----
> From: Mahadev Konar [mailto:maha...@hortonworks.com]
> Sent: Wednesday, July 13, 2011 12:02 PM
> To: dev@zookeeper.apache.org
> Subject: Re: Does abrupt kill corrupts the datadir?
>
> Hi Laxman,
>  The servers takes care of all the issues with data integrity, so a kill
> -9 is OK. Shutdown hooks are tricky. Also, the best way to make sure
> everything works reliably is use kill -9 :).
>
> Thanks
> mahadev
>
> On 7/12/11 11:16 PM, "Laxman" <lakshman...@huawei.com> wrote:
>
>>When we stop zookeeper through zkServer.sh stop, we are aborting the
>>zookeeper process using "kill -9".
>>
>>
>>
>>129 stop)
>>
>>130     echo -n "Stopping zookeeper ... "
>>
>>131     if [ ! -f "$ZOOPIDFILE" ]
>>
>>132     then
>>
>>133       echo "error: could not find file $ZOOPIDFILE"
>>
>>134       exit 1
>>
>>135     else
>>
>>136       $KILL -9 $(cat "$ZOOPIDFILE")
>>
>>137       rm "$ZOOPIDFILE"
>>
>>138       echo STOPPED
>>
>>139       exit 0
>>
>>140     fi
>>
>>141     ;;
>>
>>
>>
>>
>>
>>This may corrupt the snapshot and transaction logs. Also, its not
>>recommended to use "kill -9".
>>
>>In worst case, if latest snaps in all zookeeper nodes gets corrupted there
>>is a chance of dataloss.
>>
>>
>>
>>How about introducing a shutdown hook which will ensure zookeeper is
>>shutdown gracefully when we call stop?
>>
>>
>>
>>Note: This is just an observation and its not found in a test.
>>
>>
>>
>>--
>>
>>Thanks,
>>
>>Laxman
>>
>
>
>

Reply via email to