[ https://issues.apache.org/jira/browse/HBASE-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506993#comment-13506993 ]
Kannan Muthukkaruppan commented on HBASE-7242: ---------------------------------------------- Currently, don't the shutdown hooks also try to flush/close the regions before closing the ZK connection? > Use Runtime.exit() instead of Runtime.halt() upon HLog Sync failures > -------------------------------------------------------------------- > > Key: HBASE-7242 > URL: https://issues.apache.org/jira/browse/HBASE-7242 > Project: HBase > Issue Type: Brainstorming > Reporter: Amitanand Aiyer > Priority: Minor > > Hey Guys, > Should we use Runtime.exit() instead of Runtime.halt(), when we fail a Hlog > sync. > The key difference is that Runtime.exit() is going to invoke the shutdown > hooks; while Runtime.halt() does not. > Why we might need this: > We had a HDFS name node reboot today on one of our cells, and this caused > multiple region servers to abort because they could not sync the Hlog. > However, since multiple RS died simultaneously, this seemed like a > co-related failure to the master. The master waits for the > Znode to expire; but, this could take up to few minutes after RS death (this > setting is in place so that we can withstand rack switch reboots, lasting a > couple of minutes, without region movement). > If the shutdown hooks are called, RS will close the ZK connection, causing > a immediate Znode expiry. This might help cut down the unavailability as > Regions can begin to get assigned faster. > While, we do want to abort on Hlog failure, I do not think it would hurt > giving the JVM a few seconds to shutdown gracefully. Please let me know > If I am missing something. > Thanks, > -Amit -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira