[ https://issues.apache.org/jira/browse/STORM-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992053#comment-14992053 ]
ASF GitHub Bot commented on STORM-1155: --------------------------------------- Github user zhuoliu commented on the pull request: https://github.com/apache/storm/pull/849#issuecomment-154130951 Hi @revans2 and @tgravescs , For checking all cases of relative/absolute directory for healthcheck, I would suggest to add the following function in config.clj, then healthcheck.clj can all this function to get the directory. (defn absolute-healthcheck-dir [conf] (let [storm-home (System/getProperty "storm.home") path (conf STORM-HEALTH-CHECK-DIR)] (if path (if (is-absolute-path? path) path (str storm-home file-path-separator path)) (str storm-home file-path-separator “healthcheck”)))) > Supervisor recurring health checks > ---------------------------------- > > Key: STORM-1155 > URL: https://issues.apache.org/jira/browse/STORM-1155 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core > Reporter: Thomas Graves > Assignee: Thomas Graves > > Add the ability for the supervisor to call out to health check scripts to > allow some validation of the health of the node the supervisor is running on. > It could regularly run scripts in a directory provided by the cluster admin. > If any scripts fail, it should kill the workers and stop itself. > This could work very much like the Hadoop scripts and if ERROR is returned on > stdout it means the node has some issue and we should shut down. > If a non-zero exit code is returned it indicates that the scripts failed to > execute properly so you don't want to mark the node as unhealthy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)