[ https://issues.apache.org/jira/browse/HADOOP-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-6473. ------------------------------------ Resolution: Won't Fix we are adding specific diags for specific problems; a generic one is unrealistic. I know that now :) > Add hadoop health check/diagnostics to run from command line, JSP pages, > other tools > ------------------------------------------------------------------------------------ > > Key: HADOOP-6473 > URL: https://issues.apache.org/jira/browse/HADOOP-6473 > Project: Hadoop Common > Issue Type: New Feature > Reporter: Steve Loughran > Priority: Minor > Labels: ipv6 > > If the lifecycle ping() is for short-duration "are we still alive" checks, > Hadoop still needs something bigger to check the overall system health,.This > would be for end users, but also for automated cluster deployment, a complete > validation of the cluster, > It could be a command line tool, and something that runs on different nodes, > checked via IPC or JSP. the idea would be to do thorough checks with good > diagnostics. Oh, and they should be executable through JUnit too. > For example > -if running on windows, check that cygwin is on the path, fail with a > pointer to a wiki issue if not > -datanodes should check that it can create locks on the filesystem, create > files, timestamps are (roughly) aligned with local time. > -namenodes should try and create files/locks in the filesystem > -task tracker should try and exec() something > -run through the classpath and look for problems; duplicate JARs, > unsupported java, xerces versions, etc. > * The number of tests should be extensible -rather than one single class with > all the tests, there'd be something separate for name, task, data, job > tracker nodes > * They can't be in the nodes themselves, as they should be executable even if > the nodes don't come up. > * output could be in human readable text or html, and a form that could be > processed through hadoop itself in future > * these tests could have side effects, such as actually trying to submit work > to a cluster -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org