[ http://issues.apache.org/jira/browse/HADOOP-291?page=all ]
Sameer Paranjpye updated HADOOP-291:
------------------------------------
Fix Version/s: 0.6.0
> Hadoop Log Archiver/Analyzer utility
> ------------------------------------
>
> Key: HADOOP-291
> URL: http://issues.apache.org/jira/browse/HADOOP-291
> Project: Hadoop
> Issue Type: New Feature
> Components: util
> Reporter: Arun C Murthy
> Fix For: 0.6.0
>
>
> Overview of the log archiver/analyzer utility...
> 1. Input
> The tool takes as input a list of directory URLs, each url could also we
> associated with a file-pattern to specify what pattern of files in that
> directory are to be used.
> e.g. http://g1015:50030/logs/hadoop-sameer-jobtracker-*
>
> file:///export/crawlspace/sanjay/hadoop/trunk/run/logs/haddop-sanjay-namenode-*
> (local disk on the machine on which the job was submitted)
> 2. The tool supports 2 main functions:
> a) Archival
> Archive the logs in the DFS in the following hierarchy:
> /users/<username>/log-archive/YYYY/mm/dd/HHMMSS.log by default
> Or a user-specified directory and then:
> <input-dir>/YYYY/mm/dd/HHMMSS.log
> b) Processing with simple sort/grep primitives
> Archive the logs as above and then grep for lines with given pattern
> (e.g. INFO) and then sort with spec e.g. <logger><level><date>. (Note: This
> is proposed with current log4j based logging in mind... do we need anything
> more generic?). The sort/grep specs are user-provided; along with directory
> URLs.
> 3. Thoughts on implementation...
> a) Archival
> Current idea is to put a .jsp page (src/webapps) on each of the nodes;
> which then does a *copyFromLocal* of the log-file into the DFS. The
> jobtracker will fire n map-tasks which only hit the jsp page as per the
> directory URLs. The reduce-task is a no-op and only collects statistics on
> failures (if any).
> b) Processing with sort/grep
> Here, the tool first archives the files as above and then another set of
> map-reduce tasks will do the sort/grep on the files in DFS with given specs.
>
> - * - * -
> Suggestions/corrections welcome...
> thanks,
> Arun
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira