[ http://issues.apache.org/jira/browse/HADOOP-342?page=all ]
Arun C Murthy updated HADOOP-342:
---------------------------------
Attachment: logalyzer.patch
Here's the 'logalyzer' tool.
Doug: I felt that it made sense to create a org.apache.hadoop.tools package for
logalyzer and other such tools in the future... let me know if you prefer it to
be in some other package and i'll update it accordingly.
thanks,
Arun
> Design/Implement a tool to support archival and analysis of logfiles.
> ---------------------------------------------------------------------
>
> Key: HADOOP-342
> URL: http://issues.apache.org/jira/browse/HADOOP-342
> Project: Hadoop
> Type: New Feature
> Reporter: Arun C Murthy
> Attachments: logalyzer.patch
>
> Requirements:
> a) Create a tool support archival of logfiles (from diverse sources) in
> hadoop's dfs.
> b) The tool should also support analysis of the logfiles via grep/sort
> primitives. The tool should allow for fairly generic pattern 'grep's and let
> users 'sort' the matching lines (from grep) on 'columns' of their choice.
> E.g. from hadoop logs: Look for all log-lines with 'FATAL' and sort them
> based on timestamps (column x) and then on column y (column x, followed by
> column y).
> Design/Implementation:
> a) Log Archival
> Archival of logs from diverse sources can be accomplished using the
> *distcp* tool (HADOOP-341).
>
> b) Log analysis
> The idea is to enable users of the tool to perform analysis of logs via
> grep/sort primitives.
> This can be accomplished via a relatively simple Map-Reduce task where
> the map does the *grep* for the given pattern via RegexMapper and then the
> implicit *sort* (reducer) is used with a custom Comparator which performs the
> user-specified comparision (columns).
> The sort/grep specs can be fairly powerful by letting the user of the
> tool use java's in-built regex patterns (java.util.regex).
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira