[jira] Updated: (HADOOP-1181) userlogs reader

[EMAIL PROTECTED] (JIRA) Wed, 28 Mar 2007 14:52:46 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


[EMAIL PROTECTED] updated HADOOP-1181:
--------------------------------------

    Attachment: hadoop1181.patch

Attached is a patch that changes TaskLog$Reader so it uses URLs instead of the 
file system.  It also:

+ Adds a constructor that takes a userlog subdirectory URL.
+ Adds a public getInputStream method that streams over all userlog parts.
+ Makes TaskLog and TaskLog$Reader public rather than default access
+ Adds a main that takes a URL and that then prints to stdout the concatenated 
logs

I'll not mark this issue as 'patch ready' until others have had a gander.  
Would be great if Arun C Murthy could review since he wrote the original.  In 
particular, it would be nice to know if the calculation of totalLogSize in the 
TaskLog$Reader#fetchAll method -- around line 384 in r523437 -- is important.  
If not, then some near-duplicate code could be replaced with call to the new 
getInputStream in a version2 of this patch.

> userlogs reader
> ---------------
>
>                 Key: HADOOP-1181
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1181
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: [EMAIL PROTECTED]
>         Attachments: hadoop1181.patch
>
>
> My jobs output lots of logging.  I want to be able to quickly parse the logs 
> across the cluster for anomalies.  org.apache.hadoop.tool.Logalyzer looks 
> promising at first but it does not know how to deal with the userlog format  
> and it wants to first copy all logs local.  Digging, there does not seem to 
> currently be a reader for hadoop userlog format.  TaskLog$Reader is not 
> generally accessible and it too expects logs to be on the local filesystem 
> (The latter is of little good if I want to run the analysis as a mapreduce 
> job).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1181) userlogs reader

Reply via email to