[jira] Commented: (MAPREDUCE-1918) Add documentation to Rumen

2010-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926069#action_12926069
 ] 

Hudson commented on MAPREDUCE-1918:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #523 (See 
[https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/523/])


> Add documentation to Rumen
> --
>
> Key: MAPREDUCE-1918
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1918
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1918-v1.10.patch, mapreduce-1918-v1.3.patch, 
> mapreduce-1918-v1.4.patch, mapreduce-1918-v1.7.patch, 
> mapreduce-1918-v1.8.patch, rumen.pdf, rumen.pdf
>
>
> Add forrest documentation to Rumen tool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1918) Add documentation to Rumen

2010-09-12 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908609#action_12908609
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1918:


There is a javadoc warning with the patch. Can you fix it?
{code}
  [javadoc] 
/home/amarsri/workspace/mapreduce/src/tools/org/apache/hadoop/tools/rumen/TaskAttemptInfo.java:45:
 warning - Tag @link: reference not found: TaskStatus.State
{code}

> Add documentation to Rumen
> --
>
> Key: MAPREDUCE-1918
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1918
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1918-v1.3.patch, mapreduce-1918-v1.4.patch, 
> mapreduce-1918-v1.7.patch, mapreduce-1918-v1.8.patch, rumen.pdf, rumen.pdf
>
>
> Add forrest documentation to Rumen tool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1918) Add documentation to Rumen

2010-09-03 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905998#action_12905998
 ] 

Hong Tang commented on MAPREDUCE-1918:
--

A few minor nits:
* "Incase" => "in case"
* For TraceBuilder, does it descend recursively into the input foloder, or do 
we need to specify the immediate parent directory that contains the files? 
* Can we add a bit more details on "demuxer"? How about the following?
bq. Demuxer decides how the input file maps to jobhistory file(s). [insert]Job 
history logs and job conf files are typically small files, and can be more 
effectively stored if we embed them in some container file format like 
SequenceFile or TFile. To support such usage cases, one can specify a 
customized Demuxer class that can extract individual job history logs and job 
conf files from source files. [/insert]
* There is no need to do canParse() check if you know which parser to use 
(hence no need to use ris). The parser will (or should) simply abort if the 
source is not of the expected version.
* VersionDetector seems rather internal, getParser() is probably what users 
should care about.



> Add documentation to Rumen
> --
>
> Key: MAPREDUCE-1918
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1918
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1918-v1.3.patch, mapreduce-1918-v1.4.patch, 
> mapreduce-1918-v1.7.patch, rumen.pdf, rumen.pdf
>
>
> Add forrest documentation to Rumen tool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1918) Add documentation to Rumen

2010-08-17 Thread Ranjit Mathew (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899378#action_12899378
 ] 

Ranjit Mathew commented on MAPREDUCE-1918:
--

I would suggest keeping the API information in the package-level JavaDoc 
documentation and the user-guide information in the document being worked upon 
using this ticket.
A user looking to run Rumen, to feed its output to GridMix3 for example, would 
look at the Forrest documentation, while a developer looking to integrate 
directly or indirectly with Rumen will look at the JavaDoc documentation. We 
should definitely not mirror the information in both the places as it would add 
to the maintenance burden and will lead to stale documentation.

> Add documentation to Rumen
> --
>
> Key: MAPREDUCE-1918
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1918
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1918-v1.3.patch, mapreduce-1918-v1.4.patch, 
> rumen.pdf, rumen.pdf
>
>
> Add forrest documentation to Rumen tool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1918) Add documentation to Rumen

2010-07-21 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890795#action_12890795
 ] 

Hong Tang commented on MAPREDUCE-1918:
--

I think we should also describe (1) the Json objects are created through 
Jackson ObjectMapper from LoggedXXX classes; (2)  the API interface how to 
build LoggedXXX objects, and how to read them.

The basic API flow for creating parsed rumen object is as follows (user's 
responsibility of creating input streams from job conf xml and job history 
logs):
- JobConfigurationParser: parser that parses job conf xml. One instance can be 
reused to parse many job conf xml files.
{code}
JobConfigurationParser jcp = new 
JobConfigurationParser(interestedProperties); // interestedProperties is a a 
list of keys to be extracted from the job conf xml file.
Properties parsedProperties = jcp.parse(inputStream); // inputStream is 
the file input stream for the job conf xml file.
{code}

- JobHistoryParser: parser that parses job history files. It is an interface 
and actual implementations are defined as enums in JobHistoryParserFactory. One 
can directly use the version matching the the version of job history logs. Or 
she can also use method "canParse()" to detect which parser is suitable for 
parsing the job history logs (following the pattern in TraceBuilder). Create 
one instance to parse a job history log and close it after use.
{code}
JobHistoryParser parser = new Hadoop20JHParser(inputStream); // 
inputStream is the file input stream for the job history file.
// JobHistoryParser APIs will be used later when being fed into 
JobBuilder (below).
parser.close();
{code}

- JobBuilder: builder for LoggedJobs. Create one instance to parse the pairing 
job history log and job conf. The order of parsing conf file or job history 
file is not important.
{code}
JobBuilder jb = new JobBuilder(jobID); // you will need to extract the 
job ID from the file name: _job__
jb.process(jcp.parse(jobConfInputStream));
JobHistoryParser parser = new Hadoop20JHParser(jobHistoryInputStream);
try {
HistoryEvent e;
while ((e = parser.nextEvent()) != null) {
jobBuilder.process(e);
}
} finally {
parser.close();
}
LoggedJob job = jb.build();
{code}

>From the reading side, the output produced by TraceBuilder or Folder can be 
>read through JobTraceReader or ClusterTopologyReader. One can also use 
>Jackson's ObjectMapper to parse the json formatted data into LoggedJob or 
>LoggedTopology objects.

> Add documentation to Rumen
> --
>
> Key: MAPREDUCE-1918
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1918
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: tools/rumen
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1918-v1.3.patch, mapreduce-1918-v1.4.patch, 
> rumen.pdf, rumen.pdf
>
>
> Add forrest documentation to Rumen tool.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.