[jira] Commented: (MAPREDUCE-358) Change org.apache.hadoop.examples. AggregateWordCount and org.apache.hadoop.examples.AggregateWordHistogram to use new mapreduce api.
[ https://issues.apache.org/jira/browse/MAPREDUCE-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831087#action_12831087 ] Meng Mao commented on MAPREDUCE-358: Would it be possible to backport this into 0.20? Our current codebase is stuck running our aggregate-depending classes using deprecated API until the new mapreduce aggregate lib appears. Change org.apache.hadoop.examples. AggregateWordCount and org.apache.hadoop.examples.AggregateWordHistogram to use new mapreduce api. -- Key: MAPREDUCE-358 URL: https://issues.apache.org/jira/browse/MAPREDUCE-358 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.21.0 Attachments: patch-358-1.txt, patch-358.txt, patch-5689.txt Change org.apache.hadoop.examples.AggregateWordCount and org.apache.hadoop.examples.AggregateWordHistogram to use new mapreduce api. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-994) bin/hadoop job -counter help options do not give information on permissible values.
[ https://issues.apache.org/jira/browse/MAPREDUCE-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828928#action_12828928 ] Meng Mao commented on MAPREDUCE-994: I agree, though currently the usage output is simply: $ hadoop job -counter Usage: JobClient [-counter job-id group-name counter-name] I couldn't the example of hadoop job -counter org.apache.hadoop.mapreduce.TaskCounter REDUCE_INPUT_RECORDS to work. Is it working for others? Access to the various MapReduce counters would be highly desirable. bin/hadoop job -counter help options do not give information on permissible values. --- Key: MAPREDUCE-994 URL: https://issues.apache.org/jira/browse/MAPREDUCE-994 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Iyappan Srinivasan Priority: Minor Right now, bin/hadoop job -counter gives this output bin/hadoop job -counter DEPRECATED: Use of this script to execute mapred command is deprecated. Instead use the mapred command for it. Usage: CLI [-counter job-id group-name counter-name] What are these group names and what are the counter-names is not explained. All permissible values of group-name and counter-name should be specified. Group_name Ex: org.apache.hadoop.mapreduce.TaskCounter Counter name example: REDUCE_INPUT_RECORDS -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1298) better access/organization of userlogs
[ https://issues.apache.org/jira/browse/MAPREDUCE-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Meng Mao updated MAPREDUCE-1298: Attachment: fido.py Attached is a script that illustrates a typical debugging approach. The script goes out to all the worker nodes and grabs any userlogs for attempts for a given job. If there were a page that brought all these userlogs together for a given job, this script wouldn't be necessary. better access/organization of userlogs -- Key: MAPREDUCE-1298 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1298 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Reporter: Meng Mao Priority: Minor Attachments: fido.py Right now, it is quite a chore to browse to all userlogs generated during a given map or reduce phase. It is quite easy to browse to a job and look at either the map or reduce tasks, like so: /jobtasks.jsp?jobid=job_myidtype=mappagenum=1 /jobtasks.jsp?jobid=job_myidtype=reducepagenum=1 However, it is not easy to look at the stderr output across all the attempts. Currently, the best technique I know of is to browse into each task: /taskdetails.jsp?jobid=job_myidtipid=task_taskid And from there, jump to the slave node's task log for that taskid: slavenode/tasklog?taskid=attempt_for the taskidall=true I'm not suggesting that there needs to be really sophisticated way to present all the task userlogs in one place, especially with the expected size of the logs. However, it would be nice to be presented with a list of URLs (that are clickable) to all the log files. From here, it would be easy to copy/paste that elsewhere, where I could wget the set of log files and grep through them. What has prevented me from scripting it is a foolproof way to branch down from a job id to all the constituent task ids and logs. One more thing -- the task detail page: /taskdetails.jsp?jobid=job_myidtipid=task_taskid gives links to see 4kb, 8kb, and all logs. I think it'd be nice to be able to get a link to just the stdout, stderr, and syslog portions. Most of our debugging is done by examining all of the stderr logs. Maybe it's possible to request that via URL? But I haven't found out how to in documentation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1298) better access/organization of userlogs
better access/organization of userlogs -- Key: MAPREDUCE-1298 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1298 Project: Hadoop Map/Reduce Issue Type: Improvement Components: tasktracker Reporter: Meng Mao Priority: Minor Right now, it is quite a chore to browse to all userlogs generated during a given map or reduce phase. It is quite easy to browse to a job and look at either the map or reduce tasks, like so: /jobtasks.jsp?jobid=job_myidtype=mappagenum=1 /jobtasks.jsp?jobid=job_myidtype=reducepagenum=1 However, it is not easy to look at the stderr output across all the attempts. Currently, the best technique I know of is to browse into each task: /taskdetails.jsp?jobid=job_myidtipid=task_taskid And from there, jump to the slave node's task log for that taskid: slavenode/tasklog?taskid=attempt_for the taskidall=true I'm not suggesting that there needs to be really sophisticated way to present all the task userlogs in one place, especially with the expected size of the logs. However, it would be nice to be presented with a list of URLs (that are clickable) to all the log files. From here, it would be easy to copy/paste that elsewhere, where I could wget the set of log files and grep through them. What has prevented me from scripting it is a foolproof way to branch down from a job id to all the constituent task ids and logs. One more thing -- the task detail page: /taskdetails.jsp?jobid=job_myidtipid=task_taskid gives links to see 4kb, 8kb, and all logs. I think it'd be nice to be able to get a link to just the stdout, stderr, and syslog portions. Most of our debugging is done by examining all of the stderr logs. Maybe it's possible to request that via URL? But I haven't found out how to in documentation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.