[ https://issues.apache.org/jira/browse/MAPREDUCE-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Meng Mao updated MAPREDUCE-1298: -------------------------------- Attachment: fido.py Attached is a script that illustrates a typical debugging approach. The script goes out to all the worker nodes and grabs any userlogs for attempts for a given job. If there were a page that brought all these userlogs together for a given job, this script wouldn't be necessary. > better access/organization of userlogs > -------------------------------------- > > Key: MAPREDUCE-1298 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1298 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tasktracker > Reporter: Meng Mao > Priority: Minor > Attachments: fido.py > > > Right now, it is quite a chore to browse to all userlogs generated during a > given map or reduce phase. > It is quite easy to browse to a job and look at either the map or reduce > tasks, like so: > /jobtasks.jsp?jobid=job_<myid>&type=map&pagenum=1 > /jobtasks.jsp?jobid=job_<myid>&type=reduce&pagenum=1 > However, it is not easy to look at the stderr output across all the attempts. > Currently, the best technique I know of is to browse into each task: > /taskdetails.jsp?jobid=job_<myid>&tipid=task_<taskid> > And from there, jump to the slave node's task log for that taskid: > slavenode/tasklog?taskid=attempt_<for the taskid>&all=true > I'm not suggesting that there needs to be really sophisticated way to present > all the task userlogs in one place, especially with the expected size of the > logs. However, it would be nice to be presented with a list of URLs (that are > clickable) to all the log files. From here, it would be easy to copy/paste > that elsewhere, where I could wget the set of log files and grep through > them. What has prevented me from scripting it is a foolproof way to branch > down from a job id to all the constituent task ids and logs. > One more thing -- the task detail page: > /taskdetails.jsp?jobid=job_<myid>&tipid=task_<taskid> > gives links to see 4kb, 8kb, and all logs. I think it'd be nice to be able to > get a link to just the stdout, stderr, and syslog portions. Most of our > debugging is done by examining all of the stderr logs. Maybe it's possible to > request that via URL? But I haven't found out how to in documentation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.