[jira] Commented: (HADOOP-1857) Ability to run a script when a task fails to capture stack traces

Amareshwari Sri Ramadasu (JIRA) Mon, 01 Oct 2007 02:55:12 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531456
 ]


Amareshwari Sri Ramadasu commented on HADOOP-1857:
--------------------------------------------------

Usage Documentation :

A facility is provided, via user-provided scripts, for doing post-processing on 
task logs, task's stdout, stderr, syslog and core files. There is a default 
script which processes core dumps under gdb and prints stack trace. The last 
five lines from stdout and stderr of debug script are printed on the 
diagnostics. These outputs are displayed on job UI on demand.

How to submit debug command:
A quick way to set debug command is to set the properties 
"mapred.map.task.debug.command" and "mapred.reduce.task.debug.command" for 
debugging map task and reduce task respectively. These properties can also be 
set by APIs conf.setMapDebugCommand(String cmd) and 
conf.setReduceDebugCommand(String cmd). The debug command can consist of 
@stdout@, @stderr@, @syslog@ and @core@ to access task's stdout, stderr, syslog 
and core files respectively. In case of streaming, debug command can be 
submitted with command-line options -mapdebug, -reducedebug for debugging 
mapper and redcuer respectively.
For example, the debug command can be 'myScript @stderr@'. This command has 
executable myScript. And myScript processes failed task's stderr.
The debug command can be a gdb command where user can submit a command file to 
execute using -x option. Then debug command can look like 'gdb <program-name> 
-c @core@ -x <gdb-cmd-fle> '. This command processes core file of the failed 
task <program-name> and executes commands in <gdb-cmd-file>. Please make sure 
gdb command file has 'quit' in its last line.

How to submit debug script:
To submit the debug script file, first put the file in dfs.
The executable can be added by setting the property "mapred.cache.executables" 
with value <path>#<executable-name>. For more than one executable, they can be 
added as comma seperated executable paths. Executable property can also be set 
by APIs DistributedCache.addCacheExecutable(URI,conf) and 
DistributedCache.setCacheExecutables(URI[],conf) where URI is of the form 
"hdfs://host:port/<path>#<executable-name>". For Streaming, the executable can 
be added through -cacheExecutable URI.
For gdb, the gdb command file need not be executable. But, the command file 
needs to be in dfs. It can be added to cache by setting the property 
"mapred.cache.files" with the value <path>#<cmd-file> or through the API 
DistributedCache.addCacheFile(URI,conf). Please make sure the property 
"mapred.create.symlink" is set to "yes" 


All this documentation is incorporated in Java doc also.


> Ability to run a script when a task fails to capture stack traces
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1857
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1857
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Amareshwari Sri Ramadasu
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.15.0
>
>         Attachments: patch-1857.txt, patch1857.txt
>
>
> This basically is for providing a better user interface for debugging failed
> jobs. Today we see stack traces for failed tasks on the job ui if the job
> happened to be a Java MR job. For non-Java jobs like Streaming, Pipes, the
> diagnostic info on the job UI is not helpful enough to debug what might have
> gone wrong. They are usually framework traces and not app traces.
> We want to be able to provide a facility, via user-provided scripts, for doing
> post-processing on task logs, input, output, etc. There should be some default
> scripts like running core dumps under gdb for locating illegal instructions,
> the last few lines from stderr, etc.  These outputs could be sent to the
> tasktracker and in turn to the jobtracker which would then display it on the
> job UI on demand.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1857) Ability to run a script when a task fails to capture stack traces

Reply via email to