[ https://issues.apache.org/jira/browse/MAPREDUCE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Yang updated MAPREDUCE-3112: --------------------------------- Attachment: MAPREDUCE-3112-trunk-2.patch Updated configuration to have HADOOP_CLIENT_OPTS override. > Calling hadoop cli inside mapreduce job leads to errors > ------------------------------------------------------- > > Key: MAPREDUCE-3112 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3112 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/streaming > Affects Versions: 0.20.205.0, 0.23.0 > Environment: Java, Linux > Reporter: Eric Yang > Assignee: Eric Yang > Fix For: 0.20.205.0, 0.23.0 > > Attachments: HAPREDUCE-3112-1.patch, MAPREDUCE-3112-trunk-2.patch, > MAPREDUCE-3112-trunk.patch, MAPREDUCE-3112.patch > > > When running a streaming job with mapper > bin/hadoop --config /etc/hadoop/ jar > contrib/streaming/hadoop-streaming-0.20.205.0.jar -mapper "hadoop --config > /etc/hadoop/ dfs -help" -reducer NONE -input "/tmp/input.txt" -output NONE > Task log shows: > {noformat} > Exception in thread "main" java.lang.ExceptionInInitializerError > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:57) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895) > Caused by: org.apache.commons.logging.LogConfigurationException: > User-specified log class 'org.apache.commons.logging.impl.Log4JLogger' cannot > be found or is not useable. > at > org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:874) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) > at org.apache.hadoop.conf.Configuration.<clinit>(Configuration.java:142) > ... 3 more > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed > with code 1 > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:311) > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:545) > at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:132) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) > at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:261) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > at org.apache.hadoop.mapred.Child.main(Child.java:255) > {noformat} > Upon inspection, there are two problems in the inherited from environment > which prevent the logger initialization to work properly. In hadoop-env.sh, > the HADOOP_OPTS is inherited from the parent process. This configuration was > requested by user to have a way to override HADOOP environment in the > configuration template: > {noformat} > export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_OPTS" > {noformat} > -Dhadoop.log.dir=$HADOOP_LOG_DIR/task_tracker_user is injected into > HADOOP_OPTS in the tasktracker environment. Hence, the running task would > inherit the wrong logging directory, which the end user might not have > sufficient access to write. Second, $HADOOP_ROOT_LOGGER is override to: > -Dhadoop.root.logger=INFO,TLA by the task controller, therefore, the > bin/hadoop script will attempt to use hadoop.root.logger=INFO,TLA, but fail > to initialize. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira