[ http://issues.apache.org/jira/browse/HADOOP-406?page=all ]

Chris Schneider updated HADOOP-406:
-----------------------------------

    Attachment: HADOOP-406.patch

Here's a patch that provides one solution to the problem.

I modified TaskRunner.java so that it supports elements like [EMAIL PROTECTED]@ 
within the mapred.child.java.opts Hadoop configuration property (replacing 
@property[hadoop.log.dir]@ with this property value).

If others agree that this is the appropriate way to solve the problem, then we 
should probably modify mapred.child.java.opts in hadoop-default.xml to include 
all 5 of the items in the default $HADOOP_OPTS (hadoop.log.dir, 
hadoop.log.file, hadoop.home.dir, hadoop.id.str, and hadoop.root.logger), using 
this method. A few notes, though:

1) This creates yet another dependency on the contents of $HADOOP_OPTS (i.e., 
anyone that adds items to this should probably add them to 
mapred.child.java.opts as well.) This is less than elegant.

2) It doesn't seem like hadoop.home.dir, hadoop.id.str, and hadoop.root.logger 
are actually being used currently by the code. Both hadoop.home.dir and 
hadoop.id.str are loaded by LogFormatter.initFileHandler(), but that method 
doesn't seem to be used by anyone. It doesn't seem like anyone is loading 
hadoop.root.logger at all.

> Tasks launched by tasktracker in separate JVM can't generate log output
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-406
>                 URL: http://issues.apache.org/jira/browse/HADOOP-406
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.4.0
>            Reporter: Chris Schneider
>         Attachments: HADOOP-406.patch
>
>
> Child JVM's don't have access to logging config system properties. When the 
> child JVM gets launched, it doesn't inherit the Java system properties 
> hadoop.log.dir and hadoop.log.file (which are actually based on the Bash 
> environment variables $HADOOP_LOG_DIR and $HADOOP_LOGFILE). This means that 
> you get no log messages from the actual map/reduce tasks that are executing.
> Stefan Groschupf reported this problem a while back:
> -------------------------------------------------------------------------
> To: [email protected]
> From: Stefan Groschupf <[EMAIL PROTECTED]>
> Subject: tasks can't log bug?
> Date: Tue, 25 Jul 2006 19:26:17 -0700
> X-Virus-Checked: Checked by ClamAV on apache.org
> Hi Hadoop developers,
> I'm confused about the way logging works within map or reduce tasks.
> Since tasks are launched in a new JVM  the java system properties 
> "hadoop.log.dir" and "hadoop.log.file" are not passed to the new JVM.
> This prevents the child process from logging properly. In fact you get:
>  java.io.FileNotFoundException: / (Is a directory)
>   at java.io.FileOutputStream.openAppend(Native Method)
>   at java.io.FileOutputStream.<init>(FileOutputStream.java:177)
>   at java.io.FileOutputStream.<init>(FileOutputStream.java:102)
>   at org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
>   at 
> org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:165)
>   at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
>   at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
>   at 
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
>   at 
> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
>   at 
> org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
>   at 
> org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
>   at 
> org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.j
> 2006-07-25 15:59:07,553 INFO  mapred.TaskTracker (TaskTracker.java:main(993)) 
> - Child
>   at 
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
>   at 
> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
>   at 
> org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:4
>   at org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
>   at org.apache.log4j.Logger.getLogger(Logger.java:104)
>   at 
> org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
>   at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImp
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcc
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
>   at 
> org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529
>   at 
> org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235
>   at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
>   at org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:44)
>   at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:993)
> We see several ways to solve this problem. First retrieve the properties 
> "hadoop.log.dir" and "hadoop.log.file" from the mother JVM and then pass them 
> to the child JVM as within the args parameter.
> Second would be to  access the environment variables "$HADOOP_LOG_DIR" and 
> "$HADOOP_LOGFILE" using System.getEnv (java 1.5).
> Third there would be a more general solution. Taskrunner would resolve any 
> environment variables it found in "mapred.child.java.opts" by lookup the 
> value using System.getEnv().
> Eg:
> unix:
> export MAX_MEMORY = 200
> hadoop-site.xml:
> <name>mapred.child.java.opts</name>
> <value>-Xmx${MAX_MEMORY}</value>

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to