Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs
Hi! I've faced the same issue a couple of times and I found nothing in the logs that lead me to the source of the error. However, I've found out that smart container and block configuration can prevent these issues First of all, check RM logs to find any problematic container since the same task is failing all the time (maybe that split is violating container resource limits, that should be reflected in such log). For instance, in my particular case, I was running a memory-intensive map and some records needed more memory than other in large test cases, hence I observed the behaviour you describe because containers were getting killed. I usually find application log files under userlogs, just go to the directory of the container that triggers the error, as pointed by the RM logs. Hope it helps. Regards, Silvina On 11 April 2014 09:15, Phan, Truong Q wrote: > I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in > the HDFS "/tmp" directory. > Where can I find these log files. > > Thanks and Regards, > Truong Phan > > > P+ 61 2 8576 5771 > M + 61 4 1463 7424 > Etroung.p...@team.telstra.com > W www.telstra.com > > > -Original Message- > From: Harsh J [mailto:ha...@cloudera.com] > Sent: Thursday, 10 April 2014 4:32 PM > To: > Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed > to run on a larger jobs > > It appears to me that whatever chunk of the input CSV files your map task > 000149 gets, the program is unable to process it and throws an error and > exits. > > Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to > see if there's any stdout/stderr printed that may help. The syslog in the > attempt's task log will also carry a "Processing split ..." > message that may help you know which file and what offset+length under > that file was being processed. > > On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q < > troung.p...@team.telstra.com> wrote: > > Hi > > > > > > > > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce > > Streaming job. > > > > I have no issue in running the MapReduce Streaming job which has an > > input data file of around 400Mb CSV file. > > > > However, it is failed when I try to run the job which has 11 input > > data files of size 400Mb each. > > > > The job failed with the following error. > > > > > > > > I appreciate for any hints or suggestions to fix this issue. > > > > > > > > > + > > > > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179] > > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: > > attempt_1395628276810_0062_m_000149_0 - exited : > java.lang.RuntimeException: > > PipeMapRed.waitOutputThreads(): subprocess failed with code 1 > > > > at > > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja > > va:320) > > > > at > > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java: > > 533) > > > > at > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) > > > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) > > > > at > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) > > > > at > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > > > > at > > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > at javax.security.auth.Subject.doAs(Subject.java:415) > > > > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat > > ion.java:1491) > > > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > > > > > > > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179] > > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report > > from > > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException: > > PipeMapRed.waitOutputThreads(): subprocess failed with code 1 > > > > at > > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja > > va:320) > > > > at > > org.apache.hadoop.streaming.PipeMapRe
RE: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs
I could not find the "attempt_1395628276810_0062_m_000149_0 attemp*" in the HDFS "/tmp" directory. Where can I find these log files. Thanks and Regards, Truong Phan P+ 61 2 8576 5771 M + 61 4 1463 7424 Etroung.p...@team.telstra.com W www.telstra.com -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Thursday, 10 April 2014 4:32 PM To: Subject: Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs It appears to me that whatever chunk of the input CSV files your map task 000149 gets, the program is unable to process it and throws an error and exits. Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to see if there's any stdout/stderr printed that may help. The syslog in the attempt's task log will also carry a "Processing split ..." message that may help you know which file and what offset+length under that file was being processed. On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q wrote: > Hi > > > > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce > Streaming job. > > I have no issue in running the MapReduce Streaming job which has an > input data file of around 400Mb CSV file. > > However, it is failed when I try to run the job which has 11 input > data files of size 400Mb each. > > The job failed with the following error. > > > > I appreciate for any hints or suggestions to fix this issue. > > > > + > > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: > attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException: > PipeMapRed.waitOutputThreads(): subprocess failed with code 1 > > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja > va:320) > > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java: > 533) > > at > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) > > at > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) > > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > > at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat > ion.java:1491) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > > > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report > from > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException: > PipeMapRed.waitOutputThreads(): subprocess failed with code 1 > > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja > va:320) > > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java: > 533) > > at > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) > > at > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) > > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > > at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat > ion.java:1491) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > > > 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > Diagnostics report from attempt_1395628276810_0062_m_000149_0: Error: > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess > failed with code 1 > > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.ja > va:320) > > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java: > 533) >
Re: Hadoop 2.2.0-cdh5.0.0-beta-1 - MapReduce Streaming - Failed to run on a larger jobs
It appears to me that whatever chunk of the input CSV files your map task 000149 gets, the program is unable to process it and throws an error and exits. Look into the attempt_1395628276810_0062_m_000149_0 attempt's task log to see if there's any stdout/stderr printed that may help. The syslog in the attempt's task log will also carry a "Processing split ..." message that may help you know which file and what offset+length under that file was being processed. On Thu, Apr 10, 2014 at 10:55 AM, Phan, Truong Q wrote: > Hi > > > > My Hadoop 2.2.0-cdh5.0.0-beta-1 is failed to run on a larger MapReduce > Streaming job. > > I have no issue in running the MapReduce Streaming job which has an input > data file of around 400Mb CSV file. > > However, it is failed when I try to run the job which has 11 input data > files of size 400Mb each. > > The job failed with the following error. > > > > I appreciate for any hints or suggestions to fix this issue. > > > > + > > 2014-04-10 10:28:10,498 FATAL [IPC Server handler 2 on 52179] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: > attempt_1395628276810_0062_m_000149_0 - exited : java.lang.RuntimeException: > PipeMapRed.waitOutputThreads(): subprocess failed with code 1 > > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) > > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) > > at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) > > at > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > > > 2014-04-10 10:28:10,498 INFO [IPC Server handler 2 on 52179] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from > attempt_1395628276810_0062_m_000149_0: Error: java.lang.RuntimeException: > PipeMapRed.waitOutputThreads(): subprocess failed with code 1 > > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) > > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) > > at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) > > at > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > > > 2014-04-10 10:28:10,499 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics > report from attempt_1395628276810_0062_m_000149_0: Error: > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess > failed with code 1 > > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) > > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) > > at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) > > at > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > > > > + > > MAPREDUCE SCRIPT: > > $ cat