Hi Shuja, First, thank you for using CDH3. Can you also check what m* apred.child.ulimit* you are using? Try adding "* -D mapred.child.ulimit=3145728*" to the command line.
I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum, which you can download from the Java SE Homepage<http://java.sun.com/javase/downloads/index.jsp> . Let me know how it goes. Alex K On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamug...@gmail.com>wrote: > Hi Alex > > Yeah, I am running a job on cluster of 2 machines and using Cloudera > distribution of hadoop. and here is the output of this command. > > root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20 > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console > -Dhadoop.policy.file=hadoop-policy.xml -classpath > /usr/lib/hadoop-0.20/conf:/usr/ > > jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo > > p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common > > s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1 > > .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja > > r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l > > ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h > > adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso > > n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru > > ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib > > /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0. > > 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib > > /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav > > a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u > > sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0 > > .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api > -2.1.jar org.apache.hadoop.util.RunJar > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat > -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML=" > http://www.w3.org/TR/REC-xml">,end=</mdc> -input > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper > /home/ftpuser1/Nodemapper5.groovy -reducer > org.apache.hadoop.mapred.lib.IdentityReducer -file / > home/ftpuser1/Nodemapper5.groovy > root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy > > > > ------------------------------------------------------------------------------------------------------------------------------ > and what is meant by OOM and thanks for helping, > > Best Regards > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <ale...@cloudera.com> wrote: > > > Hi Shuja, > > > > It looks like the OOM is happening in your code. Are you running > MapReduce > > in a cluster? If so, can you send the exact command line your code is > > invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy' > > command on one of the nodes which is running the task? > > > > Thanks, > > > > Alex K > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamug...@gmail.com > > >wrote: > > > > > Hi All > > > > > > I am facing a hard problem. I am running a map reduce job using > streaming > > > but it fails and it gives the following error. > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25) > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess > > > failed with code 1 > > > at > > > > > > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) > > > at > > > > > > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) > > > > > > at > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136) > > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) > > > at > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) > > > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > > > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > > > > > > I have increased the heap size in hadoop-env.sh and make it 2000M. Also > I > > > tell the job manually by following line. > > > > > > -D mapred.child.java.opts=-Xmx2000M \ > > > > > > but it still gives the error. The same job runs fine if i run on shell > > > using > > > 1024M heap size like > > > > > > cat file.xml | /root/Nodemapper5.groovy > > > > > > > > > Any clue????????? > > > > > > Thanks in advance. > > > > > > -- > > > Regards > > > Shuja-ur-Rehman Baig > > > _________________________________ > > > MS CS - School of Science and Engineering > > > Lahore University of Management Sciences (LUMS) > > > Sector U, DHA, Lahore, 54792, Pakistan > > > Cell: +92 3214207445 > > > > > > > > > -- > Regards > Shuja-ur-Rehman Baig > _________________________________ > MS CS - School of Science and Engineering > Lahore University of Management Sciences (LUMS) > Sector U, DHA, Lahore, 54792, Pakistan > Cell: +92 3214207445 >