Hi Alex Yeah, I am running a job on cluster of 2 machines and using Cloudera distribution of hadoop. and here is the output of this command.
root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20 -Dhadoop.id.str= -Dhado op.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/conf:/usr/ jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1 .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0. 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0 .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api -2.1.jar org.apache.hadoop.util.RunJar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML=" http://www.w3.org/TR/REC-xml">,end=</mdc> -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper /home/ftpuser1/Nodemapper5.groovy -reducer org.apache.hadoop.mapred.lib.IdentityReducer -file / home/ftpuser1/Nodemapper5.groovy root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy ------------------------------------------------------------------------------------------------------------------------------ and what is meant by OOM and thanks for helping, Best Regards On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <ale...@cloudera.com> wrote: > Hi Shuja, > > It looks like the OOM is happening in your code. Are you running MapReduce > in a cluster? If so, can you send the exact command line your code is > invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy' > command on one of the nodes which is running the task? > > Thanks, > > Alex K > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamug...@gmail.com > >wrote: > > > Hi All > > > > I am facing a hard problem. I am running a map reduce job using streaming > > but it fails and it gives the following error. > > > > Caught: java.lang.OutOfMemoryError: Java heap space > > at Nodemapper5.parseXML(Nodemapper5.groovy:25) > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess > > failed with code 1 > > at > > > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) > > at > > > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) > > > > at > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) > > at > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > > > I have increased the heap size in hadoop-env.sh and make it 2000M. Also I > > tell the job manually by following line. > > > > -D mapred.child.java.opts=-Xmx2000M \ > > > > but it still gives the error. The same job runs fine if i run on shell > > using > > 1024M heap size like > > > > cat file.xml | /root/Nodemapper5.groovy > > > > > > Any clue????????? > > > > Thanks in advance. > > > > -- > > Regards > > Shuja-ur-Rehman Baig > > _________________________________ > > MS CS - School of Science and Engineering > > Lahore University of Management Sciences (LUMS) > > Sector U, DHA, Lahore, 54792, Pakistan > > Cell: +92 3214207445 > > > -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445