Hi Shuja,

I think you need to enclose the invocation string in quotes.  Try:

-mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"

Also, it would be nice to see how exactly the groovy is invoked.  Is groovy
started and them gives you OOM or is OOM error during the start?  Can you
see the new process with "ps -aef"?

Can you run groovy in local mode?  Try "-jt local" option.

Thanks,

Alex K

On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <shujamug...@gmail.com> wrote:

> Hi Patrick,
> Thanks for explanation. I have supply the heapsize in mapper in the
> following way
>
> -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
>
> but still same error. Any other idea?
> Thanks
>
> On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patr...@cloudera.com
> >wrote:
>
> > Shuja,
> >
> > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> > used
> > for child JVMs that get forked by the TaskTracker. You are using Hadoop
> > streaming, which means the TaskTracker is forking a JVM for streaming,
> > which
> > is then forking a shell process that runs your groovy code (in another
> > JVM).
> >
> > I'm not much of a groovy expert, but if there's a way you can wrap your
> > code
> > around the MapReduce API that would work best. Otherwise, you can just
> pass
> > the heapsize in '-mapper' argument.
> >
> > Regards,
> >
> > - Patrick
> >
> > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <shujamug...@gmail.com>
> > wrote:
> >
> > > Hi Alex,
> > >
> > > I have update the java to latest available version on all machines in
> the
> > > cluster and now i run the job by adding this line
> > >
> > > -D mapred.child.ulimit=3145728 \
> > >
> > > but still same error. Here is the output of this job.
> > >
> > >
> > > root      7845  5674  3 01:24 pts/1    00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > /usr/lib/hadoop-0.20/con
> > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > org.apache.hadoop.util.RunJar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> -D
> > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > -inputformat StreamIn putFormat -inputreader
> > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > 3.org/TR/REC-xml">,end=</mdc>
> > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > /home/ftpuser1/Nodemapp
> > > er5.groovy
> > > root      7930  7632  0 01:24 pts/2    00:00:00 grep Nodemapper5.groovy
> > >
> > >
> > > Any clue?
> > > Thanks
> > >
> > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <ale...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > First, thank you for using CDH3.  Can you also check what m*
> > > > apred.child.ulimit* you are using?  Try adding "*
> > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > >
> > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > minimum,
> > > > which you can download from the Java SE
> > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > .
> > > >
> > > > Let me know how it goes.
> > > >
> > > > Alex K
> > > >
> > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> shujamug...@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi Alex
> > > > >
> > > > > Yeah, I am running a job on cluster of 2 machines and using
> Cloudera
> > > > > distribution of hadoop. and here is the output of this command.
> > > > >
> > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > > > -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> > > > > http://www.w3.org/TR/REC-xml";>,end=</mdc> -input
> > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> mapred.map.tasks=1
> > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > Nodemapper5.groovy
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > and what is meant by OOM and thanks for helping,
> > > > >
> > > > > Best Regards
> > > > >
> > > > >
> > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <ale...@cloudera.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > It looks like the OOM is happening in your code.  Are you running
> > > > > MapReduce
> > > > > > in a cluster?  If so, can you send the exact command line your
> code
> > > is
> > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > Nodemapper5.groovy'
> > > > > > command on one of the nodes which is running the task?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > shujamug...@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi All
> > > > > > >
> > > > > > > I am facing a hard problem. I am running a map reduce job using
> > > > > streaming
> > > > > > > but it fails and it gives the following error.
> > > > > > >
> > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > >
> > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > subprocess
> > > > > > > failed with code 1
> > > > > > >        at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > >        at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > >
> > > > > > >        at
> > > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > >        at
> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > >        at
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > >        at
> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > >
> > > > > > >        at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > >
> > > > > > >
> > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > 2000M.
> > > > Also
> > > > > I
> > > > > > > tell the job manually by following line.
> > > > > > >
> > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > >
> > > > > > > but it still gives the error. The same job runs fine if i run
> on
> > > > shell
> > > > > > > using
> > > > > > > 1024M heap size like
> > > > > > >
> > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > > Any clue?????????
> > > > > > >
> > > > > > > Thanks in advance.
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Reply via email to