Running the hadoop fs -put command as a Java action seems to work much better. I'll stick to that for now. Thanks for the help.
On Fri, Jul 6, 2012 at 12:00 AM, Harsh J <[email protected]> wrote: > In addition to what Mohammad has already suggested, you may also try > to bump the mapred.child.ulimit value, since this script runs as a > forked process if am right: > > Set oozie.launcher.mapred.child.ulimit to 2.5 GB in KB, so '2621440'. > > On Fri, Jul 6, 2012 at 12:21 PM, Mohammad Islam <[email protected]> wrote: >> >> >> What about this? >> >> >> <property> <name>oozie.launcher.mapred.child.java.opts</name> <value>-server >> -Xmx1G -Djava.net.preferIPv4Stack=true</value> <description>setting memory >> usage to 1024MB</description> </property> >> >> More details could be found at: >> http://incubator.apache.org/oozie/pig-cookbook.html >> >> >> Also can you try the command line "hadoop fs -ls"? >> >> >> ----- Original Message ----- >> From: Tim Chan <[email protected]> >> To: [email protected]; Mohammad Islam <[email protected]> >> Cc: >> Sent: Thursday, July 5, 2012 9:47 PM >> Subject: Re: Running python script using Oozie >> >> Hi Mohammad, >> >> That didn't seem to help. >> >> Here is my action: >> >> <action name="hdfs-put"> >> <shell xmlns="uri:oozie:shell-action:0.1"> >> <job-tracker>${jobTracker}</job-tracker> >> <name-node>${nameNode}</name-node> >> <configuration> >> <property> >> <name>mapred.job.queue.name</name> >> <value>${queueName}</value> >> </property> >> <property> >> <name>oozie.launcher.mapred.child.java.opts</name> >> <value>-Xmx1G</value> >> </property> >> </configuration> >> >> <exec>/usr/bin/hadoop</exec> >> <argument>fs</argument> >> <argument>-ls</argument> >> >> <capture-output/> >> </shell> >> >> >> >> On Thu, Jul 5, 2012 at 8:18 PM, Mohammad Islam <[email protected]> wrote: >>> Hi Tim, >>> Could you try by adding this into shell action definition: >>> <name>oozie.launcher.mapred.child.java.opts</name> >>> <value>-Xmx1G </value> >>> >>> Regards, >>> Mohammad >>> >>> >>> ----- Original Message ----- >>> From: Tim Chan <[email protected]> >>> To: [email protected] >>> Cc: >>> Sent: Thursday, July 5, 2012 8:04 PM >>> Subject: Re: Running python script using Oozie >>> >>> I can run my python script now. >>> >>> >>> But I am trying to use the shell action to run: >>> >>> hadoop fs -put fileName >>> >>> >>> I get the following error in the logs: >>> >>> >>> Error occurred during initialization of VM >>> Could not reserve enough space for object heap >>> Exit code of the Shell command 1 >>> >>> What might be the problem? >>> >>> >>> >>> On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan >>> <[email protected]> wrote: >>>> Hi Tim, >>>> >>>> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct? >>>> I thought this should be <shell xmlns="uri:oozie:workflow:0.2"> >>>> >>>> Thanks & Regards, >>>> Harish.T.K >>>> >>>> >>>> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <[email protected]> wrote: >>>> >>>>> Alejandro, >>>>> >>>>> We're running Oozie server 2.3.2-cdh3u4. >>>>> The shell action appears to be supported based on the documentation, >>>>> but when I run my workflow, I get the following error in the oozie >>>>> logs: >>>>> >>>>> >>>>> E0701: XML schema error, cvc-complex-type.2.4.c: The matching >>>>> wildcard is strict, but no declaration can be found for element >>>>> 'shell'. >>>>> >>>>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error: >>>>> >>>>> XException, org.apache.oozie.command.CommandException: E0701: XML >>>>> schema error, cvc-elt.1: Cannot find the declaration of element >>>>> 'workflow-app'. >>>>> org.apache.oozie.command.CommandException: E0701: XML schema error, >>>>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'. >>>>> >>>>> >>>>> Here is m workflow.xml: >>>>> >>>>> <workflow-app xmlns="uri:oozie:workflow:0.2" >>>>> name="dlx-mapping-processor-main"> >>>>> >>>>> <start to="shell-test"/> >>>>> >>>>> <action name="shell-test"> >>>>> <shell xmlns="uri:oozie:shell-action:0.1"> >>>>> <job-tracker>${jobTracker}</job-tracker> >>>>> <name-node>${nameNode}</name-node> >>>>> <configuration> >>>>> <property> >>>>> <name>mapred.job.queue.name</name> >>>>> <value>${queueName}</value> >>>>> </property> >>>>> </configuration> >>>>> >>>>> <exec>pwd</exec> >>>>> >>>>> <capture-output/> >>>>> >>>>> </shell> >>>>> >>>>> <ok to="end"/> >>>>> <error to="fail"/> >>>>> </action> >>>>> >>>>> <kill name="fail"> >>>>> <message>Node failed, error >>>>> message[${wf:errorMessage(wf:lastErrorNode())}]</message> >>>>> </kill> >>>>> >>>>> <end name="end"/> >>>>> </workflow-app> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <[email protected]> >>>>> wrote: >>>>> > Hi TIm, >>>>> > >>>>> > I think the Shell action would be better suited to run a phyton script. >>>>> And >>>>> > keep in mind phyton and all the libs you need should be avail in all >>>>> nodes >>>>> > in the cluster. >>>>> > >>>>> > Thanks >>>>> > >>>>> > Alejandro >>>>> > >>>>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <[email protected]> wrote: >>>>> > >>>>> >> I would like to use Oozie to run a python script on a worker node. >>>>> >> >>>>> >> I've been looking at the documentation located here: >>>>> >> >>>>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases >>>>> >> >>>>> >> under the heading: Java-Main Action with Script support >>>>> >> >>>>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO >>>>> API. >>>>> >> >>>>> >> Is there updated documentation on running scripts (ruby, python, perl, >>>>> >> etc) using Oozie? >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > Alejandro >>>>> >>>>> >>>>> >>>>> -- >>>>> Tim Chan // [email protected] // 213.784.2523 >>>>> >>> >>> >>> >>> -- >>> Tim Chan // [email protected] // 213.784.2523 >>> >> >> >> >> -- >> Tim Chan // [email protected] // 213.784.2523 >> > > > > -- > Harsh J -- Tim Chan // [email protected] // 213.784.2523
