Yes, thanks everyone. I have it working now as a Java action. That works because Oozie (nicely) puts all the hadoop jars in the classpath before running the java code.
On Thu, Oct 17, 2013 at 1:37 AM, Som Satpathy <somsatpa...@gmail.com> wrote: > I have been running crunch/cascading jobs as oozie java actions, no > problems so far. > > > On Tue, Oct 8, 2013 at 4:47 PM, Alejandro Abdelnur <t...@cloudera.com > >wrote: > > > I would suggest looking at how Pig/Hive/Sqoop/Distcp actions works if you > > want to have a custom <cascading> action. Which, BTW, it would be a great > > contribution to Oozie. > > > > If you are going this path, you'll have to write a > CascadingActionExecutor > > class that runs in the Oozie server and you'll e corhave to write a > > CascadingMain class that runs in the Launcher job. Plus an XSD defining > the > > cascading XML syntax. > > > > If you want to start simpler, you can could do it via the Java action. > You > > will only need the CascadingMain for this. You can cannibalize > > Pig/Hive/Sqoop/Distcp Oozie main class for this. The most important thing > > here is to ensure the tokens are propagated to the cascading MR jobs. > > > > hope this helps. > > > > > > On Tue, Oct 8, 2013 at 3:19 PM, <mpeters...@gmail.com> wrote: > > > > > Follow up. I've tried to run a Cascading job in oozie a couple of > ways, > > > but they all fail for various reasons. > > > > > > I tried to put it in a map-reduce action with > > > oozie.launcher.action.main.class defined pointing to my Cascading > class, > > > but I can't see any way to pass all the arguments to it that it needs. > > > > > > I also tried to use a shell action using > > oozie.launcher.action.main.class. > > > That launches my class but doesn't pass any arguments to it even > though > > I > > > specified arguments in the shell action. > > > > > > Finally, I tried to do it with a shell command where I don't specify > > > oozie.launcher.action.main.class and instead put '/usr/bin/hadoop' as > the > > > exec action and then put all the rest of the invocation as commands. > > This > > > invokes my Cascading class with the right arguments, but then dies for > no > > > apparent reason that I can tell from the Hadoop logs (it never launches > > the > > > Cascading MR jobs). > > > > > > If anyone has an example of a working oozie workflow where they wrap a > > > Cascading job, I'd love to see it. > > > > > > -Michael > > > > > > > > > > > > On Tue, Oct 8, 2013 at 4:08 PM, <mpeters...@gmail.com> wrote: > > > > > > > Apologies if this has been asked before, but I can't figure out how > to > > > > search the archives of this mailing list and 20 minutes of googling > > > yielded > > > > no useful results. > > > > > > > > I'm on a team that uses Cascading to do our MapReduce flows. > However, > > we > > > > are investigating using Oozie to do additional types of actions > (hive, > > > > shell, etc.) and use its scheduler. For this to work, we'll need to > be > > > > able to run a Cascading job as an oozie action. Which is what I > can't > > > > figure out how to do. > > > > > > > > Typically to run a Cascading job, we'll do this: > > > > > > > > hadoop jar mycascading_uberjar.jar com.company.MyCascadingFlow arg1 > > arg2 > > > > arg3 argN > > > > > > > > My first thought was to use an oozie map-reduce action, since I run > > this > > > > with "hadoop jar" and Cascading creates MRs under the hood, but the > > oozie > > > > map-reduce action wants things like mapred.mapper.class > > > > and mapred.reducer.class. Well MyCascadingFlow runs two dozen > > different > > > > mappers and a few different reducers! > > > > > > > > What is the best way to do this? The java action seems wrong since > it > > > > won't run it with "hadoop jar". Which leaves me with just a shell > > action > > > > and putting the "hadoop jar ...." line in a shell script and invoking > > it. > > > > > > > > Other ideas? > > > > > > > > -Michael > > > > > > > > > > > > > > > -- > > Alejandro > > >