On Mon, Jul 1, 2013 at 5:31 AM, Ahmet Emre Aladağ <emre.ala...@agmlab.com>wrote:
> Hi, > > I'd like to add a new stage called "updatescore" after "updatedb" to Nutch > 2.1. > > I tried two ways for this: > 1) public class ScoreUpdaterJob extends NutchTool implements Tool; > > Nutch requires me to define the InputFormat, OutputFormat etc. to perform > Map-reduce calculations. > > I don't want to perform map-reduce but call a Giraph job to run on Hadoop. > When it's finished, Nutch can go on its way. > > 2) public class ScoreUpdaterJob implements Tool; > or public class ScoreUpdaterJob; > > Then I can't use setJarClass of NutchTool, so hadoop job fails: > Caused by: java.lang.**ClassNotFoundException: org.apache.giraph.examples. > **LinkRank.LinkRankComputation > Isn't setJarClass a method provided in Hadoop itself and something that is not provided in NutchTool ? https://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/Job.html#setJarByClass%28java.lang.Class%29 > > How can I fix this? What's the best way to add a giraph job as a Nutch > stage? > My feeling is that #2 should work. > Thanks, > > >