On Mon, Jul 1, 2013 at 5:31 AM, Ahmet Emre Aladağ <emre.ala...@agmlab.com>wrote:

> Hi,
>
> I'd like to add a new stage called "updatescore" after "updatedb" to Nutch
> 2.1.
>
> I tried two ways for this:
> 1) public class ScoreUpdaterJob extends NutchTool implements Tool;
>
> Nutch requires me to define the InputFormat, OutputFormat etc. to perform
> Map-reduce calculations.
>
> I don't want to perform map-reduce but call a Giraph job to run on Hadoop.
> When it's finished, Nutch can go on its way.
>

> 2) public class ScoreUpdaterJob implements Tool;
> or public class ScoreUpdaterJob;
>
> Then I can't use setJarClass of NutchTool, so hadoop job fails:
> Caused by: java.lang.**ClassNotFoundException: org.apache.giraph.examples.
> **LinkRank.LinkRankComputation
>

Isn't setJarClass a method provided in Hadoop itself and something that is
not provided in NutchTool ?
https://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/Job.html#setJarByClass%28java.lang.Class%29

>
> How can I fix this? What's the best way to add a giraph job as a Nutch
> stage?
>

My feeling is that #2 should work.


> Thanks,
>
>
>

Reply via email to