Re: nutch on amazon emr

Julien Nioche Thu, 01 Jan 2015 11:59:54 -0800

Hi Adil

Why don't you simply SSH to the master node, install Nutch there and run
the crawl script in runtime/deploy? You can then monitor your crawl in the
usual way using the MapReduce UI.


HTH

Julien

On 1 January 2015 at 17:03, Adil Ishaque Abbasi <aiabb...@gmail.com> wrote:

> I tried to run it through custom jar step using script runner jar i.e.
> s3://elasticmapreduce/libs/script-runner/script-runner.jar
>
> Regards
> Adil I. Abbasi
>
> On Thu, Jan 1, 2015 at 8:51 PM, Meraj A. Khan <mera...@gmail.com> wrote:
>
> > Can you give us the command that you use to start the crawl?
> > On Jan 1, 2015 10:28 AM, "Adil Ishaque Abbasi" <aiabb...@gmail.com>
> wrote:
> >
> > > When I try to nutch crawl script on amazon emr, it gives me this error
> > >
> > > /mnt/var/lib/hadoop/steps/s-3VT1QRVSURPSH/./crawl: line 81:
> > > hdfs:///nutch/bin/nutch: No such file or directory
> > > Command exiting with ret '0'
> > >
> > >
> > > Though nutch script is located at hdfs:///nutch/bin/,still it gives
> this
> > > erorr.
> > >
> > > Any idea what is it that I'm doing wrong ?
> > >
> > >
> > >
> > >
> > > Regards
> > > Adil
> > >
> >
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: nutch on amazon emr

Reply via email to