Hi Adil Why don't you simply SSH to the master node, install Nutch there and run the crawl script in runtime/deploy? You can then monitor your crawl in the usual way using the MapReduce UI.
HTH Julien On 1 January 2015 at 17:03, Adil Ishaque Abbasi <aiabb...@gmail.com> wrote: > I tried to run it through custom jar step using script runner jar i.e. > s3://elasticmapreduce/libs/script-runner/script-runner.jar > > Regards > Adil I. Abbasi > > On Thu, Jan 1, 2015 at 8:51 PM, Meraj A. Khan <mera...@gmail.com> wrote: > > > Can you give us the command that you use to start the crawl? > > On Jan 1, 2015 10:28 AM, "Adil Ishaque Abbasi" <aiabb...@gmail.com> > wrote: > > > > > When I try to nutch crawl script on amazon emr, it gives me this error > > > > > > /mnt/var/lib/hadoop/steps/s-3VT1QRVSURPSH/./crawl: line 81: > > > hdfs:///nutch/bin/nutch: No such file or directory > > > Command exiting with ret '0' > > > > > > > > > Though nutch script is located at hdfs:///nutch/bin/,still it gives > this > > > erorr. > > > > > > Any idea what is it that I'm doing wrong ? > > > > > > > > > > > > > > > Regards > > > Adil > > > > > > -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble