Re: Nutch - Hadoop Help

Talat Uyarer Mon, 03 Feb 2014 21:18:18 -0800

Hi,

Actually Shell script run Hadoop commands. Job is summited to Jobtracker.
Jobtracker Coordinate for job splitting to tasktracker etc.


Talat

3 Şubat 2014 Pazartesi tarihinde, d_k <[email protected]> yazdı:

> How exactly can you make Hadoop execute the shell script? Or do you mean to
> execute the script from the master node and each task will be sent to all
> the nodes and the results sent back?
>
>
> On Mon, Feb 3, 2014 at 9:58 PM, Lewis John Mcgibbney <
> [email protected] <javascript:;>> wrote:
>
> > Hi Manikandan,
> >
> > On Mon, Feb 3, 2014 at 3:45 PM, 
> > <[email protected]<javascript:;>>
> wrote:
> >
> > > And then, I'm running this:
> > > $HADOOP_HOME/bin/hadoop jar /usr/local/nutch/nutch.job
> > > org.apache.nutch.crawl.Crawler dmoz -dir /user/hduser/crawl -depth 3
> > -topN
> > > 5000
> > >
> >
> > You're using the Crawler class. This is not advised at all and is now
> > deprecated. There is no point in downloading the crawl script if you are
> > going to use the Crawler class. I would suggest you using the crawl
> script.
> >
> >
> > >
> > > org.apache.gora.memory.store.MemStore as the Gora storage class.
> > >
> >
> > Please don't use MemStore its implementation in Gora 0.3 is not thread
> safe
> > and is only used for trivial tests. Please see the 2.x tutorial on the
> > Nutch wiki for details of how to configure the supported Gora persistent
> > data stores.
> >
> >
> > Once you've used the crawl script, and configured your Nutch deployment
> job
> > file, please get back to us with your results.
> > Remeber you will always need to regenerate your Nutch job file if you
> make
> > configuration changes to your Nutch deployment.
> > hth
> > Thanks
> >
>


-- 
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: Nutch - Hadoop Help

Reply via email to