On Tuesday 15 May 2012 17:39:31 Vikas Hazrati wrote:
> So once the crawl (which abstracts iterative crawls till the depth is
> reached) is finished, is there a way to trigger a recrawl as well as a part
> of some command line option so that Nutch continues to run as a daemon or
> is shell script the way out?

shell scripting is the way to go. Nutch will automatically recrawl pages that 
are due to be refetched.

> 
> Regards | Vikas
> 
> On Fri, May 11, 2012 at 8:26 PM, Lewis John Mcgibbney <
> 
> lewis.mcgibb...@gmail.com> wrote:
> > If you would like I could add you to the moderators group and you can
> > word it how you wish.
> > 
> > Please sign up to Jira, give me your Jira username on this page, and I
> > will happily add you the the group.
> > 
> > On the other-hand, if you don't wish to do this, then please reply
> > here with your suggestion and I'll make sure something gets changed to
> > accommodate your suggestions.
> > 
> > Thanks
> > 
> > On Fri, May 11, 2012 at 2:52 PM, Matthias Paul <magethle.nu...@gmail.com>
> > 
> > wrote:
> > > In was confused by this tutorial:
> > http://wiki.apache.org/nutch/NutchTutorial
> > 
> > > Reading this page one might get to the conclusion that the crawl tool
> > > can't do iterative crawling, because under "3.2 Using Individual
> > > Commands for Whole-Web Crawling" there's  the sentence "This also
> > > permits ... incremental crawling", as if the crawl command described
> > > before (3.1 Using the Crawl Command) couldn't do that.
> > > 
> > > Could someone perhaps improve this part of the tutorial?
> > > 
> > > Matthias
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > On Thu, May 10, 2012 at 8:39 PM, Markus Jelsma
> > > 
> > > <markus.jel...@openindex.io> wrote:
> > >> By default each crawl is iterative. The crawl command is nothing more
> > 
> > than a wrapper around the individual crawl cycle commands. The depth
> > parameter is nothing more than executing a single crawl cycle multiple
> > times. This is, if i am not mistaken, also true for older releases,
> > certainly 1.2 and above.
> > 
> > >> On Thu, 10 May 2012 19:31:27 +0100, Lewis John Mcgibbney <
> > 
> > lewis.mcgibb...@gmail.com> wrote:
> > >>> For the record, there is a patch pending review for Nutchgora which
> > >>> will sort part of this for you as well.
> > >>> 
> > >>> https://issues.apache.org/jira/browse/NUTCH-1301
> > >>> 
> > >>> Susam Pal also contributed a patch for Nutchgora regarding incremental
> > >>> indexing but I can't find it just now sorry.
> > >>> 
> > >>> Lewis
> > >>> 
> > >>> 
> > >>> On Thu, May 10, 2012 at 5:18 PM, Matthias Paul
> > >>> 
> > >>> <magethle.nu...@gmail.com> wrote:
> > >>>> Hi all,
> > >>>> 
> > >>>> can the crawl-command also be used for iterative crawls?
> > >>>> In older Nutch-versions this was not possible but in 1.5 it seems to
> > 
> > work?
> > 
> > >>>> Thanks
> > >>>> Matthias
> > >> 
> > >> --
> > >> Markus Jelsma - CTO - Openindex
> > 
> > --
> > Lewis
-- 
Markus Jelsma - CTO - Openindex

Reply via email to