.
Thanks
On Fri, May 11, 2012 at 2:52 PM, Matthias Paul
magethle.nu...@gmail.com
wrote:
In was confused by this tutorial:
http://wiki.apache.org/nutch/NutchTutorial
Reading this page one might get to the conclusion that the crawl tool
can't do iterative crawling, because
/nutch/NutchTutorial
Reading this page one might get to the conclusion that the crawl
tool
can't do iterative crawling, because under 3.2 Using Individual
Commands for Whole-Web Crawling there's the sentence This also
permits ... incremental crawling, as if the crawl command
/NutchTutorial
Reading this page one might get to the conclusion that the crawl tool
can't do iterative crawling, because under 3.2 Using Individual
Commands for Whole-Web Crawling there's the sentence This also
permits ... incremental crawling, as if the crawl command described
before (3.1 Using
suggestions.
Thanks
On Fri, May 11, 2012 at 2:52 PM, Matthias Paul magethle.nu...@gmail.com
wrote:
In was confused by this tutorial:
http://wiki.apache.org/nutch/NutchTutorial
Reading this page one might get to the conclusion that the crawl tool
can't do iterative
In was confused by this tutorial: http://wiki.apache.org/nutch/NutchTutorial
Reading this page one might get to the conclusion that the crawl tool
can't do iterative crawling, because under 3.2 Using Individual
Commands for Whole-Web Crawling there's the sentence This also
permits ... incremental
that the crawl tool
can't do iterative crawling, because under 3.2 Using Individual
Commands for Whole-Web Crawling there's the sentence This also
permits ... incremental crawling, as if the crawl command described
before (3.1 Using the Crawl Command) couldn't do that.
Could someone perhaps improve
For the record, there is a patch pending review for Nutchgora which
will sort part of this for you as well.
https://issues.apache.org/jira/browse/NUTCH-1301
Susam Pal also contributed a patch for Nutchgora regarding incremental
indexing but I can't find it just now sorry.
Lewis
On Thu, May
By default each crawl is iterative. The crawl command is nothing more
than a wrapper around the individual crawl cycle commands. The depth
parameter is nothing more than executing a single crawl cycle multiple
times. This is, if i am not mistaken, also true for older releases,
certainly 1.2
8 matches
Mail list logo