On Thu, Jun 27, 2013 at 3:38 AM, Sznajder ForMailingList <
bs4mailingl...@gmail.com> wrote:

> Hi
>
> I do not see the usage of "Segments" in nutch 2.x
>
> In addition, I do not see DB path .
>

"segments" and "crawldb" are notions in 1.x representing the dir over FS
which has the crawlers' data in it (those are nothing but Hadoops' Map
files and Sequence files).
2.x leverages datastores to store the crawled data. A table is created in
the datastore to have all the information.

>
> In such condition, how can we two separate crawls, one starting from url1
> and the second from another seed, for example?
>

You could specify different crawlIDs. Being honest, I have never tried
running multiple crawls at the same time with 2.x.
Its not seen to be a good thing to do as mentioned by Julien in this thread:
http://lucene.472066.n3.nabble.com/Concurrently-running-multiple-nutch-crawls-td3166207.html

>
> Benjamin
>

Reply via email to