Re: Index a dynamic list of urls

Julien Nioche Mon, 08 Jun 2009 02:29:52 -0700

Bonjour Fabrice,

You could simply inject the URLs your lists of URLs into the crawlDB and use
the "-noAdditions" option of the updatedb command. This way the URLs found
during the parsing will not be added to the crawlDB and hence not fetched.


HTH

Julien

-- 
DigitalPebble Ltd
http://www.digitalpebble.com

2009/6/8 Fabrice Estiévenart <[email protected]>

> Hello,
>
> I'd like using Nutch to index a dynamic (i.e. constantly changing) list of
> urls but without crawling (i.e. not following the links contained in the
> urls). The process should be able to index new urls and update the index
> with changing ones.
>
> What's the best way to do this with Nutch commands ? Thanks,
>
> --
> Fabrice Estiévenart, Ingénieur R&D, CETIC
> Web : http://www.cetic.be
>
>

Re: Index a dynamic list of urls

Reply via email to