Bonjour Fabrice, You could simply inject the URLs your lists of URLs into the crawlDB and use the "-noAdditions" option of the updatedb command. This way the URLs found during the parsing will not be added to the crawlDB and hence not fetched.
HTH Julien -- DigitalPebble Ltd http://www.digitalpebble.com 2009/6/8 Fabrice Estiévenart <[email protected]> > Hello, > > I'd like using Nutch to index a dynamic (i.e. constantly changing) list of > urls but without crawling (i.e. not following the links contained in the > urls). The process should be able to index new urls and update the index > with changing ones. > > What's the best way to do this with Nutch commands ? Thanks, > > -- > Fabrice Estiévenart, Ingénieur R&D, CETIC > Web : http://www.cetic.be > >
