Hi - I'm running nutch where I want to ensure that the given URL is crawled
just once (and never again). 

In order to achieve this, I tried putting a large negative value for adddays
(-9999) so as to ensure that the URL is due for re-crawl after (30 + 9999)
days. But it isn't working for me. In fact, I found that URLs keep getting
re-crawled in different cycles of the same run.

Any ideas?
-- 
View this message in context: 
http://www.nabble.com/How-to-ensure-that-a-particular-URL-is-not-crawled-%28ever%29-again-tp23071795p23071795.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to