Hi - I'm running nutch where I want to ensure that the given URL is crawled just once (and never again).
In order to achieve this, I tried putting a large negative value for adddays (-9999) so as to ensure that the URL is due for re-crawl after (30 + 9999) days. But it isn't working for me. In fact, I found that URLs keep getting re-crawled in different cycles of the same run. Any ideas? -- View this message in context: http://www.nabble.com/How-to-ensure-that-a-particular-URL-is-not-crawled-%28ever%29-again-tp23071795p23071795.html Sent from the Nutch - User mailing list archive at Nabble.com.
