Hi,

I'm using nutch 0.8.1 and I noticed the following.
When pageA redirects to pageB (HTTP 3xx), pageA remains unfetched in the 
crawlDB (pageB is fetched).

Hence, pageA shows up in each generate/fetch/updatedb iteration.

Is this a bug? I found a previous thread on this list which describes 
this issue too:
http://www.mail-archive.com/[email protected]/msg04599.html

Mathijs


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to