Hi, as I wrote before, it seems that I am not the only one who can not crawl all the seed.txt url's. I couldn't find a solution really. I collected 450 domains and approximately 200 nutch will or can not crawl. I want to know why this happens, is there a solution to force crawling sites?
It would be great to get a satisfying answer, to know why this happens and maybe how to solve it. Thanks in advance Ayhan