Hi Bob,

it's impossible to make any diagnostics without the full log files
the complete configuration and a detailed description what is missing.

It could be a bug, of course. But it's more likely a configuration issue,
you should check the log files. Also have a look at:
- the robots.txt of the crawled sites
- your URL filters
- http.content.limit

These are often the reason for links not found or not fetched.


> even when I use the sitemap.xml as a seed url.

You need to use the SitemapProcessor
  bin/nutch sitemap

Best,
Sebastian

On 05/09/2018 07:08 PM, Robert Scavilla wrote:
>  Hello and thank you for your help. I'm confused why nutch 1.14 (I've had
> the same issues with earlier versions) is not crawling full websites. I set
> the number of rounds to a generous number and the crawl quits without
> crawling the whole site with the message "No New Links Found". This happens
> even when I use the sitemap.xml as a seed url.
> 
> Any help is greatly appreciated.
> 
> Best,
> ...bob
> 

Reply via email to