Hi hoping for some help to get sitemaps.xml working using this command to crawl (nutch 1.18)
NUTCH_HOME/bin/crawl -i -D solr.server.url=http://localhost:8983/solr/nutch --sitemaps-from-hostdb always -s $NUTCH_HOME/urls/ $NUTCH_HOME/Crawl 10 if this flag is used *--sitemaps-from-hostdb always* *this error occurs* *Generator: number of items rejected during selection:Generator: 201 SCHEDULE_REJECTEDGenerator: 0 records selected for fetching, exiting ...* without this flag present it crawls the site without issue and nutch-default.xml set the interval to 2 seconds from default 30 days. <name>db.fetch.interval.default</name> <value>2</value> I also don't understand why the crawldb is automatically deleted after each crawl so I cannot runn any commands about url's that are not crawled. Any help -- Andrew MacKay -- CONFIDENTIALITY NOTICE: The information contained in this email is privileged and confidential and intended only for the use of the individual or entity to whom it is addressed. If you receive this message in error, please notify the sender immediately at 613-729-1100 and destroy the original message and all copies. Thank you.

