Hello, We have a sitemap.xml pointing to further sitemaps. The XML seems fine, but Nutch things those two sitemap URL's are actually one consisting of both concatenated.
Here is https://www.saxion.nl/sitemap.xml <?xml version="1.0" encoding="UTF-8"?> <ns2:sitemapindex xmlns:ns2="http://www.sitemaps.org/schemas/sitemap/0.9"> <sitemap> <loc>https://www.saxion.nl/opleidingen-sitemap.xml</loc> <loc>https://www.saxion.nl/content-sitemap.xml</loc> </sitemap> </ns2:sitemapindex> This seems fine, but Nutch attempts, and obviously fails to load: 2018-05-25 16:27:50,515 ERROR [Thread-30] org.apache.nutch.util.SitemapProcessor: Error while fetching the sitemap. Status code: 14 for https://www.saxion.nl/opleidingen-sitemap.xmlhttps://www.saxion.nl/content-sitemap.xml What is going on here? Why does Nutch, or CC's sitemap util behave like this? Thanks, Markus