The robots parsing does not recognize the "sitemaps" line, which was likely
not in the spec for robots when this connector was written.

Karl


On Wed, Jul 7, 2021 at 3:31 AM h0444xk8 <h0444...@posteo.de> wrote:

> Hi,
>
> I have a general question. Is the Web connector supporting sitemap files
> referenced by the robots.txt? In my use case the robots.txt is stored in
> the root of the website and is referencing two compressed sitemaps.
>
> Example of robots.txt
> ------------------------
> User-Agent: *
> Disallow:
> Sitemap: https://www.example.de/sitemap/de-sitemap.xml.gz
> Sitemap: https://www.example.de/sitemap/en-sitemap.xml.gz
>
> When start crawling in „Simple History" there is an error log entry as
> follows:
>
> Unknown robots.txt line: 'Sitemap:
> https://www.example.de/sitemap/en-sitemap.xml.gz'
>
> Is there a general problem with sitemaps at all or with sitemaps
> referenced in robots.txt or with compressed sitemaps?
>
> Best regards
>
> Sebastian
>

Reply via email to