[ 
https://issues.apache.org/jira/browse/CONNECTORS-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376611#comment-17376611
 ] 

Sebastian Bölling commented on CONNECTORS-1657:
-----------------------------------------------

I would also appreciate this feature. The usage of an sitemap reference is 
specified in  [https://www.sitemaps.org/protocol.html#submit_robots] 

This feature would be an easy and standard way for webmasters to inform the Web 
connector about the pages available for crawling.

> Web connector - Handle sitemap instruction in robot.txt
> -------------------------------------------------------
>
>                 Key: CONNECTORS-1657
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1657
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Web connector
>    Affects Versions: ManifoldCF 2.17
>            Reporter: Julien Massiera
>            Priority: Major
>
> Currently the web connector does not understand when the robot.txt file 
> points a sitemap. As an example, for the site 
> [https://www.persee.fr,|https://www.persee.fr%2C/] in the simple history one 
> can find the following error:
> Unknown robots.txt line: 'Sitemap: [https://www.persee.fr/sitemap.xml']
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to