[ https://issues.apache.org/jira/browse/CONNECTORS-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481393#comment-17481393 ]
DK commented on CONNECTORS-1695: -------------------------------- Server returns valid sitemap xml and with mime type text/xml as mime type. As per another defect, it is in 'interestingMimeType' and should be supported. I also exclude it in solr output connector. But, I just get an error in job history indicating text/xml is restricted and web connector is still trying to process sitemap.xm as one full xml file. Appreciate any pointers or help fixing it. > Sitemap xml not detected in version 2.17 webconnector > ----------------------------------------------------- > > Key: CONNECTORS-1695 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1695 > Project: ManifoldCF > Issue Type: Bug > Components: Web connector > Affects Versions: ManifoldCF 2.17 > Reporter: DK > Priority: Major > > Trying to index sitemap xml and web connector index the whole xml into solr. > Please fix in version 2.17. > If it is any special config that needs to be taken care, please add here and > add in documentation to make it clear. > > Sitemap.xml: > <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> > <sitemap> > <loc>https://<url>/sitemap_1.xml</loc> > <lastmod>2022-01-21T16:04:45Z</lastmod> > </sitemap> > </sitemapindex> > > sitemap_1.xml: > <urlset> > <url> > <loc>https://<docurl></loc> > <lastmod>2018-10-31T11:25:27Z</lastmod> > </url> > </urlset> -- This message was sent by Atlassian Jira (v8.20.1#820001)