Hi David, On Tue, Sep 7, 2021 at 7:32 PM David Jencks <[email protected]> wrote: > > I investigated the patch-sitemap.js question a bit. > > Using my `issue-16854-jsonpath-options` camel-website branch, I built the > site twice, with site:url set to `https://camel.apache.org` and set to `/`. > I didn’t look at every page, but diffing the generated sites seems to > consistently show two differences between the results: > > - with `https://camel.apache.org` a head <link> element is included such as > > <link rel="canonical" > href="https://camel.apache.org/components/latest/ironmq-component.html"> > > It’s omitted as expected with `/`. My understanding is that this needs to be > an absolute URI and that it’s function is to help search engines. However, > if we don’t want it, it’s trivial to modify the UI to not generate it.
The Hugo built bits also contain the `<link rel="canonical>`[1]. When used, recommendation is to place absolute URLs[2]. Now since we use 301 redirects, have sitemap(s), I'm not entirely convinced we need `<link rel="canonical>` at all. I suggest we remove it (from Hugo layout and Antora UI). > - with `https://camel.apache.org` a footer micro data script is plausible > rather than meaningless (the `url:` entry): I think the URLs in JSON-LD microdata need to be absolute, I can't find a definite reference on that, but if I test[3] with relative URL I get "http://example-test.site/" for URL, might be a placeholder... > In both cases I’d expect that to be usable the logo should be an absolute URI? Yeah, I think all URLs need to be absolute in JSON-LD, but I'm not 100% on that... > I note that the next bit of micro data, BreadcrumbList, is always generated > with absolute URIs with https://camel.apache.org. Shouldn’t this be generated > from the site:url? Well, I'm guessing that was a pragmatic choice when we did that initially, I do remember some back and forth on that but the context escapes me > My conclusions are: > > - There is no need for patch-sitemap.js and that the site needs to be > generated with the correct site:url. Let's check sitemaps and JSON-LD microdata if they are all generated and contain absolute URLs first. otherwise what Dan suggested on #772 could be a good way to go... > - If inclusion of the <link> element causes a problem it can be removed from > the UI. +1, not sure if we need it at all... > - The Organization micro data needs it’s logo URL fixed to be absolute based > on site:url +1 > - The BreadcrumbList micro data needs to be generated based on site:url. +1 > Have I missed something? We need to double check the XML sitemaps... zoran [1] https://github.com/apache/camel-website/blob/ad0b8c6efcea9943ae8e690ec020f5589c227a54/layouts/partials/header.html#L28 [2] https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls#rel-canonical-link-method [3] https://search.google.com/test/rich-results?id=CaNl08DpmzTpYqlM7Cg1kA -- Zoran Regvart
