Hi Zoran!

I’m going to leave checking the sitemap generation to you because AFAICT they 
are correct when generated with the site:url.

Would you like to make the UI changes you suggest or shall I write a PR?

Thanks!
David Jencks

> On Sep 8, 2021, at 2:26 AM, Zoran Regvart <zo...@regvart.com> wrote:
> 
> Hi David,
> 
> On Tue, Sep 7, 2021 at 7:32 PM David Jencks <david.a.jen...@gmail.com> wrote:
>> 
>> I investigated the patch-sitemap.js question a bit.
>> 
>> Using my `issue-16854-jsonpath-options` camel-website branch, I built the 
>> site twice, with site:url set to `https://camel.apache.org` and set to `/`.  
>> I didn’t look at every page, but diffing the generated sites seems to 
>> consistently show two differences between the results:
>> 
>> - with `https://camel.apache.org` a head <link> element is included such as
>> 
>>    <link rel="canonical" 
>> href="https://camel.apache.org/components/latest/ironmq-component.html";>
>> 
>> It’s omitted as expected with `/`.  My understanding is that this needs to 
>> be an absolute URI and that it’s function is to help search engines.  
>> However, if we don’t want it, it’s trivial to modify the UI to not generate 
>> it.
> 
> The Hugo built bits also contain the `<link rel="canonical>`[1]. When
> used, recommendation is to place absolute URLs[2]. Now since we use
> 301 redirects, have sitemap(s), I'm not entirely convinced we need
> `<link rel="canonical>` at all. I suggest we remove it (from Hugo
> layout and Antora UI).
> 
>> - with  `https://camel.apache.org` a footer micro data script is plausible 
>> rather than meaningless (the `url:`  entry):
> 
> I think the URLs in JSON-LD microdata need to be absolute, I can't
> find a definite reference on that, but if I test[3] with relative URL
> I get "http://example-test.site/"; for URL, might be a placeholder...
> 
>> In both cases I’d expect that to be usable the logo should be an absolute 
>> URI?
> 
> Yeah, I think all URLs need to be absolute in JSON-LD, but I'm not
> 100% on that...
> 
>> I note that the next bit of micro data, BreadcrumbList, is always generated 
>> with absolute URIs with https://camel.apache.org. Shouldn’t this be 
>> generated from the site:url?
> 
> Well, I'm guessing that was a pragmatic choice when we did that
> initially, I do remember some back and forth on that but the context
> escapes me
> 
>> My conclusions are:
>> 
>> - There is no need for patch-sitemap.js and that the site needs to be 
>> generated with the correct site:url.
> 
> Let's check sitemaps and JSON-LD microdata if they are all generated
> and contain absolute URLs first. otherwise what Dan suggested on #772
> could be a good way to go...
> 
>> - If inclusion of the <link> element causes a problem it can be removed from 
>> the UI.
> 
> +1, not sure if we need it at all...
> 
>> - The Organization micro data needs it’s logo URL fixed to be absolute based 
>> on site:url
> 
> +1
> 
>> - The BreadcrumbList micro data needs to be generated based on site:url.
> 
> +1
> 
>> Have I missed something?
> 
> We need to double check the XML sitemaps...
> 
> zoran
> 
> [1] 
> https://github.com/apache/camel-website/blob/ad0b8c6efcea9943ae8e690ec020f5589c227a54/layouts/partials/header.html#L28
> [2] 
> https://developers.google.com/search/docs/advanced/crawling/consolidate-duplicate-urls#rel-canonical-link-method
> [3] https://search.google.com/test/rich-results?id=CaNl08DpmzTpYqlM7Cg1kA
> -- 
> Zoran Regvart

Reply via email to