Hi, Sometime last year I was surprised to see (not on a public list unfortunately) that bookindex.html is 657kB, with > 200kB just being repetitions of xmlns="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink"
Reminded of this, due to a proposal to automatically generate docs as part of cfbot runs (which'd be fairly likely to update bookindex.html), I spent a few painful hours last night trying to track this down. The reason for the two xmlns= are different. The xmlns="http://www.w3.org/1999/xhtml" is afaict caused by confusion on our part. Some of our stylesheets use xmlns="http://www.w3.org/TR/xhtml1/transitional" others use xmlns="http://www.w3.org/1999/xhtml" It's noteworthy that the docbook xsl stylesheets end up with <html xmlns="http://www.w3.org/1999/xhtml"> so it's a bit pointless to reference http://www.w3.org/TR/xhtml1/transitional afaict. Adding xmlns="http://www.w3.org/1999/xhtml" to stylesheet-html-common.xsl gets rid of xmlns="http://www.w3.org/TR/xhtml1/transitional" in bookindex specific content. Changing stylesheet.xsl from transitional to http://www.w3.org/1999/xhtml gets rid of xmlns="http://www.w3.org/TR/xhtml1/transitional" in navigation/footer. Of course we should likely change all http://www.w3.org/TR/xhtml1/transitional references, rather than just the one necessary to get rid of the xmlns= spam. So far, so easy. It took me way longer to understand what's causing the all the xmlns:xlink= appearances. For a long time I was misdirected because if I remove the <xsl:template name="generate-basic-index"> in stylesheet-html-common.xsl, the number of xmlns:xlink drastically reduces to a handful. Which made me think that their existance is somehow our fault. And I tried and tried to find the cause. But it turns out that this originally is caused by a still existing buglet in the docbook xsl stylesheets, specifically autoidx.xsl. It doesn't omit xlink in exclude-result-prefixes, but uses ids etc from xlink. The reason that we end up with so many more xmlns:xlink is just that without our customization there ends up being a single <div xmlns:xlink="http://www.w3.org/1999/xlink" class="index"> and then everything below that doesn't need the xmlns:xlink anymore. But because stylesheet-html-common.xsl emits the div, the xmlns:xlink is emitted for each element that autoidx.xsl has "control" over. Waiting for docbook to fix this seems a bit futile, I eventually found a bugreport about this, from 2016: https://sourceforge.net/p/docbook/bugs/1384/ But we can easily reduce the "impact" of the issue, by just adding a single xmlns:xlink to <div class="index">, which is sufficient to convince xsltproc to not repeat it. Before: -rw-r--r-- 1 andres andres 683139 Feb 13 04:31 html-broken/bookindex.html After: -rw-r--r-- 1 andres andres 442923 Feb 13 12:03 html/bookindex.html While most of the savings are in bookindex, the rest of the files are reduced by another ~100kB. WIP patch attached. For now I just adjusted the minimal set of xmlns="http://www.w3.org/TR/xhtml1/transitional", but I think we should update all. Greetings, Andres Freund
diff --git i/doc/src/sgml/stylesheet-html-common.xsl w/doc/src/sgml/stylesheet-html-common.xsl index d9961089c65..9f69af40a94 100644 --- i/doc/src/sgml/stylesheet-html-common.xsl +++ w/doc/src/sgml/stylesheet-html-common.xsl @@ -4,6 +4,7 @@ %common.entities; ]> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" + xmlns="http://www.w3.org/1999/xhtml" version="1.0"> <!-- @@ -126,7 +127,11 @@ set toc,title &uppercase;), substring(&primary;, 1, 1)))]"/> - <div class="index"> + <!-- pgsql-docs: added xmlns:xlink, autoidx.xsl doesn't include xlink in + exclude-result-prefixes. Without our customization that just leads to a + single xmlns:xlink in this div, but because we emit it it otherwise + gets pushed down to the elements output by autoidx.xsl --> + <div class="index" xmlns:xlink="http://www.w3.org/1999/xlink"> <!-- pgsql-docs: begin added stuff --> <p class="indexdiv-quicklinks"> <a href="#indexdiv-Symbols"> diff --git i/doc/src/sgml/stylesheet.xsl w/doc/src/sgml/stylesheet.xsl index 0eac594f0cc..24a9481fd49 100644 --- i/doc/src/sgml/stylesheet.xsl +++ w/doc/src/sgml/stylesheet.xsl @@ -1,7 +1,7 @@ <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0' - xmlns="http://www.w3.org/TR/xhtml1/transitional" + xmlns="http://www.w3.org/1999/xhtml" exclude-result-prefixes="#default"> <xsl:import href="http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl"/>