Re: Getting our tables to render better in PDF output

2020-02-11 Thread Alexander Lakhin
Hello Tom,
12.02.2020 00:51, Tom Lane wrote:
> The crummy formatting of our tables of functions and operators has
> been an issue for a long time.  To my mind, there are several things
> that need to be addressed:
>
> * The layout is completely unfriendly to function descriptions that
> run to more than a few words.
>
> * It's not very practical to have more than one example per function
> (or at least, we seldom do so).
>
> * The results look completely awful in PDF format, because of the
> narrow effectively-available space, plus the fact that the toolchain
> will prefer to overprint following columns instead of breaking text
> where there's no whitespace.
Please look at a less invasive approach that we use at Postgres Pro for
some time (mainly for improving the translated documentation, but it
works for the original one too). The idea is to add zero-width spaces
after/before some chars ('(', ',', '[', etc) to let fop split lines
where desired. It has one disadvantage - it's not search-friendly
(though maybe that is application-dependent).
But if it's feasible, I think this approach can at least complement a
manual tables reformatting. Decreasing a font size in the tables seems
appropriate to me too.

Best regards,
Alexander
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 0401a515df8..3be29dba9f1 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -169,10 +169,14 @@ XSLTPROC_FO_FLAGS += --stringparam img.src.path '$(srcdir)/'
 %-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) -o $@.tmp $(@D)/pg-customize-fo.xsl $@
+	mv $@.tmp $@
 
 %-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) -o $@.tmp $(@D)/pg-customize-fo.xsl $@
+	mv $@.tmp $@
 
 %.pdf: %.fo $(ALL_IMAGES)
 	$(FOP) -fo $< -pdf $@
diff --git a/doc/src/sgml/pg-customize-fo.xsl b/doc/src/sgml/pg-customize-fo.xsl
new file mode 100644
index 000..48978880126
--- /dev/null
+++ b/doc/src/sgml/pg-customize-fo.xsl
@@ -0,0 +1,118 @@
+
+http://www.w3.org/1999/XSL/Format; version="1.0"
+xmlns:xsl="http://www.w3.org/1999/XSL/Transform;>
+
+  
+
+  
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+
+  
+
+
+
+  
+  
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+


Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Jonathan S. Katz
On 2/11/20 3:49 PM, Jonathan S. Katz wrote:
> On 2/11/20 3:41 PM, Peter Geoghegan wrote:
>> On Tue, Feb 11, 2020 at 11:40 AM Jonathan S. Katz  
>> wrote:
>>> Anyway, attached is a first attempt at a patch. I tried a few different
>>> variations but in my quick review of it, I could not figure out how to
>>> make a XSLT respect having multiple stylesheets (likely due to my lack
>>> of familiarity with XSLT).
>>
>> I tried this patch out.
> 
> Thanks!
> 
>> The alignment is a little off, since the docs
>> don't appear in the website's frame, and lack the website's header. It
>> would be nice if the same margins appeared to the left and to the
>> right. 
> 
> Yup, that's a direct result of not having the Bootstrap base.
> 
>> But even still, it's a vast improvement.
> 
> Cool.

I played around with this for a bit longer, became a bit more familiar
with DocBook[1] (and a lot of other pages, but this one seemed
relevant), and here is what I came up with:

As I mentioned, the way pgweb works is that it wraps a root element (the
...) around the imported HTMl from the
generation, which allows it to apply the various website styles. This is
important, because it allows us to apply some general style rules, but
namespace them specifically to the documentation. Hold this thought for
a moment.

When calling "make STYLE=website html", this turns on a flag that embeds
the URL to the old "docs.css" content that we generated. I did an
experiment where I overloaded the "dynamic CSS generator" we have in our
code to include the bootstrap.css files (as well as some others) in
addition to our new base CSS. This demonstrated a marked improvement in
the output from the above command, but it was still not perfect: the CSS
rules still expect there to be the #docContent namespace.

I thought this would be a good area to explore to see if I could get the
#docContent ID wrapped around the content body. As I was writing this
note (where actually I was about to throw in the towel), on a hunch I
improved my Googling and found a solution (attached).

This works with pgweb as pgweb extracts the content from the  tag
that is generated by "make html" so this is unaffected.

For this solution to fully work, I also need to make a patch to pgweb. I
have it 80% done, where the final 20% is getting rid of some annoying
errors of files it is looking for (the Bootstrap minification expects a
CSS map file. I believe I can silence that).

It's not perfect: we don't have a full container around the generated
documentation so you can't see it exactly in terms of how it's render on
the website, but it's way closer to the look and feel. I might be able
to add a few more attributes to make it look closer to the website in
that regard, though after there is consensus that this approach is ok.

That said, I think this is a happy compromise that allows said mode to
appear mostly like what you would find on the website.

Thanks,

Jonathan

[1] http://docbook.sourceforge.net/release/xsl/current/doc/html/index.html
diff --git a/doc/src/sgml/stylesheet-html-common.xsl 
b/doc/src/sgml/stylesheet-html-common.xsl
index 9edce52a10..8c2c759c81 100644
--- a/doc/src/sgml/stylesheet-html-common.xsl
+++ b/doc/src/sgml/stylesheet-html-common.xsl
@@ -18,6 +18,13 @@
 pgsql-docs@lists.postgresql.org
 2
 
+
+
+  docContent
+
 
 
 
diff --git a/doc/src/sgml/stylesheet.xsl b/doc/src/sgml/stylesheet.xsl
index 4ff6e8ed24..bd27b8c1c9 100644
--- a/doc/src/sgml/stylesheet.xsl
+++ b/doc/src/sgml/stylesheet.xsl
@@ -23,7 +23,7 @@
 
   
 stylesheet.css
-
https://www.postgresql.org/media/css/docs.css
+https://www.postgresql.org/dyncss/docs.css
   
 
 


signature.asc
Description: OpenPGP digital signature


Getting our tables to render better in PDF output

2020-02-11 Thread Tom Lane
The crummy formatting of our tables of functions and operators has
been an issue for a long time.  To my mind, there are several things
that need to be addressed:

* The layout is completely unfriendly to function descriptions that
run to more than a few words.

* It's not very practical to have more than one example per function
(or at least, we seldom do so).

* The results look completely awful in PDF format, because of the
narrow effectively-available space, plus the fact that the toolchain
will prefer to overprint following columns instead of breaking text
where there's no whitespace.

In [1], Alvaro suggested that we might be able to improve matters by
taking advantage of DocBook's features for column and row spanning.
I did some concrete experimentation in that line, and attached are
two alternative patches that show a couple of things we might do.
Both patches change tables 9.31 (Date/Time Functions) and 9.33
(Enum Support Functions), which I chose somewhat at random, but of
course there would be a lot more to be done if we choose to go this way.

The first patch uses only one row for each function example, while
the second patch uses two rows (i.e., example and result in separate
table rows).  Otherwise they're the same.

I initially did the enum-support table, and what I tried there included
getting rid of the separate table column for function result type by
writing the functions in the form "func(argtypes) returns resulttype".
(Note that this table failed to specify the result types at all before,
which doesn't seem great.)  The layout idea is

function name, args, result   description
 example example result

where we can repeat the "example / example result" row if we want more
examples per function.  Alternatively, in the second patch, it's

function name, args, result  description
 example
 example result

To my eyes, the first alternative is preferable in HTML, unless maybe you
want to read the manual in a *very* narrow browser window.  But some of
the examples/results still overrun the available space when looking
at it in PDF A4 format.  The second patch fixes that problem, but seems
not very pretty in a normal-width browser window.

When I tried to apply the same idea to the date/time functions table,
it didn't really work well at all, mainly because of a few beasts like
make_interval() --- that caused the left column to be so wide that the
right-hand columns were horrid.  (At least with the toolchain version
I'm using, it seems like the colwidth specifications are respected
rigidly in PDF output but just plain ignored in HTML output.  What
seems to happen in HTML is that earlier columns get their preferred
width and later ones get squeezed.)

So the layout idea that the patches show for that table is

function name  arg types result type
   description
   example example result

or

function name  arg types result type
   description
   example
   example result

(Even with that, I had to savage make_interval's arg-types list a bit
to keep that column from eating too much space...)

I'm not especially wedded to any of these ideas, but I hope to provoke
some discussion about what we might do in this area.  DocBook tables
aren't the greatest layout tool in the world, but they do have abilities
we're not exploiting.

Even with these changes, the amount of space available for examples
and results in PDF format is pretty tiny.  With examples and results
in the same row, it seems that you can only have a couple of dozen
consecutive non-whitespace characters without running into overwrite
issues, whereas in HTML format the trouble threshold is a good deal
higher.  I wonder if we could improve matters by switching to some
narrower font for  text in PDF?

regards, tom lane

[1] 
https://www.postgresql.org/message-id/2020011618.GA25792%40alvherre.pgsql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ceda48e..385fdc0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6798,471 +6798,618 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
 
 
  Date/Time Functions
- 
+ 
+  
+  
+  
+  
+  
+  
+  
+  
+  
   

-Function
-Return Type
-Description
-Example
-Result
+Function
+Argument Types
+Result Type
+   
+   
+Description
+   
+   
+Example
+Example Result

   
 
   

-
+
  
   age
  
- age(timestamp, timestamp)
+ age
 
-interval
-Subtract arguments, producing a symbolic 

Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Jonathan S. Katz
On 2/11/20 3:41 PM, Peter Geoghegan wrote:
> On Tue, Feb 11, 2020 at 11:40 AM Jonathan S. Katz  
> wrote:
>> Anyway, attached is a first attempt at a patch. I tried a few different
>> variations but in my quick review of it, I could not figure out how to
>> make a XSLT respect having multiple stylesheets (likely due to my lack
>> of familiarity with XSLT).
> 
> I tried this patch out.

Thanks!

> The alignment is a little off, since the docs
> don't appear in the website's frame, and lack the website's header. It
> would be nice if the same margins appeared to the left and to the
> right. 

Yup, that's a direct result of not having the Bootstrap base.

> But even still, it's a vast improvement.

Cool.

>>
> There are a couple of inconsistencies in the tables and diagrams that
> appear on this documentation page (on my local build that uses your
> patch):
> 
> https://www.postgresql.org/docs/devel/storage-page-layout.html
> 
> The tables look different, which isn't too bad. The "Figure 68.1. Page
> Layout" diagram is massive, though. IIRC was an issue that had to be
> addressed on the website a little after the introduction of images
> into the docs. It seems as if my local build of the docs needs that
> same fix.

Ditto on missing the Bootstrap base. The tables rely directly on that
base for the style an formatting. For the images, the CSS classes are:

"figure col-xl-8 col-lg-10 col-md-12"

"figure" is one of our custom defined classes, but the rest are
Bootstrap and are designed to size to the particular browser window
resolution.

(For the history of the figure sizing, it was two fixes:

1. One with the SVG generation to allow for it to scale (the "S" in SVG
:) and then
2. Applying the CSS classes shown above.


Without the CSS classes, the image will scale without limit)

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Peter Geoghegan
On Tue, Feb 11, 2020 at 11:40 AM Jonathan S. Katz  wrote:
> Anyway, attached is a first attempt at a patch. I tried a few different
> variations but in my quick review of it, I could not figure out how to
> make a XSLT respect having multiple stylesheets (likely due to my lack
> of familiarity with XSLT).

I tried this patch out. The alignment is a little off, since the docs
don't appear in the website's frame, and lack the website's header. It
would be nice if the same margins appeared to the left and to the
right. But even still, it's a vast improvement.

There are a couple of inconsistencies in the tables and diagrams that
appear on this documentation page (on my local build that uses your
patch):

https://www.postgresql.org/docs/devel/storage-page-layout.html

The tables look different, which isn't too bad. The "Figure 68.1. Page
Layout" diagram is massive, though. IIRC was an issue that had to be
addressed on the website a little after the introduction of images
into the docs. It seems as if my local build of the docs needs that
same fix.

-- 
Peter Geoghegan




Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Jonathan S. Katz
On 2/11/20 2:32 PM, Tom Lane wrote:
> "Jonathan S. Katz"  writes:
>> On 2/11/20 1:37 PM, Tom Lane wrote:
>>> I also wonder why duplicating the website's style isn't the default.
>>> Doesn't seem like having authors optimize for some other style is
>>> what we really want.
> 
>> Oh, and specifically for this, my guess is because it requires one to
>> make a call over a network to load the stylesheet. :)
> 
> Surely we could provide directions about how to store that locally.

I have a little doubt about that, but per mention in the original email,
it means storing a lot more stylesheets and ones that may change with
more frequency than the project. It may not be too much of an issue, but
I do want to note that. I'm somewhat ambivalent myself, but my
preference is to have the single source of truth.

Anyway, attached is a first attempt at a patch. I tried a few different
variations but in my quick review of it, I could not figure out how to
make a XSLT respect having multiple stylesheets (likely due to my lack
of familiarity with XSLT).

This just swaps out the link. A better approach would be to find a way
to include multiple CSS stylesheets. After searching over a bunch of
different terms, I could not figure out how to get to this result, but
as mentioned, I'm close to clueless on writing XSLT at this point.

Another way we could get to the desired result add something to pgweb
similar to the old "docs.css" that is being referenced that combines the
multiple stylesheets into one. It's a bit of an anti-pattern in modern
web, so I'm not thrilled to go down that route.

Jonathan
diff --git a/doc/src/sgml/stylesheet.xsl b/doc/src/sgml/stylesheet.xsl
index 4ff6e8ed24..38434367f1 100644
--- a/doc/src/sgml/stylesheet.xsl
+++ b/doc/src/sgml/stylesheet.xsl
@@ -23,7 +23,7 @@
 
   
 stylesheet.css
-
https://www.postgresql.org/media/css/docs.css
+
https://www.postgresql.org/media/css/base.css
   
 
 


signature.asc
Description: OpenPGP digital signature


Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Peter Geoghegan
On Tue, Feb 11, 2020 at 10:37 AM Tom Lane  wrote:
> I also wonder why duplicating the website's style isn't the default.
> Doesn't seem like having authors optimize for some other style is
> what we really want.

FWIW, I've often wondered about it myself.

-- 
Peter Geoghegan




Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Tom Lane
"Jonathan S. Katz"  writes:
> On 2/11/20 1:37 PM, Tom Lane wrote:
>> I also wonder why duplicating the website's style isn't the default.
>> Doesn't seem like having authors optimize for some other style is
>> what we really want.

> Oh, and specifically for this, my guess is because it requires one to
> make a call over a network to load the stylesheet. :)

Surely we could provide directions about how to store that locally.

regards, tom lane




Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Jonathan S. Katz
On 2/11/20 1:37 PM, Tom Lane wrote:

> I also wonder why duplicating the website's style isn't the default.
> Doesn't seem like having authors optimize for some other style is
> what we really want.

Oh, and specifically for this, my guess is because it requires one to
make a call over a network to load the stylesheet. :)

Jonathan



signature.asc
Description: OpenPGP digital signature


Re: Duplicating website's formatting in local doc builds

2020-02-11 Thread Jonathan S. Katz
On 2/11/20 1:37 PM, Tom Lane wrote:
> I'm wondering how to do $SUBJECT.  The fine manual suggests
> 
>   make STYLE=website html
> 
> but what I'm getting here with that is not a very close approximation
> of what I see at postgresql.org.  It's closer than the default,
> but it's not the same font, margins, etc.
> 
> I also wonder why duplicating the website's style isn't the default.
> Doesn't seem like having authors optimize for some other style is
> what we really want.

It looks like it's pulling from the wrong source[1]. It should be:


https://www.postgresql.org/dyncss/base.css

There are a few more dependencies now as well to get the Bootstrap
structure and the font:

https://www.postgresql.org/media/css/fontawesome.css
https://www.postgresql.org/media/css/bootstrap.min.css

(And one for another font...which I see we should import the dependency on).

This should likely be a small quick change. I was going to try to say
"after the release" comment, but given I'm in both codebases at the
moment, I'll do a quick test and see how it looks.

(FWIW, I test the appearance a bit differently. I actually import built
documentation into my local copy of pgweb and tinker from there, as I'll
have all the dependencies available. That likely is not a viable option
for most people working on the documentation [unless we make it easier
to get pgweb up and running]).

Jonathan

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=doc/src/sgml/stylesheet.xsl;hb=HEAD#l26



signature.asc
Description: OpenPGP digital signature


Duplicating website's formatting in local doc builds

2020-02-11 Thread Tom Lane
I'm wondering how to do $SUBJECT.  The fine manual suggests

make STYLE=website html

but what I'm getting here with that is not a very close approximation
of what I see at postgresql.org.  It's closer than the default,
but it's not the same font, margins, etc.

I also wonder why duplicating the website's style isn't the default.
Doesn't seem like having authors optimize for some other style is
what we really want.

regards, tom lane