Re: Getting our tables to render better in PDF output

2020-04-13 Thread Corey Huinker
On Mon, Apr 13, 2020 at 12:28 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > I was thinking that there were references that included parameters, but
> I'm
> > not finding any with actual parameter values, so at most we'd lose the
> "()"
> > of a reference.
>
> We could possibly stick the parens into the indexterm text.  Arguably
> that's an improvement on its own merits, since it'd become clearer which
> index entries are function names.  If you don't want that, another idea is
> to put xreflabel options that include the parens into the indexterm tags.
> Or we can just standardize on not having parens, but personally I like
> them.  Without parens, for clarity you really have to write "function
> foo" which is redundant-looking in the XML and hence
> easy to get wrong.
>

That makes sense to me. There may be some hope for the font via the
xrefstyle attribute, but I'm not educated well enough on docbook to know
for sure.


> > Assuming we want to make the anchors visible, we need a way for people to
> > discover the anchors we've made, and my thought there is that we make the
> > first definition a non-xref link to the indexterm just above it. Any
> > thoughts on what the best way to do that is?
>
> I'm not really buying into that as a requirement.  For one thing, the
> anchor name will be 100% predictable.
>

The anchor name is deterministic (or I intend it to be) but
the existence of the link is not predictable. So while having no visible
link is fine for internal links which we create, I'm envisioning a
not-very-experienced reader wanting to help an even-less-experienced
person. If they find the date_part function, and they see that the word
"date_part" is itself clickable, they'll probably click it once, see that
it's a link, and send the less-experienced person the anchored link instead
of the broader page link. They're very unlikely to try to forge their own
anchor link in the hopes that it already exists.

One thing that I noticed while playing with this last night is that
> even though  or  links will take you right to the exact
> table entry, the index entries generated from the indexterms only
> point to the page.  That seems pretty sad, why isn't it better?
>

As you've described it it does seem very odd, but maybe I'm just
misunderstanding.


Re: Getting our tables to render better in PDF output

2020-04-13 Thread Tom Lane
Corey Huinker  writes:
> I was thinking that there were references that included parameters, but I'm
> not finding any with actual parameter values, so at most we'd lose the "()"
> of a reference.

We could possibly stick the parens into the indexterm text.  Arguably
that's an improvement on its own merits, since it'd become clearer which
index entries are function names.  If you don't want that, another idea is
to put xreflabel options that include the parens into the indexterm tags.
Or we can just standardize on not having parens, but personally I like
them.  Without parens, for clarity you really have to write "function
foo" which is redundant-looking in the XML and hence
easy to get wrong.

> Assuming we want to make the anchors visible, we need a way for people to
> discover the anchors we've made, and my thought there is that we make the
> first definition a non-xref link to the indexterm just above it. Any
> thoughts on what the best way to do that is?

I'm not really buying into that as a requirement.  For one thing, the
anchor name will be 100% predictable.

One thing that I noticed while playing with this last night is that
even though  or  links will take you right to the exact
table entry, the index entries generated from the indexterms only
point to the page.  That seems pretty sad, why isn't it better?

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-04-13 Thread Corey Huinker
>
>
> I did a quick check by adding id tags to all 700-or-so s in
> func.sgml (don't get excited, it was a perl one-liner that just added
> random id strings).


I did, actually, get excited for a second.


> The runtime difference for building the HTML docs
> seems to be under 1%, and negligible for PDF output.  So it looks like
> we don't have to worry about scalability of tagging all the functions.
>

Ok, so that's the function anchors.

So some references to functions are just the name, and xrefs will work fine
for those.

I was thinking that there were references that included parameters, but I'm
not finding any with actual parameter values, so at most we'd lose the "()"
of a reference.

Assuming we want to make the anchors visible, we need a way for people to
discover the anchors we've made, and my thought there is that we make the
first definition a non-xref link to the indexterm just above it. Any
thoughts on what the best way to do that is?


Re: Getting our tables to render better in PDF output

2020-04-12 Thread Tom Lane
Corey Huinker  writes:
> On Sat, Apr 11, 2020 at 6:41 PM Tom Lane  wrote:
>> Don't have a strong opinion about that, but it'd sure be a lot of new
>> anchors.

> So I can't speak to any scalability issues for adding a bunch of refs,

I did a quick check by adding id tags to all 700-or-so s in
func.sgml (don't get excited, it was a perl one-liner that just added
random id strings).  The runtime difference for building the HTML docs
seems to be under 1%, and negligible for PDF output.  So it looks like
we don't have to worry about scalability of tagging all the functions.

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-04-12 Thread Corey Huinker
On Sun, Apr 12, 2020 at 8:38 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > On Sat, Apr 11, 2020 at 6:41 PM Tom Lane  wrote:
> >> Is that going to be a problem for the docs toolchain?  If
> >> the anchors are attached to individual function names rather than
> >> sections or paragraphs, do they actually work well as link references?
> >> (I'm particularly wondering how an  would render.)
>
> > So I can't speak to any scalability issues for adding a bunch of refs,
> but
> > I did try this out for justify_days() (diff attached) and here's what I
> > found:
> > * justify_days
> >This made a link, in the same font as any other link ref.
> > * 
> >This made a link that looks exactly like the previous one, with the
> text
> > "justify_days", so if we're fine with the font change, we could use that
> > *  > linkend="function-justify-days">justify_days
> >This made the link we want in the function font.
>
> Hm.  Attaching the link ID to an  is an interesting hack.
>

it worked for glossterms, I figured an indexterm is just another 'term.


> My inclination is to standardize on using  for references and
> just accept the lack of a special font.  It's not worth the notational
> pain to use both  and , especially not in HTML output
> where links will probably get rendered specially anyway.  We
> previously made the same tradeoff with respect to GUC variables,
> and I've not seen many complaints.  (I experimented with putting
>  into the indexterm text, but that did not help.)
>
> I'd be a bit inclined to shorten the ID prefix to "func-", just
> in the interests of carpal tunnel avoidance.
>

xref it is. I'll take a shot and scripting the necessary changes.


Re: Getting our tables to render better in PDF output

2020-04-12 Thread Tom Lane
Corey Huinker  writes:
> On Sat, Apr 11, 2020 at 6:41 PM Tom Lane  wrote:
>> Is that going to be a problem for the docs toolchain?  If
>> the anchors are attached to individual function names rather than
>> sections or paragraphs, do they actually work well as link references?
>> (I'm particularly wondering how an  would render.)

> So I can't speak to any scalability issues for adding a bunch of refs, but
> I did try this out for justify_days() (diff attached) and here's what I
> found:
> * justify_days
>This made a link, in the same font as any other link ref.
> * 
>This made a link that looks exactly like the previous one, with the text
> "justify_days", so if we're fine with the font change, we could use that
> *  linkend="function-justify-days">justify_days
>This made the link we want in the function font.

Hm.  Attaching the link ID to an  is an interesting hack.
It makes me nervous, because it's not immediately obvious that that
won't cause links to lead to someplace in the index.  Still, it does
seem to work the way we want in both HTML and PDF output, so maybe
we can get away with it.  We've previously found that attaching an
ID to a  does *not* work, at least not in PDF --- see the
existing attempts for function-encode and function-decode, which
give rise to PDF build warnings and no functioning links.  I checked
just now and attaching the ID to the  acts the same, so it
seems it's  or nothing.

My inclination is to standardize on using  for references and
just accept the lack of a special font.  It's not worth the notational
pain to use both  and , especially not in HTML output
where links will probably get rendered specially anyway.  We
previously made the same tradeoff with respect to GUC variables,
and I've not seen many complaints.  (I experimented with putting
 into the indexterm text, but that did not help.)

I'd be a bit inclined to shorten the ID prefix to "func-", just
in the interests of carpal tunnel avoidance.

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-04-12 Thread Tom Lane
Alexander Lakhin  writes:
> 12.04.2020 20:33, Tom Lane wrote:
>> I educated myself a teensy bit about XSL, and unless I'm missing
>> something, this is really pretty darn trivial; the attached seems
>> to do the trick.

> I've come to almost the same solution simultaneously. I think this
> should work for us.

Thanks for looking at it!  I did some more polishing on the first
batch of tables and pushed it --- see what you think.

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-04-12 Thread Alexander Lakhin
Hello Tom,
12.04.2020 20:33, Tom Lane wrote:
> I wrote:
>> So if we can get  to both insert a right arrow and switch the
>> font to match 's choice, this would work more or less decently, and
>> it's probably cleaner than the bare-entity-reference approach I posted
>> before.  I don't have the XSL skills to get that to work though.
>> Anyone want to help out?
> I educated myself a teensy bit about XSL, and unless I'm missing
> something, this is really pretty darn trivial; the attached seems
> to do the trick.
I've come to almost the same solution simultaneously. I think this
should work for us.

Best regards,
Alexander





Re: Getting our tables to render better in PDF output

2020-04-12 Thread Tom Lane
I wrote:
> So if we can get  to both insert a right arrow and switch the
> font to match 's choice, this would work more or less decently, and
> it's probably cleaner than the bare-entity-reference approach I posted
> before.  I don't have the XSL skills to get that to work though.
> Anyone want to help out?

I educated myself a teensy bit about XSL, and unless I'm missing
something, this is really pretty darn trivial; the attached seems
to do the trick.

I experimented with the markup from  and decided that
I didn't like their choice of a smaller font size in this context;
it looks better to me to leave the arrow full-size.  The important
thing to learn from that precedent seems to be that we have to
specify the font correctly, as indeed is mentioned in the docbook
documentation.  So it seems to work well to just use

 

(The extra space seems to be necessary, else the arrow ends up
adjacent to the type name.)

So I'm pretty happy with this implementation and will push forward.

regards, tom lane

diff --git a/doc/src/sgml/stylesheet-common.xsl b/doc/src/sgml/stylesheet-common.xsl
index e148c90..5936d9a 100644
--- a/doc/src/sgml/stylesheet-common.xsl
+++ b/doc/src/sgml/stylesheet-common.xsl
@@ -49,6 +49,11 @@
   
 
 
+
+  
+  
+
+
 
   
 
diff --git a/doc/src/sgml/stylesheet-fo.xsl b/doc/src/sgml/stylesheet-fo.xsl
index ea75408..a3b6463 100644
--- a/doc/src/sgml/stylesheet-fo.xsl
+++ b/doc/src/sgml/stylesheet-fo.xsl
@@ -63,6 +63,12 @@
   
 
 
+
+
+   
+  
+
+
 
 
 


Re: Getting our tables to render better in PDF output

2020-04-12 Thread Jürgen Purtz


On 11.04.20 22:51, Tom Lane wrote:

Yet another possibility is to use the docbook tags:
func()
int.
Then we can define the desired formatting for such markup (similar to
..).

I looked into this.  It appears that  is fairly tightly tied
to C function declaration syntax, plus it sounds like it might get
deprecated in future docbook versions.


funcsynopsis, funcdef, function, ... keeps valid in Docbook 5, see: 
https://tdg.docbook.org/tdg/5.1/funcsynopsis.html . There is even an 
option to distinguish between K and ANSI style during rendering: 



Kind regards, Jürgen Purtz



Re: Getting our tables to render better in PDF output

2020-04-11 Thread Corey Huinker
On Sat, Apr 11, 2020 at 6:41 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > If it's ok to work on doc patches during the feature freeze, and if we're
> > already tweaking function documentation, would it be possible to add in
> > anchor ids to function definitions so that we could reference specific
> > functions (or rather the family of functions that share a name like this:
> >
> https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTION-DATE-PART
> > or similar. I tried it out just now, and the anchoring works, but there's
> > no obvious place to acquire the anchored link, so presumably we'd
> > anchor-ize the function name itself.
>
> Don't have a strong opinion about that, but it'd sure be a lot of new
> anchors.


True, but it'd would be a lot better than pointing a person to a page that
has 20+ functions defined on it.


> Is that going to be a problem for the docs toolchain?  If
> the anchors are attached to individual function names rather than
> sections or paragraphs, do they actually work well as link references?
> (I'm particularly wondering how an  would render.)
>

So I can't speak to any scalability issues for adding a bunch of refs, but
I did try this out for justify_days() (diff attached) and here's what I
found:
* justify_days
   This made a link, in the same font as any other link ref.
* 
   This made a link that looks exactly like the previous one, with the text
"justify_days", so if we're fine with the font change, we could use that
* justify_days
   This made the link we want in the function font.

The docbook spec doesn't allow an xref inside a function tag, and no tags
at all can be inside an xref.
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index c2e42f31c0..0b33d32a1b 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2766,7 +2766,9 @@ SELECT EXTRACT(days from '80 hours'::interval);
  0
 
 
- Functions justify_days and
+ Functions  and
+ Functions justify_days and
+ Functions justify_days and
  justify_hours are available for adjusting days
  and hours that overflow their normal ranges.
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 12d75b476f..fd8ba334f8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -7079,7 +7079,7 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
 

 
- 
+ 
   justify_days
  
  justify_days(interval)


Re: Getting our tables to render better in PDF output

2020-04-11 Thread Tom Lane
Corey Huinker  writes:
> If it's ok to work on doc patches during the feature freeze, and if we're
> already tweaking function documentation, would it be possible to add in
> anchor ids to function definitions so that we could reference specific
> functions (or rather the family of functions that share a name like this:
> https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTION-DATE-PART
> or similar. I tried it out just now, and the anchoring works, but there's
> no obvious place to acquire the anchored link, so presumably we'd
> anchor-ize the function name itself.

Don't have a strong opinion about that, but it'd sure be a lot of new
anchors.  Is that going to be a problem for the docs toolchain?  If
the anchors are attached to individual function names rather than
sections or paragraphs, do they actually work well as link references?
(I'm particularly wondering how an  would render.)

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-04-11 Thread Corey Huinker
On Sat, Apr 11, 2020 at 4:51 PM Tom Lane  wrote:

> I set this idea aside during the final v13 commitfest, but I figure that
> it's fine to work on documentation improvements during feature freeze,
> so I'm going to try to push it forward over the next few weeks.


If it's ok to work on doc patches during the feature freeze, and if we're
already tweaking function documentation, would it be possible to add in
anchor ids to function definitions so that we could reference specific
functions (or rather the family of functions that share a name like this:
https://www.postgresql.org/docs/devel/functions-datetime.html#FUNCTION-DATE-PART
or similar. I tried it out just now, and the anchoring works, but there's
no obvious place to acquire the anchored link, so presumably we'd
anchor-ize the function name itself.


Re: Getting our tables to render better in PDF output

2020-04-11 Thread Tom Lane
I set this idea aside during the final v13 commitfest, but I figure that
it's fine to work on documentation improvements during feature freeze,
so I'm going to try to push it forward over the next few weeks.

Barring objections, I want to commit more or less what I posted at [1],
verify that it looks decent on the website, and then incrementally
convert the rest of our function/operator tables to the new style.
It's too big a job to get done in one commit, but a table or two at
a time seems like a reasonable approach.  After the table format
conversion is finished we can take a look at how much of a
bad-line-breaks issue we still have, and decide what to do about that.

First though, we need to nail down exactly what markup to use.

Alexander Lakhin  writes:
> Maybe it's better to use the same formatting as in the docbook xsl
> template (see docbook/stylesheet/docbook-xsl/xhtml-1_1/inline.xsl).
> There "$menuchoice.menu.separator" is enclosed in  font-size=".75em" font-family="{$symbol.font.family}">...
> and you can see the effect on page 536 (IPC parameters can be set in the
> System Administration Manager (SAM) under Kernel Configu-
> ration → Configurable Parameters.)

Yeah, I see that that uses a right-arrow and it looks quite decent in
both HTML and PDF renderings.  So we ought to borrow those markup details
rather than solving the problem from scratch.

> Yet another possibility is to use the docbook tags:
> func()
> int.
> Then we can define the desired formatting for such markup (similar to
> ..).

I looked into this.  It appears that  is fairly tightly tied
to C function declaration syntax, plus it sounds like it might get
deprecated in future docbook versions.  So I don't want to use that.
But we could use  which seems to be defined independently
of , and isn't being used in our docs at present.  I found
by experimentation that this doesn't work:

  date

(it complains that these two tag types can't be nested); but this does:

  date

So if we can get  to both insert a right arrow and switch the
font to match 's choice, this would work more or less decently, and
it's probably cleaner than the bare-entity-reference approach I posted
before.  I don't have the XSL skills to get that to work though.
Anyone want to help out?

regards, tom lane

[1] https://www.postgresql.org/message-id/23574.1581555393%40sss.pgh.pa.us




Re: Getting our tables to render better in PDF output

2020-02-16 Thread Alexander Lakhin
Hello Tom,
> 16.02.2020 23:07, Tom Lane wrote:
>
>
> I poked at this a little bit, and found that I could get a pretty
> decent-looking result if I hacked the .fo file to contain
> "→" rather than a bare
> right arrow.  (See attached screenshot, wherein the last rightarrow
> was fixed this way but the others weren't.)  However, I do not
> have much of a clue as to how such a fix might be injected into
> our stylesheets --- anybody have a suggestion?
Please look at the XSLT template for processing .fo before calling fop.
Maybe this can be done with just the existing stylesheet-fo.xsl, I'll
try to research this later.

Best regards,
Alexander

diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 0401a515df8..3be29dba9f1 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -169,10 +169,14 @@ XSLTPROC_FO_FLAGS += --stringparam img.src.path '$(srcdir)/'
 %-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) -o $@.tmp $(@D)/pg-customize-fo.xsl $@
+	mv $@.tmp $@
 
 %-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) -o $@.tmp $(@D)/pg-customize-fo.xsl $@
+	mv $@.tmp $@
 
 %.pdf: %.fo $(ALL_IMAGES)
 	$(FOP) -fo $< -pdf $@
diff --git a/doc/src/sgml/pg-customize-fo.xsl b/doc/src/sgml/pg-customize-fo.xsl
new file mode 100644
index 000..ba3138e2a3d
--- /dev/null
+++ b/doc/src/sgml/pg-customize-fo.xsl
@@ -0,0 +1,42 @@
+
+http://www.w3.org/1999/XSL/Format; version="1.0"
+xmlns:xsl="http://www.w3.org/1999/XSL/Transform;>
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+  
+
+  
+
+
+
+  
+
+  
+  
+
+  
+
+  
+
+


Re: Getting our tables to render better in PDF output

2020-02-14 Thread Tom Lane
Alvaro Herrera  writes:
> On 2020-Feb-13, Alexander Lakhin wrote:
>> Third (minor) issue is with translation - when I will see some break in
>> the English source, e.g. "split_part('abc~@~def~@~ghi', '~@~',
>> 2)", should I leave the break in the same place, or it's better to move
>> it because adjacent text has different length and the table columns have
>> different width?

> If the English version is warning-clean, then it should be possible to
> keep the zwsps in the same location in the translation, and then tweak
> the translation according to any new warnings that appear there.
> My guess is that the majority of zwsps are going to want to stay in the
> same place.

So far as I've seen, the majority of places where we'll still need to
insert break opportunities are in examples and example results, which
don't seem like they'd be subject to translation.  I'm really not eager to
turn loose an automatic-zwsp-inserter for a problem that might be mostly
hypothetical once we have a more forgiving table layout in place.

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-02-14 Thread Alvaro Herrera
On 2020-Feb-13, Alexander Lakhin wrote:

> Yes, I was starting with manual  insertions into the translation,
> but later I reduced such insertions just to several dozens. (For
> example, we still have "3.14159265358979323846" in the translation.)
> The main issue of the manual approach was that I needed to recheck that
> zwsp placement on updates, and I can't see where it's desired until I
> generate pdf. Fortunately, fop prints warning like that:
> [WARN] FOUserAgent - The contents of fo:block line 2 exceed the
> available area in the inline-progression direction by 22725 millipoints.
> (See position 127769:983)
> It's not very user-friendly, but still useful when we have a pair or two
> of them.

It seems to me that a productive way forward would be to fix the layout
to make these warning disappear. Then it will be relatively easy to find
where to fix, if new ones appear.

Now I suppose you're complaining about the "position 127769:983" part of
the error message which tells you with zero clarity where the problem
is.  Maybe what we need is to figure out what the numbers mean, and how
to use them; for example if they are byte offsets into the file, then it
should be possible to tell your editor to go to that byte in the
complete XML file.

> Second issue is that the placement can depend on the page size and in
> fact most of that zwsps are not needed for html or other formats
> (moreover, some formats can require different placements (if we're not
> just implementing some common rules)).

I suppose A4 page size is going to show slightly different warnings than
Letter page size in the PDF output.  Perhaps we can say that we only
care about warnings in one of them, for these purposes.

Having to touch 500+ places does not sound very appetizing, for sure.

> Third (minor) issue is with translation - when I will see some break in
> the English source, e.g. "split_part('abc~@~def~@~ghi', '~@~',
> 2)", should I leave the break in the same place, or it's better to move
> it because adjacent text has different length and the table columns have
> different width?

If the English version is warning-clean, then it should be possible to
keep the zwsps in the same location in the translation, and then tweak
the translation according to any new warnings that appear there.
My guess is that the majority of zwsps are going to want to stay in the
same place.

> Maybe some of the rules can be implemented explicitly in the DocBook
> source, just to reduce tons of zwsp in the generated output, or the
> "fo:table-cell/fo:block//text()" condition can be improved to filter
> some (text-only?) tables out, but I think that the idea of our specific
> line breaking rules could work.

Maybe we can mark-up specific table cells/columns as being subject to
the special line breaking rules.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Getting our tables to render better in PDF output

2020-02-12 Thread Alexander Lakhin
12.02.2020 23:58, Tom Lane wrote:
> Alexander Lakhin  writes:
>> Please look at a less invasive approach that we use at Postgres Pro for
>> some time (mainly for improving the translated documentation, but it
>> works for the original one too). The idea is to add zero-width spaces
>> after/before some chars ('(', ',', '[', etc) to let fop split lines
>> where desired. It has one disadvantage - it's not search-friendly
>> (though maybe that is application-dependent).
>> But if it's feasible, I think this approach can at least complement a
>> manual tables reformatting. Decreasing a font size in the tables seems
>> appropriate to me too.
> Hmm, interesting proposal.  I experimented and verified that injecting
> zero-width space () does allow line breaking to occur in both
> HTML and PDF output, so this could be a route to improving the situation
> for overlength example texts.  I do not think I like the idea of
> automatically injecting tons of them, though.  As you say, it might
> hinder searching; and it would allow some silly breaks; and there are
> cases where it still wouldn't find a break, such as the examples for
> sha256() et al.  I'd be happier about manually inserting breaks just
> in the places we really need them.  To keep the source readable, I'd
> want to write something like "" not a numeric entity code,
> but it looks like we can define custom entities if we want.
Yes, I was starting with manual  insertions into the translation,
but later I reduced such insertions just to several dozens. (For
example, we still have "3.14159265358979323846" in the translation.)
The main issue of the manual approach was that I needed to recheck that
zwsp placement on updates, and I can't see where it's desired until I
generate pdf. Fortunately, fop prints warning like that:
[WARN] FOUserAgent - The contents of fo:block line 2 exceed the
available area in the inline-progression direction by 22725 millipoints.
(See position 127769:983)
It's not very user-friendly, but still useful when we have a pair or two
of them. (For now, I see 559 such warnings in REL_12_STABLE.)
Second issue is that the placement can depend on the page size and in
fact most of that zwsps are not needed for html or other formats
(moreover, some formats can require different placements (if we're not
just implementing some common rules)).
Third (minor) issue is with translation - when I will see some break in
the English source, e.g. "split_part('abc~@~def~@~ghi', '~@~',
2)", should I leave the break in the same place, or it's better to move
it because adjacent text has different length and the table columns have
different width?

For me this approach expresses a belief that the line breaking rules
should be slightly different in our context. For example, having line
break after an opening bracket is feasible and common in function calls
and declarations. Maybe the rules in the proposed xslt could be
improved/restricted, but I think that if fop would allow us to enable an
imaginary 'programming language line breaking rules' mode, we would use
it for our tables (some or all).
Maybe some of the rules can be implemented explicitly in the DocBook
source, just to reduce tons of zwsp in the generated output, or the
"fo:table-cell/fo:block//text()" condition can be improved to filter
some (text-only?) tables out, but I think that the idea of our specific
line breaking rules could work.

Best regards,
Alexander




Re: Getting our tables to render better in PDF output

2020-02-12 Thread Tom Lane
Alvaro Herrera  writes:
> On 2020-Feb-12, Tom Lane wrote:
>> I also attached a screenshot of a segment of Table 9-31, to show
>> what that layout proposal looks like.  It's a little busier, but
>> it does have the advantage that it's clearer how to apply that
>> format to operator tables.  The "returns " notation isn't used
>> anywhere in SQL for operators, so I am not in love with the idea of
>> writing the operator tables that way.

> Yeah, that's a little less obvious.  I just noticed that the operators
> tables show the operator names but not the input datatypes except in the
> examples.  Perhaps we could use a layout with a cell labelled
> "signature" (namest=col2 nameend=col3) instead of input types + return
> types and separate them using  which would look like this:
>date + integer → date

Oh, that's a thought.  We could do the same for functions:

function name   type1, type2, type3 → rettype
description ...
example example result

which'd relieve the column-width pressure for functions with several
arguments.  On the other hand, that would look a little funny
for functions with no arguments ... not but what they're going to
look funny no matter what.  I used "none" in my conversion of
table 9.31, but wasn't satisfied with that, because it relies
completely on font choice to be distinguishable from a data type
named "none".  With a separate argument-types cell it'd likely be
better to just leave the cell empty, but do we want to write
just "→ rettype" in a signature cell?

The other thing I was struggling with was how to distinguish
normal zero-argument functions (written with parens) from those
SQL abominations that are function calls with no parens.  I think
we need to show that somehow, so that it's clear that the examples
are correct and not typos.  It doesn't have to be *totally* obvious,
perhaps, if we have an example to back it up ... but the example
can't be the only thing.

Maybe don't take out the parens?  So it'd work like

Function   Signature

age(timestamp) → interval

now() → timestamp with time zone

current_timestamp  → timestamp with time zone

Also, I think we're both imagining that we'd use the operator name
in operator signatures:

Operator   Signature

+  integer + integer → integer

+  + integer → integer

so being consistent with that might suggest including the function name
in function signatures:

Function   Signature

ageage(timestamp) → interval

nownow() → timestamp with time zone

current_timestamp  current_timestamp → timestamp with time zone

I'm a bit suspicious of how much horizontal space that would eat, but
if we're able to get rid of the separate cell for result type, it
might work out OK.

regards, tom lane




Re: Getting our tables to render better in PDF output

2020-02-12 Thread Alvaro Herrera
On 2020-Feb-12, Tom Lane wrote:

> For amusement's sake, attached is a screenshot of what Table 9-33
> looks like in A4 format, with my one-row-per-example patch of
> yesterday plus a few manually-added zero-width spaces to break up
> the examples.  This is the first PDF rendering of that table that
> I've seen that I actually like.

I like this.  The trick of mkaing the first cell take up two or three
rows makes this much clearer and sensible than what I had obtained.

> I also attached a screenshot of a segment of Table 9-31, to show
> what that layout proposal looks like.  It's a little busier, but
> it does have the advantage that it's clearer how to apply that
> format to operator tables.  The "returns " notation isn't used
> anywhere in SQL for operators, so I am not in love with the idea of
> writing the operator tables that way.

Yeah, that's a little less obvious.  I just noticed that the operators
tables show the operator names but not the input datatypes except in the
examples.  Perhaps we could use a layout with a cell labelled
"signature" (namest=col2 nameend=col3) instead of input types + return
types and separate them using  which would look like this:
   date + integer → date

> Also worth noting is that in most function tables, and certainly
> in the operator tables, we could make the first column narrower.
> The same table with the first column half as wide as the others
> is depicted in the last screenshot.  (For this particular table,
> doing that would require breaking some of the longer function
> names such as transaction_timestamp.  Not sure whether that's
> a net win, but we do have the option.)

I like making that column narrower.

> One issue that I've found is that the toolchain has no idea that
> the table rows are in groups, so it's happy to split a table
> across pages with a function's description and/or examples on
> a new page.  No idea if there's any way around that.  Fortunately
> it's not an issue in HTML, so maybe we don't have to fix it.

My vote goes to postponing a solution to this problem :-)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Getting our tables to render better in PDF output

2020-02-11 Thread Alexander Lakhin
Hello Tom,
12.02.2020 00:51, Tom Lane wrote:
> The crummy formatting of our tables of functions and operators has
> been an issue for a long time.  To my mind, there are several things
> that need to be addressed:
>
> * The layout is completely unfriendly to function descriptions that
> run to more than a few words.
>
> * It's not very practical to have more than one example per function
> (or at least, we seldom do so).
>
> * The results look completely awful in PDF format, because of the
> narrow effectively-available space, plus the fact that the toolchain
> will prefer to overprint following columns instead of breaking text
> where there's no whitespace.
Please look at a less invasive approach that we use at Postgres Pro for
some time (mainly for improving the translated documentation, but it
works for the original one too). The idea is to add zero-width spaces
after/before some chars ('(', ',', '[', etc) to let fop split lines
where desired. It has one disadvantage - it's not search-friendly
(though maybe that is application-dependent).
But if it's feasible, I think this approach can at least complement a
manual tables reformatting. Decreasing a font size in the tables seems
appropriate to me too.

Best regards,
Alexander
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 0401a515df8..3be29dba9f1 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -169,10 +169,14 @@ XSLTPROC_FO_FLAGS += --stringparam img.src.path '$(srcdir)/'
 %-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) -o $@.tmp $(@D)/pg-customize-fo.xsl $@
+	mv $@.tmp $@
 
 %-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) -o $@.tmp $(@D)/pg-customize-fo.xsl $@
+	mv $@.tmp $@
 
 %.pdf: %.fo $(ALL_IMAGES)
 	$(FOP) -fo $< -pdf $@
diff --git a/doc/src/sgml/pg-customize-fo.xsl b/doc/src/sgml/pg-customize-fo.xsl
new file mode 100644
index 000..48978880126
--- /dev/null
+++ b/doc/src/sgml/pg-customize-fo.xsl
@@ -0,0 +1,118 @@
+
+http://www.w3.org/1999/XSL/Format; version="1.0"
+xmlns:xsl="http://www.w3.org/1999/XSL/Transform;>
+
+  
+
+  
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+
+  
+
+
+
+  
+  
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+


Getting our tables to render better in PDF output

2020-02-11 Thread Tom Lane
The crummy formatting of our tables of functions and operators has
been an issue for a long time.  To my mind, there are several things
that need to be addressed:

* The layout is completely unfriendly to function descriptions that
run to more than a few words.

* It's not very practical to have more than one example per function
(or at least, we seldom do so).

* The results look completely awful in PDF format, because of the
narrow effectively-available space, plus the fact that the toolchain
will prefer to overprint following columns instead of breaking text
where there's no whitespace.

In [1], Alvaro suggested that we might be able to improve matters by
taking advantage of DocBook's features for column and row spanning.
I did some concrete experimentation in that line, and attached are
two alternative patches that show a couple of things we might do.
Both patches change tables 9.31 (Date/Time Functions) and 9.33
(Enum Support Functions), which I chose somewhat at random, but of
course there would be a lot more to be done if we choose to go this way.

The first patch uses only one row for each function example, while
the second patch uses two rows (i.e., example and result in separate
table rows).  Otherwise they're the same.

I initially did the enum-support table, and what I tried there included
getting rid of the separate table column for function result type by
writing the functions in the form "func(argtypes) returns resulttype".
(Note that this table failed to specify the result types at all before,
which doesn't seem great.)  The layout idea is

function name, args, result   description
 example example result

where we can repeat the "example / example result" row if we want more
examples per function.  Alternatively, in the second patch, it's

function name, args, result  description
 example
 example result

To my eyes, the first alternative is preferable in HTML, unless maybe you
want to read the manual in a *very* narrow browser window.  But some of
the examples/results still overrun the available space when looking
at it in PDF A4 format.  The second patch fixes that problem, but seems
not very pretty in a normal-width browser window.

When I tried to apply the same idea to the date/time functions table,
it didn't really work well at all, mainly because of a few beasts like
make_interval() --- that caused the left column to be so wide that the
right-hand columns were horrid.  (At least with the toolchain version
I'm using, it seems like the colwidth specifications are respected
rigidly in PDF output but just plain ignored in HTML output.  What
seems to happen in HTML is that earlier columns get their preferred
width and later ones get squeezed.)

So the layout idea that the patches show for that table is

function name  arg types result type
   description
   example example result

or

function name  arg types result type
   description
   example
   example result

(Even with that, I had to savage make_interval's arg-types list a bit
to keep that column from eating too much space...)

I'm not especially wedded to any of these ideas, but I hope to provoke
some discussion about what we might do in this area.  DocBook tables
aren't the greatest layout tool in the world, but they do have abilities
we're not exploiting.

Even with these changes, the amount of space available for examples
and results in PDF format is pretty tiny.  With examples and results
in the same row, it seems that you can only have a couple of dozen
consecutive non-whitespace characters without running into overwrite
issues, whereas in HTML format the trouble threshold is a good deal
higher.  I wonder if we could improve matters by switching to some
narrower font for  text in PDF?

regards, tom lane

[1] 
https://www.postgresql.org/message-id/2020011618.GA25792%40alvherre.pgsql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ceda48e..385fdc0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6798,471 +6798,618 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
 
 
  Date/Time Functions
- 
+ 
+  
+  
+  
+  
+  
+  
+  
+  
+  
   

-Function
-Return Type
-Description
-Example
-Result
+Function
+Argument Types
+Result Type
+   
+   
+Description
+   
+   
+Example
+Example Result

   
 
   

-
+
  
   age
  
- age(timestamp, timestamp)
+ age
 
-interval
-Subtract arguments, producing a symbolic