Re: documentation structure

2024-04-28 Thread Corey Huinker
>
> I've splitted it to7 patches.
> each patch split one  into separate new files.
>

Seems like a good start. Looking at the diffs of these, I wonder if we
would be better off with a func/ directory, each function gets its own file
in that dir, and either these files above include the individual files, or
the original func.sgml just becomes the organizer of all the functions.
That would allow us to do future reorganizations with minimal churn, make
validation of this patch a bit more straightforward, and make it easier for
future editors to find the function they need to edit.


Re: documentation structure

2024-04-19 Thread jian he
On Wed, Apr 17, 2024 at 7:07 PM Dagfinn Ilmari Mannsåker
 wrote:
>
>
> > It'd also be quite useful if clients could render more of the documentation
> > for functions. People are used to language servers providing full
> > documentation for functions etc...
>
> A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
> with \h for commands?) would certainly be nice.
>

I think `\hf` is useful.
otherwise people first need google to find out the function html page,
then need Ctrl + F to locate specific function entry.

for \hf
we may need to offer a doc url link.
but currently many functions are unlinkable in the doc.
Also one section can have many sections.
I guess just linking directly to a nearby position in a html page
should be fine.


We can also add a url for functions decorated as underscore
like mysql 
(https://dev.mysql.com/doc/refman/8.3/en/string-functions.html#function_concat).
I am not sure it is an elegant solution.




Re: documentation structure

2024-04-18 Thread Corey Huinker
>
> Yeah, we can't expect everyone wanting to call a built-in function to
> know how they would define an equivalent one themselves. In that case I
> propos marking it up like this:
>
> format (
> formatstr text
> , formatarg "any"
> , ...  )
> text
>

Looks good, but I guess I have to ask: is there a parameter-list tag out
there instead of (, and should we be using that?



> The requisite nesting when there are multiple optional parameters makes
> it annoying to wrap and indent it "properly" per XML convention, but how
> about something like this, with each parameter on a line of its own, and
> all the closing  tags on one line?
>
> regexp_substr (
> string text,
> pattern text
> , start integer
> , N integer
> , flags text
> , subexpr integer
> )
> text
>

Yes, that has an easy count-the-vertical, count-the-horizontal,
do-they-match flow to it.


> A lot of functions mostly follow this style, except they tend to put the
> first parameter on the same line of the function namee, even when that
> makes the line overly long. I propose going the other way, with each
> parameter on a line of its own, even if the first one would fit after
> the function name, except the whole parameter list fits after the
> function name.
>

+1


>
> Also, when there's only one optional argument, or they're independently
> optional, not nested, the  tag should go on the same line as
> the parameter.
>
> substring (
> bits bit
>  FROM start
> integer 
>  FOR count
> integer  )
> bit
>

+1


Re: documentation structure

2024-04-18 Thread Dagfinn Ilmari Mannsåker
Corey Huinker  writes:

>>
>> I havent dealt with variadic yet, since the two styles are visually
>> different, not just markup (... renders as [...]).
>>
>> The two styles for variadic are the what I call caller-style:
>>
>>concat ( val1 "any" [, val2 "any" [, ...] ] )
>>format(formatstr text [, formatarg "any" [, ...] ])
>>
>
> While this style is obviously clumsier for us to compose, it does avoid
> relying on the user understanding what the word variadic means. Searching
> through online documentation of the python *args parameter, the word
> variadic never comes up, the closest they get is "variable length
> argument". I realize that python is not SQL, but I think it's a good point
> of reference for what concepts the average reader is likely to know.

Yeah, we can't expect everyone wanting to call a built-in function to
know how they would define an equivalent one themselves. In that case I
propos marking it up like this:

format (
formatstr text
, formatarg "any"
, ...  )
text


> Looking at the patch, I think it is good, though I'd consider doing some
> indentation for the nested s to allow the author to do more
> visual tag-matching. The ']'s were sufficiently visually distinct that we
> didn't really need or want nesting, but  is just another tag to
> my eyes in a sea of tags.

The requisite nesting when there are multiple optional parameters makes
it annoying to wrap and indent it "properly" per XML convention, but how
about something like this, with each parameter on a line of its own, and
all the closing  tags on one line?

regexp_substr (
string text,
pattern text
, start integer
, N integer
, flags text
, subexpr integer
)
text

A lot of functions mostly follow this style, except they tend to put the
first parameter on the same line of the function namee, even when that
makes the line overly long. I propose going the other way, with each
parameter on a line of its own, even if the first one would fit after
the function name, except the whole parameter list fits after the
function name.

Also, when there's only one optional argument, or they're independently
optional, not nested, the  tag should go on the same line as
the parameter.

substring (
bits bit
 FROM start 
integer 
 FOR count 
integer  )
bit


I'm not quite sure what to with things like json_object which have even
more complex nexting of optional parameters, but I do think the current
200+ character lines are too long.

- ilmari




Re: documentation structure

2024-04-18 Thread Corey Huinker
>
> I havent dealt with variadic yet, since the two styles are visually
> different, not just markup (... renders as [...]).
>
> The two styles for variadic are the what I call caller-style:
>
>concat ( val1 "any" [, val2 "any" [, ...] ] )
>format(formatstr text [, formatarg "any" [, ...] ])
>

While this style is obviously clumsier for us to compose, it does avoid
relying on the user understanding what the word variadic means. Searching
through online documentation of the python *args parameter, the word
variadic never comes up, the closest they get is "variable length
argument". I realize that python is not SQL, but I think it's a good point
of reference for what concepts the average reader is likely to know.

Looking at the patch, I think it is good, though I'd consider doing some
indentation for the nested s to allow the author to do more
visual tag-matching. The ']'s were sufficiently visually distinct that we
didn't really need or want nesting, but  is just another tag to
my eyes in a sea of tags.


Re: documentation structure

2024-04-18 Thread jian he
On Thu, Apr 18, 2024 at 2:37 AM Dagfinn Ilmari Mannsåker
 wrote:
>
> Andres Freund  writes:
>
> > Hi,
> >
> > On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
> >> Andres Freund  writes:
> >> > I think the manual work for writing signatures in sgml is not 
> >> > insignificant,
> >> > nor is the volume of sgml for them. Manually maintaining the signatures 
> >> > makes
> >> > it impractical to significantly improve the presentation - which I don't 
> >> > think
> >> > is all that great today.
> >>
> >> And it's very inconsistent.  For example, some functions use 
> >> tags for optional parameters, others use square brackets, and some use
> >> VARIADIC to indicate variadic parameters, others use
> >> ellipses (sometimes in  tags or brackets).
> >
> > That seems almost inevitably the outcome of many people having to manually
> > infer the recommended semantics, for writing something boring but 
> > nontrivial,
> > from a 30k line file.
>
> As Corey mentioned elsethread, having a markup style guide (maybe a
> comment at the top of the file?) would be nice.
>
> >> > And the lack of argument names in the pg_proc entries is occasionally 
> >> > fairly
> >> > annoying, because a \df+ doesn't provide enough information to use 
> >> > functions.
> >>
> >> I was also annoyed by this the other day (specifically wrt. the boolean
> >> arguments to pg_ls_dir),
> >
> > My bane is regexp_match et al, I have given up on remembering the argument
> > order.
>
> There's a thread elsewhere about those specifically, but I can't be
> bothered to find the link right now.
>
> >> and started whipping up a Perl script to parse func.sgml and generate
> >> missing proargnames values for pg_proc.dat, which is how I discovered the
> >> above.
> >
> > Nice.
> >
> >> The script currently has a pile of hacky regexes to cope with that,
> >> so I'd be happy to submit a doc patch to turn it into actual markup to get
> >> rid of that, if people think that's a worhtwhile use of time and won't 
> >> clash
> >> with any other plans for the documentation.
> >
> > I guess it's a bit hard to say without knowing how voluminious the changes
> > would be. If we end up rewriting the whole file the tradeoff is less clear
> > than if it's a dozen inconsistent entries.
>
> It turned out to not be that many that used [] for optional parameters,
> see the attached patch.
>

hi.
I manually checked the html output. It looks good to me.




Re: documentation structure

2024-04-17 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Andres Freund  writes:
>> > I think the manual work for writing signatures in sgml is not 
>> > insignificant,
>> > nor is the volume of sgml for them. Manually maintaining the signatures 
>> > makes
>> > it impractical to significantly improve the presentation - which I don't 
>> > think
>> > is all that great today.
>> 
>> And it's very inconsistent.  For example, some functions use 
>> tags for optional parameters, others use square brackets, and some use
>> VARIADIC to indicate variadic parameters, others use
>> ellipses (sometimes in  tags or brackets).
>
> That seems almost inevitably the outcome of many people having to manually
> infer the recommended semantics, for writing something boring but nontrivial,
> from a 30k line file.

As Corey mentioned elsethread, having a markup style guide (maybe a
comment at the top of the file?) would be nice.

>> > And the lack of argument names in the pg_proc entries is occasionally 
>> > fairly
>> > annoying, because a \df+ doesn't provide enough information to use 
>> > functions.
>> 
>> I was also annoyed by this the other day (specifically wrt. the boolean
>> arguments to pg_ls_dir),
>
> My bane is regexp_match et al, I have given up on remembering the argument
> order.

There's a thread elsewhere about those specifically, but I can't be
bothered to find the link right now.

>> and started whipping up a Perl script to parse func.sgml and generate
>> missing proargnames values for pg_proc.dat, which is how I discovered the
>> above.
>
> Nice.
>
>> The script currently has a pile of hacky regexes to cope with that,
>> so I'd be happy to submit a doc patch to turn it into actual markup to get
>> rid of that, if people think that's a worhtwhile use of time and won't clash
>> with any other plans for the documentation.
>
> I guess it's a bit hard to say without knowing how voluminious the changes
> would be. If we end up rewriting the whole file the tradeoff is less clear
> than if it's a dozen inconsistent entries.

It turned out to not be that many that used [] for optional parameters,
see the attached patch. 

I havent dealt with variadic yet, since the two styles are visually
different, not just markup (... renders as [...]).

The two styles for variadic are the what I call caller-style:

   concat ( val1 "any" [, val2 "any" [, ...] ] )
   format(formatstr text [, formatarg "any" [, ...] ])

which shows more clearly how you'd call it, versus definition-style:

   num_nonnulls ( VARIADIC "any" )
   jsonb_extract_path ( from_json jsonb, VARIADIC path_elems text[] )

which matches the CREATE FUNCTION statement.  I don't have a strong
opinion on which we should use, but we should be consistent.

> Greetings,
>
> Andres Freund

- ilmari

>From f71e0669eb25b205bd5065f15657ba6d749261f3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 17 Apr 2024 16:00:52 +0100
Subject: [PATCH] func.sgml: Consistently use  to indicate optional
 parameters

Some functions were using square brackets instead.
---
 doc/src/sgml/func.sgml | 54 +-
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8dfb42ad4d..afaaf61d69 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3036,7 +3036,7 @@
  concat
 
 concat ( val1 "any"
- [, val2 "any" [, ...] ] )
+ , val2 "any" [, ...]  )
 text


@@ -3056,7 +3056,7 @@
 
 concat_ws ( sep text,
 val1 "any"
-[, val2 "any" [, ...] ] )
+, val2 "any" [, ...]  )
 text


@@ -3076,7 +3076,7 @@
  format
 
 format ( formatstr text
-[, formatarg "any" [, ...] ] )
+, formatarg "any" [, ...]  )
 text


@@ -3170,7 +3170,7 @@
  parse_ident
 
 parse_ident ( qualified_identifier text
-[, strict_mode boolean DEFAULT true ] )
+, strict_mode boolean DEFAULT true  )
 text[]


@@ -3309,8 +3309,8 @@
  regexp_count
 
 regexp_count ( string text, pattern text
- [, start integer
- [, flags text ] ] )
+ , start integer
+ , flags text   )
 integer


@@ -3331,11 +3331,11 @@
  regexp_instr
 
 regexp_instr ( string text, pattern text
- [, start integer
- [, N integer
- [, endoption integer
- [, flags text
- [, subexpr integer ] ] ] ] ] )
+ , start integer
+ , N integer
+ , endoption integer
+ , flags text
+ , subexpr integer  )
 integer


@@ -3360,7 +3360,7 @@
  regexp_like
 
 regexp_like ( string text, pattern text
- [, flags text ] )

Re: documentation structure

2024-04-17 Thread Andres Freund
Hi,

On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
> Andres Freund  writes:
> > I think the manual work for writing signatures in sgml is not insignificant,
> > nor is the volume of sgml for them. Manually maintaining the signatures 
> > makes
> > it impractical to significantly improve the presentation - which I don't 
> > think
> > is all that great today.
> 
> And it's very inconsistent.  For example, some functions use 
> tags for optional parameters, others use square brackets, and some use
> VARIADIC to indicate variadic parameters, others use
> ellipses (sometimes in  tags or brackets).

That seems almost inevitably the outcome of many people having to manually
infer the recommended semantics, for writing something boring but nontrivial,
from a 30k line file.


> > And the lack of argument names in the pg_proc entries is occasionally fairly
> > annoying, because a \df+ doesn't provide enough information to use 
> > functions.
> 
> I was also annoyed by this the other day (specifically wrt. the boolean
> arguments to pg_ls_dir),

My bane is regexp_match et al, I have given up on remembering the argument
order.


> and started whipping up a Perl script to parse func.sgml and generate
> missing proargnames values for pg_proc.dat, which is how I discovered the
> above.

Nice.


> The script currently has a pile of hacky regexes to cope with that,
> so I'd be happy to submit a doc patch to turn it into actual markup to get
> rid of that, if people think that's a worhtwhile use of time and won't clash
> with any other plans for the documentation.

I guess it's a bit hard to say without knowing how voluminious the changes
would be. If we end up rewriting the whole file the tradeoff is less clear
than if it's a dozen inconsistent entries.


> > It'd also be quite useful if clients could render more of the documentation
> > for functions. People are used to language servers providing full
> > documentation for functions etc...
> 
> A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
> with \h for commands?) would certainly be nice.

Indeed.

Greetings,

Andres Freund




Re: documentation structure

2024-04-17 Thread Andres Freund
Hi,

On 2024-04-17 02:46:53 -0400, Corey Huinker wrote:
> > > This sounds to me like it would be a painful exercise with not a
> > > lot of benefit in the end.
> >
> > Maybe we could _verify_ the contents of func.sgml against pg_proc.
> >
> 
> All of the functions redefined in catalog/system_functions.sql complicate
> using pg_proc.dat as a doc generator or source of validation. We'd probably
> do better to validate against a live instance, and even then the benefit
> wouldn't be great.

There are 80 'CREATE OR REPLACE's in system_functions.sql, 1016 occurrences of
func_table_entry in funcs.sgml and 3.3k functions in pg_proc. I'm not saying
that differences due to system_functions.sql wouldn't be annoying to deal
with, but it'd also be far from the end of the world.

Greetings,

Andres Freund




Re: documentation structure

2024-04-17 Thread Corey Huinker
>
> And it's very inconsistent.  For example, some functions use 
> tags for optional parameters, others use square brackets, and some use
> VARIADIC to indicate variadic parameters, others use
> ellipses (sometimes in  tags or brackets).


Having just written a couple of those functions, I wasn't able to find any
guidance on how to document them with regards to  vs [], etc.
Having such a thing would be helpful.

While we're throwing out ideas, does it make sense to have function
parameters and return values be things that can accept COMMENTs? Like so:

COMMENT ON FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [,
...] ] ) ] ARGUMENT argname IS '';
COMMENT ON FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [,
...] ] ) ] RETURN VALUE IS '';

I don't think this is a great idea, but if we're going to auto-generate
documentation then we've got to store the metadata somewhere, and
pg_proc.dat is already lacking relevant details.


Re: documentation structure

2024-04-17 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Definitely shouldn't be the same in all cases, but I think there's a decent
> number of cases where they can be the same. The differences between the two is
> often minimal today.
>
> Entirely randomly chosen example:
>
> { oid => '2825',
>   descr => 'slope of the least-squares-fit linear equation determined by the 
> (X, Y) pairs',
>   proname => 'regr_slope', prokind => 'a', proisstrict => 'f',
>   prorettype => 'float8', proargtypes => 'float8 float8',
>   prosrc => 'aggregate_dummy' },
>
> and
>
>   
>
> 
>  regression slope
> 
> 
>  regr_slope
> 
> regr_slope ( Y 
> double precision, X double 
> precision )
> double precision
>
>
> Computes the slope of the least-squares-fit linear equation determined
> by the (X, Y)
> pairs.
>
>Yes
>   
>
>
> The description is quite similar, the pg_proc entry lacks argument names. 
>
>
>> This sounds to me like it would be a painful exercise with not a
>> lot of benefit in the end.
>
> I think the manual work for writing signatures in sgml is not insignificant,
> nor is the volume of sgml for them. Manually maintaining the signatures makes
> it impractical to significantly improve the presentation - which I don't think
> is all that great today.

And it's very inconsistent.  For example, some functions use 
tags for optional parameters, others use square brackets, and some use
VARIADIC to indicate variadic parameters, others use
ellipses (sometimes in  tags or brackets).

> And the lack of argument names in the pg_proc entries is occasionally fairly
> annoying, because a \df+ doesn't provide enough information to use functions.

I was also annoyed by this the other day (specifically wrt. the boolean
arguments to pg_ls_dir), and started whipping up a Perl script to parse
func.sgml and generate missing proargnames values for pg_proc.dat, which
is how I discovered the above.  The script currently has a pile of hacky
regexes to cope with that, so I'd be happy to submit a doc patch to turn
it into actual markup to get rid of that, if people think that's a
worhtwhile use of time and won't clash with any other plans for the
documentation.

> It'd also be quite useful if clients could render more of the documentation
> for functions. People are used to language servers providing full
> documentation for functions etc...

A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
with \h for commands?) would certainly be nice.

> Greetings,
>
> Andres Freund

- ilmari




Re: documentation structure

2024-04-17 Thread Corey Huinker
>
> > This sounds to me like it would be a painful exercise with not a
> > lot of benefit in the end.
>
> Maybe we could _verify_ the contents of func.sgml against pg_proc.
>

All of the functions redefined in catalog/system_functions.sql complicate
using pg_proc.dat as a doc generator or source of validation. We'd probably
do better to validate against a live instance, and even then the benefit
wouldn't be great.


Re: documentation structure

2024-04-16 Thread Andres Freund
Hi,

On 2024-04-16 15:05:32 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I think we should work on generating a lot of func.sgml.  Particularly the
> > signature etc should just come from pg_proc.dat, it's pointlessly painful to
> > generate that by hand. And for a lot of the functions we should probably 
> > move
> > the existing func.sgml comments to the description in pg_proc.dat.
>
> Where are you going to get the examples and text descriptions from?

I think there's a few different way to do that. E.g. having long_desc, example
fields in pg_proc.dat. Or having examples and description in a separate file
and "enriching" that with auto-generated function signatures.


> (And no, I don't agree that the pg_description string should match
> what's in the docs.  The description string has to be a short
> one-liner in just about every case.)

Definitely shouldn't be the same in all cases, but I think there's a decent
number of cases where they can be the same. The differences between the two is
often minimal today.

Entirely randomly chosen example:

{ oid => '2825',
  descr => 'slope of the least-squares-fit linear equation determined by the 
(X, Y) pairs',
  proname => 'regr_slope', prokind => 'a', proisstrict => 'f',
  prorettype => 'float8', proargtypes => 'float8 float8',
  prosrc => 'aggregate_dummy' },

and

  
   

 regression slope


 regr_slope

regr_slope ( Y double 
precision, X double precision )
double precision
   
   
Computes the slope of the least-squares-fit linear equation determined
by the (X, Y)
pairs.
   
   Yes
  


The description is quite similar, the pg_proc entry lacks argument names. 


> This sounds to me like it would be a painful exercise with not a
> lot of benefit in the end.

I think the manual work for writing signatures in sgml is not insignificant,
nor is the volume of sgml for them. Manually maintaining the signatures makes
it impractical to significantly improve the presentation - which I don't think
is all that great today.

And the lack of argument names in the pg_proc entries is occasionally fairly
annoying, because a \df+ doesn't provide enough information to use functions.

It'd also be quite useful if clients could render more of the documentation
for functions. People are used to language servers providing full
documentation for functions etc...

Greetings,

Andres Freund




Re: documentation structure

2024-04-16 Thread Bruce Momjian
On Tue, Apr 16, 2024 at 03:05:32PM -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I think we should work on generating a lot of func.sgml.  Particularly the
> > signature etc should just come from pg_proc.dat, it's pointlessly painful to
> > generate that by hand. And for a lot of the functions we should probably 
> > move
> > the existing func.sgml comments to the description in pg_proc.dat.
> 
> Where are you going to get the examples and text descriptions from?
> (And no, I don't agree that the pg_description string should match
> what's in the docs.  The description string has to be a short
> one-liner in just about every case.)
> 
> This sounds to me like it would be a painful exercise with not a
> lot of benefit in the end.

Maybe we could _verify_ the contents of func.sgml against pg_proc.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: documentation structure

2024-04-16 Thread Tom Lane
Andres Freund  writes:
> I think we should work on generating a lot of func.sgml.  Particularly the
> signature etc should just come from pg_proc.dat, it's pointlessly painful to
> generate that by hand. And for a lot of the functions we should probably move
> the existing func.sgml comments to the description in pg_proc.dat.

Where are you going to get the examples and text descriptions from?
(And no, I don't agree that the pg_description string should match
what's in the docs.  The description string has to be a short
one-liner in just about every case.)

This sounds to me like it would be a painful exercise with not a
lot of benefit in the end.

I do agree with Andrew that splitting func.sgml into multiple files
would be beneficial.

regards, tom lane




Re: documentation structure

2024-04-16 Thread Andres Freund
Hi,

On 2024-03-19 17:39:39 -0400, Andrew Dunstan wrote:
> My own pet docs peeve is a purely editorial one: func.sgml is a 30k line
> beast, and I think there's a good case for splitting out at least the
> larger chunks of it.

I think we should work on generating a lot of func.sgml.  Particularly the
signature etc should just come from pg_proc.dat, it's pointlessly painful to
generate that by hand. And for a lot of the functions we should probably move
the existing func.sgml comments to the description in pg_proc.dat.

I suspect that we can't just generate all the documentation from pg_proc,
because of xrefs etc.  Although perhaps we could just strip those out for
pg_proc.

We'd need to add some more metadata to pg_proc, for grouping kinds of
functions together. But that seems doable.

Greetings,

Andres Freund




Re: documentation structure

2024-04-15 Thread Robert Haas
On Mon, Apr 8, 2024 at 10:15 AM Peter Eisentraut  wrote:
> > Here is a new version of this patch. I think this is v18 material at
> > this point, absent an outcry to the contrary. Sometimes we're flexible
> > about doc patches.
>
> Looks good to me.  I think this could go into PG17.

Hearing no objections, done.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-04-14 Thread jian he
On Wed, Mar 20, 2024 at 5:40 AM Andrew Dunstan  wrote:
>
>
> +many for improving the index.
>
> My own pet docs peeve is a purely editorial one: func.sgml is a 30k line 
> beast, and I think there's a good case for splitting out at least the larger 
> chunks of it.
>

I think I successfully reduced func.sgml from 311322 lines to 13167 lines.
(base-commit: 93582974315174d544592185d797a2b44696d1e5)

writing a patch would be unreviewable.
key gotcha is put the contents between opening ``  and closing
`` (both inclusive)
into a new file.
in func.sgml, using ``  to refernce the new file.
also update filelist.sgml

here is how I do it:

I found out these build html files are the biggest one:
doc/src/sgml/html/functions-string.html
doc/src/sgml/html/functions-matching.html
doc/src/sgml/html/functions-datetime.html
doc/src/sgml/html/functions-json.html
doc/src/sgml/html/functions-aggregate.html
doc/src/sgml/html/functions-info.html
doc/src/sgml/html/functions-admin.html

so create these new sgml files hold corrspedoning content:
func-string.sgml
func-matching.sgml
func-datetime.sgml
func-json.sgml
func-aggregate.sgml
func-info.sgml
func-admin.sgml

based on funs.sgml structure pattern:

next section1 line number:



next section1 line number:



next section1 line number:



next section1 line number:



next section1 line number:



next section1 line number:



next section1 line number:


step1:   pipe the relative line range contents to new sgml files.
(example: line 2407 to line 4177 include all the content correspond to
functions-string.html)

sed -n '2407,4177 p' func.sgml > func-string.sgml
sed -n '5328,7756 p' func.sgml >  func-matching.sgml
sed -n '8939,11122 p' func.sgml > func-datetime.sgml
sed -n '15498,19348 p' func.sgml > func-json.sgml
sed -n '21479,22896 p' func.sgml > func-aggregate.sgml
sed -n '24257,27896 p' func.sgml > func-info.sgml
sed -n '27898,30579 p' func.sgml > func-admin.sgml

step2:
in place delete these line ranges in func.sgml
sed --in-place  "2407,4177d ; 5328,7756d ; 8939,11122d ; 15498,19348d
; 21479,22896d ; 24257,27896d ; 27898,30579d" \
func.sgml
reference: 
https://unix.stackexchange.com/questions/676210/matching-multiple-ranges-with-sed-range-expressions
   
https://www.gnu.org/software/sed/manual/sed.html#Command_002dLine-Options

step3:
put following lines into relative position in func.sgml:
(based on above structure pattern, quickly location line position)

`







`

step4: update filelist.sgml:
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 3fb0709f..0b78a361 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -18,6 +18,13 @@
 
 
 
+
+
+
+
+
+
+
 
 
 

 doc/src/sgml/filelist.sgml   | 7 +
 doc/src/sgml/func-admin.sgml |  2682 +
 doc/src/sgml/func-aggregate.sgml |  1418 +++
 doc/src/sgml/func-datetime.sgml  |  2184 
 doc/src/sgml/func-info.sgml  |  3640 ++
 doc/src/sgml/func-json.sgml  |  3851 ++
 doc/src/sgml/func-matching.sgml  |  2429 
 doc/src/sgml/func-string.sgml|  1771 +++
 doc/src/sgml/func.sgml   | 17979 +

we can do it one by one, but it's still worth it.




Re: documentation structure

2024-04-08 Thread Peter Eisentraut

On 05.04.24 17:11, Robert Haas wrote:

4. Consolidate the "Generic WAL Records" and "Custom WAL Resource
Managers" chapters, which cover related topics, into a single one. I
didn't see anyone object to this, but David Johnston pointed out that
the patch I posted was a few bricks short of a load, because it really
needed to put some introductory text into the new chapter. I'll study
this a bit more and propose a new patch that does the same thing a bit
more carefully than my previous version did.


Here is a new version of this patch. I think this is v18 material at
this point, absent an outcry to the contrary. Sometimes we're flexible
about doc patches.


Looks good to me.  I think this could go into PG17.




Re: documentation structure

2024-04-05 Thread David G. Johnston
On Fri, Apr 5, 2024 at 9:18 AM Robert Haas  wrote:

> On Fri, Apr 5, 2024 at 12:15 PM David G. Johnston
>  wrote:
> > Here is a link to my attempt at this a couple of years ago.  It
> basically "abuses" refentry.
> >
> >
> https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com
> >
> > I never did dive into the man page or PDF dynamics of this particular
> change but it seemed to solve HTML pagination without negative consequences
> and with minimal risk of unintended consequences since only the markup on
> the pages we want to alter is changed, not global configuration.
>
> Hmm, but it seems like that might have generated some man page entries
> that we don't want?
>

If so (didn't check) maybe just remove them in post?

David J.


Re: documentation structure

2024-04-05 Thread Robert Haas
On Fri, Apr 5, 2024 at 12:15 PM David G. Johnston
 wrote:
> Here is a link to my attempt at this a couple of years ago.  It basically 
> "abuses" refentry.
>
> https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com
>
> I never did dive into the man page or PDF dynamics of this particular change 
> but it seemed to solve HTML pagination without negative consequences and with 
> minimal risk of unintended consequences since only the markup on the pages we 
> want to alter is changed, not global configuration.

Hmm, but it seems like that might have generated some man page entries
that we don't want?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-04-05 Thread David G. Johnston
On Fri, Apr 5, 2024 at 9:01 AM Robert Haas  wrote:

>
> > The rendering can be adjusted to some degree, but then we also need to
> > make sure any new chunking makes sense in other chapters.  (And it might
> > also change a bunch of externally known HTML links.)
>
> I looked into this and I'm unclear how much customization is possible.
>
>
Here is a link to my attempt at this a couple of years ago.  It basically
"abuses" refentry.

https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com

I never did dive into the man page or PDF dynamics of this
particular change but it seemed to solve HTML pagination without negative
consequences and with minimal risk of unintended consequences since only
the markup on the pages we want to alter is changed, not global
configuration.

David J.


Re: documentation structure

2024-04-05 Thread Robert Haas
On Mon, Mar 25, 2024 at 11:40 AM Peter Eisentraut  wrote:
> I think a possible problem we need to consider with these proposals to
> combine chapters is that they could make the chapters themselves too
> deep and harder to navigate.

I looked into various options for further combining chapters and/or
appendixes and found that this is indeed a huge problem. For example,
I had thought of creating a Developer Information chapter in the
appendix and moving various existing chapters and appendixes inside of
it, but that means that the  elements in those chapters get
demoted to , and what used to be a whole chapter or appendix
becomes a . And since you get one HTML page per , that
means that instead of a bunch of individual HTML pages of very
pleasant length, you suddenly get one very long HTML page that is,
exactly as you say, hard to navigate.

> The rendering can be adjusted to some degree, but then we also need to
> make sure any new chunking makes sense in other chapters.  (And it might
> also change a bunch of externally known HTML links.)

I looked into this and I'm unclear how much customization is possible.
I gather that the current scheme comes from having chunk.section.depth
of 1, and I guess you can change that to 2 to get an HTML page per
, but it seems like it would take a LOT of restructuring to
make that work. It would be much easier if you could vary this across
different parts of the documentation; for instance, if you could say,
well, in this particular chapter or appendix, I want
chunk.section.depth of 2, but elsewhere 1, that would be quite handy,
but after several hours reading various things about DocBook on the
Internet, I was still unable to determine  conclusively whether this
was possible. There's an interesting comment in
stylesheet-speedup-xhtml.xsl that says "Since we set a fixed
$chunk.section.depth, we can do away with a bunch of complicated XPath
searches for the previous and next sections at various levels." That
sounds like it's suggesting that it is in fact possible for this
setting to vary, but I don't know if that's true, or how to do it, and
it sounds like there might be performance consequences, too.

> I think maybe more could also be done at the top-level structure, too.
> Right now, we have  ->  -> .  We could add  on
> top of that.

Does this let you create structures of non-uniform depth? i.e. is
there a way that we can group some chapters into sets while leaving
others as standalone chapters, or somesuch?

I'm not 100% confident that non-uniform depth (either via  or via
chunk.section.depth or via some other mechanism) is a good idea.
There's a sort of uniformity to our navigation right now that does
have some real appeal. The downside, though, is that if you want
something to be a single HTML page, it's got to either be a chapter
(or appendix) by itself with no sections inside of it, or it's got to
be a  inside of a chapter, and so anything that's long enough
that it should be an HTML page by itself can never be more than one
level below the index. And that seems to make it quite difficult to
keep the index small.

Without some kind of variable-depth structure, the only other ways
that I can see to improve things are:

1. Make chunk.section.depth 2 and restructure the entire documentation
until the results look reasonable. This might be possible but I bet
it's difficult. We have, at present, chapters of *wildly* varying
length, from a few sentences to many, many pages. That is perhaps a
bad thing; you most likely wouldn't do that in a printed book. But
fixing it is a huge project. We don't necessarily have the same amount
of content about each topic, and there isn't necessarily a way of
grouping related topics together that produces units of relatively
uniform length. I think it's sensible to try to make improvements
where we can, by pushing stuff down that's short and not that
important, but finding our way to a chunk.section.depth=2 world that
feels good to most people compared to what we have today seems like
it's going to be challening.

2. Replace the current index with a custom index or landing page of
some kind. Or keep the current index and add a new landing page
alongside it. Something that isn't derived automatically from the
documentation structure but is created by hand.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-04-05 Thread Robert Haas
On Fri, Mar 29, 2024 at 9:40 AM Robert Haas  wrote:
> 2. Demote "Monitoring Disk Usage" from a chapter on its own to a
> section of the "Monitoring Database Activity" chapter. I haven't seen
> any objections to this, and I'd like to move ahead with it.
>
> 3. Merge the separate chapters on various built-in index AMs into one.
> Peter didn't think this was a good idea, but Tom and Alvaro's comments
> focused on how to do it mechanically, and on whether the chapters
> needed to be reordered afterwards, which I took to mean that they were
> OK with the basic concept. David Johnston was also clearly in favor of
> it. So I'd like to move ahead with this one, too.

I committed these two patches.

> 4. Consolidate the "Generic WAL Records" and "Custom WAL Resource
> Managers" chapters, which cover related topics, into a single one. I
> didn't see anyone object to this, but David Johnston pointed out that
> the patch I posted was a few bricks short of a load, because it really
> needed to put some introductory text into the new chapter. I'll study
> this a bit more and propose a new patch that does the same thing a bit
> more carefully than my previous version did.

Here is a new version of this patch. I think this is v18 material at
this point, absent an outcry to the contrary. Sometimes we're flexible
about doc patches.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v3-0001-docs-Consolidate-into-new-WAL-for-Extensions-chap.patch
Description: Binary data


Re: documentation structure

2024-03-29 Thread Robert Haas
OK, so I'm coming back to this thread after giving it a few days to
cool off. My last series of patches proposed to do five things:

1. Merge the four-sentence "Installation from Binaries" chapter back
into "Installation from Source". I thought this was a slam-dunk, but
Peter pointed out that exactly the opposite of this was done a few
years ago to create the "Installation from Binaries" chapter in the
first place. Based on subsequent discussion, what I'm now inclined to
do is come up with a new proposal that involves moving the information
about compiling from source to an appendix. So never mind about this
one for now.

2. Demote "Monitoring Disk Usage" from a chapter on its own to a
section of the "Monitoring Database Activity" chapter. I haven't seen
any objections to this, and I'd like to move ahead with it.

3. Merge the separate chapters on various built-in index AMs into one.
Peter didn't think this was a good idea, but Tom and Alvaro's comments
focused on how to do it mechanically, and on whether the chapters
needed to be reordered afterwards, which I took to mean that they were
OK with the basic concept. David Johnston was also clearly in favor of
it. So I'd like to move ahead with this one, too.

4. Consolidate the "Generic WAL Records" and "Custom WAL Resource
Managers" chapters, which cover related topics, into a single one. I
didn't see anyone object to this, but David Johnston pointed out that
the patch I posted was a few bricks short of a load, because it really
needed to put some introductory text into the new chapter. I'll study
this a bit more and propose a new patch that does the same thing a bit
more carefully than my previous version did.

5. Consolidate all of the procedural language chapters into one. This
was clearly the most controversial part of the proposal. I'm going to
lay this one aside for now and possibly come back to it at a later
time.

I hope that this way of proceeding makes sense to people.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-25 Thread Robert Haas
On Mon, Mar 25, 2024 at 11:40 AM Peter Eisentraut  wrote:
> I think a possible problem we need to consider with these proposals to
> combine chapters is that they could make the chapters themselves too
> deep and harder to navigate.  For example, if we combined the
> installation from source and binaries chapters, the structure of the new
> chapter would presumably be

I agree with this in theory, but in practice I think the patches that
I posted don't have this issue to a degree that is problematic, and I
posted some specific proposals on adjustments that we could make to
ameliorate the problem if other people feel differently.

> I think maybe more could also be done at the top-level structure, too.
> Right now, we have  ->  -> .  We could add  on
> top of that.
>
> We could also play with CSS or JavaScript to make the top-level table of
> contents more navigable, with collapsing subsections or whatever.
>
> We could also render additional tables of contents or indexes, so there
> is more than one way to navigate into the content from the top.
>
> We could also build better search.

These are all reasonable ideas. I think some better CSS and JavaScript
could definitely help, and I also wondered whether the entrypoint to
the documentation has to be the index page, or whether it could maybe
be a page we've crafted specifically for that purpose, that might
include some text as well as a bunch of links.

But that having been said, I don't believe that any of those ideas (or
anything else we do) will obviate the need for some curation of the
toplevel index. If you're going to add another level, as you propose
in the first point, you still need to make decisions about which
things properly go at which levels. If you're going to allow for
collapsing subsections, you still want the overall tree in which
subsections are be expanded and collapsed to make logical sense. If
you have multiple ways to navigate to the content, one of them will
probably be still the index, and it should be good. And good search is
good, but it shouldn't be the only convenient way to find the content.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-25 Thread Peter Eisentraut

On 22.03.24 15:10, Robert Haas wrote:

Sorry. I didn't mean to dispute the point that the section was added a
few years ago, nor the point that most people just want to read about
the binaries. I am confident that both of those things are true. What
I do want to dispute is that having a four-sentence chapter in the
documentation index that tells people something they can find much
more easily without using the documentation at all is a good plan.


I think a possible problem we need to consider with these proposals to 
combine chapters is that they could make the chapters themselves too 
deep and harder to navigate.  For example, if we combined the 
installation from source and binaries chapters, the structure of the new 
chapter would presumably be


 Installation
Installation from Binaries
Installation from Source
 Requirements
 Getting the Source
 Building and Installation with Autoconf and Make
 Building and Installation with Meson
etc.

This would mean that the entire "Installation from Source" part would be 
rendered on a single HTML page.


The rendering can be adjusted to some degree, but then we also need to 
make sure any new chunking makes sense in other chapters.  (And it might 
also change a bunch of externally known HTML links.)


I think maybe more could also be done at the top-level structure, too. 
Right now, we have  ->  -> .  We could add  on 
top of that.


We could also play with CSS or JavaScript to make the top-level table of 
contents more navigable, with collapsing subsections or whatever.


We could also render additional tables of contents or indexes, so there 
is more than one way to navigate into the content from the top.


We could also build better search.





Re: documentation structure

2024-03-25 Thread Peter Eisentraut

On 22.03.24 14:59, Robert Haas wrote:

And I don't believe that if someone were writing a physical book about
PostgreSQL from scratch, they'd ever end up with a top-level chapter
that looks anything like our GiST chapter. All of the index AM
chapters are quite obviously clones of each other, and they're all
quite short. Surely you'd make them sections within a chapter, not
entire chapters.

I do agree that PL/pgsql is more arguable. I can imagine somebody
writing a book about PostgreSQL and choosing to make that topic into a
whole chapter.


Yeah, I think there is probably a range of of things from pretty obvious 
to mostly controversial.





Re: documentation structure

2024-03-22 Thread Robert Haas
On Fri, Mar 22, 2024 at 3:17 PM Bruce Momjian  wrote:
> I agree and they should be with the other views.  I was just explaining
> why, at the time, I didn't touch them.

Ah, OK. That makes total sense.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread Bruce Momjian
On Fri, Mar 22, 2024 at 03:13:29PM -0400, Robert Haas wrote:
> On Fri, Mar 22, 2024 at 2:59 PM Bruce Momjian  wrote:
> > I assume statistics collector views are in "Monitoring Database
> > Activity" because that is their purpose.
> 
> Well, yes. :-)
> 
> But the point is that all other statistics views are in a single
> section regardless of their purpose. We don't document pg_roles in the
> "Database Roles" chapter, for example.
> 
> And on the flip side, pg_locks and pg_replication_origin_status are
> also for monitoring database activity, but they're in the "System
> Views" chapter anyway. The only system views that are in "Monitoring
> Database Activity" rather than "System Views" are the ones where the
> name starts with "pg_stat_".
> 
> So the reason you state is why these views are under "Monitoring
> Database Activity" rather than a chapter chosen at random. But it
> doesn't really explain why they're separate from the other system
> views at all. That seems to be a pretty much random choice, AFAICT.

I agree and they should be with the other views.  I was just explaining
why, at the time, I didn't touch them.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: documentation structure

2024-03-22 Thread Robert Haas
On Fri, Mar 22, 2024 at 2:59 PM Bruce Momjian  wrote:
> I assume statistics collector views are in "Monitoring Database
> Activity" because that is their purpose.

Well, yes. :-)

But the point is that all other statistics views are in a single
section regardless of their purpose. We don't document pg_roles in the
"Database Roles" chapter, for example.

And on the flip side, pg_locks and pg_replication_origin_status are
also for monitoring database activity, but they're in the "System
Views" chapter anyway. The only system views that are in "Monitoring
Database Activity" rather than "System Views" are the ones where the
name starts with "pg_stat_".

So the reason you state is why these views are under "Monitoring
Database Activity" rather than a chapter chosen at random. But it
doesn't really explain why they're separate from the other system
views at all. That seems to be a pretty much random choice, AFAICT.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread David G. Johnston
On Fri, Mar 22, 2024 at 11:19 AM Robert Haas  wrote:

> On Fri, Mar 22, 2024 at 1:35 PM Bruce Momjian  wrote:
>
> But that all seems like a separate question from why we have the
> statistic collector views in a completely different part of the
> documentation from the rest of the system views. My guess is that it's
> just kind of a historical accident, but maybe there was some other
> logic to it.
>
>
The details under-pinning the cumulative statistics subsystem are
definitely large enough to warrant their own subsection. And it isn't like
placing them into the monitoring chapter is wrong and aside from a couple
of views those under System Views don't fit into what we've defined as
monitoring.  I don't have any desire to lump them under the generic system
views; which itself could probably use a level of categorization since the
nature of pg_locks and pg_cursors is decidedly different than pg_indexes
and pg_config.  This all becomes more appealing to work on once we solve
the problem of all sect2 entries being placed on a single page.

I struggled for a long while where I'd always look for pg_stat_activity
under system views instead of monitoring.  Amending my prior suggestion in
light of this I would suggest we move the Cumulative Statistics Views into
Reference but as its own Chapter, not part of System Views, and change its
name to "Monitoring Views" (going more generalized here feels like a win to
me). I'd move pg_locks, pg_cursors, pg_backend_memory_contexts,
pg_prepared_*, pg_shmem_allocations, and pg_replication_*.  Those all have
the same general monitoring nature to them compared to the others that
basically provide details regarding schema and static or session
configuration.

The original server admin monitoring section can go into detail regarding
Cumulative Statistics versus other kinds of monitoring.  We can use
section ordering to fulfill logical grouping desires until we are able to
make section3 entries appear on their own pages.

David J.


Re: documentation structure

2024-03-22 Thread Bruce Momjian
On Fri, Mar 22, 2024 at 02:19:29PM -0400, Robert Haas wrote:
> If you were actually looking for the section called "System Views",
> you weren't likely to see it here unless you already knew it was
> there, because it was 64 items into a 97-item list. Having one of
> these two sections inside the other just doesn't work at all. We could
> have alternatively chosen to have one chapter with two  tags
> inside of it, but I think what you actually did was perfectly fine.
> IMHO, "System Views" is important enough (and big enough) that giving
> it its own chapter is perfectly reasonable.
> 
> But that all seems like a separate question from why we have the
> statistic collector views in a completely different part of the
> documentation from the rest of the system views. My guess is that it's
> just kind of a historical accident, but maybe there was some other
> logic to it.

I assume statistics collector views are in "Monitoring Database
Activity" because that is their purpose.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: documentation structure

2024-03-22 Thread Robert Haas
On Fri, Mar 22, 2024 at 1:35 PM Bruce Momjian  wrote:
> > One question I have is why all of these views are documented here
> > rather than in chapter 53, "System Views," because surely they are
> > system views. I feel like if our documentation index weren't a mile
> > long and if you could easily find the entry for "System Views," that's
> > where you would naturally look for these details. I don't think it's
> > natural for a user to expect that most of the system views are going
> > to be documented in section VII, chapter 53 but one particular kind is
> > going to be documented in section III, chapter 27, under a chapter
>
> Well, until this commit in 2022, the system views were _under_ the
> system catalogs chapter:

Even before that commit, the statistics collector views were
documented in a completely separate part of the documentation from all
of the other system views.

I think that commit was a good idea, even though it made the top-level
documentation index bigger, because in v14, the "System Catalogs"
chapter looks like this:

...
52.61. pg_ts_template
52.62. pg_type
52.63. pg_user_mapping
52.64. System Views
52.65. pg_available_extensions
52.66. pg_available_extension_versions
52.67. pg_backend_memory_contexts
...

If you were actually looking for the section called "System Views",
you weren't likely to see it here unless you already knew it was
there, because it was 64 items into a 97-item list. Having one of
these two sections inside the other just doesn't work at all. We could
have alternatively chosen to have one chapter with two  tags
inside of it, but I think what you actually did was perfectly fine.
IMHO, "System Views" is important enough (and big enough) that giving
it its own chapter is perfectly reasonable.

But that all seems like a separate question from why we have the
statistic collector views in a completely different part of the
documentation from the rest of the system views. My guess is that it's
just kind of a historical accident, but maybe there was some other
logic to it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread Bruce Momjian
On Fri, Mar 22, 2024 at 08:32:14AM -0400, Robert Haas wrote:
> On Thu, Mar 21, 2024 at 6:32 PM David G. Johnston
>  wrote:
> > Just going to note that the section on the cumulative statistics views 
> > being a single page is still a strongly bothersome issue here.  Though the 
> > quick fix actually requires upgrading the section to chapter status...
> 
> Yeah, I've been bothered by this in the past, too. I'm not very keen
> to start promoting things to the top-level, though. I think we need a
> more thoughtful fix than that.
> 
> One question I have is why all of these views are documented here
> rather than in chapter 53, "System Views," because surely they are
> system views. I feel like if our documentation index weren't a mile
> long and if you could easily find the entry for "System Views," that's
> where you would naturally look for these details. I don't think it's
> natural for a user to expect that most of the system views are going
> to be documented in section VII, chapter 53 but one particular kind is
> going to be documented in section III, chapter 27, under a chapter

Well, until this commit in 2022, the system views were _under_ the
system catalogs chapter:

commit 64d364bb39c
Author: Bruce Momjian 
Date:   Thu Jul 14 16:07:12 2022 -0400

doc:  move system views section to its own chapter

Previously it was inside the system catalogs chapter.

Reported-by: Peter Smith

Discussion: 
https://postgr.es/m/cahut+psmc18qp60d+l0hjboxrlqt5m88yvacdyxlq34gfph...@mail.gmail.com

Backpatch-through: 15

The thread contains more discussion the issue, and I think it still needs help:


https://www.postgresql.org/message-id/flat/CAHut%2BPsMc18QP60D%2BL0hJBOXrLQT5m88yVaCDyxLq34gfPHsow%40mail.gmail.com

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: documentation structure

2024-03-22 Thread David G. Johnston
On Fri, Mar 22, 2024, 09:32 Robert Haas  wrote:

>
>
> I notice that you say that the "Installation" section should "cover
> the architectural overview and point people to where they can find the
> stuff they need to install PostgreSQL in the various ways available to
> them" so maybe you're not imagining a four-sentence chapter, either.
>

Fair point but I posit that new users are looking for a chapter named
Installation in the documentation.  At least the ones willing to read
documentation.  Having two of them isn't needed but having zero doesn't
make sense either.

The current proposal does that so I'm ok as-is but it can be further
improved by moving source install talk elsewhere and having the
installation chapter redirect the reader there for details.  I'm not
concerned with how long or short the resultant installation chapter is.

David J.

>
>


Re: documentation structure

2024-03-22 Thread Robert Haas
On Fri, Mar 22, 2024 at 11:50 AM David G. Johnston
 wrote:
> On Fri, Mar 22, 2024 at 7:10 AM Robert Haas  wrote:
>> That's actually what we had in chapter
>> 18, "Installation from Source Code on Windows", since removed. But for
>> some reason we decided that on non-Windows platforms, it needed a
>> whole new chapter rather than an extra sentence in the existing one. I
>> think that's massively overkill.
>
> I agree with the premise that we should have a single chapter, in the main 
> documentation flow, named "Installation".  It should cover the architectural 
> overview and point people to where they can find the stuff they need to 
> install PostgreSQL in the various ways available to them.  I agree with 
> moving the source installation material to the appendix.  None of the 
> sections under Installation would then actually detail how to install the 
> software since that isn't something the project itself handles but has 
> delegated to packagers for the vast majority of cases and the source install 
> details are in the appendix for the one "supported" mechanism that most 
> people do not use.

Hmm, that's not quite the same as my position. I'm fine with either
moving the installation from source material to an appendix, or
leaving it where it is. But I'm strongly against continuing to have a
chapter with four sentences in it that says to use the same download
link that is on the main navigation bar of every page on the
postgresql.org web site. We're never going to get the chapter index
down to a reasonable size if we insist on having chapters that have a
totally de minimis amount of content.

So my feeling is that if we keep the installation from source material
where it is, then we can make it also mention the download link, just
as we used to do in the installation-on-windows chapter. But if we
banish installation from source to the appendixes, then we shouldn't
keep a whole chapter in the main documentation to tell people
something that is anyway obvious. I don't really think that material
needs to be there at all, but if we want to have it, surely we can
find someplace to put it such that it doesn't require a whole chapter
to say that and nothing else. It could for example go at the beginning
of the "Server Setup and Operation" chapter, for instance; if that
were the first chapter of section III, I think that would be natural
enough.

I notice that you say that the "Installation" section should "cover
the architectural overview and point people to where they can find the
stuff they need to install PostgreSQL in the various ways available to
them" so maybe you're not imagining a four-sentence chapter, either.
But this project is going to be impossible unless we stick to limited
goals. We can, and should, rewrite some sections of the documentation
to be more useful; but if we try to do that as part of the same
project that aims to tidy up the index, the chances of us getting
stuck in an endless bikeshedding loop go from "high" to "certain". So
I don't want to hypothesize the existence of an installation chapter
that isn't any of the things we have today. Let's try to get the
things we have into places that make sense, and then consider other
improvements separately.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread David G. Johnston
On Fri, Mar 22, 2024 at 7:10 AM Robert Haas  wrote:

>
> That's actually what we had in chapter
> 18, "Installation from Source Code on Windows", since removed. But for
> some reason we decided that on non-Windows platforms, it needed a
> whole new chapter rather than an extra sentence in the existing one. I
> think that's massively overkill.
>
>
I agree with the premise that we should have a single chapter, in the main
documentation flow, named "Installation".  It should cover the
architectural overview and point people to where they can find the stuff
they need to install PostgreSQL in the various ways available to them.  I
agree with moving the source installation material to the appendix.  None
of the sections under Installation would then actually detail how to
install the software since that isn't something the project itself handles
but has delegated to packagers for the vast majority of cases and the
source install details are in the appendix for the one "supported"
mechanism that most people do not use.

David J.


Re: documentation structure

2024-03-22 Thread Robert Haas
On Fri, Mar 22, 2024 at 9:35 AM Peter Eisentraut  wrote:
> >> But this separation was explicitly added a few years ago, because most
> >> people just want to read about the binaries.
> >
> > I really doubt that this is true.
>
> Here is the thread:
> https://www.postgresql.org/message-id/flat/CABUevExRCf8waYOsrCO-QxQL50XGapMf5dnWScOXj7X%3DMXW--g%40mail.gmail.com

Sorry. I didn't mean to dispute the point that the section was added a
few years ago, nor the point that most people just want to read about
the binaries. I am confident that both of those things are true. What
I do want to dispute is that having a four-sentence chapter in the
documentation index that tells people something they can find much
more easily without using the documentation at all is a good plan. I
agree with the concern that Magnus expressed on the thread, i.e:

> It's kind of strange that if you start your PostgreSQL journey by reading our 
> instructions, you get nothing useful about installing PostgreSQL from binary 
> packages other than "go ask somebody else about it".

But I don't agree that this was the right way to address that problem.
I think it would have been better to just add the download link to the
existing installation chapter. That's actually what we had in chapter
18, "Installation from Source Code on Windows", since removed. But for
some reason we decided that on non-Windows platforms, it needed a
whole new chapter rather than an extra sentence in the existing one. I
think that's massively overkill.

Alternately, I think it would be reasonable to address the concern by
just moving all the stuff about building from source code to an
appendix, and assume people can figure out how to download the
software without us needing to say anything in the documentation at
all. What was weird about the state before that patch, IMHO, was that
we both talked about building from source code and didn't talk about
binary packages. That can be addressed either by adding a mention of
binary packages, or by deemphasizing the idea of installing from
source code.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread Robert Haas
On Thu, Mar 21, 2024 at 7:40 PM Peter Eisentraut  wrote:
> I'm highly against this.  If I want to read about PL/Python, why should
> I have to wade through PL/Perl and PL/Tcl?
>
> I think, abstractly, in a book, PL/Python should be a chapter of its
> own.  Just like GiST should be a chapter of its own.  Because they are
> self-contained topics.

On the other hand, in a book, chapters tend to be of relatively
uniform length. People don't usually write a book with some chapters
that are 100+ pages long, and others that are a single page, or even
just a couple of sentences. I mean, I'm sure it's been done, but it's
not a normal way to write a book.

And I don't believe that if someone were writing a physical book about
PostgreSQL from scratch, they'd ever end up with a top-level chapter
that looks anything like our GiST chapter. All of the index AM
chapters are quite obviously clones of each other, and they're all
quite short. Surely you'd make them sections within a chapter, not
entire chapters.

I do agree that PL/pgsql is more arguable. I can imagine somebody
writing a book about PostgreSQL and choosing to make that topic into a
whole chapter.

However, I also think that people don't make decisions about what
should be a chapter in a vacuum. If you've got 100 people writing a
book together, which is essentially what we actually do have, and each
of those people makes decisions in isolation about what is worthy of
being a chapter, then you end up with exactly the kind of mess that we
now have. Some chapters are long and some are short. Some are
well-written and some are poorly written. Some are updated regularly
and others have hardly been touched in a decade. Books have editors to
straighten out those kinds of inconsistencies so that there's some
uniformity to the product as a whole.

The problem with that, of course, is that it invites bike-shedding. As
you say, every decision that is reflected in our documentation was
made for some reason, and most of them will have been made by
prominent, active committers. So discussions about how to improve
things can easily bog down even when people agree on the overall
goals, simply because few individual changes find consensus. I hope
that doesn't happen here, because I think most people who have
commented so far agree that there is a problem here and that we should
try to fix it. Let's not let the perfect be the enemy of the good.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread Peter Eisentraut

On 22.03.24 13:50, Robert Haas wrote:

On Thu, Mar 21, 2024 at 7:37 PM Peter Eisentraut  wrote:

On 20.03.24 17:43, Robert Haas wrote:

0001 removes the "Installation from Binaries" chapter. The whole thing
is four sentences. I moved the most important information into the
"Installation from Source Code" chapter and retitled it
"Installation".


But this separation was explicitly added a few years ago, because most
people just want to read about the binaries.


I really doubt that this is true.


Here is the thread: 
https://www.postgresql.org/message-id/flat/CABUevExRCf8waYOsrCO-QxQL50XGapMf5dnWScOXj7X%3DMXW--g%40mail.gmail.com






Re: documentation structure

2024-03-22 Thread Robert Haas
On Thu, Mar 21, 2024 at 7:37 PM Peter Eisentraut  wrote:
> On 20.03.24 17:43, Robert Haas wrote:
> > 0001 removes the "Installation from Binaries" chapter. The whole thing
> > is four sentences. I moved the most important information into the
> > "Installation from Source Code" chapter and retitled it
> > "Installation".
>
> But this separation was explicitly added a few years ago, because most
> people just want to read about the binaries.

I really doubt that this is true. I've been installing software on
UNIX-like operating systems for more than 30 years now, and I don't
think there's been a single time when I have ever consulted the
documentation for a software package to find the download location for
that package. When I first started out, everything was ftp rather than
www, so you went to ftp.whatever.{com,org,net,gov,edu} and tried to
download the distribution bundle, and then you untarred it and ran
configure and make. Then you read the README or the documentation or
whatever afterward. These days, I think what people do is either (a)
use their package manager to install PostgreSQL and then come to the
documentation afterward to find out how to use it or (b) do a search
for "PostgreSQL download" and click on whatever comes up. I'm not
saying there's never been a user who made use of this section of the
documentation to find the download location, but surely the normal
thing to do if you come to www.postgresql.org and you want to download
the software is to click "Download" on the nav bar, not
"Documentation," then a specific version, then chapter 16, then the
exact same download link that's already there on the nav bar.

I do agree that it is very questionable whether "Installation from
Source Code" is of sufficient interest to ordinary users to justify
including it in "III. Server Administration." Most people, probably
including many extension developers, are only going to install the
binary packages. But the solution to that isn't to have a
four-sentence chapter telling me about a download location that I
likely found long before I looked at the documentation, and that I can
certainly find very easily without needing the documentation. Rather,
what we should do if we think that installing from source code is of
marginal interest is move it to an appendix. As I said to Alvaro
yesterday, I think that a "Developer Guide" appendix could be a good
place to house a number of things that currently have toplevel
chapters but don't really need them because they're only of interest
to a small minority of users. This might be another thing that could
go there.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-22 Thread Robert Haas
On Thu, Mar 21, 2024 at 6:32 PM David G. Johnston
 wrote:
> Just going to note that the section on the cumulative statistics views being 
> a single page is still a strongly bothersome issue here.  Though the quick 
> fix actually requires upgrading the section to chapter status...

Yeah, I've been bothered by this in the past, too. I'm not very keen
to start promoting things to the top-level, though. I think we need a
more thoughtful fix than that.

One question I have is why all of these views are documented here
rather than in chapter 53, "System Views," because surely they are
system views. I feel like if our documentation index weren't a mile
long and if you could easily find the entry for "System Views," that's
where you would naturally look for these details. I don't think it's
natural for a user to expect that most of the system views are going
to be documented in section VII, chapter 53 but one particular kind is
going to be documented in section III, chapter 27, under a chapter
title that gives no hint that it will document any views.

> Maybe "pl/pgsql and Other Procedural Languages" as the title?

I guess I have a hard time seeing this as an improvement. It would
help someone who knows that plpgsql exists but doesn't know that it
falls into the general category called procedural languages, but I
suspect that's not a very common confusion. I think it's better to
keep the chapter titles short and to the point.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-21 Thread Bruce Momjian
On Fri, Mar 22, 2024 at 01:12:30AM +0100, Daniel Gustafsson wrote:
> > On 22 Mar 2024, at 00:33, Peter Eisentraut  wrote:
> > 
> > On 19.03.24 14:50, Tom Lane wrote:
> >> Daniel Gustafsson  writes:
> >>> It's actually not very odd, the reference section is using  
> >>> elements
> >>> and we had missed the arabic numerals setting on those.  The attached 
> >>> fixes
> >>> that for me.  That being said, we've had roman numerals for the reference
> >>> section since forever (all the way down to the 7.2 docs online has it) so 
> >>> maybe
> >>> it was intentional?
> >> I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
> >> it's not that way simply because nobody thought about it.
> > 
> > Looks to me it was just that way because it's the default setting of the 
> > stylesheets.
> 
> That's quite possible.  I don't have strong opinions on whether we should
> change, or keep it the way it is.

If we can't justify why it should be different, it should be like the
surrounding sections.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: documentation structure

2024-03-21 Thread Daniel Gustafsson
> On 22 Mar 2024, at 00:33, Peter Eisentraut  wrote:
> 
> On 19.03.24 14:50, Tom Lane wrote:
>> Daniel Gustafsson  writes:
>>> It's actually not very odd, the reference section is using  
>>> elements
>>> and we had missed the arabic numerals setting on those.  The attached fixes
>>> that for me.  That being said, we've had roman numerals for the reference
>>> section since forever (all the way down to the 7.2 docs online has it) so 
>>> maybe
>>> it was intentional?
>> I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
>> it's not that way simply because nobody thought about it.
> 
> Looks to me it was just that way because it's the default setting of the 
> stylesheets.

That's quite possible.  I don't have strong opinions on whether we should
change, or keep it the way it is.

--
Daniel Gustafsson





Re: documentation structure

2024-03-21 Thread Peter Eisentraut

On 21.03.24 15:31, Robert Haas wrote:

On Thu, Mar 21, 2024 at 9:38 AM Tom Lane  wrote:

I'd follow the extend.sgml precedent: have a file corresponding to the
chapter and containing any top-level text we need, then that includes
a file per sect1.


OK, here's a new patch set. I've revised 0003 and 0004 to use this
approach, and I've added a new 0005 that does essentially the same
thing for the PL chapters.


I'm highly against this.  If I want to read about PL/Python, why should 
I have to wade through PL/Perl and PL/Tcl?


I think, abstractly, in a book, PL/Python should be a chapter of its 
own.  Just like GiST should be a chapter of its own.  Because they are 
self-contained topics.






Re: documentation structure

2024-03-21 Thread Peter Eisentraut

On 20.03.24 17:43, Robert Haas wrote:

0001 removes the "Installation from Binaries" chapter. The whole thing
is four sentences. I moved the most important information into the
"Installation from Source Code" chapter and retitled it
"Installation".


But this separation was explicitly added a few years ago, because most 
people just want to read about the binaries.






Re: documentation structure

2024-03-21 Thread Peter Eisentraut

On 19.03.24 14:50, Tom Lane wrote:

Daniel Gustafsson  writes:

It's actually not very odd, the reference section is using  elements
and we had missed the arabic numerals setting on those.  The attached fixes
that for me.  That being said, we've had roman numerals for the reference
section since forever (all the way down to the 7.2 docs online has it) so maybe
it was intentional?


I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
it's not that way simply because nobody thought about it.


Looks to me it was just that way because it's the default setting of the 
stylesheets.






Re: documentation structure

2024-03-21 Thread David G. Johnston
On Thu, Mar 21, 2024 at 11:30 AM Robert Haas  wrote:

>
> My second thought is that the stuff from "VII. Internals" that I
> categorized as reference material should move into section "VI.
> Reference". I think we should also consider moving appendix F,
> "Additional Supplied Modules and Extensions," and appendix G,
> "Additional Supplied Programs" to the reference section.
>
>
For "VI. Reference" I propose the following Chapters:

SQL Commands
PL/pgSQL
Cumulative Statistics Views
System Views
System Catalogs
Client Applications
Server Applications
Modules and Extensions

-- Remove Appendix G (Programs) altogether and just note for the two that
are listed that they are in contrib as opposed to core.

-- The PostgreSQL qualifier doesn't seem helpful and once you add the
additional chapters its unusual presence stands out even more.

-- PL/pgSQL gets its own reference chapter since we wrote it.  Stuff like
Perl and Python have entire books that the user can consult as reference
material for those languages.

David J.


Re: documentation structure

2024-03-21 Thread David G. Johnston
On Wed, Mar 20, 2024 at 9:43 AM Robert Haas  wrote:

> On Tue, Mar 19, 2024 at 5:39 PM Andrew Dunstan 
> wrote:
> > +many for improving the index.
>
> Here's a series of four patches.


I reviewed the most recent set of 5 patches.


> Taken together, they cut down the
> number of numbered chapters from 76 to 68. I think we could easily
> save that much again if I wrote a few more patches along similar
> lines, but I'm posting these first to see what people think.
>
> 0001 removes the "Installation from Binaries" chapter. The whole thing
> is four sentences. I moved the most important information into the
> "Installation from Source Code" chapter and retitled it
> "Installation".
>

Makes sense


> 0002 removes the "Monitoring Disk Usage" chapter by folding it into
> the immediately-preceding "Monitoring Database Activity" chapter. I
> kind of feel like the "Monitoring Disk Usage" chapter might be in need
> of a bigger rewrite or just outright removal, but there's surely not
> enough content here to justify making it a top-level chapter.
>

Just going to note that the section on the cumulative statistics views
being a single page is still a strongly bothersome issue here.  Though the
quick fix actually requires upgrading the section to chapter status...

Maybe we can stub out that section in the "Monitoring Database Activity"
chapter and move that entire section after "System Views" in the Internals
part?

I agree with subordinating Monitoring Disk Usage.


> 0003 merges all of the "Internals" chapters whose names are the names
> of built-in index access methods (Btree, Gin, etc.) into a single
> chapter called "Built-In Index Access Methods". All of these chapters
> have a very similar structure and none of them are very long, so it
> makes a lot of sense, at least in my mind, to consolidate them into
> one.
>

One of the more impactful and wanted improvements, IMO.


> 0004 merges the "Generic WAL Records" and "Custom WAL Resource
> Managers" chapter together, creating a new chapter called "Write Ahead
> Logging for Extensions".
>
>
The positioning of this and the preceding Built-in Index Access Methods
chapter seem like they should be switched.

If this sticks we should add an introductory paragraph for the chapter.

and I've added a new 0005 that does essentially the same
> thing for the PL chapters.
>

The following page needs to be reworded to take the new structure into
account:

https://www.postgresql.org/docs/current/xfunc-pl.html

Not having pl/pgsql appear on the main ToC seems like a loss but the others
make sense and a special exception for it probably isn't warranted.

Maybe "pl/pgsql and Other Procedural Languages" as the title?

David J.


Re: documentation structure

2024-03-21 Thread Robert Haas
On Thu, Mar 21, 2024 at 12:43 PM Alvaro Herrera  wrote:
> which is a bit odd: why are the two WAL chapters in the middle of the
> chapters 62 and 63 talking about AMs?  Maybe put 66 right after 63
> instead.Also, is it really better to have 62/63 first and 66
> later?  It sounds to me like 66 is more user-oriented and the other two
> are developer-oriented, so I'm inclined to suggest putting them the
> other way around, but I'm not really sure about this.  (Also, starting
> chapter 66 straight with 66.1 BTree without any intro text looks a bit
> odd; maybe one short introductory paragraph is sufficient?)

I had similar thoughts. I think that we should consider some changes
to the chapter ordering, but I didn't want to try to change too many
things all at once, since somebody only has to hate one thing about
the patch to sink the whole thing.

But since you brought it up, what I've been thinking about is that the
whole division into parts might need to be rethought a bit. I feel
like "VII. Internals" is a mix of about four different kinds of
content. First, the biggest portion of it is information about
developing certain kinds of C extensions -- all the "Writing a
Whatever" chapters, the "Whatever Access Method Interface Definition"
chapters, "Generic WAL Records", "Custom WAL Resource Managers", and
all the index-related chapters. Second, we've got some information
that I think is mainly of interest to people developing PostgreSQL
itself, namely, "PostgreSQL Coding Conventions", "Native Language
Support", and "System Catalog Declarations and Initial Contents". You
*might* care about these if you're developing extensions, or even if
you're not a developer at all, but then again you might not. Third,
we've got some reference material, namely "System Catalogs", "System
Views", and perhaps "Frontend/Backend Protocol". I distinguish these
from the previous two categories because I think you could care about
this stuff as a random user, or a developer of products that
interoperate with PostgreSQL but don't link with it or share any
common code. Finally, there's just a bunch of random bits and bobs
that we've decided to document here for one reason or another, either
because somebody else did a bunch of the work, like "Overview of
PostgreSQL Internals", or because some developer did something and
someone said "hey, that should be documented!", like "Backup Manifest
Format."

So my first thought is to pull out the stuff that's mainly for
PostgreSQL core developers and move it to an appendix. I propose we
create an appendix called "Developer Guide" and that it absorb the
existing appendix I, "The Source Code Repository", possibly all or
part of appendix J, "Documentation", and the chapters from "VII.
Internals" that are mostly of developer interest. I think that
possibly some of what's in "J. Documentation" should actually be moved
into the "Installation" chapter where we talk about building the
source code, because it doesn't make much sense to document the build
tool chain in one part of the documentation and the documentation
toolchain someplace else entirely, but "J.6. Style Guide" is developer
information, not build instructions.

My second thought is that the stuff from "VII. Internals" that I
categorized as reference material should move into section "VI.
Reference". I think we should also consider moving appendix F,
"Additional Supplied Modules and Extensions," and appendix G,
"Additional Supplied Programs" to the reference section. However,
prior to doing that, I think that appendix G needs some cleanup or
possibly we should just find a way to remove it outright. We're
shipping an appendix G with two major subsections, one of which is
completely empty and has been since v14, and the other of which
contains only two things. I think we should just remove the empty
sub-section entirely. I'm not sure what to do about the only with only
2 things in it (vacuumlo and oid2name). Would it be a bad idea to just
merge those bits into the client applications reference section?

My third thought is about what to do with the material in "VII.
Internals" that is about developing specific kind of extensions, like,
say, "Writing a Foreign Data Wrapper." If you look at "V. Server
Programming", you see that we actually have some very similar sections
there, like chapter 47, "Background Worker Processes" and chapter 50,
"Archive Modules". I think it's not very clear in the current
structure where topics that are relevant for extension developers
should go under "Server Programming" or under "Internals", and it
looks to me like different people have just done different things and
it's all a bit haphazard. One idea is to decide that the correct
answer is "Server Programming" and move all of the internals chapters
that fall into this category over to there. I don't think this is the
right answer, because that section also contains information about a
bunch of stuff that's strictly SQL-level, like rules and triggers. So

Re: documentation structure

2024-03-21 Thread Alvaro Herrera
On 2024-Mar-21, Robert Haas wrote:

> On Thu, Mar 21, 2024 at 9:38 AM Tom Lane  wrote:
> > I'd follow the extend.sgml precedent: have a file corresponding to the
> > chapter and containing any top-level text we need, then that includes
> > a file per sect1.
> 
> OK, here's a new patch set. I've revised 0003 and 0004 to use this
> approach, 

Great, thanks.  Looking at the index in the PDF after (only) 0003, we
now have this structure

62. Table Access Method Interface Definition 
... 2475
63. Index Access Method Interface Definition 
... 2476
63.1. Basic API Structure for Indexes 
.. 2476
63.2. Index Access Method Functions 
.. 2479
63.3. Index Scanning 

 2485
63.4. Index Locking Considerations 
. 2486
63.5. Index Uniqueness Checks 
.. 2487
63.6. Index Cost Estimation Functions 
. 2489
64. Generic WAL Records 
.
 2492
65. Custom WAL Resource Managers 
. 2494
66. Built-in Index Access Methods 
.. 2496

which is a bit odd: why are the two WAL chapters in the middle of the
chapters 62 and 63 talking about AMs?  Maybe put 66 right after 63
instead.Also, is it really better to have 62/63 first and 66
later?  It sounds to me like 66 is more user-oriented and the other two
are developer-oriented, so I'm inclined to suggest putting them the
other way around, but I'm not really sure about this.  (Also, starting
chapter 66 straight with 66.1 BTree without any intro text looks a bit
odd; maybe one short introductory paragraph is sufficient?)

> and I've added a new 0005 that does essentially the same
> thing for the PL chapters.

I was looking at the PL chapters earlier today too, wondering whether
this would be valuable; but I worry that there are too many
sub-sub-sections there, so it could end up being a bit messy.  I didn't
look at the resulting output though.

> 0001 and 0002 are [un]changed. Should 0002 use the include-an-entity
> approach as well?

Shrug, I wouldn't, doesn't look worth it.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"




Re: documentation structure

2024-03-21 Thread Robert Haas
On Thu, Mar 21, 2024 at 10:31 AM Robert Haas  wrote:
> 0001 and 0002 are changed. Should 0002 use the include-an-entity
> approach as well?

Woops. I meant to say that 0001 and 0002 are *unchanged*.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-21 Thread Robert Haas
On Thu, Mar 21, 2024 at 9:38 AM Tom Lane  wrote:
> I'd follow the extend.sgml precedent: have a file corresponding to the
> chapter and containing any top-level text we need, then that includes
> a file per sect1.

OK, here's a new patch set. I've revised 0003 and 0004 to use this
approach, and I've added a new 0005 that does essentially the same
thing for the PL chapters.

0001 and 0002 are changed. Should 0002 use the include-an-entity
approach as well?

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v2-0004-docs-Consolidate-into-new-WAL-for-Extensions-chap.patch
Description: Binary data


v2-0002-docs-Demote-Monitoring-Disk-Usage-from-chapter-to.patch
Description: Binary data


v2-0003-docs-Merge-separate-chapters-on-built-in-index-AM.patch
Description: Binary data


v2-0001-docs-Remove-the-Installation-from-Binaries-chapte.patch
Description: Binary data


v2-0005-docs-Merge-all-procedural-language-documentation-.patch
Description: Binary data


Re: documentation structure

2024-03-21 Thread Tom Lane
Robert Haas  writes:
> Well, I suppose I thought it was a good idea because (1) we don't seem
> to have any existing precedent for file-per-sect1 rather than
> file-per-chapter and (2) all of the per-AM files combined are less
> than 20% of the size of func.sgml.

We have done (1) in places, eg. json.sgml, array.sgml,
rangetypes.sgml, rowtypes.sgml, and the bulk of extend.sgml is split
out into xaggr, xfunc, xindex, xoper, xtypes.  I'd be the first to
concede it's a bit haphazard, but it's not like there's no precedent.

As for (2), func.sgml likely should have been split years ago.

> But, OK, if you want to establish a new paradigm here, sure. I see two
> ways to do it. We can either put the  tag directly in
> postgres.sgml, or I can still create a new indextypes.sgml and put
>  etc. inside of it. Which way do you prefer?

I'd follow the extend.sgml precedent: have a file corresponding to the
chapter and containing any top-level text we need, then that includes
a file per sect1.

regards, tom lane




Re: documentation structure

2024-03-21 Thread Robert Haas
On Wed, Mar 20, 2024 at 5:25 PM Tom Lane  wrote:
> I'd say that a separate file per AM is a good thing regardless.
> Elsewhere in this same thread are grumblings about how big func.sgml
> is; why would you think it good to start down that same path for the
> AM documentation?

Well, I suppose I thought it was a good idea because (1) we don't seem
to have any existing precedent for file-per-sect1 rather than
file-per-chapter and (2) all of the per-AM files combined are less
than 20% of the size of func.sgml.

But, OK, if you want to establish a new paradigm here, sure. I see two
ways to do it. We can either put the  tag directly in
postgres.sgml, or I can still create a new indextypes.sgml and put
 etc. inside of it. Which way do you prefer?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-20 Thread Tom Lane
Robert Haas  writes:
> On Wed, Mar 20, 2024 at 5:05 PM Alvaro Herrera  
> wrote:
>> I think you can achieve this with a much smaller patch that just changes
>> the outer tag in each file so that each file is a , then create a
>> single file that includes all of these plus an additional outer tag for
>> the  (or maybe just add the  in postgres.sgml).  This
>> has the advantage that each AM continues to be a separate single file,
>> and you still have your desired structure.

> Right, that could also be done, and not just for 0003. I just wasn't
> sure that was the right approach. It would mean that the division of
> the SGML into files continues to reflect the original chapter
> divisions rather than the current ones forever. In the short run
> that's less churn, less back-patching pain, etc.; but in the long term
> it means you've got relics of a structure that doesn't exist any more
> sticking around forever.

I'd say that a separate file per AM is a good thing regardless.
Elsewhere in this same thread are grumblings about how big func.sgml
is; why would you think it good to start down that same path for the
AM documentation?

regards, tom lane




Re: documentation structure

2024-03-20 Thread Robert Haas
On Wed, Mar 20, 2024 at 5:05 PM Alvaro Herrera  wrote:
> I think you can achieve this with a much smaller patch that just changes
> the outer tag in each file so that each file is a , then create a
> single file that includes all of these plus an additional outer tag for
> the  (or maybe just add the  in postgres.sgml).  This
> has the advantage that each AM continues to be a separate single file,
> and you still have your desired structure.

Right, that could also be done, and not just for 0003. I just wasn't
sure that was the right approach. It would mean that the division of
the SGML into files continues to reflect the original chapter
divisions rather than the current ones forever. In the short run
that's less churn, less back-patching pain, etc.; but in the long term
it means you've got relics of a structure that doesn't exist any more
sticking around forever.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-20 Thread Alvaro Herrera
On 2024-Mar-20, Robert Haas wrote:

> 0003 merges all of the "Internals" chapters whose names are the names
> of built-in index access methods (Btree, Gin, etc.) into a single
> chapter called "Built-In Index Access Methods". All of these chapters
> have a very similar structure and none of them are very long, so it
> makes a lot of sense, at least in my mind, to consolidate them into
> one.

I think you can achieve this with a much smaller patch that just changes
the outer tag in each file so that each file is a , then create a
single file that includes all of these plus an additional outer tag for
the  (or maybe just add the  in postgres.sgml).  This
has the advantage that each AM continues to be a separate single file,
and you still have your desired structure.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: documentation structure

2024-03-20 Thread Robert Haas
On Wed, Mar 20, 2024 at 1:35 PM Bruce Momjian  wrote:
> On Wed, Mar 20, 2024 at 12:43:08PM -0400, Robert Haas wrote:
> > Overall, I think this achieves a minor but pleasant level of
> > de-cluttering of the index. It's going to take a lot more than one
> > morning's work to produce a major improvement, but at least this is
> > something.
>
> I think this kind of doc structure review is long overdue.

Thanks, Bruce!

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-20 Thread Bruce Momjian
On Wed, Mar 20, 2024 at 12:43:08PM -0400, Robert Haas wrote:
> Overall, I think this achieves a minor but pleasant level of
> de-cluttering of the index. It's going to take a lot more than one
> morning's work to produce a major improvement, but at least this is
> something.

I think this kind of doc structure review is long overdue.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: documentation structure

2024-03-20 Thread Robert Haas
On Mon, Mar 18, 2024 at 5:40 PM Laurenz Albe  wrote:
> I also disagree that chapters 4 to 6 are a continuation of the tutorial.
> Or at least, they shouldn't be.
> When I am looking for a documentation reference on something like
> security considerations of SECURITY DEFINER functions, my first
> impulse is to look in chapter 5 (Data Definition) or in chapter 38
> (Extending SQL), and I am surprised to find it discussed in the
> SQL reference of CREATE FUNCTION.

I looked at this a bit more closely. There's actually a lot of
detailed technical information in chapters 4 and 5, but chapter 6 is
extremely short and mostly recapitulates chapter 2.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-19 Thread Andrew Dunstan
On Mon, Mar 18, 2024 at 10:12 AM Robert Haas  wrote:

> I was looking at the documentation index this morning[1], and I can't
> help feeling like there are some parts of it that are over-emphasized
> and some parts that are under-emphasized. I'm not sure what we can do
> about this exactly, but I thought it worth writing an email and seeing
> what other people think.
>
> The two sections of the documentation that seem really
> under-emphasized to me are the GUC documentation and the SQL
> reference. The GUC documentation is all buried under "20. Server
> Configuration" and the SQL command reference is under "I. SQL
> commands". For reasons that I don't understand, all chapters except
> for those in "VI. Reference" are numbered, but the chapters in that
> section have Roman numerals instead.
>
> I don't know what other people's experience is, but for me, wanting to
> know what a command does or what a setting does is extremely common.
> Therefore, I think these chapters are disproportionately important and
> should be emphasized more. In the case of the GUC reference, one idea
> I have is to split up "III. Server Administration". My proposal is
> that we divide it into three sections. The first would be called "III.
> Server Installation" and would cover chapters 16 (installation from
> binaries) through 19 (server setup and operation). The second would be
> called "IV. Server Configuration" -- so every section that's currently
> a subsection of "server configuration" would become a top-level
> chapter. The third division would be "V. Server Administration," and
> would cover the current chapters 21-33. This is probably far from
> perfect, but it seems like a relatively simple change and better than
> what we have now.
>
> I don't know what to do about "I. SQL commands". It's obviously
> impractical to promote that to a top-level section, because it's got a
> zillion sub-pages which I don't think we want in the top-level
> documentation index. But having it as one of several unnumbered
> chapters interposed between 51 and 52 doesn't seem great either.
>
> The stuff that I think is over-emphasized is as follows: (a) chapters
> 1-3, the tutorial; (b) chapters 4-6, which are essentially a
> continuation of the tutorial, and not at all similar to chapters 8-11
> which are chalk-full of detailed technical information; (c) chapters
> 43-46, one per procedural language; perhaps these could just be
> demoted to sub-sections of chapter 42 on procedural languages; (d)
> chapters 47 (server programming interface), 50 (replication progress
> tracking), and 51 (archive modules), all of which are important to
> document but none of which seem important enough to put them in the
> top-level documentation index; and (e) large parts of section "VII.
> Internals," which again contain tons of stuff of very marginal
> interest. The first ~4 chapters of the internals section seem like
> they might be mainstream enough to justify the level of prominence
> that we give them, but the rest has got to be of interest to a tiny
> minority of readers.
>
> I think it might be possible to consolidate the internals section by
> grouping a bunch of existing entries together by category. Basically,
> after the first few chapters, you've got stuff that is of interest to
> C programmers writing core or extension code; and you've got
> explainers on things like GEQO and index op-classes and support
> functions which might be of interest even to non-programmers. I think
> for example that we don't need separate top-level chapters on writing
> procedural language handlers, FDWs, tablesample methods, custom scan
> providers, table access methods, index access methods, and WAL
> resource managers. Some or all of those could be grouped under a
> single chapter, perhaps, e.g. Using PostgreSQL Extensibility
> Interfaces.
>
> Thoughts? I realize that this topic is HIGHLY prone to ENDLESS
> bikeshedding, and it's inevitable that not everybody is going to
> agree. But I hope we can agree that it's completely silly that it's
> vastly easier to find the documentation about the backup manifest
> format than it is to find the documentation on CREATE TABLE or
> shared_buffers, and if we can agree on that, then perhaps we can agree
> on some way to make things better.
>
>
>
+many for improving the index.

My own pet docs peeve is a purely editorial one: func.sgml is a 30k line
beast, and I think there's a good case for splitting out at least the
larger chunks of it.

cheers

andrew


Re: documentation structure

2024-03-19 Thread Tom Lane
Daniel Gustafsson  writes:
> It's actually not very odd, the reference section is using  
> elements
> and we had missed the arabic numerals setting on those.  The attached fixes
> that for me.  That being said, we've had roman numerals for the reference
> section since forever (all the way down to the 7.2 docs online has it) so 
> maybe
> it was intentional?

I'm quite sure it *was* intentional.  Maybe it was a bad idea, but
it's not that way simply because nobody thought about it.

regards, tom lane




Re: documentation structure

2024-03-19 Thread Daniel Gustafsson
> On 18 Mar 2024, at 22:40, Laurenz Albe  wrote:
> On Mon, 2024-03-18 at 10:11 -0400, Robert Haas wrote:

>> For reasons that I don't understand, all chapters except
>> for those in "VI. Reference" are numbered, but the chapters in that
>> section have Roman numerals instead.
> 
> That last fact is very odd indeed and could be easily fixed.

It's actually not very odd, the reference section is using  elements
and we had missed the arabic numerals setting on those.  The attached fixes
that for me.  That being said, we've had roman numerals for the reference
section since forever (all the way down to the 7.2 docs online has it) so maybe
it was intentional?  Or no one managed to see it until Robert did, I've
certainly never noticed it until now.

--
Daniel Gustafsson



reference-autolabel.diff
Description: Binary data


Re: documentation structure

2024-03-18 Thread Robert Haas
On Mon, Mar 18, 2024 at 6:51 PM Tom Lane  wrote:
> This might be a silly suggestion, but: could we just render the
> "most important" chapter titles in a larger font?

It's not the silliest suggestion ever -- you could have proposed
! -- but I also suspect it's not the right answer. Of course,
varying the font size can be a great way of emphasizing some things
more than others, but it doesn't usually work out well to just take a
document that was designed to be displayed in a uniform font size and
enlarge bits of text here and there. You usually want to have some
kind of overall plan of which font size is a single component.

For example, on a corporate home page, it's quite common to have two
nav bars, the larger of which has entries that correspond to the
company's product offerings and/or marketing materials, and the
smaller of which has "utility functions" like "login", "contact us",
and "search". Font size can be an effective tool for emphasizing the
relative importance of one nav bar versus the other, but you don't
start by deciding which things are going to get displayed in a larger
font. You start with an overall idea of the layout and then the font
size flows out of that.

Just riffing a bit, you could imagine adding a nav bar to our
documentation, either across the top or along the side, that is always
there on every page of the documentation and contains those links that
we want to make sure are always visible. Necessarily, these must be
limited in number. Then on the home page you could have the whole
table of contents as we do today, and you use that to navigate to
everything that isn't one of the quick links.

Or you can imagine that the home page of our documentation isn't just
a tree view like it is today; it might instead be written in paragraph
form. "Welcome to the PostgreSQL documentation! If you're new here,
check out our tutorial! Otherwise, you might be
interested in our SQL reference, our configuration
reference, or our banana plantation. If none of
those sound like what you want, check out the documentation
index." Obviously in order to actually work, something like
this would need to be expanded into enough paragraphs to actually
cover all of the important sections of the documentation, and probably
not mention banana plantations. Or maybe it wouldn't be just
paragraphs, but a two-column table, with each row of the table having
a main title and link in the narrower lefthand column and a blurb with
more links in the wider righthand column.

I'm sure there are a lot of other ways to do this, too. Our main
documentation page is very old-school, and there are probably a bunch
of ways to do better.

But I'm not sure how easy it would be to get agreement on something
specific, and I don't know how well our toolchain can support anything
other than what we've already got. I've also learned from painful
experience that you can't fix bad content with good markup. I think it
is worth spending some effort on trying to beat the existing format
into submission, promoting things that seem to deserve it and demoting
those that seem to deserve that. At some point, we'll probably reach a
point of diminishing returns, either because we all agree we've done
as well as we can, or because we can't agree on what else to do, and
maybe at that point the only way to improve further is with better web
design and/or a different documentation toolchain. But I think it's
fairly clear that we're not at that point now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-18 Thread Tom Lane
Laurenz Albe  writes:
> On Mon, 2024-03-18 at 10:11 -0400, Robert Haas wrote:
>> I don't know what to do about "I. SQL commands". It's obviously
>> impractical to promote that to a top-level section, because it's got a
>> zillion sub-pages which I don't think we want in the top-level
>> documentation index. But having it as one of several unnumbered
>> chapters interposed between 51 and 52 doesn't seem great either.

> I think that both the GUCs and the SQL reference could be top-level
> sections.  For the GUCs there is an obvious split in sub-chapters,
> and the SQL reference could be a top-level section without any chapters
> under it.

I'd be in favor of promoting all three of the "Reference" things to
the top level, except that as Robert says, it seems likely that that
would end in having a hundred individual command reference pages
visible in the topmost table of contents.  Also, if we manage to
suppress that, did we really make it any more prominent?  Not sure.

Making "SQL commands" top-level with half a dozen subsections would
solve the visibility problem, but I'm not real eager to go there,
because I foresee endless arguments about which subsection a given
command goes in.  Robert's point about wanting a single alphabetized
list is valid too (although you could imagine that being a list in an
introductory section, similar to what we have for system catalogs).

This might be a silly suggestion, but: could we just render the
"most important" chapter titles in a larger font?

regards, tom lane




Re: documentation structure

2024-03-18 Thread Laurenz Albe
On Mon, 2024-03-18 at 10:11 -0400, Robert Haas wrote:
> The two sections of the documentation that seem really
> under-emphasized to me are the GUC documentation and the SQL
> reference. The GUC documentation is all buried under "20. Server
> Configuration" and the SQL command reference is under "I. SQL
> commands". For reasons that I don't understand, all chapters except
> for those in "VI. Reference" are numbered, but the chapters in that
> section have Roman numerals instead.

That last fact is very odd indeed and could be easily fixed.

> I don't know what other people's experience is, but for me, wanting to
> know what a command does or what a setting does is extremely common.
> Therefore, I think these chapters are disproportionately important and
> should be emphasized more. In the case of the GUC reference, one idea
> I have is to split up "III. Server Administration". My proposal is
> that we divide it into three sections. The first would be called "III.
> Server Installation" and would cover chapters 16 (installation from
> binaries) through 19 (server setup and operation). The second would be
> called "IV. Server Configuration" -- so every section that's currently
> a subsection of "server configuration" would become a top-level
> chapter. The third division would be "V. Server Administration," and
> would cover the current chapters 21-33. This is probably far from
> perfect, but it seems like a relatively simple change and better than
> what we have now.

I'm fine with splitting up "Server Administration" into three sections
like you propose.

> I don't know what to do about "I. SQL commands". It's obviously
> impractical to promote that to a top-level section, because it's got a
> zillion sub-pages which I don't think we want in the top-level
> documentation index. But having it as one of several unnumbered
> chapters interposed between 51 and 52 doesn't seem great either.

I think that both the GUCs and the SQL reference could be top-level
sections.  For the GUCs there is an obvious split in sub-chapters,
and the SQL reference could be a top-level section without any chapters
under it.

> The stuff that I think is over-emphasized is as follows: (a) chapters
> 1-3, the tutorial; (b) chapters 4-6, which are essentially a
> continuation of the tutorial, and not at all similar to chapters 8-11
> which are chalk-full of detailed technical information; (c) chapters
> 43-46, one per procedural language; perhaps these could just be
> demoted to sub-sections of chapter 42 on procedural languages; (d)
> chapters 47 (server programming interface), 50 (replication progress
> tracking), and 51 (archive modules), all of which are important to
> document but none of which seem important enough to put them in the
> top-level documentation index; and (e) large parts of section "VII.
> Internals," which again contain tons of stuff of very marginal
> interest. The first ~4 chapters of the internals section seem like
> they might be mainstream enough to justify the level of prominence
> that we give them, but the rest has got to be of interest to a tiny
> minority of readers.

I disagree that the tutorial is over-emphasized.

I also disagree that chapters 4 to 6 are a continuation of the tutorial.
Or at least, they shouldn't be.
When I am looking for a documentation reference on something like
security considerations of SECURITY DEFINER functions, my first
impulse is to look in chapter 5 (Data Definition) or in chapter 38
(Extending SQL), and I am surprised to find it discussed in the
SQL reference of CREATE FUNCTION.

Another case in point is the "Notes" section for CREATE VIEW.  Why is
that not somewhere under "Data Definition"?

For me, the reference should be terse and focused on the syntax.

Changing that is probably a lost cause by now, but I feel that we need
not encourage that development any more by playing down the earlier
chapters.

> I think it might be possible to consolidate the internals section by
> grouping a bunch of existing entries together by category. Basically,
> after the first few chapters, you've got stuff that is of interest to
> C programmers writing core or extension code; and you've got
> explainers on things like GEQO and index op-classes and support
> functions which might be of interest even to non-programmers. I think
> for example that we don't need separate top-level chapters on writing
> procedural language handlers, FDWs, tablesample methods, custom scan
> providers, table access methods, index access methods, and WAL
> resource managers. Some or all of those could be grouped under a
> single chapter, perhaps, e.g. Using PostgreSQL Extensibility
> Interfaces.

I have no strong feelings about that.

Yours,
Laurenz Albe




Re: documentation structure

2024-03-18 Thread Roberto Mello
On Mon, Mar 18, 2024 at 10:12 AM Robert Haas  wrote:

> I was looking at the documentation index this morning[1], and I can't
> help feeling like there are some parts of it that are over-emphasized
> and some parts that are under-emphasized. I'm not sure what we can do
> about this exactly, but I thought it worth writing an email and seeing
> what other people think.
>

I agree, and my usage patterns of the docs are similar.

As the project progresses and more features are added and tacked on to
existing docs, things can get
murky or buried. I imagine that web access and search logs could paint a
picture of documentation usage.

I don't know what other people's experience is, but for me, wanting to
> know what a command does or what a setting does is extremely common.
> Therefore, I think these chapters are disproportionately important and
> should be emphasized more. In the case of the GUC reference, one idea
>

+1

I have is to split up "III. Server Administration". My proposal is
> that we divide it into three sections. The first would be called "III.
> Server Installation" and would cover chapters 16 (installation from
> binaries) through 19 (server setup and operation). The second would be
> called "IV. Server Configuration" -- so every section that's currently
> a subsection of "server configuration" would become a top-level
>
chapter. The third division would be "V. Server Administration," and
> would cover the current chapters 21-33. This is probably far from


I like all of those.


> I don't know what to do about "I. SQL commands". It's obviously
> impractical to promote that to a top-level section, because it's got a
> zillion sub-pages which I don't think we want in the top-level
> documentation index. But having it as one of several unnumbered
> chapters interposed between 51 and 52 doesn't seem great either.
>

I think it'd be easier to read if current "VI. Reference" came right after
"Server Administration",
ahead of "Client Interfaces" and "Server Programming", which are of
interest to a much smaller
subset of users.

Also if the subchapters were numbered like the rest of them. I don't think
the roman numerals are
particularly helpful.

The stuff that I think is over-emphasized is as follows: (a) chapters
> 1-3, the tutorial; (b) chapters 4-6, which are essentially a

...

Also +1

Thoughts? I realize that this topic is HIGHLY prone to ENDLESS
> bikeshedding, and it's inevitable that not everybody is going to
> agree. But I hope we can agree that it's completely silly that it's
> vastly easier to find the documentation about the backup manifest
> format than it is to find the documentation on CREATE TABLE or
> shared_buffers, and if we can agree on that, then perhaps we can agree
> on some way to make things better.
>

Impossible to please everyone, but I'm sure we can improve things.

I've contributed to different parts of the docs over the years, and would
be happy
to help with this work.

Roberto


Re: documentation structure

2024-03-18 Thread Robert Haas
On Mon, Mar 18, 2024 at 10:55 AM Matthias van de Meent
 wrote:
> > I don't know what to do about "I. SQL commands". It's obviously
> > impractical to promote that to a top-level section, because it's got a
> > zillion sub-pages which I don't think we want in the top-level
> > documentation index. But having it as one of several unnumbered
> > chapters interposed between 51 and 52 doesn't seem great either.
>
> Could "SQL Commands" be a top-level construct, with subsections for
> SQL/DML, SQL/DDL, SQL/Transaction management, and PG's
> extensions/administrative/misc features? I sometimes find myself
> trying to mentally organize what SQL commands users can use vs those
> accessible to database owners and administrators, which is not
> currently organized as such in the SQL Commands section.

Yeah, I wondered about that, too. Or for example you could group all
CREATE commands together, all ALTER commands together, all DROP
commands together, etc. But I can't really see a future in such
schemes, because having a single page that links to the reference
documentation for every single command we have in alphabetical order
is incredibly handy, or at least I have found it so. So my feeling -
at least at present - is that it's more fruitful to look into cutting
down the amount of clutter that appears in the top-level documentation
index, and maybe finding ways to make important sections like the SQL
reference more prominent.

Given how much documentation we have, it's just not going to be
possible to make everything that matters conveniently visible at the
top level. I think if people have to click down a level for the SQL
reference, that's fine, as long as the link they need to click on is
reasonably visible. What annoys me about the present structure is that
it isn't. You don't get any visual clue that the "SQL Commands" page
with ~100 subpages is more important than "51. Archive Modules" or
"33. Regression Tests" or "58. Writing a Procedural Language Handler,"
but it totally is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: documentation structure

2024-03-18 Thread Matthias van de Meent
On Mon, 18 Mar 2024 at 15:12, Robert Haas  wrote:

I'm not going into detail about the other docs comments, I don't have
much of an opinion either way on the mentioned sections. You make good
arguments; yet I don't usually use those sections of the docs but
rather do code searches.

> I don't know what to do about "I. SQL commands". It's obviously
> impractical to promote that to a top-level section, because it's got a
> zillion sub-pages which I don't think we want in the top-level
> documentation index. But having it as one of several unnumbered
> chapters interposed between 51 and 52 doesn't seem great either.

Could "SQL Commands" be a top-level construct, with subsections for
SQL/DML, SQL/DDL, SQL/Transaction management, and PG's
extensions/administrative/misc features? I sometimes find myself
trying to mentally organize what SQL commands users can use vs those
accessible to database owners and administrators, which is not
currently organized as such in the SQL Commands section.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




documentation structure

2024-03-18 Thread Robert Haas
I was looking at the documentation index this morning[1], and I can't
help feeling like there are some parts of it that are over-emphasized
and some parts that are under-emphasized. I'm not sure what we can do
about this exactly, but I thought it worth writing an email and seeing
what other people think.

The two sections of the documentation that seem really
under-emphasized to me are the GUC documentation and the SQL
reference. The GUC documentation is all buried under "20. Server
Configuration" and the SQL command reference is under "I. SQL
commands". For reasons that I don't understand, all chapters except
for those in "VI. Reference" are numbered, but the chapters in that
section have Roman numerals instead.

I don't know what other people's experience is, but for me, wanting to
know what a command does or what a setting does is extremely common.
Therefore, I think these chapters are disproportionately important and
should be emphasized more. In the case of the GUC reference, one idea
I have is to split up "III. Server Administration". My proposal is
that we divide it into three sections. The first would be called "III.
Server Installation" and would cover chapters 16 (installation from
binaries) through 19 (server setup and operation). The second would be
called "IV. Server Configuration" -- so every section that's currently
a subsection of "server configuration" would become a top-level
chapter. The third division would be "V. Server Administration," and
would cover the current chapters 21-33. This is probably far from
perfect, but it seems like a relatively simple change and better than
what we have now.

I don't know what to do about "I. SQL commands". It's obviously
impractical to promote that to a top-level section, because it's got a
zillion sub-pages which I don't think we want in the top-level
documentation index. But having it as one of several unnumbered
chapters interposed between 51 and 52 doesn't seem great either.

The stuff that I think is over-emphasized is as follows: (a) chapters
1-3, the tutorial; (b) chapters 4-6, which are essentially a
continuation of the tutorial, and not at all similar to chapters 8-11
which are chalk-full of detailed technical information; (c) chapters
43-46, one per procedural language; perhaps these could just be
demoted to sub-sections of chapter 42 on procedural languages; (d)
chapters 47 (server programming interface), 50 (replication progress
tracking), and 51 (archive modules), all of which are important to
document but none of which seem important enough to put them in the
top-level documentation index; and (e) large parts of section "VII.
Internals," which again contain tons of stuff of very marginal
interest. The first ~4 chapters of the internals section seem like
they might be mainstream enough to justify the level of prominence
that we give them, but the rest has got to be of interest to a tiny
minority of readers.

I think it might be possible to consolidate the internals section by
grouping a bunch of existing entries together by category. Basically,
after the first few chapters, you've got stuff that is of interest to
C programmers writing core or extension code; and you've got
explainers on things like GEQO and index op-classes and support
functions which might be of interest even to non-programmers. I think
for example that we don't need separate top-level chapters on writing
procedural language handlers, FDWs, tablesample methods, custom scan
providers, table access methods, index access methods, and WAL
resource managers. Some or all of those could be grouped under a
single chapter, perhaps, e.g. Using PostgreSQL Extensibility
Interfaces.

Thoughts? I realize that this topic is HIGHLY prone to ENDLESS
bikeshedding, and it's inevitable that not everybody is going to
agree. But I hope we can agree that it's completely silly that it's
vastly easier to find the documentation about the backup manifest
format than it is to find the documentation on CREATE TABLE or
shared_buffers, and if we can agree on that, then perhaps we can agree
on some way to make things better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/docs/16/index.html