SP-GiST confusing introductory paragraph

2023-10-16 Thread PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/16/spgist-intro.html
Description:

I'm confused by this paragraph:

> These popular data structures were originally developed for in-memory
usage. In main memory, they are usually designed as a set of dynamically
allocated nodes linked by pointers. This is not suitable for direct storing
on disk, since these chains of pointers can be rather long which would
require too many disk accesses. In contrast, disk-based data structures
should have a high fanout to minimize I/O. The challenge addressed by
SP-GiST is to map search tree nodes to disk pages in such a way that a
search need access only a few disk pages, even if it traverses many nodes.

In particular, "These popular datastructures" is ambiguous -- based on how
the previous paragraph ends, it sounds like the "popular datastructures" or
SP-GiSTs, but then it goes on to say they were designed for in-memory, and
then it mentions that SP-GiST's space partitioning (with high fanout) is
more appropriate to minimize disk access. I think maybe the solution here
would be to replace "These popular data structures" with something like
"Classic Postgres indexes such as B-tree indexes..." or something like
that.

Also, I think a short parenthetical definition of "fanout" would be useful
here, something like "high fanout (i.e. where each node has a potentially
large number of child nodes that reside in the same disk page)". Took me
some googling to realize what fanout meant in this context. 

Best,
Alex


Re: SP-GiST confusing introductory paragraph

2023-10-16 Thread Alvaro Herrera
On 2023-Oct-16, PG Doc comments form wrote:

> I'm confused by this paragraph:
> 
> > These popular data structures were originally developed for in-memory
> > usage. In main memory, they are usually designed as a set of dynamically
> > allocated nodes linked by pointers. This is not suitable for direct storing
> > on disk, since these chains of pointers can be rather long which would
> > require too many disk accesses. In contrast, disk-based data structures
> > should have a high fanout to minimize I/O. The challenge addressed by
> > SP-GiST is to map search tree nodes to disk pages in such a way that a
> > search need access only a few disk pages, even if it traverses many nodes.
> 
> In particular, "These popular datastructures" is ambiguous -- based on how
> the previous paragraph ends, it sounds like the "popular datastructures" or
> SP-GiSTs, but then it goes on to say they were designed for in-memory, and
> then it mentions that SP-GiST's space partitioning (with high fanout) is
> more appropriate to minimize disk access. I think maybe the solution here
> would be to replace "These popular data structures" with something like
> "Classic Postgres indexes such as B-tree indexes..." or something like
> that.

Yeah, this paragraph is a rewording of Oleg's[1]

   SP-GiST is an abbreviation of space-partitioned GiST - the search
   tree, which allows to implement a wide range of different
   non-balanced disk-based data structures, such as quadtree, kd-tree,
   tries - popular data structures, originally developed for memory
   storage. Main memory access structures usually designed as a set of
   dynamically allocated nodes linked by pointers, which is not suitable
   for direct storing on disk, since these chains of pointers can be
   rather long and require too many disk accesses. In opposite, disk
   based data structures have a high fanout to minimize I/O. The
   challenge is to map nodes of tree to disk pages in such a way, so
   search algorithm accesses only a few disk pages, even if it traverse
   many nodes.

where the point is (IMO) much clearer.

[1] http://www.sai.msu.su/~megera/wiki/spgist_dev

> Also, I think a short parenthetical definition of "fanout" would be useful
> here, something like "high fanout (i.e. where each node has a potentially
> large number of child nodes that reside in the same disk page)". Took me
> some googling to realize what fanout meant in this context. 

Hmm.  We also use the term (hypenated as fan-out) in the reference to
recursive_worktable_factor.  Maybe we should list it in the glossary in
a way that works for both.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)




"20.16. Customized Options" – cannot be set by `ALTER SYSTEM`

2023-10-16 Thread PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/16/runtime-config-custom.html
Description:

As far as I can tell, the following statement:

> PostgreSQL will accept a setting for any two-part parameter name

does not hold when creating a *new* setting with `ALTER SYSTEM`, e.g.

  ALTER SYSTEM SET foo.bar TO 'baz';

will elicit an error.

However, if `foo.bar` is defined in `postgresql.conf` or
`postgresql.auto.conf` – put there by hand – then it can be altered, i.e.
the `ALTER SYSTEM` command above will succeed.

I don't know if this is something that should be mentioned in the
documentation or if it's an inconsistency in the implementation.


Re: "20.16. Customized Options" – cannot be set by `ALTER SYSTEM`

2023-10-16 Thread Tom Lane
PG Doc comments form  writes:
> As far as I can tell, the following statement:

>> PostgreSQL will accept a setting for any two-part parameter name

> does not hold when creating a *new* setting with `ALTER SYSTEM`, e.g.

>   ALTER SYSTEM SET foo.bar TO 'baz';

> will elicit an error.

ALTER SYSTEM requires the variable to be known, so that (a) it can
figure out whether you have permissions to set it at system level,
and (b) it can check the validity of the value.  It does not seem
like a good idea to allow unchecked values to be pushed into the
config file, because a mistake would prevent future server restarts
from succeeding.

If you just do "SET foo.bar = whatever", the action is transiently
allowed because nothing very interesting will happen until/unless some
extension loads a definition of the variable into your session, and we
can figure out at that point whether your setting should be accepted.
It would be too much of a mess to make that work for ALTER SYSTEM
though, not least because the config files don't record who set the
variable.

I do see an issue here:

regression=# ALTER SYSTEM SET foo.bar TO 'baz';
ERROR:  unrecognized configuration parameter "foo.bar"
regression=# SET foo.bar TO 'baz';
SET
regression=# ALTER SYSTEM SET foo.bar TO 'baz';
ALTER SYSTEM

and now we have

$ cat $PGDATA/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
foo.bar = 'baz'

So that feels like a bug: we should not allow ALTER SYSTEM to execute
against a placeholder GUC definition, because the placeholder can't
tell us whether the value is valid.  I wonder though if forbidding
this would break any legitimate usage patterns.

regards, tom lane




Re: "20.16. Customized Options" – cannot be set by `ALTER SYSTEM`

2023-10-16 Thread Laurenz Albe
On Mon, 2023-10-16 at 12:29 -0400, Tom Lane wrote:
> I do see an issue here:
> 
> regression=# ALTER SYSTEM SET foo.bar TO 'baz';
> ERROR:  unrecognized configuration parameter "foo.bar"
> regression=# SET foo.bar TO 'baz';
> SET
> regression=# ALTER SYSTEM SET foo.bar TO 'baz';
> ALTER SYSTEM
> 
> and now we have
> 
> $ cat $PGDATA/postgresql.auto.conf
> # Do not edit this file manually!
> # It will be overwritten by the ALTER SYSTEM command.
> foo.bar = 'baz'
> 
> So that feels like a bug: we should not allow ALTER SYSTEM to execute
> against a placeholder GUC definition, because the placeholder can't
> tell us whether the value is valid.  I wonder though if forbidding
> this would break any legitimate usage patterns.

I feel the same.  However, the lack of any "variables" in SQL (as proposed
in [1]) leads a lot of people to abuse placeholder parameters as variables
to hold application state.  I am sure that that is where this complaint
comes from.  We maintain that doing so is not a valid use case, but that claim
sounds increasingly like a grammarian declaring that sentences should not
end with a preposition, when everybody does it all the time.

Yours,
Laurenz Albe

 [1]: 
https://postgr.es/m/CAFj8pRDY%2Bm9OOxfO10R7J0PAkCCauM-TweaTrdsrsLGMb1VbEQ%40mail.gmail.com




Re: "20.16. Customized Options" – cannot be set by `ALTER SYSTEM`

2023-10-16 Thread Tom Lane
Laurenz Albe  writes:
> On Mon, 2023-10-16 at 12:29 -0400, Tom Lane wrote:
>> So that feels like a bug: we should not allow ALTER SYSTEM to execute
>> against a placeholder GUC definition, because the placeholder can't
>> tell us whether the value is valid.  I wonder though if forbidding
>> this would break any legitimate usage patterns.

> I feel the same.  However, the lack of any "variables" in SQL (as proposed
> in [1]) leads a lot of people to abuse placeholder parameters as variables
> to hold application state.  I am sure that that is where this complaint
> comes from.  We maintain that doing so is not a valid use case, but that claim
> sounds increasingly like a grammarian declaring that sentences should not
> end with a preposition, when everybody does it all the time.

Yeah, and we have been slowly removing the issues that made us not want
to recommend using them like that.

Anyway, I realized that I was wrong to claim that we need ALTER SYSTEM
to defend us against bogus values of extension parameters in the config
file.  Checking is an important thing to do for core parameters, but
a faulty extension parameter doesn't stop the system from booting.
That's because we'll apply all the config file entries before we load
any extensions, even ones listed in shared_preload_libraries.  When
we do load an extension, if it doesn't like what it finds in a placeholder
then you get a WARNING and the parameter's default value is substituted.
So there's no risk of an unstartable system.

So maybe we should allow ALTER SYSTEM for unrecognized parameters,
as long as the parameter name is syntactically legit and you're a
superuser.

regards, tom lane




Re: "20.16. Customized Options" – cannot be set by `ALTER SYSTEM`

2023-10-16 Thread Laurenz Albe
On Mon, 2023-10-16 at 16:21 -0400, Tom Lane wrote:
> So maybe we should allow ALTER SYSTEM for unrecognized parameters,
> as long as the parameter name is syntactically legit and you're a
> superuser.

That seems more consistent than the current behavior, so +1.

Yours,
Laurenz Albe