Re: Patch bug: Fix jsonpath .* on Arrays

2024-06-04 Thread David G. Johnston
On Tuesday, June 4, 2024, David E. Wheeler  wrote:

> Hackers,
>
> The behavior of the .* jpiAnyKey jsonpath selector seems incorrect.
>
> ```
> select jsonb_path_query('[1,2,3]', '$.*');
> jsonb_path_query
> --
> (0 rows)
>
> select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', '$.*');
> jsonb_path_query
> --
> [3, 4, 5]
> ```
>
> The first example might be expected, since .* is intended for object keys,
> but the handing of `jpiAnyKey` has a branch for unwrapping arrays. The
> second example, however, just seems weird: this is .*, not .**.
>

This seems to be working correctly. Lax mode causes the first array level
to unwrap and produce new context item values.  Then the wildcard member
accessor is applied to each.  Numbers don’t have members so no matches
exist in the first example.  The object in the second indeed has a single
member and so matches the wildcard and its value, the array, is returned.

David J.


Re: Explicit specification of index ensuring uniqueness of foreign columns

2024-05-31 Thread David G. Johnston
On Friday, May 31, 2024, Tom Lane  wrote:

> Kaiting Chen  writes:
> > I'd like to resurrect a subset of my proposal in [1], specifically that:
> >   The FOREIGN KEY constraint syntax gains a [ USING INDEX index_name ]
> clause
> >   optionally following the referenced column list.
> > ...
> > While, in this minimal reproduction, the two indexes are
> interchangeable, there
> > are situations that may reasonably occur in the course of ordinary use
> in which
> > they aren't. For example, a redundant unique index with different storage
> > parameters may exist during the migration of an application schema.
>
> I agree that there's a hazard there, but I question if the case is
> sufficiently real-world to justify the costs of introducing a
> non-SQL-standard clause in foreign key constraints.
>
> One such cost is that pg_dump output would become less able to be
> loaded into other DBMSes, or even into older PG versions.
>
> I also wonder if this wouldn't just trade one fragility for another.
> Specifically, I am not sure that we guarantee that the names of
> indexes underlying constraints remain the same across dump/reload.
> If they don't, the USING INDEX clause might fail unnecessarily.
>
> As against that, I'm not sure I've ever seen a real-world case with
> intentionally-duplicate unique indexes.
>
> So on the whole I'm unconvinced that this is worth changing.


Seems like most of those issues could be avoided if we only supply “alter
table” syntax (or a function…).  i.e., give the dba a tool to modify their
system when our default choices fail them.  But continue on with the
defaults as they exist today.

David J.


Re: pgsql: Add more SQL/JSON constructor functions

2024-05-28 Thread David G. Johnston
On Monday, May 27, 2024, Alvaro Herrera  wrote:

> On 2024-May-27, Alvaro Herrera wrote:
>
> > > JSON_SERIALIZE()
>
> I just noticed this behavior, which looks like a bug to me:
>
> select json_serialize('{"a":1, "a":2}' returning varchar(5));
>  json_serialize
> 
>  {"a":
>
> I think this function should throw an error if the destination type
> doesn't have room for the output json.  Otherwise, what good is the
> serialization function?
>
>
It’s not a self-evident bug given that this is exactly how casting data to
varchar(n) behaves as directed by the SQL Standard.

I'd probably leave the internal consistency and take the opportunity to
educate the reader that text is the preferred type in PostgreSQL and,
especially here, there is little good reason to use anything else.

David J.


Re: PG catalog

2024-05-24 Thread David G. Johnston
On Thursday, May 23, 2024, Karki, Sanjay  wrote:
>
> I need to grant select on privilege in pg_catalog to user so I can connect
> via Toad Data point ,
>
> Users can already select from the tables in pg_catalog, grant able
privileges not required or allowed.  Of course, some specific data is
restricted from non-superusers.

David J.


Re: doc regexp_replace replacement string \n does not explained properly

2024-05-20 Thread David G. Johnston
On Monday, May 20, 2024, jian he  wrote:

> hi.
>
> https://www.postgresql.org/docs/current/functions-
> matching.html#FUNCTIONS-POSIX-REGEXP
>
>
>  If there is a match,
> the source string is returned with the replacement string substituted
> for the matching substring.


>
This happens regardless of the presence of parentheses.


>
>  The replacement string can contain \n,
> where n is 1 through 9, to indicate that the source substring matching
> the n'th parenthesized subexpression of the pattern should be
> inserted, and it can contain \& to indicate that the substring
> matching the entire pattern should be inserted.


 Then if the replacement text contains “\n” expressions those are replaced
with text captured from the corresponding parentheses group.


> <<
> i think it explained example like:
> SELECT regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g');


global - find two matches to process.

foobarbaz
fooX\1YX\1Y
fooXarYXazY


>
> but it does not seem to explain cases like:
> SELECT regexp_replace('foobarbaz', 'b(..)', 'X\2Y', 'g');
>
>
foobarbaz
fooX\2YX\2Y
fooX{empty string, no second capture group}YX{empty}Y
fooXYXY

The docs are correct, though I suppose being explicit that a missing
capture group results in an empty string substitution instead of an error
is probably warranted.

David J.


Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-05-17 Thread David G. Johnston
On Fri, May 17, 2024 at 4:57 PM Erik Wienhold  wrote:

> On 2024-05-16 17:47 +0200, David G. Johnston wrote:
> > On Wed, May 15, 2024 at 8:46 AM Robert Haas 
> wrote:
> >
> > > On Thu, Apr 4, 2024 at 12:41 AM Erik Wienhold  wrote:
> > > > Thanks, fixed in v4.  Looks like American English prefers that comma
> and
> > > > it's also more common in our docs.
> > >
> > > Reviewing this patch:
> > >
> > > -  Creates a typed table, which takes its
> > > -  structure from the specified composite type (name optionally
> > > -  schema-qualified).  A typed table is tied to its type; for
> > > -  example the table will be dropped if the type is dropped
> > > -  (with DROP TYPE ... CASCADE).
> > > +  Creates a typed table, which takes its
> > > structure
> > > +  from an existing (name optionally schema-qualified) stand-alone
> > > composite
> > > +  type (i.e., created using )
> though
> > > it
> > > +  still produces a new composite type as well.  The table will
> have
> > > +  a dependency on the referenced type such that cascaded alter and
> > > drop
> > > +  actions on the type will propagate to the table.
> > >
> > > It would be better if this diff didn't reflow the unchanged portions
> > > of the paragraph.
>
> Right.  I now reformatted it so that first line remains unchanged.  But
> the rest of the para is still a complete rewrite.
>
> > > I agree that it's a good idea to mention that the table must have been
> > > created using CREATE TYPE .. AS here, but I disagree with the rest of
> > > the rewording in this hunk. I think we could just add "creating using
> > > CREATE TYPE" to the end of the first sentence, with an xref, and leave
> > > the rest as it is.
> >
> >
> >
> > > I don't see a reason to mention that the typed
> > > table also spawns a rowtype; that's just standard CREATE TABLE
> > > behavior and not really relevant here.
> >
> >
> > I figured it wouldn't be immediately obvious that the system would
> create a
> > second type with identical structure.  Of course, in order for SELECT tbl
> > FROM tbl; to work it must indeed do so.  I'm not married to pointing out
> > this dynamic explicitly though.
> >
> >
> > > And I don't understand what the
> > > rest of the rewording does for us.
> > >
> >
> > It calls out the explicit behavior that the table's columns can change
> due
> > to actions on the underlying type.  Mentioning this unique behavior seems
> > worth a sentence.
> >
> >
> > >   
> > > -  When a typed table is created, then the data types of the
> > > -  columns are determined by the underlying composite type and are
> > > -  not specified by the CREATE TABLE command.
> > > +  A typed table always has the same column names and data types
> as the
> > > +  type it is derived from, and you cannot specify additional
> columns.
> > >But the CREATE TABLE command can add defaults
> > > -  and constraints to the table and can specify storage parameters.
> > > +  and constraints to the table, as well as specify storage
> parameters.
> > >   
> > >
> > > I don't see how this is better.
> > >
> >
> > I'll agree this is more of a stylistic change, but mainly because the
> talk
> > about data types reasonably implies the other items the patch explicitly
> > mentions - names and additional columns.
>
> I prefer David's changes to both paras because right now the details of
> typed tables are spread over the respective CREATE and ALTER commands
> for types and tables.  Or maybe we should add those details to the
> existing "Typed Tables" section at the very end of CREATE TABLE?
>
> > > - errmsg("type %s is not a composite type",
> > > + errmsg("type %s is not a stand-alone composite type",
> > >
> > > I agree with Peter's complaint that people aren't going to understand
> > > what a stand-alone composite type means when they see the revised
> > > error message; to really help people, we're going to need to do better
> > > than this, I think.
> > >
> > >
> > We have a glossary.
>

If sticking with stand-alone composite type as a formal term we should
document it in the glossary.  It's unclear whether this will survive review
though.  With the wording provided in

Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread David G. Johnston
On Friday, May 17, 2024, Joe Conway  wrote:

>
> I wrote:
>
>> Namely, the week before commitfest I don't actually know if I will have
>> the time during that month, but I will make sure my patch is in the
>> commitfest just in case I get a few clear days to work on it. Because if it
>> isn't there, I can't take advantage of those "found" hours.
>>
>
> A solution to both of these issues (yours and mine) would be to allow
> things to be added *during* the CF month. What is the point of having a
> "freeze" before every CF anyway? Especially if they start out clean. If
> something is ready for review on day 8 of the CF, why not let it be added
> for review?
>

In conjunction with WIP removing this limitation on the bimonthlies makes
sense to me.

David J.


Re: Postgres and --config-file option

2024-05-16 Thread David G. Johnston
On Thu, May 16, 2024 at 4:11 PM Michael Paquier  wrote:

> On Thu, May 16, 2024 at 11:57:10AM +0300, Aleksander Alekseev wrote:
> > I propose my original v1 patch for correcting the --help output of
> > 'postgres' too. I agree with the above comments that corresponding
> > changes in v4 became somewhat unwieldy.
>
> Thanks for compiling the rest.
>
> -printf(_("  --NAME=VALUE   set run-time parameter\n"));
> +printf(_("  --NAME=VALUE   set run-time parameter, a shorter form
> of -c\n"));
>
> This part with cross-references in the output is still meh to me, for
> same reason as for the doc changes I've argued to discard upthread.
>

I'm fine with leaving these alone.  If we did change this I'd want to
change both, not just --NAME.


>  write_stderr("%s does not know where to find the server
> configuration file.\n"
> - "You must specify the --config-file or -D invocation
> "
> + "You must specify the --config-file (or equivalent
> -c) or -D invocation "
>
>
I would rather just do this:

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 3fb6803998..f827086489 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1828,8 +1828,8 @@ SelectConfigFiles(const char *userDoption, const char
*progname)
else
{
write_stderr("%s does not know where to find the server
configuration file.\n"
-"You must specify the
--config-file or -D invocation "
-"option or set the PGDATA
environment variable.\n",
+"You must specify either the
config_file run-time parameter or "
+"provide the database directory\n",
 progname);
return false;
}

Both "run-time parameter" and "database directory" are words present in the
help and the user can find the correct argument if that is the option they
want to use.  The removal of the mention of the PGDATA environment variable
doesn't seem to be a great loss here.  This error message doesn't seem to
be the correct place to teach the user about all of their options so long
as they choose to read the documentation they learn about them there; and
we need not prescribe the specific means by which they supply either of
those pieces of information - which is the norm.  If someone simply runs
"postgres" at the command line this message and --help gives them
sufficient information to proceed.

David J.


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread David G. Johnston
On Thu, May 16, 2024 at 1:46 PM Melanie Plageman 
wrote:

>
> I should probably simply
> withdraw and re-register them. My justification was that I'll lose
> them if I don't keep them in the commitfest app. But, I could just,
> you know, save them somewhere myself instead of polluting the
> commitfest app with them. I don't know if others are in this
> situation. Anyway, I'm definitely currently guilty of parking.
>
>
I use a personal JIRA to track the stuff that I hope makes it into the
codebase, as well as just starring the corresponding emails in the
short-term.  Every patch ever submitted sits in the mailing list archive so
I have no real need to preserve git branches with my submitted work on
them.  At lot of my work comes down to lucky timing so I'm mostly content
with just pinging my draft patches on the email thread once in a while
hoping there is interest and time from someone.  For stuff that I would be
OK committing as submitted I'll add it to the commitfest and wait for
someone to either agree or point out where I can improve things.

Adding both these kinds to WIP appeals to me, particularly with something
akin to a "Collaboration Wanted" category in addition to "Needs Review" for
when I think it is ready, and "Waiting on Author" for stuff that has
pending feedback to resolve - or the author isn't currently fishing for
reviewer time for whatever reason.  Ideally there would be no rejections,
only constructive feedback that convinces the author that, whether for now
or forever, the proposed patch should be withdrawn pending some change in
circumstances that suggests the world is ready for it.

David J.


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread David G. Johnston
On Thu, May 16, 2024 at 11:30 AM Robert Haas  wrote:

> Hi,
>
> The original intent of CommitFests, and of commitfest.postgresql.org
> by extension, was to provide a place where patches could be registered
> to indicate that they needed to be reviewed, thus enabling patch
> authors and patch reviewers to find each other in a reasonably
> efficient way. I don't think it's working any more. I spent a good
> deal of time going through the CommitFest this week, and I didn't get
> through a very large percentage of it, and what I found is that the
> status of the patches registered there is often much messier than can
> be captured by a simple "Needs Review" or "Waiting on Author," and the
> number of patches that are actually in need of review is not all that
> large. For example, there are:
>
> - patches parked there by a committer who will almost certainly do
> something about them after we branch
> - patches parked there by a committer who probably won't do something
> about them after we branch, but maybe they will, or maybe somebody
> else will, and anyway this way we at least run CI
> - patches parked there by a committer who may well do something about
> them before we even branch, because they're not actually subject to
> the feature freeze
>

If a committer has a patch in the CF that is going to be committed in the
future unless there is outside action those should be put under a "Pending
Commit" status.

- patches that we've said we don't want but the author thinks we do
> (sometimes i agree with the author, sometimes not)
> - patches that have long-unresolved difficulties which the author
> either doesn't know how to solve or is in no hurry to solve
> - patches that have already been reviewed by multiple people, often
> including several committers, and which have been updated multiple
> times, but for one reason or another, not committed
>

Use the same software but a different endpoint - Collaboration.  Or even
the same endpoint just add an always open slot named "Work In Process"
(WIP).  If items can be moved from there to another open or future
commitfest slot so much the better.

- patches that actually do need to be reviewed
>

If we can distinguish between needs to be reviewed by a committer
(commitfest dated slots - bimonthlies) and reviewed by someone other than
the author (work in process slot) that should help everyone.  Of course,
anyone is welcome to review a patch that has been marked ready to commit.
I suppose "ready to commit" already sorta does this without the need for
WIP but a quick sanity check would be that ready to commit shouldn't (not
mustn't) be seen in WIP and needs review shouldn't be seen in the
bimonthlies.  A needs review in WIP means that the patch has been seen by a
committer and sent back for more work but that the scope and intent are
such that it will make it into the upcoming major release.  Or is something
like a bug fix that just goes right into the bimonthly instead of starting
out as a WIP item.


> I think there are a couple of things that have led to this state of
> affairs. First, we got tired of making people mad by booting their
> stuff out of the CommitFest, so we mostly just stopped doing it, even
> if it had 0% chance of being committed this CommitFest, and really
> even if it had a 0% chance of being committed ever.


Those likely never get out of the new WIP slot discussed above.  Your patch
tracker basically.  And there should be less angst in moving something in
the bimonthly into WIP rather than dropping it outright.  There is
discussion to be had regarding WIP/patch tracking should we go this
direction but even if it is just movIng clutter from one room to another
there seems like a clear benefit and need to tighten up what it means to be
in the bimonthly slot - to have qualifications laid out for a patch to earn
its way there, either by effort (authored and reviewed) or need (i.e., bug
fixes), or, realistically, being authored by a committer and being mostly
trivial in nature.


> Second, we added
> CI, which means that there is positive value to registering the patch
> in the CommitFest even if committing it is not in the cards.


The new slot retains this benefit.

And those
> things together have created a third problem, which is that the list
> is now so long and so messy that even the CommitFest managers probably
> don't manage to go through the whole thing thoroughly in a month.
>

The new slot wouldn't be subject to this.

We'll still have a problem with too many WIP patches and not enough ability
or desire to resolve them.  But setting a higher bar to get onto the
bimonthly slot while still providing a place for collaboration is a step
forward that configuring technology can help with.  As for WIP, maybe
adding thumbs-up and thumbs-down support tracking widgets will help draw
attention to more desired things.

David J.


Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-05-16 Thread David G. Johnston
On Wed, May 15, 2024 at 8:46 AM Robert Haas  wrote:

> On Thu, Apr 4, 2024 at 12:41 AM Erik Wienhold  wrote:
> > Thanks, fixed in v4.  Looks like American English prefers that comma and
> > it's also more common in our docs.
>
> Reviewing this patch:
>
> -  Creates a typed table, which takes its
> -  structure from the specified composite type (name optionally
> -  schema-qualified).  A typed table is tied to its type; for
> -  example the table will be dropped if the type is dropped
> -  (with DROP TYPE ... CASCADE).
> +  Creates a typed table, which takes its
> structure
> +  from an existing (name optionally schema-qualified) stand-alone
> composite
> +  type (i.e., created using ) though
> it
> +  still produces a new composite type as well.  The table will have
> +  a dependency on the referenced type such that cascaded alter and
> drop
> +  actions on the type will propagate to the table.
>
> It would be better if this diff didn't reflow the unchanged portions
> of the paragraph.
>
> I agree that it's a good idea to mention that the table must have been
> created using CREATE TYPE .. AS here, but I disagree with the rest of
> the rewording in this hunk. I think we could just add "creating using
> CREATE TYPE" to the end of the first sentence, with an xref, and leave
> the rest as it is.



> I don't see a reason to mention that the typed
> table also spawns a rowtype; that's just standard CREATE TABLE
> behavior and not really relevant here.


I figured it wouldn't be immediately obvious that the system would create a
second type with identical structure.  Of course, in order for SELECT tbl
FROM tbl; to work it must indeed do so.  I'm not married to pointing out
this dynamic explicitly though.


> And I don't understand what the
> rest of the rewording does for us.
>

It calls out the explicit behavior that the table's columns can change due
to actions on the underlying type.  Mentioning this unique behavior seems
worth a sentence.


>   
> -  When a typed table is created, then the data types of the
> -  columns are determined by the underlying composite type and are
> -  not specified by the CREATE TABLE command.
> +  A typed table always has the same column names and data types as the
> +  type it is derived from, and you cannot specify additional columns.
>But the CREATE TABLE command can add defaults
> -  and constraints to the table and can specify storage parameters.
> +  and constraints to the table, as well as specify storage parameters.
>   
>
> I don't see how this is better.
>

I'll agree this is more of a stylistic change, but mainly because the talk
about data types reasonably implies the other items the patch explicitly
mentions - names and additional columns.


> - errmsg("type %s is not a composite type",
> + errmsg("type %s is not a stand-alone composite type",
>
> I agree with Peter's complaint that people aren't going to understand
> what a stand-alone composite type means when they see the revised
> error message; to really help people, we're going to need to do better
> than this, I think.
>
>
We have a glossary.

That said, leave the wording as-is and add a conditional hint: The
composite type must not also be a table.

David J.


Re: First draft of PG 17 release notes

2024-05-15 Thread David G. Johnston
On Wednesday, May 15, 2024, jian he  wrote:

> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> >
>
> in section: E.1.2. Migration to Version 17
>
> >> Rename I/O block read/write timing statistics columns of
> pg_stat_statement (Nazir Bilal Yavuz)
> >> This renames "blk_read_time" to "shared_blk_read_time", and
> "blk_write_time" to "shared_blk_write_time".
>
> we only mentioned, pg_stat_statements some columns name changed
> in "E.1.2. Migration to Version 17"
> but if you look at the release note pg_stat_statements section,
> we added a bunch of columns, which are far more incompatibile than
> just colunm name changes.
>
> not sure we need add these in section "E.1.2. Migration to Version 17"
>
>
New columns are not a migration issue since nothing being migrated forward
ever referenced them.  Its the ones that existing code knows about that
we’ve removed (including renames) that matter from a migration perspective.

David J.


Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 6:35 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

>
> If in core I would still want to expose this as say a contrib module
> binary instead of hacking it into postgres.  It would be our first server
> program entry there.
>
>
Sorry for self-reply but:

Maybe name it "pg_script_check" with a, for now mandatory, "--syntax-only"
option that enables this raw parse mode.  Leaving room for executing this
in an environment where there is, or it can launch, a running instance that
then performs post-parse analysis as well.

David J.


Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 1:00 PM Robert Haas  wrote:

> On Wed, May 15, 2024 at 3:28 PM Tom Lane  wrote:
> > Sorry: "make sense" was a poorly chosen phrase there.  What I was
> > doubting, and continue to doubt, is that 100% checking of what
> > you can check without catalog access and 0% checking of the rest
> > is a behavior that end users will find especially useful.
>
> You might be right, but my guess is that you're wrong. Syntax
> highlighting is very popular, and seems like a more sophisticated
> version of that same concept.


The proposed seems distinctly less sophisticated though would be a starting
point.

I think the documentation for --syntax check would read something like this:

postgres --syntax-check={filename | -}

Performs a pass over the lexical structure of the script supplied in
filename or, if - is specified, standard input, then exits.  The exit code
is zero if no errors were found, otherwise it is 1, and the errors, at full
verbosity, are printed to standard error.  This does not involve reading
the configuration file and, by extension, will not detect errors that
require knowledge of a database schema, including the system catalogs, to
manifest.  There will be no false positives, but due to the prior rule,
false negatives must be factored into its usage.  Thus this option is most
useful as an initial triage point, quickly rejecting SQL code without
requiring a running PostgreSQL service.

Note: This is exposed as a convenient way to get access to the outcome of
performing a raw parse within the specific version of the postgres binary
being executed.  The specific implementation of that parse is still
non-public.  Likewise, PostgreSQL doesn't itself have a use for a raw parse
output beyond sending it to post-parse analysis.  All of the catalog
required checks, and potentially some that don't obviously need the
catalogs, happen in this post-parse step; which the syntax checking API
does not expose.  In short, the API here doesn't include any guarantees
regarding the specific errors one should expect to see output, only the no
false positive test result of performing the first stage raw parse.

David J.

If in core I would still want to expose this as say a contrib module binary
instead of hacking it into postgres.  It would be our first server program
entry there.


Re: add function argument names to regex* functions.

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 1:19 PM Robert Haas  wrote:

>
> So my point was: to me, N is more self-documenting than replace_at,
> and less self-documenting than count or occurrence.
>
> If your mileage varies on that point, so be it!
>
>
Maybe just "match" instead of "replace_match".

Reading this it strikes me that any of these parameter names can and
probably should be read as having "replace" in front of them:

replace N
replace count
replace occurrence
replace match

Saying replace becomes redundant:
replace replace at
replace replace match

David J.


Re: add function argument names to regex* functions.

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 12:07 PM Robert Haas  wrote:

> On Wed, May 15, 2024 at 3:01 PM David G. Johnston
>  wrote:
> > I think this confusion goes to show that replacing N with count doesn't
> work.
> >
> > "replace_at" comes to mind as a better name.
> I'd expect replace_at to be a
> character position or something, not an occurrence count.
>
>
I'll amend the name to:  "replace_match"

I do now see that since the immediately preceding parameter, "start", deals
with characters instead of matches that making it clear this parameter
deals in matches in the name work.  The singular 'match' has all the same
benefits as 'at' plus this point of clarity.


David J.


Re: add function argument names to regex* functions.

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 12:52 PM Robert Haas  wrote:

> On Wed, May 15, 2024 at 3:25 PM David G. Johnston
>  wrote:
> > The function replaces matches, not random characters.  And if you are
> reading the documentation I find it implausible that the wording I
> suggested would cause one to think in terms of characters instead of
> matches.
>
> I mean I just told you what my reaction to it was. If you find that
> reaction "implausible" then I guess you think I was lying when I said
> that?
>
>
You just broke my brain when you say that you read:

By default, only the first match of the pattern is replaced.  If replace_at
is specified and greater than zero, then the first "replace_at - 1" matches
are skipped before making a single replacement (i.e., the g flag is ignored
when replace_at is specified.)

And then say:

I'd expect replace_at to be a character position or something, not an
occurrence count.

I guess it isn't a claim you are lying, rather I simply don't follow your
mental model of all this and in my mental model behind the proposal I don't
believe the typical reader will become confused on that point.  I guess
that means I don't find you to be the typical reader, at least so far as
this specific topic goes.  But hey, maybe I'm the one in the minority.  In
either case we disagree and that was my main point.

David J.


Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 12:35 PM Josef Šimánek 
wrote:

> st 15. 5. 2024 v 21:33 odesílatel David G. Johnston
>  napsal:
>
> > Now, in my ideal world something like this could be made as an extension
> so that it can work on older versions and not have to be maintained by
> core.  And likely grow more features over time.  Is there anything
> fundamental about this that prevents it being implemented in an extension
> and, if so, what can we add to core in v18 to alleviate that limitation?
>
> Like extension providing additional binary? Or what kind of extension
> do you mean? One of the original ideas was to be able to do so (parse
> query) without running postgres itself. Could extension be useful
> without running postgres backend?
>
>
Pushing beyond my experience level here...but yes a separately installed
binary (extension is being used conceptually here, this doesn't involve
"create extension") that can inspect pg_config to find out where
backend/postmaster library objects are installed and link to them.

David J.


Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 12:18 PM  wrote:

> Tom Lane:
> >> This is really what is missing for the ecosystem. A libpqparser for
> >> tools to use: Formatters, linters, query rewriters, simple syntax
> >> checkers... they are all missing access to postgres' own parser.
> >
> > To get to that, you'd need some kind of agreement on what the syntax
> > tree is.  I doubt our existing implementation would be directly useful
> > to very many tools, and even if it is, do they want to track constant
> > version-to-version changes?
>
> Correct, on top of what the syntax tree currently has, one would
> probably need:
> - comments
> - locations (line number / character) for everything, including those of
> comments
>
>
I'm with the original patch idea at this point.  A utility that simply runs
text through the parser, not parse analysis, and answers the question:
"Were you able to parse this?" has both value and seems like something that
can be patched into core in a couple of hundred lines, not thousands, as
has already been demonstrated.

Sure, other questions are valid and other goals exist in the ecosystem, but
that doesn't diminish this one which is sufficiently justified for my +1 on
the idea.

Now, in my ideal world something like this could be made as an extension so
that it can work on older versions and not have to be maintained by core.
And likely grow more features over time.  Is there anything fundamental
about this that prevents it being implemented in an extension and, if so,
what can we add to core in v18 to alleviate that limitation?

David J.


Re: add function argument names to regex* functions.

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 12:07 PM Robert Haas  wrote:

> On Wed, May 15, 2024 at 3:01 PM David G. Johnston
>  wrote:
> > I think this confusion goes to show that replacing N with count doesn't
> work.
> >
> > "replace_at" comes to mind as a better name.
>
> I do not agree with that at all. It shows that a literal
> search-and-replace changing N to count does not work, but it does not
> show that count is a bad name for the concept, and I don't think it
> is. I believe that if I were reading the documentation, count would be
> clearer to me than N, N would probably still be clear enough, and
> replace_at wouldn't be clear at all. I'd expect replace_at to be a
> character position or something, not an occurrence count.
>
>
The function replaces matches, not random characters.  And if you are
reading the documentation I find it implausible that the wording I
suggested would cause one to think in terms of characters instead of
matches.

If I choose not to read the documentation "count" seems like it behaves as
a qualified "g".  I don't want all matches replaced, I want the first
"count" matches only replaced.

"occurrence" probably is the best choice but I agree the spelling issues
are a big negative.

count - how many things there are.  This isn't a count.  I'd rather stick
with N, at least it actually has the desired meaning as a pointer to an
item in a list.

N - The label provides zero context as to what the number you place there
is going to be used for.  Labels ideally do more work than this especially
if someone takes the time to spell them out.  Otherwise why use "pattern"
instead of "p".

David J.


Re: add function argument names to regex* functions.

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 11:46 AM Robert Haas  wrote:

> On Thu, Apr 4, 2024 at 9:55 AM jian he 
> wrote:
> > in the regexp_replace explanation section.
> > changing "N" to lower-case would be misleading for regexp_replace?
> > so I choose "count".
>
> I don't see why that would be confusing for regexp_replace
> specifically, but I think N => count is a reasonable change to make.
> However, I don't think this quite works:
>
> + then the count'th match of the pattern
>
> An English speaker is more likely to understand what is meant by
> "N'th" than what is meant by "count'th". Even if they can guess, it's
> kinda strange-looking. I think it needs to be rephrased somehow, but
> I'm not sure exactly how.
>
>
I think this confusion goes to show that replacing N with count doesn't
work.

"replace_at" comes to mind as a better name.

By default, only the first match of the pattern is replaced.  If replace_at
is specified and greater than zero, then the first "replace_at - 1" matches
are skipped before making a single replacement (i.e., the g flag is ignored
when replace_at is specified.)

David J.


Re: Postgres and --config-file option

2024-05-15 Thread David G. Johnston
On Wed, May 15, 2024 at 2:49 AM Peter Eisentraut 
wrote:

> On 15.05.24 04:07, Michael Paquier wrote:
> > Not sure that these additions in --help or the docs are necessary.
> > The rest looks OK.
> >
> > -"You must specify the --config-file or -D invocation "
> > +"You must specify the --config-file (or equivalent -c) or -D
> invocation "
> >
> > How about "You must specify the --config-file, -c
> > \"config_file=VALUE\" or -D invocation"?  There is some practice for
> > --opt=VALUE in .po files.
>
> Yeah, some of this is becoming quite unwieldy, if we document and
> mention each spelling variant of each option everywhere.
>

Where else would this need to be added that was missed?  Largely we don't
discuss how to bring a setting into effect - rather there is a single
reference area that discusses how, and everywhere else just assumes you
have read it and goes on to name the setting.  On this grounds the
proper fix here is probably to not put the how into the message:

"You must specify the config_file option, the -D argument, or the PGDATA
environment variable."

And this is only unwieldy because while -D and --config-file both can get
to the same result they are not substitutes for each other.  Namely if the
configuration file is not in the data directory, as is the case on Debian,
the choice to use -D is not going to work.

This isn't an error message, I'm not all that worried if we output a wall
of text in lieu of pointing the user to the reference page.


> Maybe if the original problem is that the option --config-file is not
> explicitly in the --help output, let's add it to the --help output?
>
>
I'm not opposed to this.  Though maybe it is sufficient to do:

--NAME=VALUE (e.g., --config-file='...')

I would do this in addition to removing the explicit how of setting
config_file above.

We also don't mention environment variables in the help but that message
refers to PGDATA...so the complaint and fix if done on that basis seems a
bit selective.

David J.


Re: Things I don't like about \du's "Attributes" column

2024-05-14 Thread David G. Johnston
On Tue, May 14, 2024 at 9:03 AM Robert Haas  wrote:

> On Tue, Apr 16, 2024 at 3:06 AM Pavel Luzanov 
> wrote:
> > As for the Login column and its values.
> > I'm not sure about using "Can" instead of "yes" to represent true.
> > In other psql commands, boolean values are always shown as yes/no.
> > NULL instead of false might be possible, but I'd rather check if this
> approach
> > has been used elsewhere. I prefer consistency everywhere.
>
> I don't think we can use "Can" to mean "yes". That's going to be
> really confusing.
>

Agreed


> If I see that the connection limit is labelled as (irrelevant)
> I don't know why it's labelled that way and, if it were me, I'd likely
> end up looking at the source code to figure out why it's showing it
> that way.
>

Or we'd document what we've done and users that don't want to go looking at
source code can just read our specification.


> I think we should go back to the v4 version of this patch, minus the
> (ignored) stuff.
>
>
Agreed, I'm past the point of wanting to have this behave more
intelligently rather than a way for people to avoid having to go write a
catalog using query themselves.

David J.


Re: Document NULL

2024-05-11 Thread David G. Johnston
On Saturday, May 11, 2024, Thom Brown  wrote:

>
>  Sat, May 11, 2024, 16:34 David G. Johnston 
> wrote:
>


> My plan is to have a v4 out next week, without or without a review of this
>> draft, but then the subsequent few weeks will probably be a bit quiet.
>>
>
> +   The cardinal rule, a given null value is never
> +   equal or unequal
> +   to any other non-null.
>
> Again, doesn't this imply it tends to be equal to another null by its
> omission?
>
>
I still agree, it’s just a typo now…

…is never equal or unequal to any value.

Though I haven’t settled on a phrasing I really like.  But I’m trying to
avoid a parenthetical.

David J.


Re: Document NULL

2024-05-11 Thread David G. Johnston
On Fri, May 3, 2024 at 9:00 AM David G. Johnston 
wrote:

> On Fri, May 3, 2024 at 8:44 AM Tom Lane  wrote:
>
>> Having said that, I reiterate my proposal that we make it a new
>>
>  under DDL, before 5.2 Default Values which is the first
>> place in ddl.sgml that assumes you have heard of nulls.
>
>
> I will go with this and remove the "Data Basics" section I wrote, leaving
> it to be just a discussion about null values.  The tutorial is the only
> section that really needs unique wording to fit in.  No matter where we
> decide to place it otherwise the core content will be the same, with maybe
> a different section preface to tie it in.
>
>
v3 Attached.

Probably at the 90% complete mark.  Minimal index entries, not as thorough
a look-about of the existing documentation as I'd like.  Probably some
wording and style choices to tweak.  Figured better to get feedback now
before I go into polish mode.  In particular, tweaking and re-running the
examples.

Yes, I am aware of my improper indentation for programlisting and screen. I
wanted to be able to use the code folding features of my editor.  Those can
be readily un-indented in the final version.

The changes to func.sgml is basically one change repeated something like 20
times with tweaks for true/false.  Plus moving the discussion regarding the
SQL specification into the new null handling section.

It took me doing this to really understand the difference between row
constructors and composite typed values, especially since array
constructors produce array typed values and the constructor is just an
unimportant implementation option while row constructors introduce
meaningfully different behaviors when used.

My plan is to have a v4 out next week, without or without a review of this
draft, but then the subsequent few weeks will probably be a bit quiet.

David J.
From bea784bd683f7e022dbfb3d72832d09fc7754913 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Wed, 1 May 2024 07:45:48 -0700
Subject: [PATCH] Document NULL

---
 doc/src/sgml/ddl.sgml|   2 +
 doc/src/sgml/filelist.sgml   |   1 +
 doc/src/sgml/func.sgml   | 268 ++---
 doc/src/sgml/nullvalues.sgml | 719 +++
 4 files changed, 837 insertions(+), 153 deletions(-)
 create mode 100644 doc/src/sgml/nullvalues.sgml

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 026bfff70f..68a0fe698d 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -168,6 +168,8 @@ DROP TABLE products;
   
  
 
+ 
+
  
   Default Values
 
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 38ec362d8f..882752e88f 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -21,6 +21,7 @@
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 17c44bc338..98fba7742c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23295,7 +23295,8 @@ MERGE INTO products p
This section describes the SQL-compliant subquery
expressions available in PostgreSQL.
All of the expression forms documented in this section return
-   Boolean (true/false) results.
+   three-valued typed
+   results (true, false, or null).
   
 
   
@@ -23357,19 +23358,17 @@ WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2);
 
   
The right-hand side is a parenthesized
-   subquery, which must return exactly one column.  The left-hand expression
+   subquery, which must return exactly one column.  The result of IN
+   is false if the subquery returns no rows, otherwise the left-hand expression
is evaluated and compared to each row of the subquery result.
-   The result of IN is true if any equal subquery row is found.
-   The result is false if no equal row is found (including the
-   case where the subquery returns no rows).
+   The result is true if any equal subquery row is found.
+   The result is false if no equal row is found.
   
 
   
-   Note that if the left-hand expression yields null, or if there are
-   no equal right-hand values and at least one right-hand row yields
-   null, the result of the IN construct will be null, not false.
-   This is in accordance with SQL's normal rules for Boolean combinations
-   of null values.
+   As explained in , it is not possible to see
+   a false result in the presence of both rows and null values since the multiple equality
+   tests are AND'd together.
   
 
   
@@ -23386,21 +23385,18 @@ WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2);
as described in .
The right-hand side is a parenthesized
subquery, which must return exactly as many columns as there are
-   expressions in the left-hand row.  The left-hand expressions are
+   expressions in the left-hand row.
+   The result of IN is false if the subquery returns no rows,
+   otherwise the left-hand expressions are
evaluated and compared row-wise to each row of the subque

Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread David G. Johnston
On Thu, May 9, 2024 at 1:16 PM Andres Freund  wrote:

> Hi,
>
> On 2024-05-09 09:23:37 -0700, David G. Johnston wrote:
> > This needs updating:
> > https://www.postgresql.org/docs/current/docguide-build-meson.html
>
> You mean it should have a syntax target? Or that something else is out of
> date?
>
>
v17 looks good, I like the auto-generation.  I failed to notice I was
looking at v16 when searching for a check docs target.


> Also, as a sanity check, running that command takes my system 1 minute.
> Any
> > idea what percentile that falls into?
>
> I think that's on the longer end - what OS/environment is this on? Even on
> ~5yo CPU with turbo boost disabled it's 48s for me.  FWIW, the single-page
> html is a good bit faster, 29s on the same system.
>

Ubuntu 22.04 running in AWS Workspaces Power.

Amazon EC2 t3.xlarge
Intel® Xeon(R) Platinum 8259CL CPU @ 2.50GHz × 4
16GiB Ram

David J.


Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread David G. Johnston
On Thu, May 9, 2024 at 12:12 PM Dagfinn Ilmari Mannsåker 
wrote:

> "David G. Johnston"  writes:
>
> > I've been using "ninja html" which isn't shown here.
>
> The /devel/ version has a link to the full list of doc targets:
>
>
> https://www.postgresql.org/docs/devel/install-meson.html#TARGETS-MESON-DOCUMENTATION


I knew I learned about the html target from the docs.  Forgot to use the
devel ones this time around.


> Attached is a patch which adds a check-docs target for meson, which
> takes 0.3s on my laptop.
>

Thanks.

David J.


Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread David G. Johnston
$subject

Make has one:
https://www.postgresql.org/docs/current/docguide-build.html#DOCGUIDE-BUILD-SYNTAX-CHECK

This needs updating:
https://www.postgresql.org/docs/current/docguide-build-meson.html

I've been using "ninja html" which isn't shown here.  Also, as a sanity
check, running that command takes my system 1 minute.  Any idea what
percentile that falls into?

David J.


Re: request for database identifier in the startup packet

2024-05-09 Thread David G. Johnston
On Thursday, May 9, 2024, Dave Cramer  wrote:

> Greetings,
>
> The JDBC driver is currently keeping a per connection cache of types in
> the driver. We are seeing cases where the number of columns is quite high.
> In one case Prevent fetchFieldMetaData() from being run when unnecessary.
> · Issue #3241 · pgjdbc/pgjdbc (github.com)
>  2.6 Million columns.
>
> If we knew that we were connecting to the same database we could use a
> single cache across connections.
>
> I think we would require a server/database identifier in the startup
> message.
>

I feel like pgbouncer ruins this plan.

But maybe you can construct a lookup key from some combination of data
provided by these functions:
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-SESSION

David J.


Re: PERIOD foreign key feature

2024-05-07 Thread David G. Johnston
On Tue, May 7, 2024 at 7:54 AM Bruce Momjian  wrote:

> In this commit:
>
> commit 34768ee3616
> Author: Peter Eisentraut 
> Date:   Sun Mar 24 07:37:13 2024 +0100
>
> Add temporal FOREIGN KEY contraints
>
> Add PERIOD clause to foreign key constraint definitions.  This
> is
> supported for range and multirange types.  Temporal foreign
> keys check
> for range containment instead of equality.
>
> This feature matches the behavior of the SQL standard temporal
> foreign
> keys, but it works on PostgreSQL's native ranges instead of
> SQL's
> "periods", which don't exist in PostgreSQL (yet).
>
> Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET
> DEFAULT}
> are not supported yet.
>
> Author: Paul A. Jungwirth 
> Reviewed-by: Peter Eisentraut 
> Reviewed-by: jian he 
> Discussion:
> https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mdhcy4_qq0+noc...@mail.gmail.com
>
> this text was added to create_table.sgml:
>
> In addition, the referenced table must have a primary
> key or unique constraint declared with WITHOUT
> --> OVERLAPS.  Finally, if one side of the foreign key
> --> uses PERIOD, the other side must too.  If the
> refcolumn list is
> omitted, the WITHOUT OVERLAPS part of the
> primary key is treated as if marked with PERIOD.
>
> In the two marked lines, it says "if one side of the foreign key uses
> PERIOD, the other side must too."  However, looking at the example
> queries, it seems like if the foreign side has PERIOD, the primary side
> must have WITHOUT OVERLAPS, not PERIOD.
>
> Does this doc text need correcting?
>
>
The text is factually correct, though a bit hard to parse.

"the other side" refers to the part after "REFERENCES":

FOREIGN KEY ( column_name [, ... ] [, PERIOD column_name ] ) REFERENCES
reftable [ ( refcolumn [, ... ] [, PERIOD column_name ] ) ]

***(shouldn't the second occurrence be [, PERIOD refcolum] ?)

The text is pointing out that since the refcolumn specification is optional
you may very well not see a second PERIOD keyword in the clause.  Instead
it will be inferred from the PK.

Maybe:

Finally, if the foreign key has a PERIOD column_name specification the
corresponding refcolumn, if present, must also be marked PERIOD.  If the
refcolumn clause is omitted, and thus the reftable's primary key constraint
chosen, the primary key must have its final column marked WITHOUT OVERLAPS.

David J.


Re: Increase the length of identifers from 63 characters to 128 characters or more

2024-05-06 Thread David G. Johnston
On Monday, May 6, 2024, Peter Burbery  wrote:

>
> Business Use-case: I want to create a table named things_that_take_up_a_
> lot_of_storage_and_space_on_a_computer_and_hard_drive of 75 characters. I
> also want to create a column named thing_that_takes_up_a_
> lot_of_storage_and_space_on_a_computer_and_hard_drive_id of 78
> characters. People have asked for this feature before.
>
>
We have a mailing list archive.  You should do the same research of past
requests and discussions on it since that is where you will find the
developers involved in the discussion and can find out the past reasoning
as to why this hasn’t been changed.

 David J.


Re: Proposal for CREATE OR REPLACE EVENT TRIGGER in PostgreSQL

2024-05-03 Thread David G. Johnston
On Friday, May 3, 2024, Peter Burbery  wrote:

> Dear pgsql-hackers,
>
> One-line Summary:
> Proposal to introduce the CREATE OR REPLACE syntax for EVENT TRIGGER in
> PostgreSQL.
>
> Business Use-case:
> Currently, to modify an EVENT TRIGGER, one must drop and recreate it. This
> proposal aims to introduce a CREATE OR REPLACE syntax for EVENT TRIGGER,
> similar to other database objects in PostgreSQL, to simplify this process
> and improve usability.
>
> Contact Information:
> Peter Burbery
> peter.cullen.burb...@gmail.com
>
> I look forward to your feedback on this proposal.
>

Its doesn’t seem like that big an omission, especially since unlike views
and functions you can’t actually affect dependencies on an event trigger.
But the same can be said of normal triggers and they already have an “or
replace” option.  So, sure, if you come along and offer a patch for this it
seems probable it would be accepted.  I’d just be a bit sad it probably
would take away from time that could spent on more desirable features.

As for your proposal format, it is a quite confusing presentation to send
to an open source project.  Also, your examples are overly complex, and
crammed together, for the purpose and not enough effort is spent actually
trying to demonstrate why this is needed when “drop if exists” already is
present.  A much better flow is to conditionally drop the existing object
and create the new one all in the same transaction.

David J.


Re: Document NULL

2024-05-03 Thread David G. Johnston
On Fri, May 3, 2024 at 8:44 AM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Fri, May 3, 2024 at 7:10 AM Peter Eisentraut 
> > wrote:
> >> On 02.05.24 17:23, David G. Johnston wrote:
> >>> I chose to add a new sect1 in the user guide (The SQL Language)
> chapter,
> >>> "Data".
>
> >> Please, let's not.
>
> > If a committer wants to state the single place in the documentation to
> put
> > this I'm content to put it there while leaving my reasoning of choices in
> > place for future bike-shedding.  My next options to decide between are
> the
> > appendix or the lead chapter in Data Types. It really doesn't fit inside
> > DDL IMO which is the only other suggestion I've seen (and an uncertain,
> or
> > at least unsubstantiated, one) and a new chapter meets both criteria Tom
> > laid out, so long as this is framed as more than just having to document
> > null values.
>
> I could see going that route if we actually had a chapter's worth of
> material to put into "Data".  But we don't, there's really only one
> not-very-long section.  Robert has justifiably complained about that
> sort of thing elsewhere in the docs, and I don't want to argue with
> him about why it'd be OK here.
>

OK.  I was hopeful that once the Chapter existed the annoyance of it being
short would be solved by making it longer.  If we ever do that, moving this
section under there at that point would be an option.


> Having said that, I reiterate my proposal that we make it a new
>  under DDL, before 5.2 Default Values which is the first
> place in ddl.sgml that assumes you have heard of nulls.


I will go with this and remove the "Data Basics" section I wrote, leaving
it to be just a discussion about null values.  The tutorial is the only
section that really needs unique wording to fit in.  No matter where we
decide to place it otherwise the core content will be the same, with maybe
a different section preface to tie it in.

Putting it in an appendix is similarly throwing
> to the wind any idea that you can read the documentation in order.
>

I think we can keep the entire camel out of the tent while letting it get a
whiff of what is inside.  It would be a summary reference linked to from
the various places that mention null values.
https://en.wikipedia.org/wiki/Camel%27s_nose


> I suppose we could address the nonlinearity gripe with a bunch
> of cross-reference links, in which case maybe something under
> Data Types is the least bad approach.
>
>
Yeah, there is circularity here that is probably impossible to
completely resolve.

David J.


Re: Document NULL

2024-05-03 Thread David G. Johnston
On Fri, May 3, 2024 at 7:10 AM Peter Eisentraut 
wrote:

> On 02.05.24 17:23, David G. Johnston wrote:
> > Version 2 attached.  Still a draft, focused on topic picking and overall
> > structure.  Examples and links planned plus the usual semantic markup
> stuff.
> >
> > I chose to add a new sect1 in the user guide (The SQL Language) chapter,
> > "Data".
>
> Please, let's not.
>

If a committer wants to state the single place in the documentation to put
this I'm content to put it there while leaving my reasoning of choices in
place for future bike-shedding.  My next options to decide between are the
appendix or the lead chapter in Data Types. It really doesn't fit inside
DDL IMO which is the only other suggestion I've seen (and an uncertain, or
at least unsubstantiated, one) and a new chapter meets both criteria Tom
laid out, so long as this is framed as more than just having to document
null values.


> A stylistic note: "null" is an adjective.  You can talk about a "null
> value" or a value "is null".
>

Will do.

David J.


Re: Document NULL

2024-05-03 Thread David G. Johnston
On Fri, May 3, 2024 at 1:14 AM jian he  wrote:

> On Fri, May 3, 2024 at 2:47 PM Laurenz Albe 
> wrote:
> >
> > On Thu, 2024-05-02 at 08:23 -0700, David G. Johnston wrote:
> > > Version 2 attached.  Still a draft, focused on topic picking and
> overall structure.
> >
> > I'm fine with most of the material (ignoring ellipses and typos), except
> this:
> >
> > +The NOT NULL column constraint is largely syntax sugar for the
> corresponding
> > +column IS NOT NULL check constraint, though there are metadata
> differences
> > +described in create table.
> >
>
> the system does not translate (check constraint column IS NOT NULL)
> to NOT NULL constraint,
> at least in domain.
>
>
I'll change this but I was focusing on the fact you get identical
user-visible behavior with not null and a check(col is not null).  Chain of
thought being we discuss the is not null operator (indirectly) already and
so not null, which is syntax as opposed to an operation/expression, can
leverage that explanation as opposed to getting its own special case.  I'll
consider this some more and maybe mention the catalog dynamics a bit as
well, or at least point to them.


> drop domain connotnull cascade;
> create domain connotnull integer;
> alter domain connotnull add check (value is not null);
> \dD
>

This reminds me, I forgot to add commentary regarding defining a not null
constraint on a domain but the domain type surviving a left join but having
a null value.

David J.


Re: Document NULL

2024-05-02 Thread David G. Johnston
On Wed, May 1, 2024 at 9:47 PM Tom Lane  wrote:

> David Rowley  writes:
> > Let's bash it into shape a bit more before going any further on actual
> wording.
>
> FWIW, I want to push back on the idea of making it a tutorial section.
> I too considered that, but in the end I think it's a better idea to
> put it into the "main" docs, for two reasons:
>
>
Version 2 attached.  Still a draft, focused on topic picking and overall
structure.  Examples and links planned plus the usual semantic markup stuff.

I chose to add a new sect1 in the user guide (The SQL Language) chapter,
"Data".  Don't tell Robert.

The "Data Basics" sub-section lets us readily slide this Chapter into the
main flow and here the NULL discussion feels like a natural fit.  In
hindsight, the lack of a Data chapter in a Database manual seems like an
oversight.  One easily made because we assume if you are here you "know"
what data is, but there is still stuff to be discussed, if nothing else to
establish a common understanding between us and our users.

David J.
From 7798121992154edab4768d7eab5a89be04730b2f Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Wed, 1 May 2024 07:45:48 -0700
Subject: [PATCH] Document NULL

---
 doc/src/sgml/data.sgml | 169 +
 doc/src/sgml/filelist.sgml |   1 +
 doc/src/sgml/postgres.sgml |   1 +
 3 files changed, 171 insertions(+)
 create mode 100644 doc/src/sgml/data.sgml

diff --git a/doc/src/sgml/data.sgml b/doc/src/sgml/data.sgml
new file mode 100644
index 00..2b09382494
--- /dev/null
+++ b/doc/src/sgml/data.sgml
@@ -0,0 +1,169 @@
+
+ Data
+
+ 
+  This chapter provides definitions for, and an overview of, data.
+  It discusses the basic design of the metadata related to values
+  and then goes on to describe the special value NULL which typically
+  represents unknown
+ 
+
+ 
+  Data Basics
+  
+   All literals, columns, variables, and expression results in PostgreSQL
+   are typed, which are listed in the next chapter.  Literals and columns
+   must only use one of the concrete types while variables can use
+   either a concrete type or a pseudo-type.  Expression results
+   are limited to concrete types and the pseudo-type record described below.
+  
+  
+   The pseudo-types prefixed with any implement polymorphism
+   in PostgreSQL.  Polymorphism allows a single function specification
+   to act on multiple concrete types.  At runtime, the function body
+   associates concrete types to all polymorphic types based upon the
+   conrete argument of its inputs. See ... for more details.
+  
+  
+   The record pseudo-type is also polymorphic in nature but allows
+   the caller of the function to specify the row-like structure of
+   output within the containing query. See ... for more details.
+   The ROW(...) expression (see ...) will also produce a record
+   result comprised of the named columns.
+  
+ 
+
+  
+   Unknown Values (NULL)
+   
+This section first introduces the meaning of NULL and then goes
+on to explain how different parts of the system behave when faced
+with NULL input.
+   
+
+  
+   NULL in Data Models
+   
+Generally NULL is assumed to mean "unknown".  However,
+in practice meaning comes from context and so a model design may state that
+NULL is to be used to represent "not applicable" - i.e., that a value is not
+even possible.  SQL has only the single value NULL while there are multiple
+concepts that people have chosen to apply it to.  In any case the behavior
+of the system when dealing with NULL is the same regardless of the meaning
+the given to it in the surrounding context.
+   
+   
+NULL also takes on a literal meaning of "not found" when produced as the
+result of an outer join.
+   
+  
+
+  
+   NULL Usage
+   
+As NULL is treated as a data value it, like all values, must have
+a data type.  NULL is a valid value for all data types.
+   
+
+   
+A NULL literal is written as unquoted NULL.  Its type is unknown but
+can be cast to any concrete data type.  The [type 'string'] syntax
+however will not work as there is no way to express NULL using single
+quotes and unlike.
+   
+
+   
+The presence of NULL in the system results in three-valued logic.
+In binary logic every outcome is either true or false.  In
+three-valued logic unknown, represented using a NULL value, is
+also an outcome.  Aspects of the system that branch based upon
+whether a condition variable is true or false thus must also
+decide how to behave when then condition is NULL.  The remaining
+sub-sections summarize these decisions.
+   
+  
+
+  
+The Cardinal Rule of NULL
+   
+The cardinal rule, NULL is never equal or unequal to any non-null
+value (or itself).  [NULL = anything yields NULL].
+Checking for NULL has an explicit tes

Re: EXPLAN redundant options

2024-05-02 Thread David G. Johnston
On Thu, May 2, 2024 at 6:17 AM jian he  wrote:

> explain (verbose, verbose off, analyze on, analyze off, analyze on)
>
>
I would just update this paragraph to note the last one wins behavior.

"When the option list is surrounded by parentheses, the options can be
written in any order.  However, if options are repeated the last one listed
is used."

I have no desire to introduce breakage here.  The implemented concept is
actually quite common.  The inconsistency with COPY seems like a minor
point.  It would influence my green field choice but not enough for
changing long-standing behavior.

David J.


Document NULL

2024-05-01 Thread David G. Johnston
Hi,

Over in [1] it was rediscovered that our documentation assumes the reader
is familiar with NULL.  It seems worthwhile to provide both an introduction
to the topic and an overview of how this special value gets handled
throughout the system.

Attached is a very rough draft attempting this, based on my own thoughts
and those expressed by Tom in [1], which largely align with mine.

I'll flesh this out some more once I get support for the goal, content, and
placement.  On that point, NULL is a fundamental part of the SQL language
and so having it be a section in a Chapter titled "SQL Language" seems to
fit well, even if that falls into our tutorial.  Framing this up as
tutorial content won't be that hard, though I've skipped on examples and
such pending feedback.  It really doesn't fit as a top-level chapter under
part II nor really under any of the other chapters there.  The main issue
with the tutorial is the forward references to concepts not yet discussed
but problem points there can be addressed.

I do plan to remove the entity reference and place the content into
query.sgml directly in the final version.  It is just much easier to write
an entire new section in its own file.

David J.

[1] https://www.postgresql.org/message-id/1859814.1714532025%40sss.pgh.pa.us
From a068247e92e620455a925a0ae746adc225ae1339 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Wed, 1 May 2024 07:45:48 -0700
Subject: [PATCH] Document NULL

---
 doc/src/sgml/filelist.sgml |  1 +
 doc/src/sgml/null.sgml | 79 ++
 doc/src/sgml/query.sgml|  2 +
 3 files changed, 82 insertions(+)
 create mode 100644 doc/src/sgml/null.sgml

diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 38ec362d8f..ac4fd52978 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -10,6 +10,7 @@
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/null.sgml b/doc/src/sgml/null.sgml
new file mode 100644
index 00..5f95b2494e
--- /dev/null
+++ b/doc/src/sgml/null.sgml
@@ -0,0 +1,79 @@
+
+ Handling Unkowns (NULL)
+
+ 
+  NULL
+ 
+
+ 
+  Looking again at our example weather data you will note that we do not know
+  the amount of precipitation Hayward.  We communicated that implicitly by
+  not including the prcp column in the insert.  Explicitly, we can communicate
+  this fact by writing.  [example using null].  When a column is not specified
+  in an insert the default value for that column is recorded and the default
+  default value is NULL.
+ 
+
+ 
+  As a value NULL crops up all throughout the database and interacts with many
+  features.  The main portion of this book will detail the interactions specific
+  features have with NULL but it is helpful to have a reference page where one
+  can get an overview.
+ 
+
+ 
+  First, like all values, NULLs are typed.  But since any value can be unknown
+  NULL is a valid value for all data types.
+ 
+
+ 
+  Second, when speaking generally NULL is assumed to mean unknown.  However,
+  in practice meaning comes from context and so a model design may state that
+  NULL is to be used to represent "not applicable" - i.e., that a value is not
+  even possible.  SQL has only the single value NULL while there are multiple
+  concepts that people have chosen to apply it to.  In any case the behavior
+  of the system when dealing with NULL is the same regardless of the meaning
+  the given to it in the surrounding context.
+ 
+
+ 
+  The cardinal rule, NULL is never equal or unequal to any non-null
+  value; and when asked to be combined with a known value in an operation the
+  result of the operation becomes unknown.  e.g., both 1 = NULL and 1 + NULL
+  result in NULL.  Exceptions to this are documented.  See [chapter] for
+  details on how to test for null.  Specifically, note that concept of
+  distinctness is introduced to allow for true/false equality tests.
+ 
+
+ 
+  Extending from the previous point, function calls are truly a mixed bag.
+  Aggregate functions in particular will usually just ignore NULL inputs
+  instead of forcing the entire aggregate result to NULL.  Function
+  specifications has a "strictness" attribute that, when set to "strict"
+  (a.k.a. "null on null input") will tell the executor to return NULL for any
+  function call having at least one NULL input value, without executing the
+  function.
+ 
+
+ 
+  A WHERE clause that evaluates to NULL for a given row will exclude that row.
+  This was demonstrated in the tutorial query where cities with prcp > 0 were
+  requested and Hayward was not returned due to this and the cardinal rule.
+ 
+
+ 
+  While not yet discussed, it is possible to define validation expressions on
+  tables that ensure only values passing those expressions are inserted.  While
+  this seems like it would behave as a WHERE clause on a table, the choice here
+  when an expression evaulates to NULL is allow the row to be in

Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-04-29 Thread David G. Johnston
On Monday, April 29, 2024, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > My solution to this was to rely on the fact that the bootstrap superuser
> is
> > assigned OID 10 regardless of its name.
>
> Yeah, I wrote it that way to start with too, but reconsidered
> because
>
> (1) I don't like hard-coding numeric OIDs.  We can avoid that in C
> code but it's harder to do in SQL.


If the tests don’t involve, e.g., the predefined role pg_monitor and its
grantor of the memberships in the other predefined roles, this indeed can
be avoided.  So I think my test still needs to check for 10 even if some
other superuser is allowed to produce the test output since a key output in
my case was the bootstrap superuser and the initdb roles.


> (2) It's not clear to me that this test couldn't be run by a
> non-bootstrap superuser.  I think "current_user" is actually
> the correct thing for the role executing the test.
>

Agreed, testing against current_role is correct if the things being queried
were created while executing the test.  I would need to do this as well to
remove the current requirement that my tests be run by the bootstrap
superuser.

David J.


Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-04-29 Thread David G. Johnston
On Monday, April 29, 2024, Tom Lane  wrote:

> Daniel Gustafsson  writes:
> >> On 28 Apr 2024, at 20:52, Tom Lane  wrote:
>
>
> >> This is of course not bulletproof: with a sufficiently weird
> >> bootstrap superuser name, we could get false matches to parts
> >> of "regress_dump_test_role" or to privilege strings.  That
> >> seems unlikely enough to live with, but I wonder if anybody has
> >> a better idea.
>
> > I think that will be bulletproof enough to keep it working in the
> buildfarm and
> > among 99% of hackers.
>
> It occurred to me to use "aclexplode" to expand the initprivs, and
> then we can substitute names with simple equality tests.  The test
> query is a bit more complicated, but I feel better about it.
>

My solution to this was to rely on the fact that the bootstrap superuser is
assigned OID 10 regardless of its name.

David J.


Re: pg_input_error_info doc 2 exampled crammed together

2024-04-28 Thread David G. Johnston
On Sunday, April 28, 2024, Michael Paquier  wrote:

> On Sun, Apr 28, 2024 at 06:45:30PM -0700, David G. Johnston wrote:
> > My preference would be to limit this section to a single example.  The
> > numeric one, as it provides values for more output columns.  I would
> change
> > the output format to expanded from default, in order to show all columns
> > and not overrun the length of a single line.
>
> Agreed that having two examples does not bring much, so this could be
> brought to a single one.  The first one is enough to show the point of
> the function, IMO.  It is shorter in width and it shows all the output
> columns.
>
>
Agreed.  The column names are self-explanatory if you’ve seen errors
before.  The values are immaterial.  Plus we don’t generally use
psql-specific features in our examples.

David J.


Re: pg_input_error_info doc 2 exampled crammed together

2024-04-28 Thread David G. Johnston
On Sunday, April 28, 2024, jian he  wrote:
>
>
> after checking the definition of [1], [2],
> maybe here we should use 


>
Possibly, though I’d be curious to see how consistent we are on this point
elsewhere before making a point of it.


>
> and also add `(1 row)` information.


Doesn’t seem like added value.


>
> or we can simply add a empty new line between
> ` value "420" is out of range for type integer ||  |
> 22003`
> and
> ``


My preference would be to limit this section to a single example.  The
numeric one, as it provides values for more output columns.  I would change
the output format to expanded from default, in order to show all columns
and not overrun the length of a single line.

David J.


Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-28 Thread David G. Johnston
On Sunday, April 28, 2024, Alexander Lakhin  wrote:

>
> When we deal with mixed ownership, say, bob is an owner of a
> partitioned table, but not an owner of a partition, should we
> allow him to perform merge with that partition?
>
>
Attaching via alter table requires the user to own both the partitioned
table and the table being acted upon.  Merge needs to behave similarly.

The fact that we let the superuser break the requirement of common
ownership is unfortunate but I guess understandable.  But given the
existing behavior of attach merge should likewise fail if it find the user
doesn’t own the partitions being merged.  The fact that the user can select
from those tables can be acted upon manually if desired; these
administrative commands should all ensure common ownership and fail if that
precondition is not met.

David J.


Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-28 Thread David G. Johnston
On Sunday, April 28, 2024, Alexander Lakhin  wrote:

>
> When we deal with mixed ownership, say, bob is an owner of a
> partitioned table, but not an owner of a partition, should we
> allow him to perform merge with that partition?
>
>
IIUC Merge causes the source tables to be dropped, their data having been
effectively moved into the new partition.  bob must not be allowed to drop
Alice’s tables.  Only an owner may do that.  So if we do allow bob to build
a new partition using his select access, the tables he selected from would
have to remain behind if he is not an owner of them.

David J.


Re: Read table rows in chunks

2024-04-27 Thread David G. Johnston
On Sat, Apr 27, 2024 at 12:47 AM Sushrut Shivaswamy <
sushrut.shivasw...@gmail.com> wrote:

>
> I"m trying to read the rows of a table in chunks to process them in a
> background worker.
>

This list really isn't the place for this kind of discussion.  You are
doing application-level stuff, not working on patches for core.  General
discussions and questions like this should be directed to the -general
mailing list.

I want to ensure that each row is processed only once.
>

Is failure during processing possible?


> I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT
> {limit_size}` functionality for this but I"m running into issues.
>

FOR UPDATE and SKIPPED LOCKED clauses usually come into play for this use
case.


> Can you please suggest any alternative to periodically read rows from a
> table in chunks while processing each row exactly once.
>
>
I think you are fooling yourself if you think you can do this without
writing back to the row the fact it has been processed.  At which point
ensuring that you only retrieve and process unprocessed rows is trivial -
just don't select ones with a status of processed.

If adding a column to the data is not possible, record processed row
identifiers into a separate table and inspect that.

DavId J.


Re: query_id, pg_stat_activity, extended query protocol

2024-04-27 Thread David G. Johnston
On Sat, Apr 27, 2024 at 6:55 AM Imseih (AWS), Sami 
wrote:

>
> Hmm, you raise a good point. Isn't this a fundamental problem
> with prepared statements? If there is DDL on the
> relations of the prepared statement query, shouldn't the prepared
> statement be considered invalid at that point and raise an error
> to the user?
>
>
We choose a arguably more user-friendly option:

https://www.postgresql.org/docs/current/sql-prepare.html

"""
Although the main point of a prepared statement is to avoid repeated parse
analysis and planning of the statement, PostgreSQL will force re-analysis
and re-planning of the statement before using it whenever database objects
used in the statement have undergone definitional (DDL) changes or their
planner statistics have been updated since the previous use of the prepared
statement.
"""

David J.


Re: Partitioned tables and [un]loggedness

2024-04-24 Thread David G. Johnston
On Wed, Apr 24, 2024 at 4:35 PM Michael Paquier  wrote:

>
> I disagree here, actually.  Temporary tables are a different beast
> because they require automated cleanup which would include interacting
> with the partitionining information if temp and non-temp relations are
> mixed.  That's why the current restrictions are in place: you should
>
[ not ] ?

> be able to mix them.
>

My point is that if you feel that treating logged as a copy-able property
is OK then doing the following should also just work:

postgres=# create temp table parentt ( id integer ) partition by range (id);
CREATE TABLE
postgres=# create table child10t partition of parentt for values from (0)
to (9);
ERROR:  cannot create a permanent relation as partition of temporary
relation "parentt"

i.e., child10t should be created as a temporary partition under parentt.

David J.


Re: Partitioned tables and [un]loggedness

2024-04-24 Thread David G. Johnston
On Wed, Apr 24, 2024 at 1:26 PM Nathan Bossart 
wrote:

> On Wed, Apr 24, 2024 at 04:17:44PM +0900, Michael Paquier wrote:
> > - Support ALTER TABLE .. SET LOGGED/UNLOGGED for partitioned tables,
> > where the command only works on partitioned tables so that's only a
> > catalog switch.
>
> I'm not following what this means.  Does SET [UN]LOGGED on a partitioned
> table recurse to its partitions?  Does this mean that you cannot changed
> whether a single partition is [UN]LOGGED?  How does this work with
> sub-partitioning?
>
> > - CREATE TABLE PARTITION OF would make a new partition inherit the
> > logged ness of the parent if UNLOGGED is not directly specified.
>
> This one seems generally reasonable to me, provided it's properly noted in
> the docs.
>

I don't see this being desirable at this point.  And especially not by
itself.  It is an error to not specify TEMP on the partition create table
command when the parent is temporary, and that one is a no-brainer for
having the persistence mode of the child be derived from the parent.  The
current policy of requiring the persistence of the child to be explicitly
specified seems perfectly reasonable.  Or, put differently, the specific
current persistence of the parent doesn't get copied or even considered
when creating children.

In any case we aren't going to be able to do exactly what it means by
marking a partitioned table unlogged - namely that we execute the truncate
command on it after a crash.  Forcing the implementation here just so that
our existing decision to ignore unlogged on the parent table is, IMO,
letting optics corrupt good design.

I do agree with having an in-core way for the DBA to communicate that they
wish for partitions to be created with a known persistence unless the
create table command specifies something different.  The way I would
approach this is to add something like the following to the syntax/system:

CREATE [ persistence_mode ] TABLE ...
...
WITH (partition_default_persistence = { logged, unlogged, temporary }) --
storage_parameter, logged is effectively the default

This method is more explicit and has zero implications for existing backups
or upgrading.


> > - How about ONLY?  Would it make sense to support it so as ALTER TABLE
> > ONLY on a partitioned table does not touch any of its partitions?
>

I'd leave it to the community to develop and maintain scripts that iterate
over the partition hierarchy and toggle the persistence mode between logged
and unlogged on the individual partitions using whatever throttling and
batching strategy each individual user requires for their situation.


> > - CREATE TABLE does not have a LOGGED keyword, hence it is not
> > possible through the parser to force a partition to have a permanent
> > persistence even if its partitioned table uses UNLOGGED.
>
> Could we add LOGGED for CREATE TABLE?
>
>
I do agree with adding LOGGED to the list of optional persistence_mode
specifiers, possibly regardless of whether any of this happens.  Seems
desirable to give our default mode a name.

David J.


Re: doc: create table improvements

2024-04-24 Thread David G. Johnston
On Wed, Apr 24, 2024 at 7:45 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Apr 24, 2024 at 3:30 AM Peter Eisentraut 
> wrote:
>
>>  > +   The reliability characteristics of a table are governed by its
>>  > +   persistence mode.  The default mode is described
>>  > +   here
>>  > +   There are two alternative modes that can be specified during
>>  > +   table creation:
>>  > +   temporary and
>>  > +   unlogged.
>>
>> Not sure reliability is the best word here.  I mean, a temporary table
>> isn't any less reliable than any other table.  It just does different
>> things.
>>
>>
> Given the name of the section where this is all discussed I'm having
> trouble going with a different word.  But better framing and phrasing I can
> do:
>
> A table may be opted out of certain storage aspects of reliability, as
> described [here], by specifying either of the alternate persistence modes:
> [temporary] or [logged]. The specific trade-offs and implications are
> detailed below.
>
>
Or maybe:

A table operates in one of three persistence modes (default, [temporary],
and [unlogged]) described in [Chapter 28]. --point to the intro page for
the chapter as expanded as below, not the reliability page.

diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 05e2a8f8be..102cfeca68 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -5,8 +5,17 @@

  
   This chapter explains how to control the reliability of
-  PostgreSQL, including details about the
-  Write-Ahead Log.
+  PostgreSQL.  At its core this
+  involves writing all changes to disk twice - first to a
+  journal of changes called the write-ahead-log (WAL) and
+  then to the physical pages that comprise permanent tables
+  on disk (heap).  This results in four high-level
+  persistence modes for tables.
+  The default mode results in both these features being
+  enabled.  Temporary tables forgo both of these options,
+  while unlogged tables only forgo WAL.  There is no WAL-only
+  operating mode.  The rest of this chapter discusses
+  implementation details related to these two options.
  

David J.


Re: doc: create table improvements

2024-04-24 Thread David G. Johnston
On Wed, Apr 24, 2024 at 3:30 AM Peter Eisentraut 
wrote:

>  > +   The reliability characteristics of a table are governed by its
>  > +   persistence mode.  The default mode is described
>  > +   here
>  > +   There are two alternative modes that can be specified during
>  > +   table creation:
>  > +   temporary and
>  > +   unlogged.
>
> Not sure reliability is the best word here.  I mean, a temporary table
> isn't any less reliable than any other table.  It just does different
> things.
>
>
Given the name of the section where this is all discussed I'm having
trouble going with a different word.  But better framing and phrasing I can
do:

A table may be opted out of certain storage aspects of reliability, as
described [here], by specifying either of the alternate persistence modes:
[temporary] or [logged]. The specific trade-offs and implications are
detailed below.

David J.


Re: [Doc] Improvements to ddl.sgl Privileges Section and Glossary

2024-04-22 Thread David G. Johnston
Any thoughts?

On Thu, Jan 25, 2024 at 1:59 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> Hey,
>
> In a nearby user complaint email [1] some missing information regarding
> ownership reassignment came to light.  I took that and went a bit further
> to add what I felt was further missing information and context for how the
> privilege system is designed.  I've tried to formalize and label existing
> concepts a bit and updated the glossary accordingly.
>
> The attached is a partial rewrite of the patch on the linked post after
> those comments were taken into consideration.
>
> The new glossary entry for privileges defines various qualifications of
> the term that are used in the new prose.  I've marked up each of the
> variants and point them all back to the main entry.  I didn't try to
> incorporate those terms, or even really look, anywhere else in the
> documentation.  If the general idea is accepted that kind of work can be
> done as a follow-up.
>
> David J.
>
> [1]
> https://www.postgresql.org/message-id/d294818d12280f6223ddf169ab5454927f5186b6.camel%40cybertec.at
>
>


Re: [Doc] Improve hostssl related descriptions and option presentation

2024-04-22 Thread David G. Johnston
Thoughts anyone?

On Thu, Feb 1, 2024 at 3:47 PM David G. Johnston 
wrote:

> Motivated by a recent complaint [1] I found the hostssl related material
> in our docs quite verbose and even repetitive.  Some of that is normal
> since we have both an overview/walk-through section as well as a reference
> section.  But the overview in particular was self-repetitive.  Here is a
> first pass at some improvements.  The commit message contains both the
> intent of the patch and some thoughts/questions still being considered.
>
> David J.
>
> [1]
> https://www.postgresql.org/message-id/170672433664.663.11895343120533141715%40wrigleys.postgresql.org
>


doc: create table improvements

2024-04-22 Thread David G. Johnston
Hey,

The attached patch addresses four somewhat related aspects of the create
table reference page that bother me.

This got started with Bug# 15954 [1] (unlogged on a partitioned table
doesn't make sense) and I've added a paragraph under "unlogged" to address
it.

While doing that, it seemed desirable to explicitly frame up both temporary
and unlogged as being "persistence modes" so I added a mention of both in
the description.  Additionally, it seemed appropriate to do so immediately
after the opening paragraph since the existing second paragraph goes
immediately into talking about temporary tables and schemas.  I figured a
link to the reliability chapter where one learns about WAL and why/how
these alternative persistence modes exist is worthwhile. (I added a missing
comma to the first sentence while I was in the area)

Third, I've had a long-standing annoyance regarding the excessive length of
the CREATE line of each of the create table syntax blocks.  Replacing the
syntax for the persistence modes with a named placeholder introduces
structure and clarity while reducing the length of the line nicely.

Finally, while fixing line lengths, the subsequent line (first form) for
column specification is even more excessive.  Pulling out the
column_storage syntax into a named reference nicely cleans this line up.

David J.

P.S. I didn't go into depth on the fact the persistence options are not
inherited/copied/like-able; so for now the fact they are not so is
discovered by their omission when discussing those topics.

[1]
https://www.postgresql.org/message-id/flat/15954-b61523bed4b110c4%40postgresql.org
From e375044d55809d239be33f31c4efa8410790d3f0 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Mon, 22 Apr 2024 11:51:53 -0700
Subject: [PATCH] doc: create table doc enhancements

---
 doc/src/sgml/ref/create_table.sgml | 37 +-
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 02f31d2d6f..9a5dafb9af 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -21,8 +21,8 @@ PostgreSQL documentation
 
  
 
-CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name ( [
-  { column_name data_type [ STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN | DEFAULT } ] [ COMPRESSION compression_method ] [ COLLATE collation ] [ column_constraint [ ... ] ]
+CREATE [ persistence_mode ] TABLE [ IF NOT EXISTS ] table_name ( [
+  { column_name data_type [ column_storage ] [ COLLATE collation ] [ column_constraint [ ... ] ]
 | table_constraint
 | LIKE source_table [ like_option ... ] }
 [, ... ]
@@ -34,7 +34,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
 [ TABLESPACE tablespace_name ]
 
-CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name
+CREATE [ persistence_mode ] TABLE [ IF NOT EXISTS ] table_name
 OF type_name [ (
   { column_name [ WITH OPTIONS ] [ column_constraint [ ... ] ]
 | table_constraint }
@@ -46,7 +46,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
 [ TABLESPACE tablespace_name ]
 
-CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name
+CREATE [ persistence_mode ] TABLE [ IF NOT EXISTS ] table_name
 PARTITION OF parent_table [ (
   { column_name [ WITH OPTIONS ] [ column_constraint [ ... ] ]
 | table_constraint }
@@ -58,7 +58,16 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
 [ TABLESPACE tablespace_name ]
 
-where column_constraint is:
+where persistence_mode is: 
+
+[ GLOBAL | LOCAL ] { TEMPORARY | TEMP } |
+UNLOGGED
+
+and column_storage is:
+
+STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN | DEFAULT } [ COMPRESSION compression_method ]
+
+and column_constraint is:
 
 [ CONSTRAINT constraint_name ]
 { NOT NULL |
@@ -118,11 +127,21 @@ WITH ( MODULUS numeric_literal, REM
   Description
 
   
-   CREATE TABLE will create a new, initially empty table
+   CREATE TABLE will create a new, initially empty, table
in the current database. The table will be owned by the user issuing the
command.
   
 
+  
+   The reliability characteristics of a table are governed by its
+   persistence mode.  The default mode is described
+   here
+   There are two alternative modes that can be specified during
+   table creation:
+   temporary and
+   unlogged.
+  
+
   
If a schema name is given (for example, CREATE TABLE
myschema.mytable ...) then the table is created in the specified
@@ -221,6 +240,12 @@ WITH ( MODULUS numeric_literal, REM
   If this is specified, any sequences created together 

Re: Things I don't like about \du's "Attributes" column

2024-04-15 Thread David G. Johnston
On Sun, Feb 18, 2024 at 4:14 AM Pavel Luzanov 
wrote:

> 2. Tom's advise:
>
> Not sure it's worth worrying about
>
> Show real values for 'Valid until' and 'Connection limit' without any hints.
>
>
At this point I'm on board with retaining the \dr charter of simply being
an easy way to access the detail exposed in pg_roles with some display
formatting but without any attempt to convey how the system uses said
information.  Without changing pg_roles.  Our level of effort here, and
degree of dependence on superuser, doesn't seem to be bothering people
enough to push more radical changes here through and we have good
improvements that are being held up in the hope of possible perfection.

David J.


Re: Things I don't like about \du's "Attributes" column

2024-04-15 Thread David G. Johnston
On Sat, Apr 13, 2024 at 7:02 PM Wen Yi  wrote:

> I think we can change the output like this:
>
> postgres=# \du
> List of roles
>  Role name | Login | Attributes  | Password | Valid until | Connection
> limit
>
> ---+---+-+--+-+--
>  test  |   | Inherit |  | |
>  test2 | Can   | Inherit | Has  | |
>  wenyi | Can   | Superuser  +|  | |
>|   | Create DB  +|  | |
>|   | Create role+|  | |
>|   | Inherit+|  | |
>|   | Replication+|  | |
>|   | Bypass RLS  |  | |
> (3 rows)
>
> And I submit my the patch, have a look?
>
>
Why?  I actually am generally open to false being encoded as blank where
there are only two possible values, but there is no precedence of choosing
something besides 'yes' or 'true' to represent the boolean true value.

Whether Password is truly two-valued is debatable per the ongoing
discussion.

David J.


Re: Stability of queryid in minor versions

2024-04-14 Thread David G. Johnston
On Sun, Apr 14, 2024 at 7:03 PM David Rowley  wrote:

> On Mon, 15 Apr 2024 at 13:37, David G. Johnston
>  wrote:
> > Seems we can improve things by simply removing the "rule of thumb"
> sentence altogether.  The prior paragraph states the things the queryid
> depends upon at the level of detail the reader needs.
>
> I don't think that addresses the following, which I mentioned earlier:
>
> > but not stable across *major* versions does *not* mean stable across
> > *minor* versions. The reader is just left guessing if that's true.
>
>
The base assumption here is that changes in the things we don't mention do
not influence the queryid.  We didn't mention minor versions, changing them
doesn't influence the queryid.

Now, reading that entire paragraph is a bit of a challenge IMO, and agree,
as I subsequently noted, that the sentence you pointed out could be
reworked.  I stand by my statement that removing the sentence about "rule
of thumb" altogether is a win.  The prior paragraph should be sufficient -
it is technically at the moment but am not opposed to rewording.

David J.


Re: Stability of queryid in minor versions

2024-04-14 Thread David G. Johnston
On Sun, Apr 14, 2024 at 6:32 PM David Rowley  wrote:

> On Mon, 15 Apr 2024 at 13:19, Tom Lane  wrote:
> >
> > Michael Paquier  writes:
> > > On Mon, Apr 15, 2024 at 11:20:16AM +1200, David Rowley wrote:
> > >> 1. We cannot change Node enums in minor versions
> > >> 2. We're *unlikely* to add fields to Node types in minor versions, and
> > >> if we did we'd likely be leaving them out of the jumble calc, plus it
> > >> seems highly unlikely any new field we wedged into the padding would
> > >> relate at all to the parsed query.
> >
> > > Since 16 these new fields would be added by default unless the node
> > > attribute query_jumble_ignore is appended to it.
> >
> > They'd also be written/read by outfuncs/readfuncs, thereby breaking
> > stored views/rules if the Node is one that can appear in a parsetree.
> > So the bar to making such a change in a stable branch would be very
> > high.
>
> I think a soft guarantee in the docs for it being stable in minor
> versions would be ok then.
>
> I'm unsure if "Rule of thumb" is the correct way to convey that. We
> can't really write "We endeavour to", as who is "We".  Maybe something
> like "Generally, it can be assumed that queryid is stable between all
> minor versions of a major version of ..., providing that  reasons>".
>
>
So, there are three kinds of dependencies:

Product
Machine
User Data

The user data dependencies are noted as being OID among other things
The machine dependencies are the architecture and other facets
The product dependencies are not enumerated but can be simply stated to be
internals stable throughout a major version.

A minimal rewording of the last sentence in the prior paragraph could be:

Lastly, the queryid depends upon aspects of PostgreSQL internals that can
only change with each major version release.

I'm disinclined to note minor releases here given the above wording.  Sure,
like with lots of things, circumstances may require us to break a policy,
but we don't seem to make that point everywhere we conceive it could happen.

David J.


Re: Stability of queryid in minor versions

2024-04-14 Thread David G. Johnston
On Sun, Apr 14, 2024 at 4:20 PM David Rowley  wrote:

>
> I've drafted a patch which I think improves things, but it probably
> needs more work and opinions.
>
>
Seems we can improve things by simply removing the "rule of thumb" sentence
altogether.  The prior paragraph states the things the queryid depends upon
at the level of detail the reader needs.

The sentence "Two servers participating in replication based on physical
WAL replay can be expected to have identical queryid values for the same
query." apparently assumes that to participate both servers must share the
same machine architecture.  I am under the impression that this is only an
advisory, not a requirement.  Rather, two servers participating in physical
replication will be ensured that the catalog metadata and major versions
are identical.  This is not the case for servers related via logical
replication.

David J.


Re: In MacOS, psql reacts on SIGINT in a strange fashion (Linux is fine)

2024-04-13 Thread David G. Johnston
On Saturday, April 13, 2024, Dmitry Koterov 
wrote:

>
>
> % psql --version
> psql (PostgreSQL) 16.0
>

How did you install this and can you install other, supported, versions?

David J.


Re: documentation structure

2024-04-05 Thread David G. Johnston
On Fri, Apr 5, 2024 at 9:18 AM Robert Haas  wrote:

> On Fri, Apr 5, 2024 at 12:15 PM David G. Johnston
>  wrote:
> > Here is a link to my attempt at this a couple of years ago.  It
> basically "abuses" refentry.
> >
> >
> https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com
> >
> > I never did dive into the man page or PDF dynamics of this particular
> change but it seemed to solve HTML pagination without negative consequences
> and with minimal risk of unintended consequences since only the markup on
> the pages we want to alter is changed, not global configuration.
>
> Hmm, but it seems like that might have generated some man page entries
> that we don't want?
>

If so (didn't check) maybe just remove them in post?

David J.


Re: documentation structure

2024-04-05 Thread David G. Johnston
On Fri, Apr 5, 2024 at 9:01 AM Robert Haas  wrote:

>
> > The rendering can be adjusted to some degree, but then we also need to
> > make sure any new chunking makes sense in other chapters.  (And it might
> > also change a bunch of externally known HTML links.)
>
> I looked into this and I'm unclear how much customization is possible.
>
>
Here is a link to my attempt at this a couple of years ago.  It basically
"abuses" refentry.

https://www.postgresql.org/message-id/CAKFQuwaVm%3D6d_sw9Wrp4cdSm5_k%3D8ZVx0--v2v4BH4KnJtqXqg%40mail.gmail.com

I never did dive into the man page or PDF dynamics of this
particular change but it seemed to solve HTML pagination without negative
consequences and with minimal risk of unintended consequences since only
the markup on the pages we want to alter is changed, not global
configuration.

David J.


Re: Reports on obsolete Postgres versions

2024-04-04 Thread David G. Johnston
On Thu, Apr 4, 2024 at 11:23 AM Bruce Momjian  wrote:

> On Wed, Apr  3, 2024 at 06:01:41PM -0700, David G. Johnston wrote:
> >  
> >  The PostgreSQL Global Development Group supports a major version for 5
> years
> > -after its initial release. After its five year anniversary, a major
> version
> > will
> > -have one last minor release containing any fixes and will be considered
> > -end-of-life (EOL) and no longer supported.
> > +after its initial release. After its fifth anniversary, a major version
> will
> > +have one last minor release and will be considered
> > +end-of-life (EOL), meaning no new bug fixes will be written for it.
> >  
> >
> > # "fifth anniversary "seems more correct than "five year anniversary".
> > # The fact it gets a minor release implies it receives fixes.
> > # I've always had an issue with our use of the phrasing "no longer
> supported".
> > It seems vague when practically it just means we stop writing patches
> for it.
> > Whether individual community members refrain from answering questions or
> > helping people on these older releases is not a project decision but a
> personal
> > one.  Also, since we already say it is supported for 5 years it seems a
> bit
> > redundant to then also say "after 5 years it is unsupported".
>
> I went with the attached patch.  I tightned up the "unsupported" part,
> trying to
> tie it to the fact that we don't make anymore releases for it.
>
> >  Version Numbering
> > @@ -45,11 +45,12 @@ number, e.g. 9.5.3 to 9.5.4.
> >  Upgrading
> >
> >  
> > -Major versions usually change the internal format of the system tables.
> > -These changes are often complex, so we do not maintain backward
> > -compatibility of all stored data.  A dump/reload of the database or use
> of the
> > -pg_upgrade module is required
> > -for major upgrades. We also recommend reading the
> > +Major versions need the data directory to be initialized so that the
> system
> > tables
> > +specific to that version can be created.  There are two options to then
> > +migrate the user data from the old directory to the new one: a
> dump/reload
> > +of the database or using the
> > +pg_upgrade module.
> > +We also recommend reading the
> >  upgrading section of the
> major
> >  version you are planning to upgrade to. You can upgrade from one major
> version
> >  to another without upgrading to intervening versions, but we recommend
> reading
> > @@ -58,14 +59,15 @@ versions prior to doing so.
> >  
> >
> > # This still talked about "stored data" when really that isn't the main
> concern
> > and if it was pg_upgrade wouldn't work as an option.
> > # The choice to say "data directory" here seems reasonable if arguable.
> But it
> > implies the location of user data and we state it also contains
> > version-specific system tables.  It can go unsaid that they are
> > version-specific because the collection as a whole and the layout of
> individual
> > tables can and almost certainly will change between versions.
> >
> >  
> > -Minor release upgrades do not require a dump and restore;  you simply
> stop
> > +Minor releases upgrades do not impact the data directory, only the
> binaries.
> > +You simply stop
> >  the database server, install the updated binaries, and restart the
> server.
> > -Such upgrades might require manual changes to complete so always read
> > +However, some upgrades might require manual changes to complete so
> always read
> >  the release notes first.
> >  
> >
> > # The fact minor releases don't require dump/reload doesn't directly
> preclude
> > them from needing pg_upgrade and writing "Such upgrades" seems like it
> could
>
> Minor upgrades never have required pg_upgrade.
>
>
How about this:
"""
Major versions make complex changes, so the contents of the data directory
cannot be maintained in a backward compatible way.  A dump and restore of
the databases is required, either done manually or as part of running the
pg_upgrade server application.
"""

My main change here is to mirror "dump and restore" in both paragraphs and
make it clear that this action is required and that the unnamed
pg_dump/pg_restore tools or pg_upgrade are used in order to perform this
task.  Since minor version upgrades do not require "dump and restore" they
need not use these tools.

Also, calling pg_upgrade a module doesn't seem correct.  It is found under
server applications in our docs and consists solely of that program (and a
bunch of manual steps) from the user's perspective.

David J.


Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-04-03 Thread David G. Johnston
On Thu, Mar 28, 2024 at 8:02 PM Erik Wienhold  wrote:

> Thanks, that sounds better.  I incorporated that with some minor edits
> in the attached v3.
>

Looks good.

You added my missing ( but dropped the comma after "i.e."

diff --git a/doc/src/sgml/ref/create_table.sgml
b/doc/src/sgml/ref/create_table.sgml
index dc69a3f5dc..b2e9e97b93 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -251,7 +251,7 @@ WITH ( MODULUS numeric_literal, REM
  
   Creates a typed table, which takes its
structure
   from an existing (name optionally schema-qualified) stand-alone
composite
-  type (i.e. created using ) though it
+  type (i.e., created using ) though it
   still produces a new composite type as well.  The table will have
   a dependency on the referenced type such that cascaded alter and drop
   actions on the type will propagate to the table.

David J.


Re: Reports on obsolete Postgres versions

2024-04-03 Thread David G. Johnston
On Tue, Apr 2, 2024 at 1:47 PM Bruce Momjian  wrote:

> On Tue, Apr  2, 2024 at 11:34:46AM +0200, Magnus Hagander wrote:
>
> Okay, I changed "superseded" to "old", and changed "latest" to
> "current", patch attached.
>
>
I took a pass at this and found a few items of note.  Changes on top of
Bruce's patch.

diff --git a/templates/support/versioning.html
b/templates/support/versioning.html
index 0ed79f4f..d4762967 100644
--- a/templates/support/versioning.html
+++ b/templates/support/versioning.html
@@ -21,9 +21,9 @@ a release available outside of the minor release roadmap.

 
 The PostgreSQL Global Development Group supports a major version for 5
years
-after its initial release. After its five year anniversary, a major
version will
-have one last minor release containing any fixes and will be considered
-end-of-life (EOL) and no longer supported.
+after its initial release. After its fifth anniversary, a major version
will
+have one last minor release and will be considered
+end-of-life (EOL), meaning no new bug fixes will be written for it.
 

# "fifth anniversary "seems more correct than "five year anniversary".
# The fact it gets a minor release implies it receives fixes.
# I've always had an issue with our use of the phrasing "no longer
supported".  It seems vague when practically it just means we stop writing
patches for it.  Whether individual community members refrain from
answering questions or helping people on these older releases is not a
project decision but a personal one.  Also, since we already say it is
supported for 5 years it seems a bit redundant to then also say "after 5
years it is unsupported".


 Version Numbering
@@ -45,11 +45,12 @@ number, e.g. 9.5.3 to 9.5.4.
 Upgrading

 
-Major versions usually change the internal format of the system tables.
-These changes are often complex, so we do not maintain backward
-compatibility of all stored data.  A dump/reload of the database or use of
the
-pg_upgrade module is required
-for major upgrades. We also recommend reading the
+Major versions need the data directory to be initialized so that the
system tables
+specific to that version can be created.  There are two options to then
+migrate the user data from the old directory to the new one: a dump/reload
+of the database or using the
+pg_upgrade module.
+We also recommend reading the
 upgrading section of the major
 version you are planning to upgrade to. You can upgrade from one major
version
 to another without upgrading to intervening versions, but we recommend
reading
@@ -58,14 +59,15 @@ versions prior to doing so.
 

# This still talked about "stored data" when really that isn't the main
concern and if it was pg_upgrade wouldn't work as an option.
# The choice to say "data directory" here seems reasonable if arguable.
But it implies the location of user data and we state it also contains
version-specific system tables.  It can go unsaid that they are
version-specific because the collection as a whole and the layout of
individual tables can and almost certainly will change between versions.

 
-Minor release upgrades do not require a dump and restore;  you simply stop
+Minor releases upgrades do not impact the data directory, only the
binaries.
+You simply stop
 the database server, install the updated binaries, and restart the server.
-Such upgrades might require manual changes to complete so always read
+However, some upgrades might require manual changes to complete so always
read
 the release notes first.
 

# The fact minor releases don't require dump/reload doesn't directly
preclude them from needing pg_upgrade and writing "Such upgrades" seems
like it could lead someone to think that.
# Data Directory seems generic enough to be understood here and since I
mention it in the Major Version as to why data migration is needed,
mentioning here
# as something not directly altered and thus precluding the data migration
has a nice symmetry.  The potential need for manual changes becomes clearer
as well.


 
-Minor releases only fix frequently-encountered bugs, security issues, and data corruption
 problems, so such upgrades are very low risk.  The community
 considers performing minor upgrades to be less risky than continuing to

# Reality mostly boils down to trusting our judgement when it comes to bugs
as each one is evaluated individually.  We do not promise to leave simply
buggy behavior alone in minor releases which is the only policy that would
nearly eliminate upgrade risk.  We back-patch 5 year old bugs quite often
which by definition are not frequently encountered.  I cannot think of a
good adjective to put there nor does one seem necessary even if I agree it
reads a bit odd otherwise.  I think that has more to do with this being
just the wrong structure to get our point across.  Can we pick out some
statistics from our long history of publishing minor releases to present an
objective reality to the reader to convince them to trust our track-record
in this matter?

David J.


Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-03-28 Thread David G. Johnston
On Thu, Mar 7, 2024 at 9:29 PM Erik Wienhold  wrote:

> I wrote:
> > The attached v2 is a simpler patch that instead modifies the existing
> > error message.
>
> Forgot to attach v2.
>
>
For consideration for the doc portion.  The existing wording is too
imprecise for my liking and just tacking on "expects...create type" is
jarring.

"""
Creates a typed table, which takes it structure from an existing (name
optionally schema-qualified) stand-alone composite type i.e., one created
using CREATE TYPE) though it still produces a new composite type as well.
The table will have a dependency to the referenced type such cascaded alter
and drop actions on the type will propagate to the table.

A typed table always has the same column names and data types as the type
it is derived from, and you cannot specify additional columns.  But the
CREATE TABLE command can add defaults and constraints to the table, as well
as specify storage parameters.
"""

We do use the term "stand-alone composite" in create type so I'm inclined
to use it instead of "composite created with CREATE TYPE"; especially in
the error messages; I'm a bit more willing to add the cross-reference to
create type in the user docs.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread David G. Johnston
On Wed, Mar 27, 2024 at 5:43 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

>
> This section is also the main entry point for users into the configuration
> subsystem and hasn't been updated to reflect this new feature.  That seems
> like an oversight that needs to be corrected.
>
>
Shouldn't the "alter system" reference page receive an update as well?

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread David G. Johnston
On Wed, Mar 27, 2024 at 5:17 PM Bruce Momjian  wrote:

> On Thu, Mar 28, 2024 at 12:43:29AM +0100, Jelte Fennema-Nio wrote:
> > +  xreflabel="allow_alter_system">
> > +  allow_alter_system (boolean)
> > +  
> > +   allow_alter_system configuration
> parameter
> > +  
> > +  
> > +  
> > +   
> > +When allow_alter_system is set to
> > +off, an error is returned if the
> ALTER
> > +SYSTEM command is used. This parameter can only be
> set in
>
> "command is used." -> "command is issued." ?
>

"command is executed" seems even better.  I'd take used over issued.


> > +the postgresql.conf file or on the server
> command
> > +line. The default value is on.
> > +   
> > +
> > +   
> > +Note that this setting cannot be regarded as a security
> feature. It
>
> "setting cannot be regarded" -> "setting should not be regarded"
>

"setting must not be regarded" is the correct option here.  Stronger than
should; we are unable to control whether someone can/does regard it
differently.


> > +
> > +   
> > +Turning this setting off is intended for environments where the
> > +configuration of PostgreSQL is
> managed by
> > +some outside mechanism.
> > +In such environments, a well intenioned superuser user might
> > +mistakenly use ALTER
> SYSTEM
> > +to change the configuration instead of using the outside
> mechanism.
> > +This might even appear to update the configuration as intended,
> but
>
> "This might even appear to update" -> "This might temporarily update"
>

I strongly prefer temporarily over may/might/could.



>
> > +then might be discarded at some point in the future when that
> outside
>
> "that outside" -> "the outside"
>

Feel like "external" is a more context appropriate term here than "outside".

External also has precedent.
https://www.postgresql.org/docs/current/config-setting.html#CONFIG-INCLUDES
"External tools may also modify postgresql.auto.conf. It is not recommended
to do this while the server is running,"

That suggests using "external tools" instead of "outside mechanisms"

This section is also the main entry point for users into the configuration
subsystem and hasn't been updated to reflect this new feature.  That seems
like an oversight that needs to be corrected.

> +   
> > +
> > +   
> > +This parameter only controls the use of ALTER
> SYSTEM.
> > +The settings stored in
> postgresql.auto.conf always
>
> "always" -> "still"
>

Neither qualifier is needed, nor does one seem clearly better than the
other.  Always is true so the weaker "still" seems like the worse choice.

The following is a complete and clear sentence.

The settings stored in postgresql.auto.conf take effect even if
allow_alter_system is set to off.


> Should this paragraph be moved after or as part of the paragraph about
> modifying postgresql.auto.conf?
>
>
I like it by itself.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread David G. Johnston
On Wed, Mar 27, 2024 at 3:18 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Mar 27, 2024 at 3:13 PM Bruce Momjian  wrote:
>
>> On Wed, Mar 27, 2024 at 06:09:02PM -0400, Bruce Momjian wrote:
>> > On Wed, Mar 27, 2024 at 11:05:55AM -0400, Robert Haas wrote:
>> > > On Wed, Mar 27, 2024 at 10:43 AM Jelte Fennema-Nio <
>> postg...@jeltef.nl> wrote:
>> > > > Alright, changed the GUC name to "allow_alter_system" since that
>> seems
>> > > > to have the most "votes". One other option would be to call it
>> simply
>> > > > "alter_system", just like "jit" is not called "allow_jit" or
>> > > > "enable_jit".
>> > > >
>> > > > But personally I feel that the "allow_alter_system" is clearer than
>> > > > plain "alter_system" for the GUC name.
>> > >
>> > > I agree, and have committed your 0001.
>> >
>> > So, I email "Is this really a patch we think we can push into PG 17. I
>> > am having my doubts," and the patch is applied a few hours after my
>> > email.  Wow.
>>
>> Also odd is that I don't see the commit in git master, so now I am
>> confused.
>>
>
> The main feature being discussed is in the 0002 patch while Robert pushed
> a doc section rename in the 0001 patch.
>
>
Well, the internal category name was changed though the docs did remain
unchanged.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread David G. Johnston
On Wed, Mar 27, 2024 at 3:13 PM Bruce Momjian  wrote:

> On Wed, Mar 27, 2024 at 06:09:02PM -0400, Bruce Momjian wrote:
> > On Wed, Mar 27, 2024 at 11:05:55AM -0400, Robert Haas wrote:
> > > On Wed, Mar 27, 2024 at 10:43 AM Jelte Fennema-Nio 
> wrote:
> > > > Alright, changed the GUC name to "allow_alter_system" since that
> seems
> > > > to have the most "votes". One other option would be to call it simply
> > > > "alter_system", just like "jit" is not called "allow_jit" or
> > > > "enable_jit".
> > > >
> > > > But personally I feel that the "allow_alter_system" is clearer than
> > > > plain "alter_system" for the GUC name.
> > >
> > > I agree, and have committed your 0001.
> >
> > So, I email "Is this really a patch we think we can push into PG 17. I
> > am having my doubts," and the patch is applied a few hours after my
> > email.  Wow.
>
> Also odd is that I don't see the commit in git master, so now I am
> confused.
>

The main feature being discussed is in the 0002 patch while Robert pushed a
doc section rename in the 0001 patch.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread David G. Johnston
On Wed, Mar 27, 2024 at 10:12 AM Isaac Morland 
wrote:

> On Wed, 27 Mar 2024 at 13:05, Greg Sabino Mullane 
> wrote:
>
>> The purpose of the setting is to prevent accidental
>>> modifications via ALTER SYSTEM in environments where
>>
>>
>> The emphasis on 'accidental' seems a bit heavy here, and odd. Surely,
>> just "to prevent modifications via ALTER SYSTEM in environments where..."
>> is enough?
>>
>
> Not necessarily disagreeing, but it's very important nobody ever mistake
> this for a security feature. I don't know if the extra word "accidental" is
> necessary, but I think that's the motivation.
>

Prevent non-malicious modifications via ALTER SYSTEM in environments where
...

David J.


Re: Extension for PostgreSQL WIP

2024-03-24 Thread David G. Johnston
On Sun, Mar 24, 2024 at 5:49 AM ShadowGhost 
wrote:

> Cast_jsonb_to_hstore WIP
> v1
> This extension add function that can cast jsonb to hstore.
> That link to my github where does my extension lie
> https://github.com/antuanviolin/cast_jsonb_to_hstore
>

If you are intending to submit this to the project you need to follow the
correct procedures.

https://wiki.postgresql.org/wiki/Submitting_a_Patch

Otherwise, this is not the correct place to be promoting your extension.

David J.

p.s. I would advise that including the whole bit about jsonb and hstore in
the email subject line (if you resubmit in a proper format) instead of only
in the body of the email.  Subject lines are very important on a mailing
list such as this and should be fairly precise - not just the word
extension.


Re: documentation structure

2024-03-22 Thread David G. Johnston
On Fri, Mar 22, 2024 at 11:19 AM Robert Haas  wrote:

> On Fri, Mar 22, 2024 at 1:35 PM Bruce Momjian  wrote:
>
> But that all seems like a separate question from why we have the
> statistic collector views in a completely different part of the
> documentation from the rest of the system views. My guess is that it's
> just kind of a historical accident, but maybe there was some other
> logic to it.
>
>
The details under-pinning the cumulative statistics subsystem are
definitely large enough to warrant their own subsection. And it isn't like
placing them into the monitoring chapter is wrong and aside from a couple
of views those under System Views don't fit into what we've defined as
monitoring.  I don't have any desire to lump them under the generic system
views; which itself could probably use a level of categorization since the
nature of pg_locks and pg_cursors is decidedly different than pg_indexes
and pg_config.  This all becomes more appealing to work on once we solve
the problem of all sect2 entries being placed on a single page.

I struggled for a long while where I'd always look for pg_stat_activity
under system views instead of monitoring.  Amending my prior suggestion in
light of this I would suggest we move the Cumulative Statistics Views into
Reference but as its own Chapter, not part of System Views, and change its
name to "Monitoring Views" (going more generalized here feels like a win to
me). I'd move pg_locks, pg_cursors, pg_backend_memory_contexts,
pg_prepared_*, pg_shmem_allocations, and pg_replication_*.  Those all have
the same general monitoring nature to them compared to the others that
basically provide details regarding schema and static or session
configuration.

The original server admin monitoring section can go into detail regarding
Cumulative Statistics versus other kinds of monitoring.  We can use
section ordering to fulfill logical grouping desires until we are able to
make section3 entries appear on their own pages.

David J.


Re: documentation structure

2024-03-22 Thread David G. Johnston
On Fri, Mar 22, 2024, 09:32 Robert Haas  wrote:

>
>
> I notice that you say that the "Installation" section should "cover
> the architectural overview and point people to where they can find the
> stuff they need to install PostgreSQL in the various ways available to
> them" so maybe you're not imagining a four-sentence chapter, either.
>

Fair point but I posit that new users are looking for a chapter named
Installation in the documentation.  At least the ones willing to read
documentation.  Having two of them isn't needed but having zero doesn't
make sense either.

The current proposal does that so I'm ok as-is but it can be further
improved by moving source install talk elsewhere and having the
installation chapter redirect the reader there for details.  I'm not
concerned with how long or short the resultant installation chapter is.

David J.

>
>


Re: documentation structure

2024-03-22 Thread David G. Johnston
On Fri, Mar 22, 2024 at 7:10 AM Robert Haas  wrote:

>
> That's actually what we had in chapter
> 18, "Installation from Source Code on Windows", since removed. But for
> some reason we decided that on non-Windows platforms, it needed a
> whole new chapter rather than an extra sentence in the existing one. I
> think that's massively overkill.
>
>
I agree with the premise that we should have a single chapter, in the main
documentation flow, named "Installation".  It should cover the
architectural overview and point people to where they can find the stuff
they need to install PostgreSQL in the various ways available to them.  I
agree with moving the source installation material to the appendix.  None
of the sections under Installation would then actually detail how to
install the software since that isn't something the project itself handles
but has delegated to packagers for the vast majority of cases and the
source install details are in the appendix for the one "supported"
mechanism that most people do not use.

David J.


Re: documentation structure

2024-03-21 Thread David G. Johnston
On Thu, Mar 21, 2024 at 11:30 AM Robert Haas  wrote:

>
> My second thought is that the stuff from "VII. Internals" that I
> categorized as reference material should move into section "VI.
> Reference". I think we should also consider moving appendix F,
> "Additional Supplied Modules and Extensions," and appendix G,
> "Additional Supplied Programs" to the reference section.
>
>
For "VI. Reference" I propose the following Chapters:

SQL Commands
PL/pgSQL
Cumulative Statistics Views
System Views
System Catalogs
Client Applications
Server Applications
Modules and Extensions

-- Remove Appendix G (Programs) altogether and just note for the two that
are listed that they are in contrib as opposed to core.

-- The PostgreSQL qualifier doesn't seem helpful and once you add the
additional chapters its unusual presence stands out even more.

-- PL/pgSQL gets its own reference chapter since we wrote it.  Stuff like
Perl and Python have entire books that the user can consult as reference
material for those languages.

David J.


Re: documentation structure

2024-03-21 Thread David G. Johnston
On Wed, Mar 20, 2024 at 9:43 AM Robert Haas  wrote:

> On Tue, Mar 19, 2024 at 5:39 PM Andrew Dunstan 
> wrote:
> > +many for improving the index.
>
> Here's a series of four patches.


I reviewed the most recent set of 5 patches.


> Taken together, they cut down the
> number of numbered chapters from 76 to 68. I think we could easily
> save that much again if I wrote a few more patches along similar
> lines, but I'm posting these first to see what people think.
>
> 0001 removes the "Installation from Binaries" chapter. The whole thing
> is four sentences. I moved the most important information into the
> "Installation from Source Code" chapter and retitled it
> "Installation".
>

Makes sense


> 0002 removes the "Monitoring Disk Usage" chapter by folding it into
> the immediately-preceding "Monitoring Database Activity" chapter. I
> kind of feel like the "Monitoring Disk Usage" chapter might be in need
> of a bigger rewrite or just outright removal, but there's surely not
> enough content here to justify making it a top-level chapter.
>

Just going to note that the section on the cumulative statistics views
being a single page is still a strongly bothersome issue here.  Though the
quick fix actually requires upgrading the section to chapter status...

Maybe we can stub out that section in the "Monitoring Database Activity"
chapter and move that entire section after "System Views" in the Internals
part?

I agree with subordinating Monitoring Disk Usage.


> 0003 merges all of the "Internals" chapters whose names are the names
> of built-in index access methods (Btree, Gin, etc.) into a single
> chapter called "Built-In Index Access Methods". All of these chapters
> have a very similar structure and none of them are very long, so it
> makes a lot of sense, at least in my mind, to consolidate them into
> one.
>

One of the more impactful and wanted improvements, IMO.


> 0004 merges the "Generic WAL Records" and "Custom WAL Resource
> Managers" chapter together, creating a new chapter called "Write Ahead
> Logging for Extensions".
>
>
The positioning of this and the preceding Built-in Index Access Methods
chapter seem like they should be switched.

If this sticks we should add an introductory paragraph for the chapter.

and I've added a new 0005 that does essentially the same
> thing for the PL chapters.
>

The following page needs to be reworded to take the new structure into
account:

https://www.postgresql.org/docs/current/xfunc-pl.html

Not having pl/pgsql appear on the main ToC seems like a loss but the others
make sense and a special exception for it probably isn't warranted.

Maybe "pl/pgsql and Other Procedural Languages" as the title?

David J.


Re: REVOKE FROM warning on grantor

2024-03-16 Thread David G. Johnston
On Sat, Mar 16, 2024 at 1:00 PM Étienne BERSAC 
wrote:

>
> > The choice of warning is made because after the command ends the
> > grantmin question does not exist.  The revoke was a no-op and the
> > final state is as the user intended.
>
>
> Sorry, can you explain me what's the grantmin question is ?
>
>
That should have read:  the granted permission does not exist

David J.


Re: REVOKE FROM warning on grantor

2024-03-14 Thread David G. Johnston
On Thursday, March 14, 2024, Étienne BERSAC 
wrote:

>
> However, I'd prefer if Postgres fails properly. Because the GRANT is
> actually not revoked. This prevent ldap2pg to report an issue in
> handling privileges on such roles.
>
> What do you think of make this warning an error ?
>
>
The choice of warning is made because after the command ends the grantmin
question does not exist.  The revoke was a no-op and the final state is as
the user intended.  Historically doing this didn’t give any message at all
which was confusing so we added a warning so the semantics of not failing
were preserved but there was some indication that something was amiss.  I
don’t have a compelling argument to,change the long-standing behavior.
Client code can and probably should look for a show errors reported by the
backend.  It is indeed possibly to treat this warning more serverly than
the server chooses to.

David J.


Re: confusing `case when` column name

2024-03-12 Thread David G. Johnston
On Tuesday, March 12, 2024, adjk...@126.com  wrote:

>
> Nee we change the title of the case-when output column?
>
>
Choosing either a or b as the label seems wrong and probably worth changing
to something that has no meaning and encourages the application of a column
alias.

David J.


Re: Better error messages for %TYPE and %ROWTYPE in plpgsql

2024-02-26 Thread David G. Johnston
On Mon, Feb 26, 2024 at 6:54 PM Andy Fan  wrote:

>
> "David G. Johnston"  writes:
>
> > On Mon, Feb 26, 2024 at 5:46 PM Andy Fan  wrote:
> >
> >  > Per recent discussion[1], plpgsql returns fairly unhelpful "syntax
> >  > error" messages when a %TYPE or %ROWTYPE construct references a
> >  > nonexistent object.  Here's a quick little finger exercise to try
> >  > to improve that.
> >
> >  Looks this modify the error message, I want to know how ould we treat
> >  error-message-compatible issue during minor / major upgrade.
> >
> > There is no bug here so no back-patch; and we are not yet past feature
> freeze for v17.
>
> Acutally I didn't asked about back-patch.


What else should I be understanding when you write the words "minor
upgrade"?


> So if the error message is changed, the above code may be broken.
>
>
A fair point to bring up, and is change-specific.  User-facing error
messages should be informative and where they are not changing them is
reasonable.  Runtime errors probably need more restraint since they are
more likely to be in a production monitoring alerting system but anything
that is reporting what amounts to a syntax error should be reasonable to
change and not expect people to be writing production code looking for
them.  This seems to fall firmly into the "badly written code"/syntax
category.

David J.


Re: Better error messages for %TYPE and %ROWTYPE in plpgsql

2024-02-26 Thread David G. Johnston
On Mon, Feb 26, 2024 at 5:46 PM Andy Fan  wrote:

> > Per recent discussion[1], plpgsql returns fairly unhelpful "syntax
> > error" messages when a %TYPE or %ROWTYPE construct references a
> > nonexistent object.  Here's a quick little finger exercise to try
> > to improve that.
>
> Looks this modify the error message, I want to know how ould we treat
> error-message-compatible issue during minor / major upgrade.
>

There is no bug here so no back-patch; and we are not yet past feature
freeze for v17.

David J.


Re: Things I don't like about \du's "Attributes" column

2024-02-16 Thread David G. Johnston
On Mon, Feb 12, 2024 at 2:29 PM Pavel Luzanov 
wrote:

>  regress_du_role1 | no| Inherit | no| 2024-12-31 
> 00:00:00+03(invalid) | 50   | Group role without password but 
> with valid until
>  regress_du_role2 | yes   | Inherit | yes   | 
> | Not allowed  | No connections allowed
>  regress_du_role3 | yes   | | yes   | 
> | 10   | User without attributes
>  regress_du_su| yes   | Superuser  +| yes   | 
> | 3(ignored)   | Superuser but with connection limit
>
>
Per the recent bug report, we should probably add something like (ignored)
after the 50 connections for role1 since they are not allowed to login so
the value is indeed ignored.  It is ignored to zero as opposed to unlimited
for the Superuser so maybe a different word (not allowed)?

David J.


Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

2024-02-16 Thread David G. Johnston
On Fri, Feb 16, 2024 at 1:16 PM Jim Jones  wrote:

> In case all columns of a record have been set to null due to data type
> incompatibility, should we insert it at all?


Yes.  In particular not all columns in the table need be specified in the
copy command so while the parsed input data is all nulls the record itself
may not be.

The system should allow the user to exclude rows with incomplete data by
ignoring a not null constraint violation.

In short we shouldn't judge non-usefulness and instead give tools to the
user to decide for themselves.

David J.


Re: Function and Procedure with same signature?

2024-02-09 Thread David G. Johnston
On Fri, Feb 9, 2024, 12:05 Deepak M  wrote:

> Hello Hackers,
>

Wrong list, this is for discussions regarding patches.



> Folks, When tried to create a function with the same signature as
> procedure it fails.
>

That seems like a good hint you cannot do it.  Specifically because they
get defined in the same internal catalog within which names must be unique.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-02-07 Thread David G. Johnston
On Wednesday, February 7, 2024, Joel Jacobson  wrote:

>
> On Fri, Sep 8, 2023, at 23:43, Magnus Hagander wrote:
> > We need a "allowlist" of things a user can do, rather than a blocklist
> > of "they can do everything they can possibly think of and a computer
> > is capable of doing, except for this one specific thing". Blocklisting
> > individual permissions of a superuser will never be secure.
>
> +1 for preferring an "allowlist" approach over a blocklist.
>

The status quo is allow everything so while the theory is nice it seems
that requiring it to be allowlist is just going to scare anyone off of
actually improving matters.

Also, this isn’t necessarily about blocking the superuser, it is about
effectively disabling features deemed undesirable at runtime.  All features
enabled by default seems like a valid policy.

While the only features likely to be disabled are those involving someone’s
definition of security the security policy is still that superuser can do
everything the system is capable of doing.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-02-06 Thread David G. Johnston
On Tue, Feb 6, 2024 at 7:10 AM Peter Eisentraut 
wrote:

>
> How about ALTER SYSTEM is disabled if the file
> postgresql.auto.conf.disabled exists? This is somewhat similar to making
> the file read-only, but doesn't risk other tools breaking when they
> encounter such a file.  And it's more obvious and self-explaining.
>

A separate configuration file would be self-documenting and able to always
exist; the same properties as postgres.conf

ISTM the main requirement regardless of how the file system API is designed
- assuming there is a filesystem API - is that the running postgres process
be unable to write to the file.  It seems immaterial how the OS admin
accomplishes that goal.

The command line argument method seems appealing but it seems harder in
that case to ensure that the postgres process be disallowed from modifyIng
whatever file defines what should be run.

One concern with a file configuration is that if we require it to be
present in the data directory that goes somewhat against the design of
allowing configuration files to be placed anywhere by changing the
config_file guc.

Any design should factor in the almost immediate need to be extended to
prevent copy variants that touch the local filesystem or shell directly.

I was pondering a directory in pgdata where you could add *.disabled files
indicating which features to disable.  This is a bit more pluggable than a
single configuration file but the later still seems better to me.

David J.


Re: Grant read-only access to exactly one database amongst many

2024-02-05 Thread David G. Johnston
On Monday, February 5, 2024, Graham Leggett  wrote:

>
> Also, how do you handle the race condition between the time a database db3
> is created, and the the time all readonly users have their access revoked
> to db3?
>
>
You alter the default privileges for the system so PUBLIC does not get
connect privileges on newly created databases.

David J.

p.s. this mailing list is for discussing patches, discussions Lon how to
use the system belong on the -general list.


Re: Grant read-only access to exactly one database amongst many

2024-02-04 Thread David G. Johnston
On Sun, Feb 4, 2024 at 5:04 PM Graham Leggett  wrote:

> Hi all,
>
> I have a postgresql 15 instance with two databases in it, and I have a
> need to grant read-only access to one of those databases to a given user.
>
> To do this I created a dedicated role for readonly access to the database
> db1:
>
> CREATE ROLE "dv_read_db1"
> GRANT CONNECT ON DATABASE db1 TO dv_read_db1
>

This grant is basically pointless since by default all roles can connect
everywhere via the PUBLIC pseudo-role.  You need to revoke that grant, or
even alter it being given out by default.



> Trouble is, I can create tables in db1 which is write access.


Since in v15 PUBLIC also gets CREATE on the public schema.

I can also connect to db2 (bad),


See my comment regarding the pointless grant in a default setup.

and I can enumerate the tables in db2 (bad),
>

Connect privilege grants reading all catalog data by design.


> I appears the mechanism I am using above has insecure side effects.
>

It has, from your expectation, insecure defaults which you never changed.
We changed public schema in v16 but the ease-of-use database connecting
remains.

David J.


Re: Postgres and --config-file option

2024-02-02 Thread David G. Johnston
On Fri, Feb 2, 2024 at 2:23 PM David G. Johnston 
wrote:

> On Mon, Jan 15, 2024 at 4:35 AM Aleksander Alekseev <
> aleksan...@timescale.com> wrote:
>
>> PFA the patch. It's short but I think it mitigates the problem.
>>
>>
> I took a look at where these options are discussed in the documentation
> and now feel that we should make these options clear more broadly (config
> and libpq, plus pointing to --name from -c in a couple of places).  It
> doesn't add much verbosity and, frankly, if I was to pick one
> "--name=value" would win and so I'd rather document it, leaving -c alone
> for historical reasons.
>
> I've attached a replacement patch with the additional changes.
>
>
And I just saw one more apparently undocumented requirement (or a typo)

You must specify the --config-file

The actual parameter is "config_file", so apparently we are supposed to
either convert underscores to hyphens or we have a typo.

David J.


Re: Postgres and --config-file option

2024-02-02 Thread David G. Johnston
On Mon, Jan 15, 2024 at 4:35 AM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> PFA the patch. It's short but I think it mitigates the problem.
>
>
I took a look at where these options are discussed in the documentation and
now feel that we should make these options clear more broadly (config and
libpq, plus pointing to --name from -c in a couple of places).  It doesn't
add much verbosity and, frankly, if I was to pick one "--name=value" would
win and so I'd rather document it, leaving -c alone for historical reasons.

I've attached a replacement patch with the additional changes.

David J.
From adc6c807d2bc0d73271b47e4d1908e5c069e5b24 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Fri, 2 Feb 2024 14:16:24 -0700
Subject: [PATCH] v4 configs review

---
 doc/src/sgml/config.sgml   | 11 ++-
 doc/src/sgml/libpq.sgml|  7 ---
 doc/src/sgml/ref/postgres-ref.sgml |  8 +---
 src/backend/main/main.c|  4 ++--
 src/backend/utils/misc/guc.c   |  2 +-
 5 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5..1eb58c0793 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -333,9 +333,10 @@ UPDATE pg_settings SET setting = reset_val WHERE name = 'configuration_parameter
   
During server startup, parameter settings can be
passed to the postgres command via the
-   -c command-line parameter.  For example,
+   -c name=value command-line parameter, or its equivalent
+   --name=value variation.  For example,
 
-postgres -c log_connections=yes -c log_destination='syslog'
+postgres -c log_connections=yes --log_destination='syslog'
 
Settings provided in this way override those set via
postgresql.conf or ALTER SYSTEM,
@@ -352,10 +353,10 @@ postgres -c log_connections=yes -c log_destination='syslog'
   of the session, but do not affect other sessions.
   For historical reasons, the format of PGOPTIONS is
   similar to that used when launching the postgres
-  command; specifically, the -c flag must be specified.
-  For example,
+  command; specifically, the -c , or prepended
+  --, before the name must be specified. For example,
 
-env PGOPTIONS="-c geqo=off -c statement_timeout=5min" psql
+env PGOPTIONS="-c geqo=off --statement_timeout=5min" psql
 
  
 
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc..89f7aa590d 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1374,9 +1374,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   

 Specifies command-line options to send to the server at connection
-start.  For example, setting this to -c geqo=off sets the
-session's value of the geqo parameter to
-off.  Spaces within this string are considered to
+start.  For example, setting this to -c geqo=off
+or --geoq=off sets the session's value of the
+geqo parameter to off.
+Spaces within this string are considered to
 separate command-line arguments, unless escaped with a backslash
 (\); write \\ to represent a literal
 backslash.  For a detailed discussion of the available
diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml
index b13a16a117..cbf06d86cb 100644
--- a/doc/src/sgml/ref/postgres-ref.sgml
+++ b/doc/src/sgml/ref/postgres-ref.sgml
@@ -123,7 +123,8 @@ PostgreSQL documentation
 described in . Most of the
 other command line options are in fact short forms of such a
 parameter assignment.  -c can appear multiple times
-to set multiple parameters.
+to set multiple parameters.  See the --name
+option below for an alternate syntax.

   
  
@@ -342,8 +343,9 @@ PostgreSQL documentation
   --name=value
   

-Sets a named run-time parameter; a shorter form of
--c.
+Sets the named run-time parameter; a shorter form of
+-c.  See 
+for a listing of parameters.

   
  
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 51ffb8e773..f1b88faa5c 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -324,7 +324,7 @@ help(const char *progname)
 	printf(_("Usage:\n  %s [OPTION]...\n\n"), progname);
 	printf(_("Options:\n"));
 	printf(_("  -B NBUFFERSnumber of shared buffers\n"));
-	printf(_("  -c NAME=VALUE  set run-time parameter\n"));
+	printf(_("  -c NAME=VALUE  set run-time parameter (see also --NAME)\n"));
 	printf(_("  -C NAMEprint value of run-time parameter, then exit\n"));
 	printf(_("  -d 1-5 debugging level\n"));
 	printf(_("  -D DATADIR database directory\n"));

Re: to_regtype() Raises Error

2024-02-02 Thread David G. Johnston
On Mon, Jan 29, 2024 at 8:45 AM David E. Wheeler 
wrote:

> Hey there, coming back to this. I poked at the logs in the master branch
> and saw no mention of to_regtype; did I miss it?
>

With no feedback regarding my final suggestion I lost interest in it and
never produced a patch.


> On Sep 17, 2023, at 10:58 PM, David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
> > Parses a string of text, extracts a potential type name from it, and
> translates that name into an OID.  Failure to extract a valid potential
> type name results in an error while a failure to determine that the
> extracted name is known to the system results in a null output.
> >
> > I take specific exception to describing your example as a “textual type
> name”.
>
> More docs seem like a reasonable compromise. Perhaps it’d be useful to
> also describe when an error is likely and when it’s not.
>
>
Seems like most just want to leave well enough alone and deal with the rare
question for oddball input on the mailing list.  If you are interested
enough to come back after 4 months I'd suggest you write up and submit a
patch.  I'm happy to review it and see if that is enough to get a committer
to respond.

David J.


Re: Small fix on COPY ON_ERROR document

2024-02-02 Thread David G. Johnston
On Thu, Feb 1, 2024 at 11:59 PM Yugo NAGATA  wrote:

>
> I attached a updated patch including fixes you pointed out above.
>
>
Removed "which"; changed "occupying" to "occupy"
Removed on of the two "amounts"
Changed "unacceptable to the input function" to just "converting" as that
is what the average reader is more likely to be thinking.
The rest was good.

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 3c2feaa11a..55764fc1f2 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -385,8 +385,8 @@ COPY { table_name [ ( ON_ERROR
 
  
-  Specifies which how to behave when encountering an error due to
column values
-  unacceptable to the input function of each attribute's data type.
+  Specifies how to behave when encountering an error converting a
column's
+  input value into its data type.
   An error_action value of
   stop means fail the command, while
   ignore means discard the input row and continue
with the next one.
@@ -589,7 +589,7 @@ COPY count
 The COPY FROM command physically inserts input rows
 into the table as it progresses.  If the command fails, these rows are
 left in a deleted state; these rows will not be visible, but still
-occupying disk space. This might amount to a considerable amount of
+occupy disk space. This might amount to considerable
 wasted disk space if the failure happened well into a large copy
 operation. VACUUM should be used to recover the
 wasted space.

David J.


[Doc] Improve hostssl related descriptions and option presentation

2024-02-01 Thread David G. Johnston
Motivated by a recent complaint [1] I found the hostssl related material in
our docs quite verbose and even repetitive.  Some of that is normal since
we have both an overview/walk-through section as well as a reference
section.  But the overview in particular was self-repetitive.  Here is a
first pass at some improvements.  The commit message contains both the
intent of the patch and some thoughts/questions still being considered.

David J.

[1]
https://www.postgresql.org/message-id/170672433664.663.11895343120533141715%40wrigleys.postgresql.org
From 5bd53027d5f1cbe9021c9b4f7abe3be5792589dd Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Thu, 1 Feb 2024 15:24:30 -0700
Subject: [PATCH] docs: improve hostssl related descriptions and option
 presentation

runtime.sgml discussion regarding SSL was repetitive with itself
and presented in a problematic order - with discussion about
setting up the server configuration happening before a broad
overview of what all of the moving parts are.  Put describing
the two approaches first, with links to the reference material,
and then discuss the implementation detail of a trusted root
certificate.

client-auth.sgml treatment of hostssl ssl auth options and
the cert auth method is confusing.  They are related and so
discuss them once, together.  The cert method page provides
a nice place to isolate these, in addition to a largely
repeated discussion on the runtime.sgml page which points
back to this one spot for the extra layer of details for,
in particular, the user mapping.

NOTES:

I left the "comparison is done with the DN in RFC"
paragraph content as-is though I think its presence or
content needs re-evaluation.

I moved the wording regarding "trust+clientcert" to
the runtime page as that material fits better in
overview than reference.  I also changed it from
words to an actual pg_hba.conf line; which seems
a bit out of place but going with it for now.

The method pages are all children of the pg_hba.conf
page so if you land on cert from the runtime page
going from their directly to pg_hba.conf, or
realizing that you are basically in a section
that only pertains to pg_hba.conf, is not that
obvious.  Seems like these pages should have
xrefs added to them giving that context and
linking back to pg_hba.conf directly.
---
 doc/src/sgml/client-auth.sgml | 106 ++
 doc/src/sgml/runtime.sgml |  87 
 2 files changed, 92 insertions(+), 101 deletions(-)

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 56747c0e36..638c8e7057 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -621,8 +621,9 @@ include_dir directory
 cert
 
  
-  Authenticate using SSL client certificates. See
-   for details.
+  Authenticate the hostssl connection attempt using only the SSL certificate.
+  See  for details regarding the auth-options
+  any hostssl connection line can specify.
  
 

@@ -660,45 +661,15 @@ include_dir directory
After the auth-method field, there can be field(s) of
the form name=value that
specify options for the authentication method. Details about which
-   options are available for which authentication methods appear below.
+   options are available for which authentication methods appear on their
+   individual detail pages.
   
 
   
-   In addition to the method-specific options listed below, there is a
-   method-independent authentication option clientcert, which
-   can be specified in any hostssl record.
-   This option can be set to verify-ca or
-   verify-full. Both options require the client
-   to present a valid (trusted) SSL certificate, while
-   verify-full additionally enforces that the
-   cn (Common Name) in the certificate matches
-   the username or an applicable mapping.
-   This behavior is similar to the cert authentication
-   method (see ) but enables pairing
-   the verification of client certificates with any authentication
-   method that supports hostssl entries.
-  
-  
-   On any record using client certificate authentication (i.e. one
-   using the cert authentication method or one
-   using the clientcert option), you can specify
-   which part of the client certificate credentials to match using
-   the clientname option. This option can have one
-   of two values. If you specify clientname=CN, which
-   is the default, the username is matched against the certificate's
-   Common Name (CN). If instead you specify
-   clientname=DN the username is matched against the
-   entire Distinguished Name (DN) of the certificate.
-   This option is probably best used in conjunction with a username map.
-   The comparison is don

Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread David G. Johnston
On Tuesday, January 30, 2024, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Tuesday, January 30, 2024, Tom Lane  wrote:
> >> My larger point here is that trying to enforce restrictions on
> >> superusers *within* Postgres is simply not a good plan, for
> >> largely the same reasons that Robert questioned making the
> >> GUC mechanism police itself.  It needs to be done outside,
> >> either at the filesystem level or via some other kernel-level
> >> security system.
>
> > The idea of adding a file to the data directory appeals to me.
> >
> > optional_runtime_features.conf
> > alter_system=enabled
> > copy_from_program=enabled
> > copy_to_program=disabled
>
> ... so, exactly what keeps an uncooperative superuser from
> overwriting that file?
>
> You cannot enforce such restrictions within Postgres.
> It has to be done by an outside mechanism.  If you think
> different, you are mistaken.
>

If the only user on the OS that can modify that file is root, how does the
superuser, who is hard coded to not be root, modify it?  The root/admin
user on the OS and it’s filesystem permissions is the outside mechanism
being suggested here.

If the complaint is that the in-memory boolean is changeable by the
superuser, or even the logic pertaining to the error branch of the code,
then yes this is a lost cause.

But root prevents superuser from controlling that file and then that file
can prevent the superuser from escaping to the operating system and
leveraging its OS postgres user.

David J.


Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread David G. Johnston
On Tuesday, January 30, 2024, Tom Lane  wrote:
>
>
> My larger point here is that trying to enforce restrictions on
> superusers *within* Postgres is simply not a good plan, for
> largely the same reasons that Robert questioned making the
> GUC mechanism police itself.  It needs to be done outside,
> either at the filesystem level or via some other kernel-level
> security system.
>
>
The idea of adding a file to the data directory appeals to me.

optional_runtime_features.conf
alter_system=enabled
copy_from_program=enabled
copy_to_program=disabled

If anyone tries to use disabled features the system emits an error:

ERROR: Cannot send copy output to program, action disabled by host.

My main usability question is whether restart required is an acceptable
restriction.

Making said file owned by root (or equivalent) and only readable by the
postgres process user suffices to lock it down.  Refusing to start if the
file is writable, and at least one feature is disabled can be considered,
with a startup option to bypass that check if desired.

David J.


Small fix on COPY ON_ERROR document

2024-01-28 Thread David G. Johnston
On Sunday, January 28, 2024, Yugo NAGATA  wrote:

> On Fri, 26 Jan 2024 08:04:45 -0700
> "David G. Johnston"  wrote:
>
> > On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA  wrote:
> >
> > > On Fri, 26 Jan 2024 00:00:57 -0700
> > > "David G. Johnston"  wrote:
> > >
> > > > I will need to make this tweak and probably a couple others to my own
> > > > suggestions in 12 hours or so.
> > > >
> > >
> > >
> > And here is my v2.
> >
> > Notably I choose to introduce the verbiage "soft error" and then define
> in
> > the ON_ERROR clause the specific soft error that matters here - "invalid
> > input syntax".
>
> I am not sure we should use "soft error" without any explanation
> because it seems to me that the meaning of words is unclear for users.


Agreed. It needs to be added to the glossary.



>
> Also, I think "invalid input syntax" is a bit ambiguous. For example,
> COPY FROM raises an error when the number of input column does not match
> to the table schema, but this error is not ignored by ON_ERROR while
> this seems to fall into the category of "invalid input syntax".



It is literally the error text that appears if one were not to ignore it.
It isn’t a category of errors.  But I’m open to ideas here.  But being
explicit with what on actually sees in the system seemed preferable to
inventing new classification terms not otherwise used.


>
> So, keeping consistency with the existing description, we can say:
>
> "Specifies which how to behave when encountering an error due to
>  column values unacceptable to the input function of each attribute's
>  data type."


Yeah, I was considering something along those lines as an option as well.
But I’d rather add that wording to the glossary.



> Currently, ON_ERROR doesn't support other soft errors, so it can explain
> it more simply without introducing the new concept, "soft error" to users.
>
>
Good point.  Seems we should define what user-facing errors are ignored
anywhere in the system and if we aren’t consistently leveraging these in
all areas/commands make the necessary qualifications in those specific
places.



> > I also note the log message behavior when ignore mode is chosen.  I
> haven't
> > confirmed that it is accurate but that is readily tweaked if approved of.
> >
>
> +  An INFO level context message containing the
> ignored row count is
> +  emitted at the end of the COPY FROM if at least
> one row was discarded.
>
>
> The log level is NOTICE not INFO.


Makes sense, I hadn’t experimented.


> I think "left in a deleted state" is also unclear for users because this
> explains the internal state but not how looks from user's view.How about
> leaving the explanation "These rows will not be visible or accessible" in
> the existing statement?
>

Just visible then, I don’t like an “or” there and as tuples at least they
are accessible to the system, in vacuum especially.  But I expected the
user to understand “as if you deleted it” as their operational concept more
readily than visible.  I think this will be read by people who haven’t read
MVCC to fully understand what visible means but know enough to run vacuum
to clean up updated and deleted data as a rule.

David J.


Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

2024-01-28 Thread David G. Johnston
On Sun, Jan 28, 2024 at 4:51 PM jian he  wrote:

> On Fri, Jan 26, 2024 at 11:09 PM David G. Johnston
>  wrote:
> >
> > Hi,
> >
> > The option choice of "ignore" in the COPY ON_ERROR clause seems overly
> generic.  There would seem to be two relevant ways to ignore bad column
> input data - drop the entire row or just set the column value to null.  I
> can see us wanting to provide the set to null option and in any case having
> the option name be explicit that it ignores the row seems like a good idea.
>
> two issue I found out while playing around with it;
> create table x1(a int not null, b int not null );
>
> another issue:
> COPY x1 from stdin (on_error null);
>
> when we already have `not null` top level constraint for table x1.
> Do we need an error immediately?
> "on_error null" seems to conflict with `not null` constraint (assume
> refers to the same column).
> it may fail while doing bulk inserts while on_error is set to null
> because of violating a not null constraint.
>

You should not error immediately since whether or not there is a problem is
table and data dependent.  I would not check for the case of all columns
being defined not null and just let the mismatch happen.

That said, maybe with this being a string we can accept something like:
'null, ignore'

And so if attempting to place any one null fails, assuming we can make that
a soft error too, we would then ignore the entire row.

David J.


Re: Things I don't like about \du's "Attributes" column

2024-01-28 Thread David G. Johnston
On Sun, Jan 28, 2024 at 1:29 PM Pavel Luzanov 
wrote:

> I'd suggest pulling out this system view change into its own patch.
>
>
> But within this thread or new one?
>
>
>
Thread.  The subject line needs to make clear we are proposing changing a
system view.

The connection limit can be set to 0. What should be displayed in this
case, blank or 0?
>
> 0 or even "not allowed" to make it clear


> The connection limit can be set for superusers. What should be displayed in 
> this case,
> blank or actual non-effective value?
>
> print "# (ignored)" ?


CREATE|ALTER ROLE commands allow incorrect values to be set for 'Conn
limit' and 'Valid until'.
> How can the administrator see them and fix them?
>
>
That is unfortunate...but they can always go look at the actual system
view.  Or do what i showed above and add (invalid) after the real value.
Note I'm only really talking about -1 here being the value that is simply
hidden from display since it means unlimited and not actually -1

I'd be more inclined to print "forever" for valid until since the existing
presentation of a timestamp is already multiple characters. Using a word
for a column that is typically a number is less appealing.

David J.


Re: psql: add \create_function command

2024-01-26 Thread David G. Johnston
On Fri, Jan 26, 2024 at 12:23 PM Tom Lane  wrote:

>
> \set fbody `cat source_file.txt`
> CREATE FUNCTION foo() RETURNS whatever AS :'fbody' LANGUAGE ...;
>
> and maybe we should say that that's sufficient.


I really don't have a problem, and kinda prefer, using psql variables this
way but feel much more comfortable not having to invoke a shell.


>   It's a bit
> klugy though.  One level of improvement could be to get rid
> of the dependency on "cat" by inventing a backslash command
> to read a file into a variable:
>
> \file_read fbody source_file.txt
>

This I would use to reliably read external json text files into a psql
variable so that I could use jsonb_to_recordset(:var) on the contents.


> (\file_write to go the other way seems potentially useful too.)
>

The nearby discussions regarding trying to produce json into files would
support this claim.


> Or we could cut out the intermediate variable altogether
> by inventing something that works like :'...' but reads
> from a file not a variable.  That might be too specialized
> though, and I'm not sure about good syntax for it either.
> Maybe like
>
> CREATE FUNCTION foo() RETURNS whatever AS :{source_file.txt} LANGUAGE ...;
>
>
IMO, not enough improvement to be had over letting psql variables act as
the intermediary to justify the effort.

David J.


Re: Add new error_action COPY ON_ERROR "log"

2024-01-26 Thread David G. Johnston
On Thu, Jan 25, 2024 at 9:42 AM torikoshia 
wrote:

> Hi,
>
> As described in 9e2d870119, COPY ON_EEOR is expected to have more
> "error_action".
> (Note that option name was changed by b725b7eec)
>
> I'd like to have a new option "log", which skips soft errors and logs
> information that should have resulted in errors to PostgreSQL log.
>
>
Seems like an easy win but largely unhelpful in the typical case.  I
suppose ETL routines using this feature may be running on their machine
under root or "postgres" but in a system where they are not this very
useful information is inaccessible to them.  I suppose the DBA could set up
an extractor to send these specific log lines elsewhere but that seems like
enough hassle to disfavor this approach and favor one that can place the
soft error data and feedback into user-specified tables in the same
database.  Setting up temporary tables or unlogged tables probably is going
to be a more acceptable methodology than trying to get to the log files.

David J.


  1   2   3   4   5   6   7   8   9   10   >