Re: Add a different archive_command example for Linux / Unix

2024-02-08 Thread Stephen Frost
Greetings,

* gp...@free.fr (gp...@free.fr) wrote:
> Thanks Stephen for your detailed reply and broad perspective.
> But I see the cp example command used **as is** most of the time.

In those cases- how would changing it to be a dd command be helpful?
The directory still wouldn't be fsync'd and there's a very good chance
that the rest of the documentation isn't followed or understood either,
leading almost certainly to broken backup setups.  This wouldn't be the
only issue in any case, to be sure.

This comes back to my earlier suggestion that perhaps we should just
change it to something like:

archive_command = 'backup_tool %p /mnt/server/archivedir/%f'

and not talk about specific tools that exist but don't perform in the
manner we actually expect from an archive command that we're using.  We
already make it pretty clear to anyone who knows the tools mentioned
that the 'example' command won't work, if you read everything under that
section.

Alternatively, we could actually document the tools we're aware of that
do work and which do strive, at least, to try and be good backup tools
and good archive commands for PG.  That would certainly be a service to
our users and might result in far fewer misconfigured systems using the
examples because they thought (despite the explicit note in our
documentation) that they were recommendations.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Add a different archive_command example for Linux / Unix

2024-02-08 Thread Stephen Frost
Greetings,

* PG Doc comments form (nore...@postgresql.org) wrote:
> Hello,
> in
> https://www.postgresql.org/docs/16/continuous-archiving.html#BACKUP-ARCHIVING-WAL
> the example given could be improved for Linux / Unix environment.
> 
> As cp command is buffered it means it could return success although the data
> didn't reach the disk
> which puts backups at risk.

Yup.

> I propose to use dd command with its fsync option.
> 
> So the actual equivalent example would be :
> 
> archive_command = 'dd if=%p of=/mnt/server/archivedir/%f bs=1M
> conv=fsync,excl status=none' # Unix
> 
> What do you think ?

This doesn't fsync the directory though, for one thing, and there are
other considerations beyond that when having archive_command run and
more generally when doing backups with PG.  In short, the example in the
documentation is not to ever be used but is intended to show how the
replacement is done when the command is called, so that backup tool
authors know how it works.

In reality though, to write backup software for PG, you really do need
to know PG in much more detail than the documentation provides, which
means reading the source- for example, backup software should be
checking the pg_control file's CRC as it's possible to read it just as
it's being written and end up with an invalid pg_control file in the
backup, making the backup invalid.  There's been some discussion about
how to improve this situation but nothing exists today from PG, so
backup authors have to handle it.  This is just one example, there are
lots of others- unlogged table handling, temporary file handling, etc,
etc.

I'd strongly recommend using one of the existing well maintained backup
tools which have been written specifically for PG for your backups.
Writing a new backup tool for PG is a good bit of work and isn't
really reasonable to do with shell scripts or simple unix commands.

I do feel that we could improve the documentation around this by
dropping comments like "using any convenient file-system-backup tool
such as tar or cpio" as those don't, for example, support any way to
reasonably deal with unlogged tables by themselves.  Technically you
could scan the data directory and provide an exclude file, or not
include unlogged table files in the list of files to include, but then
you're starting to get into things like how to tell if a file is
associated with an unlogged table or not and while that's deep in the
documentation, we don't make any mention or reference to unlogged tables
in the backup documentation.  Perhaps an addition to the low-level
documentation under 'Backing Up The Data Directory' along these lines
would be helpful:

#
You should omit any unlogged relation files (other than the 'init' fork)
as they will be reset to be empty upon recovery and backing them up will
simply increase your backup size (potentially significantly) and slow
down the restore process.  Unlogged tables have an init fork (link to
storage-init.html) which is a file with the same filenode number as the
relation but with a suffix added of '_init' (link to
storage-file-layout.html).  When an '_init' fork exists for a given
relation, the '_init' file should be included in the backup, but all
other files for that relation (the 'main' fork, which does not have a
suffix, and all other forks which exist other than the 'init' fork)
should be excluded from the backup.
#

There's also no way for tar or cpio to directly validate that the copy
of pg_control that they copied is valid.  Worse, on a restore, they'll
restore pg_control more-or-less whenever and then if the restore doesn't
complete for whatever reason, you might end up with a cluster that can
be started, run for a while, but be missing whole tables.  While it was
only demonstrated relatively recently that the pg_control file can, in
fact, be invalid when read during a backup, it's a real issue that's
been around for, probably, forever, and there isn't really a good way to
address it today.  Still, perhaps we should include in the
documentation, again under the 'Backup Up The Data Directory', something
like:

#
You must be sure to check that the copy of pg_control which was
copied is valid.  The pg_controldata tool can be used for this purpose-
after making the copy of the 'data_dir/global/pg_control' file which
will be included in the backup, restore it into a new directory
'test_data_dir/global' and then run 'pg_controldata' on 'test_data_dir'
and ensure that no error is returned.  If an error is returned, attempt
to re-copy the 'data_dir/global/pg_control' file and test again (during
the backup), or fail the backup entirely.  A backup without a valid
pg_control file is not able to be restored.

Further, on restore, it is strongly recommended to restore the
pg_control file last and only after the successfully restoring the rest
of the database, to prevent the database from being started either while
the restore is happening or in the event that not all of the 

Re: unclear wording re: spoofing prevention on network connections

2023-12-09 Thread Stephen Frost
Greetings,

On Sat, Dec 9, 2023 at 18:02 David G. Johnston 
wrote:

> On Saturday, December 9, 2023, Stephen Frost  wrote:
>
>>
>>
>> The idea is that you can use both TLS and GSSAPI-with-encryption at the
>> same time within a given cluster for connections but you wouldn’t use them
>> on the same connection.  Certainly would welcome suggestions as to the best
>> way to phrase that.
>>
>
> It isn’t really connection driven though - or even specific to these two
> options.  The pg_hba.conf file can contain any number of different
> authentication methods that are usable simultaneously (from the perspective
> of the cluster).  But a given login request is only going to match a single
> one of those lines; so it isn’t like the client somehow decides during each
> login using the same machine and user name which way they are going to
> verify who they say they are.
>

Not sure how it isn’t connection driven- as mentioned, the options are
available and can be used at the same time but only on independent
connections.

>
Didn’t mean to suggest it was really up to the client.  Also- you can use
TLS with GSSAPI, just not with GSSAPI-with-encryption.

We don’t call out being able to use password and peer simultaneously, the
> description and specification of the pg_hba.conf file itself imparts that
> information.  I’m unclear why these two would warrant a special calling out.
>

Cross verification between client and server with encryption…. Perhaps
SCRAM with channel binding and TLS could also be listed.  Password doesn’t
provide this security, nor does LDAP, nor PAM. Peer doesn’t provide
encryption (tho fair that it isn’t really necessary, but that doesn’t make
it meet the criteria or intention of this).

Thanks,

Stephen

>


Re: unclear wording re: spoofing prevention on network connections

2023-12-09 Thread Stephen Frost
Greetings,

On Sat, Dec 9, 2023 at 17:29 Bruce Momjian  wrote:

> On Fri, Dec  8, 2023 at 05:42:27PM +, PG Doc comments form wrote:
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/16/preventing-server-spoofing.html
> > Description:
> >
> > When I read:
> > To prevent spoofing on TCP connections, either use SSL certificates and
> make
> > sure that clients check the server's certificate, or use GSSAPI
> encryption
> > (or both, if they're on separate connections).
> >
> > It takes some thought to figure out what "separate connections" are being
> > referred to.  Does it mean separate TLS connection and
> > non-tls-with-gssapi-encryption?


Short answer here is “yes, you understand correctly.”

I have no idea.  It was added in this commit:


…

Agreed that the wording isn’t great.

The idea is that you can use both TLS and GSSAPI-with-encryption at the
same time within a given cluster for connections but you wouldn’t use them
on the same connection.  Certainly would welcome suggestions as to the best
way to phrase that.

Thanks,

Stephen

>


Re: User mapping security

2023-11-09 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jul 16, 2019 at 02:01:00AM +, PG Doc comments form wrote:
> > The following documentation comment has been logged on the website:
> > 
> > Page: https://www.postgresql.org/docs/10/postgres-fdw.html
> > Description:
> > 
> > Regarding the documentation pages
> > https://www.postgresql.org/docs/10/postgres-fdw.html?origin_team=T02HEPYKQ
> > and https://www.postgresql.org/docs/10/sql-createusermapping.html
> > 
> > I suppose it should be warned on the pages that foreign credentials with be
> > stored as simple text and will be available for viewing in pg_user_mappings.
> 
> I know this is four years old, but the attached patch documents it.  I
> don't think postgresql-fdw needs it since it relies on user mapping and
> discourages passwords in the connection string.

A bit on the fence about it ... but I do wonder if we should encourage
use of gssapi and credential delegation now that we support that and
point out that storing passwords isn't required if you're using gssapi.

Thanks,

Stephen

> diff --git a/doc/src/sgml/ref/create_user_mapping.sgml 
> b/doc/src/sgml/ref/create_user_mapping.sgml
> index 55debd5401..e93bfe48f6 100644
> --- a/doc/src/sgml/ref/create_user_mapping.sgml
> +++ b/doc/src/sgml/ref/create_user_mapping.sgml
> @@ -92,7 +92,11 @@ CREATE USER MAPPING [ IF NOT EXISTS ] FOR {  class="parameter">user_
>This clause specifies the options of the user mapping.  The
>options typically define the actual user name and password of
>the mapping.  Option names must be unique.  The allowed option
> -  names and values are specific to the server's foreign-data wrapper.
> +  names and values are specific to the server's foreign-data
> +  wrapper.  Option values, including passwords, are visible in the
> +   +  
> linkend="catalog-pg-user-mapping">pg_user_mapping
> +  system view.
>   
>  
> 



signature.asc
Description: PGP signature


Re: pg_basebackup application does not dump all database objects

2023-05-19 Thread Stephen Frost
Greetings,

Please don't top-post on these lists.

* Kristjan Mustkivi (sonicmon...@gmail.com) wrote:
> Thank you very much for the feedback and a reference! I was under the
> impression that pg_basebackup does everything
> pg_start_backup/pg_stop_backup but with a more user-friendly manner
> and additional steps. But they are then functionally different. For my
> particular case the pg_start_backup/pg_stop_backup approach was better
> as it retained the replication slots as well (the end goal was to move
> the db from old hardware to the new one and have it on the new hw
> exactly as it was on the old one). So I was surprised that
> pg_basebackup did not bring the rep slots along.

pg_basebackup does effectively call pg_start_backup/pg_stop_backup (or
pg_backup_start/pg_backup_stop in newer versions of PG..) and is more
user-friendly and does do the additional steps expected of a low level
backup- and which are documented in the link that I sent you.

Again, it's not enough to just call pg_start_backup and pg_stop_backup
and do a 'cp -a', and that's why pg_basebackup does more than that and
also why we document those other steps for anyone interested in writing
their own backup tool.

Perhaps an argument can be made that replication slots should be
optional for pg_basebackup to include or not (and similarily for other
PG backup tools which use the low-level API).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_basebackup application does not dump all database objects

2023-05-16 Thread Stephen Frost
Greetings,

* PG Doc comments form (nore...@postgresql.org) wrote:
> It would be good to point out on the pg_basebackup documentation that it
> behaves differently from
> 
> SELECT * FROM pg_start_backup('label', true, false);
> cp -a xxx yyy
> SELECT * FROM pg_stop_backup(false, true);

... that's not actually a backup.  Doing the above, you won't get a
backup_label and unless you take other steps, PG will end up thinking
it's doing crash recovery.  That's not good and can lead to corruption.
If you're going to consider taking a low-level backup you should read
the documentation here:

https://www.postgresql.org/docs/15/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

Which specifically goes into the question about replication slots.

> method. One thing discovered was that the latter retains replication slot
> information on the clone while pg_basebackup discards all replication slot
> information. Not sure if it is a feature or a bug (replication slots
> retention is a good thing).

Our documentation points out a number of reasons why it's not, I'd
suggest you read the above.

> And perhaps there are more such subtle differences?

Not sure that there's really that many other differences between
pg_basebackup and a properly implemented low-level backup.  You
certainly have to do a lot more than what you have above to have a
properly implemented low-level backup though.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Make SSPI documentation clearer

2023-03-20 Thread Stephen Frost
Greetings,

* Tomas Pospisek (t...@sourcepole.ch) wrote:
> what is the status of bringing the proposed SSPI documentation chages into
> the git repo?

> - is there anything to be improved or missing with the proposed changes?

... maybe, see below.

> - does the improvement need a hat tip of someone?

Needs a committer willing to include the change.  I'm generally alright
with being that committer once we get to some finalized wording unless
there's comments from others.

> - anybody here that has the guts, time and willingness to take the
>   supposed improvement and apply put it into the git repo?

This isn't helpful and frankly is detrimental to getting this change
included.  We strive strongly to have a positive tone and focus on
technical excellence.

> Anything else needed here?

See below for comments on the change.

> If the change is in principle OK, then I can prepare a patch or such (I
> assume postgres doesn't take pull requests from https://github.com/postgres
> ?)?


If you'd like to work on a patch, that'd be great.  The process would be
to make the change in your local git repo, create a patch with the change
(generally with git format-patch) and then post an email to the
pgsql-hackers mailing list with the proposed patch and finally register
it into the open commitfest (which is the app that we use for managing
the many, many requested changes to PG):

https://commitfest.postgresql.org

> On 13.03.23 10:00, Tomas Pospisek wrote:
> > On 13.03.23 01:36, Stephen Frost wrote:
> > 
> >  > * PG Doc comments form (nore...@postgresql.org) wrote:
> >  > > Page: https://www.postgresql.org/docs/15/sspi-auth.html
> >  > > Description:
> >  > >
> >  > > The [current SSPI
> >  > > documentation](https://www.postgresql.org/docs/current/sspi-auth.html)
> >  > > reads:
> >  > >
> >  > > "SSPI authentication only works when both server and client are
> >  > > running Windows, or, on non-Windows platforms, when GSSAPI is
> >  > > available."
> >  > >
> >  > > I interpret that phrase like this:
> >  > >
> >  > > * there's a case where both server and client are running Windows
> >  > > * there's a case where both are running non-Windows
> >  >
> >  > Yeah, that phrasing isn't great.
> >  >
> >  > > What about mixed cases? When the client is non-Windows, then can it
> >  > > use SSPI? No, AFAIK not. So I'd suggest to make that phrase above
> >  > > clearer and completely explicit:
> >  >
> >  > SSPI is Windows-specific, yeah.
> >  >
> >  > > "SSPI authentication works when both server and client are running
> >  > > Windows.
> >  > >
> >  > > When the server is on a non-Windows platform then the server must
> >  > > use GSSAPI if it wants to authenticate the client either via
> >  > > Kerberos or via Active Directory. A client on a Windows platform
> >  > > that connects to a non-Windows Postgresql server can either use SSPI
> >  > > (strongly encouraged) or GSS (much more difficult to set up) if it
> >  > > wants to authenticate via Kerberos or Active Directory. A client
> >  > > from a non-Windows platform must use GSS if it wants to authenticate
> >  > > via Kerberos or Active Directory."
> >  >
> >  > Rather than work in negative, I feel like it might make more sense to
> >  > work in positives?  That is, perhaps this instead:
> >  >
> >  > On Windows platforms, SSPI is the default and most commonly used
> >  > mechanism.  Note that an SSPI client can authenticate to a server
> >  > which is using either SSPI or GSSAPI, and a GSSAPI client can
> >  > authenticate to a server which is using either SSPI or GSSAPI.
> >  > Generally speaking, clients and servers on Windows are recommended to
> >  > use SSPI while clients and servers on Unix (non-Windows) platforms use
> >  > GSSAPI.
> >  >
> >  > Stricltly speaking, this is all independent of if AD is being used as
> >  > the KDC or not.
> > 
> > I agree, that's a better formulation. I'd suggest to improve your
> > version in three ways:
> > 
> > 1. replace "mechanism" with "authentication mechanism"

Sure.

> > 2. be explicit about Active Directory so there's no doubt wrt to setting
> >     up authentication

Not against the idea of mentioning AD...

> > 3. be explicit that GSSAPI should be used on non-Windows platform
> >     servers when one wants clients in an AD domain to seamles

Re: Make SSPI documentation clearer

2023-03-12 Thread Stephen Frost
Greetings,

* PG Doc comments form (nore...@postgresql.org) wrote:
> Page: https://www.postgresql.org/docs/15/sspi-auth.html
> Description:
> 
> The [current SSPI
> documentation](https://www.postgresql.org/docs/current/sspi-auth.html)
> reads:
> 
> "SSPI authentication only works when both server and client are running
> Windows, or, on non-Windows platforms, when GSSAPI is available."
> 
> I interpret that phrase like this:
> 
> * there's a case where both server and client are running Windows
> * there's a case where both are running non-Windows

Yeah, that phrasing isn't great.

> What about mixed cases? When the client is non-Windows, then can it use
> SSPI? No, AFAIK not. So I'd suggest to make that phrase above clearer and
> completely explicit:

SSPI is Windows-specific, yeah.

> "SSPI authentication works when both server and client are running
> Windows.
> 
> When the server is on a non-Windows platform then the server must use GSSAPI
> if it wants to authenticate the client either via Kerberos or via Active
> Directory. A client on a Windows platform that connects to a non-Windows
> Postgresql server can either use SSPI (strongly encouraged) or GSS (much
> more difficult to set up) if it wants to authenticate via Kerberos or Active
> Directory. A client from a non-Windows platform must use GSS if it wants to
> authenticate via Kerberos or Active Directory."

Rather than work in negative, I feel like it might make more sense to
work in positives?  That is, perhaps this instead:

On Windows platforms, SSPI is the default and most commonly used
mechanism.  Note that an SSPI client can authenticate to a server which
is using either SSPI or GSSAPI, and a GSSAPI client can authenticate to
a server which is using either SSPI or GSSAPI.  Generally speaking,
clients and servers on Windows are recommended to use SSPI while clients
and servers on Unix (non-Windows) platforms use GSSAPI.

Stricltly speaking, this is all independent of if AD is being used as
the KDC or not.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Row Level Security Execution within the SQL Evaluation Pipeline

2023-02-21 Thread Stephen Frost
Greetings,

* PG Doc comments form (nore...@postgresql.org) wrote:
> RLS documentation seems to say the user's predicate is evaluated AFTER the
> policy is evaluated. This presents to me an issue that I can't wrap my head
> around, and seems to confuse others as well.
> 
> Setup:
> 1. We have a policy-enabled table that has millions of rows, with ID as a
> primary key. 
> 2. There exists a row in this table whose ID column is 10.
> 3. User submits a query: SELECT * FROM MYTABLE WHERE id = 10. 
> 
> According to the documentation, the WHERE clause is not evaluated until
> AFTER the policy is evaluated, thus there is initially a full table scan of
> MYTABLE that returns only the set of rows that the user has access to (via
> the policy), THEN the WHERE clause is activated that reduces the row count
> to 1 or 0.
> 
> This sounds non-performant, and if this is or is not the case, I think it
> should be more clearly explained. In addition, a link to a "best practices
> using the policy effectively" would be useful, as from reviewing stack
> overflow, there is lots of concern over performance of RLS.

Functions which are leakproof can be pushed down below the policy
because those functions won't ever leak information about the values
that they might see in the data that the user shouldn't be allowed to
see.

Logically, the WHERE clause still comes after the policy, but with the
leakproof function that backs the '=' operator, we're able to optimize
the query and use the 'id' index that exists.

Of course, RLS isn't going to be free and you can certainly have cases
where you're using a function or operator that isn't leakproof and then
you'll have the issues you describe, or just in general adding on the
conditions of the policy could have performance impacts, but this
specific case isn't going to be an issue.  Note that we do explicitly
perform query optimization *after* adding in the RLS policies into the
query.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Improve documentation for pg_upgrade, standbys and rsync

2022-04-05 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Jul 26, 2021 at 3:11 PM Stephen Frost  wrote:
> > * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> > > Thanks for looking at this!
> >
> > Sure.  Thanks for working on it!
> 
> Stephen, do you intend to do something about this patch in terms of
> getting it committed? You're the only reviewer but haven't responded
> to the thread for more than 5 months.

I tried to be clear in the last email on the thread, the one which you
just responded to, here:

* Stephen Frost (sfr...@snowman.net) wrote:
> This, of course, all comes back to the original complaint I had about
> documenting this approach, which is that these things should only be
> done by someone extremely familiar with the PG codebase, until and
> unless we write an actual tool to do this.

To be more explicit though- we should write a tool to do this.  We
shouldn't try to document a way to do it because it's hard to get right.
While rsync is very capable, what's needed to really do this goes beyond
what we could reasonably put into any rsync command or really even into
a documented procedure.  I get that we already have it documented (and
I'll note that doing so was against my recommendation..) and that some
folks (likely those who follow this mailing list) have had success using
it, but I'd really rather we just take it out and put it on a wiki
somewhere as a "we need a tool that does this stuff" and hope that
someone finds time to write one.

> I don't feel that I know this area of the documentation well enough to
> feel comfortable passing judgement on whether this change is an
> improvement or not. However I do feel somewhat uncomfortable with
> this:
> 
> -   
> -Prepare for standby server upgrades
> -
> -
> - If you are upgrading standby servers using methods outlined in
> section  - linkend="pgupgrade-step-replicas"/>, verify that the old standby
> - servers are caught up by running 
> pg_controldata
> - against the old primary and standby clusters.  Verify that the
> - Latest checkpoint location values match in all clusters.
> - (There will be a mismatch if old standby servers were shut down
> - before the old primary or if the old standby servers are still running.)
> - Also, make sure wal_level is not set to
> - minimal in the
> postgresql.conf file on the
> - new primary cluster.
> -
> -   
> 
> Right now, we say that you should stop the standby servers and then
> prepared for standby server upgrades. With this patch, we say that you
> should first prepare for standby server upgrades, and then stop the
> standby servers. But the last part of the text about preparing for
> standby server upgrades now mentions things to be done after carrying
> out the next step where the servers are actually stopped. That seems
> confusing. Perhaps we need two separate steps here, one to be
> performed before stopping both servers and the other after.

It should really be both- things to do on the primary ahead of time
(truncate all unlogged tables, make sure there aren't any orphaned
temporary tables, etc), and then things to do on the replicas after
shutting the primary down (basically, make sure they are fully caught up
with where the primary was at shutdown).  I tried to explain that in my
prior email but perhaps didn't do a very good job.

> Also, let me express my general terror at the idea of anyone actually
> using this procedure.

I mean, yeah, I agree.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: psql's commit df9f599b is not documented

2021-08-03 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Dave Cramer  writes:
> > I would agree. If it's worth coding it's worth documenting. Unless of
> > course the intention is that this is temporary.
> 
> Documenting it would mean committing to keeping it indefinitely,
> which I think was exactly what people didn't want to do.  It's
> a kluge and we might find ourselves backed into a situation where
> we have to take it out.

Given that it's been a while since 11 came out, this argument doesn't
seem like it really holds much water.

> > Do we have other undocumented features ?
> 
> Yup.  There are plenty of behaviors that are explained in code comments
> but not anywhere user-visible.  If we tried to document absolutely
> everything that someone might be curious about, the manual would be
> three times its current size, but not more useful.

This isn't a deep internal behavior that we're talking about, it's a
very user-visible feature and now we're having users discover it and
then complain that it's not documented.  I tend to agree with the
original poster and some others on this thread that it probably should
be documented.

Across major versions, if we feel the need to for whatever reason, we
can take it out even when it's documented.  We've certainly done so for
other things in the past, I don't see why this should be viewed as
special in that regard.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Improve documentation for pg_upgrade, standbys and rsync

2021-07-16 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote:
> > * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> > > I revently tried to upgrade a standby following the documentation,
> > > but I found it hard to understand, and it took me several tries to
> > > get it right.  This is of course owing to my lack of expertise with
> > > rsync, but I think the documentation and examples could be clearer.
> > > 
> > > I think it would be a good idea to recommend the --relative option
> > > of rsync.
> > 
> > An additional thing that we should really be mentioning is to tell
> > people to go in and TRUNCATE all of their UNLOGGED tables before going
> > through this process, otherwise the rsync will end up spending a bunch
> > of time copying the files for UNLOGGED relations which you really don't
> > want.
> 
> I have thought about that some more, and I am not certain that we should
> unconditionally recommend that.  Perhaps the pain of rebuilding the
> unlogged table on the primary would be worse than rsyncing it to the
> standby.

I disagree entirely.  The reason to even consider using this approach is
to minimize the time required to get things back online and there's no
question that having the unlogged tables get rsync'd across would
increase the time required.

> The documentation already mentions
> 
>   "Unfortunately, rsync needlessly copies files associated with temporary
>and unlogged tables because these files don't normally exist on standby
>servers."
> 
> I'd say that is good enough, and people can draw their conclusions from
> that.

I disagree.  Instead, we should have explicit steps included which
detail how to find and truncate unlogged tables and what to do to remove
or exclude temporary files once the server is shut down.

> Attached is a new patch with an added reminder to create "standby.signal",
> as mentioned in [1].
> 
> Yours,
> Laurenz Albe
> 
>  [1]: 
> https://www.postgr.es/m/1a5a1b6e-7bb6-47eb-8443-40222b769...@iris.washington.edu

> From 47b685b700548af06ab08673187bdd1df7236464 Mon Sep 17 00:00:00 2001
> From: Laurenz Albe 
> Date: Fri, 16 Jul 2021 07:45:22 +0200
> Subject: [PATCH] Improve doc for pg_upgrade and standby servers
> 
> Recommend using the --relative option of rsync for clarity
> and adapt the code samples accordingly.
> Using relative paths makes clearer what is meant by "current
> directory" and "remote_dir".

I'm not really convinced that this is actually a positive change, though
I don't know that it's really a negative one either.  In general, I
prefer fully qualified paths to try and make things very clear about
what's happening, but this is also a bit of an odd case due to hard
links, etc.

> Add a reminder that "standby.signal" needs to be created.

This makes sense to include, certainly, but it should be an explicit
step, not just a "don't forget" note at the end.  I'm not really sure
why we talk about "log shipping" either..?  Wouldn't it make more sense
to have something like:

g. Configure standby servers

Review the prior configuration of the standby servers and set up the
same configuration in the newly rsync'd directory.

1. touch /path/to/replica/standby.signal
2. Configure restore_command to pull from WAL archive
3. For streaming replicas, configure primary_conninfo

Probably back-patched all the way, with adjustments made for the pre-12
releases accordingly.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Link missing in the 14 release notes

2021-07-05 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, Jul  2, 2021 at 05:15:23AM +, Daniel Westermann (DWE) wrote:
> > >On Thu, Jul  1, 2021 at 10:01:46PM -0400, Stephen Frost wrote:
> > >Only the first mention of "factorial()" uses a link;  the second item
> > >doesn't have any link.
> > 
> > There are other places as well, like this on:
> > "The postgres_fdw supports these type of scans if async_capable is set."
> > 
> > async_capable is not a link. Or this one:
> > "This speeds normalize() and IS NORMALIZED." 
> > 
> > I still think it should be more consistent and all should be links, even if 
> > they point to the same section.
> 
> I am using standard typographic style, where the first mention of
> something is often italics, but later mentions might not be.  I am not
> in favor of adding links to everthing that can have a link _unless_
> there is a unique URI for each item.

While I understand that's the approach you've been following, I don't
agree with it.  Providing links gives users a way to find more
information and that applies to each equally.  I don't believe users of
the website and release notes would be confused, upset, or otherwise
misunderstand when two items which are covered in the same place have a
repeated link for each of the items.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Link missing in the 14 release notes

2021-07-01 Thread Stephen Frost
Greetings,

On Thu, Jul 1, 2021 at 21:57 Bruce Momjian  wrote:

> On Thu, Jul  1, 2021 at 09:47:47PM -0400, Stephen Frost wrote:
> > On Thu, Jul 1, 2021 at 21:35 Bruce Momjian  wrote:
> > I normally link to only the _first_ mention of something, and since
> they
> > are both in the same section, I didn't add a link for the second one.
> > Adding a second link might suggest that there is more information
> > available, even though the coarseness of our links means that
> section is
> > the same.
> >
> > Perhaps the wording would be better as something like:
> >
> > New predefined roles for granting access to read/write
> all
> > tables have been added, called pg_read_all_data and pg_write_all_data ..?
> >
> > Or something along those lines?
>
> I have avoided doing that since the link is about the roles, not about
> the term "predefined roles".


I tend to agree that it’s about the roles, but it isn’t about just one of
them …

> Just a thought.
> >
> > Perhaps another idea would be to make one link which includes both names.
>
> Uh, that is going to look odd, I am afraid.


This thread started with the point that the current entry looks odd. I’m
not sure this would look more odd than what is there now.

Thanks,

Stephen

>


Re: Link missing in the 14 release notes

2021-07-01 Thread Stephen Frost
Greetings,

On Thu, Jul 1, 2021 at 21:35 Bruce Momjian  wrote:

> On Thu, Jul  1, 2021 at 05:46:16PM -0400, Stephen Frost wrote:
> > Greetings,
> >
> > * Euler Taveira (eu...@eulerto.com) wrote:
> > > On Wed, Jun 30, 2021, at 11:47 AM, Daniel Westermann (DWE) wrote:
> > > > "Add predefined roles pg_read_all_data and pg_write_all_data
> (Stephen Frost)
> > > >
> > > > These non-login roles can be used to give read or write permission
> to all tables, views, and sequences."
> > > >
> > > > pg_read_all_data links to "Predefined roles", pg_write_all_data does
> not. Is that on purpose? At least it looks strange.
> > > It seems so. Once you click on the link, you will notice that
> pg_write_all_data
> > > is there too. Role names are similar to make the reader suspect that
> both
> > > descriptions will be on the same page.
> >
> > I tend to agree that it'd make sense to have them both as links.
> >
> > I've CC'd Bruce to make sure he sees this discussion.
>
> I normally link to only the _first_ mention of something, and since they
> are both in the same section, I didn't add a link for the second one.
> Adding a second link might suggest that there is more information
> available, even though the coarseness of our links means that section is
> the same.


Perhaps the wording would be better as something like:

New predefined roles for granting access to read/write all
tables have been added, called pg_read_all_data and pg_write_all_data ..?

Or something along those lines?

Just a thought.

Perhaps another idea would be to make one link which includes both names.

Thanks,

Stephen

>


Re: Link missing in the 14 release notes

2021-07-01 Thread Stephen Frost
Greetings,

* Euler Taveira (eu...@eulerto.com) wrote:
> On Wed, Jun 30, 2021, at 11:47 AM, Daniel Westermann (DWE) wrote:
> > "Add predefined roles pg_read_all_data and pg_write_all_data (Stephen Frost)
> > 
> > These non-login roles can be used to give read or write permission to all 
> > tables, views, and sequences."
> > 
> > pg_read_all_data links to "Predefined roles", pg_write_all_data does not. 
> > Is that on purpose? At least it looks strange.
> It seems so. Once you click on the link, you will notice that 
> pg_write_all_data
> is there too. Role names are similar to make the reader suspect that both
> descriptions will be on the same page.

I tend to agree that it'd make sense to have them both as links.

I've CC'd Bruce to make sure he sees this discussion.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Improve documentation for pg_upgrade, standbys and rsync

2021-05-19 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> I revently tried to upgrade a standby following the documentation,
> but I found it hard to understand, and it took me several tries to
> get it right.  This is of course owing to my lack of expertise with
> rsync, but I think the documentation and examples could be clearer.
> 
> I think it would be a good idea to recommend the --relative option
> of rsync.
> 
> Here is a patch that does that, as well as update the versions in
> the code samples to something more recent.  Also, I think it makes
> sense to place the data directory in the sample in /var/lib/postgresql,
> which is similar to what many people will have in real life.

Haven't had a chance to look at this in depth but improving things here
would be good.

An additional thing that we should really be mentioning is to tell
people to go in and TRUNCATE all of their UNLOGGED tables before going
through this process, otherwise the rsync will end up spending a bunch
of time copying the files for UNLOGGED relations which you really don't
want.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2021-04-02 Thread Stephen Frost
Greetings,

* Jonathan S. Katz (jk...@postgresql.org) wrote:
> On 4/1/21 3:34 PM, Stephen Frost wrote:
> > * Stephen Frost (sfr...@snowman.net) wrote:
> 
> >> I've written a patch to begin to make this change and sent it into
> >> -hackers for comments, thread is here:
> >>
> >> https://postgr.es/m/20201120211304.gg16...@tamriel.snowman.net
> > 
> > Just to wrap this up- this change has now been committed for v14.
> 
> Thanks Stephen! Do we have any additional follow up on the pgweb side?

Yes and no. :)

Once the next set of minor releases is pushed out, I'll double-check
that everything is working properly (the 'obsolete' pages should start
showing up instead of the redirects, since we'll no longer be going
through the "page not found" code path) and will then go remove the
alias/redirects from the pages that no longer need them.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2021-04-01 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Fri, Oct 23, 2020 at 05:40:37PM -0400, Stephen Frost wrote:
> > > Therefore, on the whole, the proposal that I would put out there is:
> > > 
> > > - Rename everything (code, docs, etc) to 'predefined roles' for v14+
> > > - Update the documentation (only) in back-branches to mention that the
> > >   feature was renamed in v14 to 'predefined roles'
> > > - Use the 'doc page alias' feature to make the documentation page for
> > >   'default roles' on the older versions link to 'predefined roles' in
> > >   the newer versions, and vice-versa
> > 
> > Sure, that works for me.  If it was only docs, backpatch makes more
> > sense, but once you showed the renaming in those other areas,
> > master-only makes more sense.  Thanks for working on this.
> 
> I've written a patch to begin to make this change and sent it into
> -hackers for comments, thread is here:
> 
> https://postgr.es/m/20201120211304.gg16...@tamriel.snowman.net

Just to wrap this up- this change has now been committed for v14.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: TLS acronym

2021-03-28 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Daniel Gustafsson (dan...@yesql.se) wrote:
> > As discussed in the NSS thread, we've had TLS defined as an  since
> > commit c6763156589 in 2014 without actually having it defined in 
> > acronyms.sgml.
> > 
> > The attached adds the definition linking to the Wikipedia entry for TLS.
> 
> Updated patch attached, mainly just adding author/reviewed-by/discussion
> items.
> 
> Barring objections, I'll plan to commit and back-patch this all the way
> over the weekend sometime.

Done.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: TLS acronym

2021-03-26 Thread Stephen Frost
Greetings,

* Daniel Gustafsson (dan...@yesql.se) wrote:
> As discussed in the NSS thread, we've had TLS defined as an  since
> commit c6763156589 in 2014 without actually having it defined in 
> acronyms.sgml.
> 
> The attached adds the definition linking to the Wikipedia entry for TLS.

Updated patch attached, mainly just adding author/reviewed-by/discussion
items.

Barring objections, I'll plan to commit and back-patch this all the way
over the weekend sometime.

Thanks!

Stephen
From 07c3564caa5dab0c198d261b7d5e17f1082dc871 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Fri, 26 Mar 2021 16:37:54 -0400
Subject: [PATCH] doc: Define TLS as an acronym

Commit c6763156589 added an acronym reference for "TLS" but the definition
was never added.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/27109504-82db-41a8-8e63-c0498314f...@yesql.se
---
 doc/src/sgml/acronyms.sgml | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/doc/src/sgml/acronyms.sgml b/doc/src/sgml/acronyms.sgml
index 4e5ec983c0..13bd819eb1 100644
--- a/doc/src/sgml/acronyms.sgml
+++ b/doc/src/sgml/acronyms.sgml
@@ -684,6 +684,16 @@
 

 
+   
+TLS
+
+ 
+  https://en.wikipedia.org/wiki/Transport_Layer_Security;>
+  Transport Layer Security
+ 
+
+   
+

 TOAST
 
-- 
2.27.0



signature.asc
Description: PGP signature


Re: incoorect restore_command

2021-02-26 Thread Stephen Frost
Greetings,

* Philipp Gramzow (p...@philphonic.de) wrote:
> Am 24.02.2021 um 06:49 schrieb Michael Paquier:
> >On Mon, Feb 22, 2021 at 07:36:28AM +, PG Doc comments form wrote:
> >>I've been trying out saving and restoring compressed archive logs. The
> >>restore_command stated in the docs at "25.3.6.2. Compressed Archive Logs"
> >>('gunzip < /mnt/server/archivedir/%f > %p') did not work for me, because the
> >>archive_command ('gzip < %p > /var/lib/pgsql/archive/%f') alters the
> >>filename to %f.gz
> >On which platform please?  Using a pipe with gzip does not alter the
> >output file name where the data is pushed to.
> We're on RHEL 7. Thanks for pointing out my mistake, someone altered our
> archive_command to use .gz extensions without me noticing it. Please excuse
> the confusion.

Note that the simple commands in the documentation are really just to
give you an example of what an archive command would look like- they
shouldn't be used verbatim since they don't provide any guarantees that
the resulting compressed file has actually been written out to disk
(something that an archive command really should provide the guarantee
of- otherwise a crash will result in WAL files likely going missing and
therefore you'll lose the ability to do PITR).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Trusted versus untrusted Pl language

2020-12-24 Thread Stephen Frost
Greetings,

* Steven Pousty (steve.pou...@gmail.com) wrote:
> If you consider the application developer or data scientist's perspective
> it makes total sense. I don't like the pattern of appdevs always working as
> the postgres user, it encourages bad patterns and can often blow up when
> you move the application to production.

> Instead I think a good flow for an appdev or a data scientists to follow
> when developing their function in Pl/Python or PL/R is:
> 1) Make the langauge trusted on the appdevs or data scientist's instance of
> Postgres. Most developers either work on a cluster on their laptop or in a
> container.

The way to give non-superusers access to things which are usually
superuser-only is to set up a way to have that ability GRANT'd to them,
either through privileges on a function, or through a new role to manage
that access.

In this case, it would seem likely that the right answer would be a new
role along the lines of "pg_use_untrusted_language", which would then
allow a user who has been GRANT'd that role to be able to create
functions in untrusted languages.  An interesting question might be if
we'd allow such a role to create C language functions or not.

Clearly, such a privilege could be used by someone to get superuser
access themselves, but that's nothing new when it comes to such roles
and I appreciate the angle you're taking here where you'd like the
developer to be able to operate as a non-superuser in general while
still being able to create such functions.

> 2) Send the finished product to the DBA and security teams for review.
> 3) If it passes review and testing then you can put it into production.
> 
> The SQL I am talking about is this:
> UPDATE pg_language SET lanpltrusted = true WHERE lanname LIKE 'plr';
> 
> There should also be a reminder to NOT do this in production.

I can't agree with this part, it's just not a good idea for anyone to be
issuing direct UPDATE calls against the catalogs.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2020-11-23 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, Oct 23, 2020 at 05:40:37PM -0400, Stephen Frost wrote:
> > Therefore, on the whole, the proposal that I would put out there is:
> > 
> > - Rename everything (code, docs, etc) to 'predefined roles' for v14+
> > - Update the documentation (only) in back-branches to mention that the
> >   feature was renamed in v14 to 'predefined roles'
> > - Use the 'doc page alias' feature to make the documentation page for
> >   'default roles' on the older versions link to 'predefined roles' in
> >   the newer versions, and vice-versa
> 
> Sure, that works for me.  If it was only docs, backpatch makes more
> sense, but once you showed the renaming in those other areas,
> master-only makes more sense.  Thanks for working on this.

I've written a patch to begin to make this change and sent it into
-hackers for comments, thread is here:

https://postgr.es/m/20201120211304.gg16...@tamriel.snowman.net

Naturally, it won't make sense to make a change to the back branches
until the change has been agreed to and committed for v14.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2020-10-23 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> OK, Stephen, I believe you have a more comprehensive patch to make this
> change.  When can you complete it?  The above email is from April.

Having gone back through this thread (again) to figure out where we are
on this, and chatting with Jonathan about the options that are available
through the web system, and trying to sort out my own feelings on what
makes the most sense here, I'd like to bring up a few points and a
proposal regarding the rename:

- General agreement that 'predefined' is better than 'default'
- Unclear on back-patching:
  Voices in favor: Rob, Laurenz, Bruce
  Voices against: Ian, Simon, Peter, Stephen 

Having understood now a bit better what is possible with the web system,
many thanks to Jonathan for chatting with me about it, I'll try to
explain here what the options are:

- Change the back-branches and use the new feature ("doc page
  redirects") to redirect old links to default-roles to
  predefined-roles.

- Change the feature's name in v14, but create a "doc page alias" to
  allow users viewing v13's "default roles" to see a v14 link which
  would take them to "predefined roles".

I'll note that we've got a number of examples of "doc page aliases"
being used to handle name changes:

catalog-pg-replication-slots.html <-> view-pg-replication-slots.html
pgxlogdump.html <-> pgwaldump.html
app-pgresetxlog.html <-> app-pgresetwal.html
legalnotice.html <-> LEGALNOTICE.html
app-pgreceivexlog.html <-> app-pgreceivewal.html

I've also been thinking about the code change impact from this, as I do
think we should have the code and the documentation and general feature
naming be consistent, but that implies changing things like
DEFAULT_ROLE_MONITOR to be PREDEFINED_ROLE_MONITOR, potentially breaking
compilation of extensions that use those #define's (such as even our own
extensions like file_fdw do) in a minor version update, which doesn't
seem terribly nice to do.

Therefore, on the whole, the proposal that I would put out there is:

- Rename everything (code, docs, etc) to 'predefined roles' for v14+
- Update the documentation (only) in back-branches to mention that the
  feature was renamed in v14 to 'predefined roles'
- Use the 'doc page alias' feature to make the documentation page for
  'default roles' on the older versions link to 'predefined roles' in
  the newer versions, and vice-versa

Thoughts?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Possible mistake in backup documentation

2020-10-07 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> On Fri, 2020-09-25 at 14:50 +0200, Magnus Hagander wrote:
> > On Fri, Sep 25, 2020 at 2:32 PM Laurenz Albe  
> > wrote:
> > > On Tue, 2020-09-22 at 14:17 +, PG Doc comments form wrote:
> > > > In "25.3.3.2. Making An Exclusive Low-Level Backup", you said that "The
> > > > exclusive backup method is deprecated and should be avoided. Prior to
> > > > PostgreSQL 9.6, this was the only low-level method available, but it is 
> > > > now
> > > > recommended that all users upgrade their scripts to use non-exclusive
> > > > backups". But in the example in "25.3.6.1. Standalone Hot Backups" you 
> > > > use
> > > > the exclusive version of backup command. Is it a mistake or not?
> > > 
> > > Yes, that's true.
> > 
> > Well, technically it is *correct*. It's just rather silly that we are using 
> > the deprecated API in the example.
> > 
> > > How about the attached patch?
> > > Perhaps that is too complicated, but I have no idea how to make it 
> > > simpler.
> > 
> > For this example, can't we just show two sessions. That is, "in a psql, run 
> > pg_start_backup().
> >  Then in a different session, copy all the files, and then back in psql run 
> > pg_stop_backup()" or such?
> > 
> > This is still just an example of a low level operation, where the 
> > recommendation is (and is there iirc)
> >  to use a different tool for it already.
> 
> I thought the point of the example is to show a workable script that could
> perform a backup and could be used as a starting point to develop your own
> backup solution.  (I know that there are people who think writing your own
> backup solution is evil, but I am not one of them.)

That's certainly not actually the case.  We don't currently have
anywhere close to sufficient documention (unless you are reading the
code and understand how PG works) for someone to develop their own
backup solution using the low-level API.  There's been attempts to fix
that, and I suspect there'll be another attempt to improve the situation
for v14, but not sure if that'll be successful.

> If we replace that with a verbal description of how to do it, the example just
> duplicates what is already documented.
> 
> In that case I would opt for simply removing the example.

Removing the example, as it really doesn't do what it implies, is
probably the best way to address this.  I'm not convinced that we could
come up with a sufficient verbal description, but I'm happy to take a
look and provide feedback if someone wants to try.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Openssl v3_ca

2020-08-24 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> A few years ago I figured out how to create intermediate certificates
> that are transferred across OpenSSL connections by using the v3_ca
> extension, and added this to the PG documentation.
> 
> I have now just figured out that v3_ca is just a heading in the openssl
> configuration file, e.g., /etc/ssl/openssl.cnf, and that it is
> specifically this line that enables this to work:
> 
>   basicConstraints = critical,CA:true

Yes, v3_ca refers to a stanza in the default openssl config.

> I have created the attached documentation patch to clarify exactly what
> is needed, in case non-openssl tools are used.

> diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
> index c8698898f3..f705c4fec1 100644
> --- a/doc/src/sgml/runtime.sgml
> +++ b/doc/src/sgml/runtime.sgml
> @@ -2194,7 +2194,8 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
> can also be appended to the file.  Doing this avoids the necessity of
> storing intermediate certificates on clients, assuming the root and
> intermediate certificates were created with v3_ca

All CAs need to have CA:TRUE set on them, root and intermediate, so the
above isn't really correct..

> -   extensions.  This allows easier expiration of intermediate certificates.
> +   extensions (which sets CA:TRUE on certificates).

Probably better would be to specifically say "This sets 'ca' to 'true'
for the basic constraints of the certificate." or similar language.
Simply saying "CA:TRUE" doesn't seem to really be an improvement over
just referencing the v3_ca stanza.

> +   This allows easier expiration of intermediate certificates.

While true, there's certainly other reasons why someone might want to
run intermediate CAs.. I'm not sure that we really need to go into the
discussion about why they make sense to have.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: "stable storage"

2020-08-21 Thread Stephen Frost
Greetings,

* Peter Geoghegan (p...@bowt.ie) wrote:
> On Fri, Aug 21, 2020 at 3:12 PM Bruce Momjian  wrote:
> > > Is "stable storage" the same as "durable storage"?  Should we rename
> > > "stable storage" mentions to "durable storage"?
> >
> > I have developed the attached doc patch to change "stable storage" to
> > "durable storage".
> 
> I agree that this is an improvement. The word "durable" is more
> descriptive than "stable". (The word "stable" might be more
> recognizable, but that in itself doesn't make the text any clearer.)

+1.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Client parameter list omits timezone

2020-07-26 Thread Stephen Frost
Greetings,

Please don't top-post on these lists.

* Kevin Burke (ke...@meter.com) wrote:
> I guess my broader issue is that it's hard to figure out how to configure
> things like the timezone in a given client. Generally, you pass some things
> as parameters in the connection string but it's not obvious which values
> are accepted or where to pass them, unless you go rooting around in the
> protocol details and the source code of the client, which is more than most
> users will or should be asked to do.

As I said in my previous reply, you need to discuss that with the
maintainer of that driver.  The documentation you found does,
accurately, reflect what options can be passed for the driver that it is
the documentation for- namely the libpq C driver.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Client parameter list omits timezone

2020-07-26 Thread Stephen Frost
Greetings,

* PG Doc comments form (nore...@postgresql.org) wrote:
> Page: https://www.postgresql.org/docs/11/libpq-connect.html
> Description:
> 
> I am looking for a complete list of parameters that I can pass in the
> "connect"/"open database connection" list on the client (in this case,
> github.com/lib/pq). 
> 
> The list of parameters documented here:
> https://www.postgresql.org/docs/11/libpq-connect.html#LIBPQ-PARAMKEYWORDS
> does not include the "timezone" parameter. However, github.com/lib/pq seems
> to indicate that "timezone" is an acceptable parameter, for example, here:
> https://github.com/lib/pq/blob/master/encode_test.go#L123

The documentation you're looking at on postgresql.org is for the libpq
C driver (which is written and maintained as part of core PostgreSQL),
not for the Go driver that's at github.com/lib/pq.

For my part, it doesn't seem like a good idea for the Go driver to
deviate in this way, but you'd have to talk to the author of it about
that, not us.  Then again, that git repo also says at the bottom that
it's not actively being developed and recommends a different Go driver
be used.

> Elsewhere in the docs, the PGTZ environment variable can be used to set the
> timezone on the client.

This is true, for the libpq driver.  I don't know about the Go driver.

> I would expect to see "timezone" (and anything else that can be configured)
> in that list.

'timezone' isn't accepted by libpq, which is what our documentation is
referring to.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Logical replication subscription owner

2020-05-09 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Michael Paquier  writes:
> > Not to make the life of everybody more complicated here, but I don't
> > agree.  LOGIN and REPLICATION are in my opinion completely orthogonal
> > and it sounds more natural IMO that a REPLICATION user should be able
> > to log into the server only if it has LOGIN defined.
> 
> ISTM those statements are contradictory.  The two privileges could
> only be called orthogonal if it's possible to make use of one without
> having the other.  As things stand, REPLICATION without LOGIN is an
> entirely useless setting.

Allowing a login to the system by a role that doesn't have the LOGIN
privilege isn't sensible though.

Perhaps a middle ground would be to set LOGIN on a role when REPLICATION
is set on it, if it's not already set (maybe with a NOTICE or WARNING or
such saying "also enabling LOGIN for role X", or maybe not if people
really think it should be obvious).

I don't think taking away login should take away replication though as
maybe there's some reason why someone would want that, nor should we
take away login if replication is taken away, this would strictly just
be a change for when REPLICATION is added to a role that doesn't have
LOGIN already.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2020-04-28 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Sat, Apr 25, 2020 at 1:38 AM Jonathan S. Katz 
> wrote:
> 
> > On 4/9/20 11:45 PM, Bruce Momjian wrote:
> > > On Thu, Apr  9, 2020 at 08:47:56PM -0400, Jonathan Katz wrote:
> > >> On 4/9/20 4:57 PM, Bruce Momjian wrote:
> > >>> Jonathan, Stephen, with the minor release done, can we focus on adding
> > >>> the URL redirect and completing the patch to rename this feature in the
> > >>> docs?  Thanks.
> > >>
> > >> Yes, I have a prototype for this ready, which I had scrambled together
> > >> before the release. I am happy to make it committable in the coming
> > days.
> > >
> > > Thanks.  Once that is done Stephen can apply my patch with his
> > > additions.
> >
> > Please see attached patch for pgweb that allows for the documentation to
> > be redirected from a page that is removed to a newer page. The way it
> > works:
> >
> > - Checks to see if a page is found. If it is, great!
> > - Now if a page 404s, we first check to see if there is a forwarding
> > address, i.e. the new page. If it is, we issue a 301 (permanent redirecTt).
> > - If it's still not found, we abort.
> >
> > I believe this gives us the desired behavior.
> >
> > Thoughts on the patch?
> 
>  Looks good to me. Go for it.

What's the plan for how to maintain it going forward?  Just email -docs
and ask someone to perform the update whenever a release is going out..?
Or do we have some kind of better process in mind?

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Logical replication subscription owner

2020-04-23 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Alvaro Herrera  writes:
> > I had it in my mind that LOGIN was for regular (SQL-based) login, and
> > REPLICATION was for replication login, and that they were orthogonal.
> 
> Yeah, that's what I would've expected.  Otherwise, is REPLICATION
> without LOGIN useful at all?

No, but it's less surprising, at least to me, for all roles that login
to require having the LOGIN right.  Having REPLICATION ignore that would
be surprising (and a change from today).  Maybe if we called it
REPLICATIONLOGIN or something along those lines it would be less
surprising, but it still seems pretty awkward.

I view REPLICATION as allowing a specific kind of connection, but you
first need to be able to login.

Also- what about per-database connections?  Does having REPLICATION mean
you get to override the CONNECT privileges on a database, if you're
connecting for the purposes of doing logical replication?

I agree we could do better in these areas, but I'd argue that's mostly
around improving the documentation rather than baking in implications
that one privilege implies another.  We certainly get people who
complain about getting a permission denied error when they have UPDATE
rights on a table (but not SELECT) and they include a WHERE clause in
their update statement, but that doesn't mean we should assume that
having UPDATE rights means you also get SELECT rights, just because
UPDATE is next to useless without SELECT.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Logical replication subscription owner

2020-04-22 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> On 2020-Apr-15, PG Doc comments form wrote:
> > If the logical replication subscription is owned by a role that is not
> > allowed to login (for example, if the LOGIN privilege is removed after the
> > subscription is created) then the logical replication worker (which uses the
> > owner to connect to the database) will start to fail with this error
> > (repeated every 5 seconds), which is pretty much undocumented:
> > 
> > FATAL:  role "XXX" is not permitted to log in
> > LOG:  background worker "logical replication worker" (PID X) exited with
> > exit code 1
> > 
> > You might want to include that error message in the docs, to ensure that web
> > searches for it bring the user to this documentation.
> 
> I wonder if a better answer is to allow the connection when the
> REPLICATION priv is granted, ignoring the LOGIN prov.

Erm, no, I wouldn't have thought that'd make sense- maybe someone
specifically wants to stop allowing that role to login and they remove
LOGIN?  That REPLICATION would override that would surely be surprising
and counter-intuitive..

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2020-02-09 Thread Stephen Frost
Greetings,

On Sun, Feb 9, 2020 at 14:24 Tom Lane  wrote:

> Stephen Frost  writes:
> > * Jonathan S. Katz (jk...@postgresql.org) wrote:
> >> On 2/6/20 12:11 AM, Bruce Momjian wrote:
> >>> Folks, is it Thursday.  Can we revert this and return to it when we are
> >>> not rushed?  Alternatively, can someone who controls all the moving
> >>> parts, like  redirects and Stephen's patch additions take ownership of
> >>> this issue, with authority to revert the patch if things are too close?
>
> >> Magnus and I briefly discussed what it would take, but the probability
> >> of having the redirects ready on pgweb before the release is
> >> slim-to-none, as we'd like to thoroughly test.
>
> > Given the discussion, barring objections, I'll revert the changes
> > tomorrow and then work with Jonathan and Magnus to try and get the
> > website redirect stuff working so that we can make this change in a
> > future release.
>
> It's Sunday afternoon, and nothing has happened, so I took it on
> myself to revert this.


Ah, thanks, Friday got away from me and it’s been a busy weekend, I had
figured on tonight but glad you took care of it.

Thanks again,

Stephen

>


Re: Documentation: 21.5. Default Roles

2020-02-06 Thread Stephen Frost
Greetings,

* Jonathan S. Katz (jk...@postgresql.org) wrote:
> On 2/6/20 12:11 AM, Bruce Momjian wrote:
> > On Tue, Feb  4, 2020 at 11:31:19AM -0500, Jonathan Katz wrote:
> >>> Using this feature to handle the rename of a file *between* major
> >>> versions, thus leaving the changes in master, should be safe (as long
> >>> as we add an entry to that table in pgweb).
> >>>
> >>> As for back branches, I think we have to say that it's too close to
> >>> the minor release to safely have something done in pgweb before then.
> >>
> >> This part I agree on, but let's sync offline to see if there is
> >> something within reason, with a preference to *not* rush if we're
> >> worried about breaking something right before release.
> > 
> > Folks, is it Thursday.  Can we revert this and return to it when we are
> > not rushed?  Alternatively, can someone who controls all the moving
> > parts, like  redirects and Stephen's patch additions take ownership of
> > this issue, with authority to revert the patch if things are too close?
> 
> Magnus and I briefly discussed what it would take, but the probability
> of having the redirects ready on pgweb before the release is
> slim-to-none, as we'd like to thoroughly test.
> 
> We can start experimenting with it now, and commit a fix after the release.
> 
> I don't have the power to revert changes directly to the documentation
> in core, so I cannot take ownership of that part. However, I am happy to
> own the completion of the redirect feature.

I can own the core bits, since you've offered to take on the web pieces.
:)

Given the discussion, barring objections, I'll revert the changes
tomorrow and then work with Jonathan and Magnus to try and get the
website redirect stuff working so that we can make this change in a
future release.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Users/Roles do not align.

2020-02-05 Thread Stephen Frost
Greetings,

* Jürgen Purtz (juer...@purtz.de) wrote:
> >Based on this, I believe Section 5.9 should read:
> >`A PostgreSQL database cluster contains one or more named databases. Roles
> >are shared across the entire cluster, but no other data is shared across
> >databases. Any given client connection to the server can access only the
> >data in a single database, the one specified in the connection request.`
> 
> imo the following is a more precise wording:
> 
> 'A cluster contains three or more named databases ('template0', 'template1',
> 'postgres', ...). Roles, which are users or groups, see Chapter 21, - as

Roles aren't 'users or groups', they're roles, and we don't actually
have users or groups today.

> well as database names and tablespace definitions - are shared across the
> entire cluster. No other data is shared across databases or schemas. Any
> given client connection to the server can access only the data in a single
> database, the one specified in the connection request. If it has the
> necessary privileges, the connection can access all schemas within this
> database.'

There's a few things wrong about this part anyway- namely that we've got
FDWs now, and there's certainly other cluster-wide things that exist
beyond the specific items listed, so I wonder if perhaps we should just
stop trying to list everything here.  The description given in 22.1
seems like it's a lot better since it talks about the hierarchy in a
general sense.

The minimalistic approach suggested initially seems like it might be the
best answer to this right now.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2020-02-04 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> On 2020-Feb-03, Jonathan S. Katz wrote:
> 
> > So, if there was something done to redirect people from specific
> > deprecated documentation pages historically, it was before my time. Most
> > of the redirects have been as general purposes ones (e.g. /docs/12), the
> > rules we put in for getting rid of "static", and the release notes,
> > which still receives some negative feedback towards it for different
> > reasons (though I think overall the effort was well-received). Anyway,
> > if we had a redirect in place, I'd want us to do it well.
> 
> I +1 changing the title and +1 having the redirect.  That said, I think
> people landing in a page titled "predefined roles" when they're looking
> for a page titled "default roles" would be quite confused for a while
> and perhaps continue to search for the page they think they're looking
> for.  (I know, it has happened to me.)  I suggest we add a very short
> paragraph --maybe a -- to the effect that feature-X used to be
> called feature-Y, immediately following the section title.

I'm a bit confused as it sounded like having the redirect wouldn't be
easy to do..  I also agree that keeping the URL the same as today but
changing that URL to be talking about 'predefined roles' would be quite
confusing.

Having some kind of landing page with the existing URL that had a link
for people to click on to get to the new URL seems like it would
possibly work.  I'd be happier with having that done in master and
keeping the back-branches as-is, since I don't really like the idea that
we're retroactively renaming a feature in our existing releases.

That said, if we at least have the landing page then at least people
will be less likely to be confused.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Documentation: 21.5. Default Roles

2020-01-14 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jan  7, 2020 at 11:46:31AM +0100, Laurenz Albe wrote:
> > On Fri, 2019-12-27 at 12:16 -0500, Bruce Momjian wrote:
> > > On Fri, Dec 27, 2019 at 05:44:10AM +, PG Doc comments form wrote:
> > > > The following documentation comment has been logged on the website:
> > > > 
> > > > Page: https://www.postgresql.org/docs/12/default-roles.html
> > > > Description:
> > > > 
> > > > The title is wrong.   The roles are not defaults; they are predefined 
> > > > and
> > > > privileged.  The title suggests that a user should expect  to be 
> > > > assigned
> > > > these roles.   "21.5 Sub-Administrator Roles"  would be 
> > > > accurate--improving
> > > > clarity over all  and removing any need to explain why postgres is not 
> > > > in
> > > > this list of roles.
> > > > 
> > > 
> > > Good points.  I have developed the attached documentation patch which
> > > includes your ideas.
> > 
> > +1
> > 
> > I think that "predefined role" is better than "default role".
> 
> Thanks, patch applied through 9.6.

Erm, I didn't agree with this and pointed to reasons why it was based,
for starters, on a misunderstanding and further wasn't a particularly
good idea anyway.  I'm not happy that it was committed, and to have been
back-patched strikes me as even worse.  What about existing links to
things like: https://www.postgresql.org/docs/9.6/default-roles.html
which will now be broken, like from here?:

https://paquier.xyz/postgresql-2/postgres-11-new-system-roles/

Or that the documentation wasn't properly updated to reflect this change
as a simple "git grep 'default role'" would have shown?  There's at
least 5 references still to 'default role' in the documentation after
this commit.

Not to mention that, with this patch, we now have confusion between
things like 'DEFAULT_ROLE_WRITE_SERVER_FILES' in the code vs. the
documentation.

In short, I don't agree with this change, which strikes me as looking
largely like it's trying to make PG look more like Oracle than anything
else, but if we're going to move in this direction we should only be
doing so in master and we should be much more careful making sure that
the documentation, at least, is updated and consistent and that
appropriate comments are made to the code to explain that DEFAULT_ROLE
in the code is referring to "predefined roles" (or we should change the
code, though I can understand if there's argument that doing so would
create unnecessary back-patching hazards..  though there isn't all
*that* much code, so I could go either way on that myself).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Postgres.js driver - for Node.js

2020-01-09 Thread Stephen Frost
Greetings,

* Jonathan S. Katz (jk...@postgresql.org) wrote:
> On 1/8/20 7:59 PM, Tom Lane wrote:
> > "Jonathan S. Katz"  writes:
> >> On 1/8/20 7:00 PM, Jonathan Buhacoff wrote:
> >>> I'm wondering if the short list in the documentation [2] could be
> >>> replaced with a link to the corresponding wiki page [1] ?
> > 
> >> ...
> >> Also, while scanning the list, I also saw that someone recently added
> >> several drivers that appear to be non-OSS.
> > 
> > Yeah, the lack of curation there is exactly why we generally *don't* link
> > to the wiki from the formal docs.  We should make more of an effort to
> > transpose vetted info about this topic into the docs, though.
> 
> I'd be fine with taking a crack at this. I had some notes (I think I put
> it on -hackers during the SCRAM discussion) on how I reorganized the
> list, with what determined what was on the list, and what was an
> unsupported driver.

I agree with putting useful and vetted info into the docs, in general.

> >> ... I will wait until there is some
> >> consensus before removing the other closed-source ones.
> > 
> > Hm.  I'd vote against including those in the docs' list, because we can't
> > as-a-community review their quality and suitability. 
> 
> 100% agree there.

I agree that we can't- but I'm also of the opinion that if we're going
to care about things the we "as-a-community" have reviewed, then we need
to actually *do* such a review, at some level, and not just list
everything there that is OSS.

There's a couple of levels of review here, at least imv-

a) It's a driver maintained by someone in the community who is active
   and takes responsibility for it (and therefore presumably either
   wrote the code or has done some amount of code-level review)

b) It's a driver that's been packaged by the community and is
   distributed through community resources

The items in 'a' are also, again in my personal view, eligible for
hosting on pginfra- things like the ODBC driver and the JDBC driver.
The items in 'b' probably also make sense to list and we expect the
packager to at least take some responsibility for dealing with issues,
even if they haven't done a code-level review.  It'd be nice if we
distinguished between those two somehow, but I'm not sure it's really
required.

I'm pretty skeptical about listing drivers beyond those categories in
our official documentation as it lends our credibility to them while
being completely outside of our control.

> > However, I'm not
> > sure that there's anything wrong with having them in the wiki's list
> > as long as they're appropriately marked as not-OSS.
> 
> I had an offline suggestion about including a "License" column, which
> seems like a good idea in general as we also have in the docs. And if we
> are going to include more drivers in the docs, we'd (read "I'd" for this
> first pass) have to pull those together anyway.

I agree with having a license column (*cough* I might have also been the
one to make the offline suggestion, so don't consider this an
independent advocating of that ;).

> My personal preference would be to at least separate the open source
> from the closed source, but certain things we include in the column
> (e.g. supports SCRAM) we would not be able to validate on a closed
> source driver. But perhaps if we include said drivers on that page, we'd
> not include that info.

I don't think I agree with this- we don't validate anything for a closed
source driver, or an open source driver, to have it be listed on the
wiki- we don't really even control the wiki and we shouldn't be acting
like we do.  We can include if the driver supports SCRAM or not and
people can add that info in about whatever driver they're adding to the
list- or leave it blank if they don't know, and then someone else with
that info can choose to fill it in if they want.

> (Also my preference with the closed source drivers would be to keep them
> listed here:
> https://www.postgresql.org/download/products/2-drivers-and-interfaces/
> ...which I realized adds yet another twist to this discussion because
> one could then say "Why not list them all here?" And then it is a
> curated listed as it goes through the pgweb team...and then we could
> potentially link it from the docs...)

Linking it from the docs has the same issues as noted before though..
Maybe you could generate the docs by pulling from that list but I'm not
really a fan of that either..

Another option though would be to have the list of things from category
'a' and 'b' above be in our docs and then a link to the
drivers-and-interfaces page from the docs, and let people do what they
want with the wiki (maybe put a link at the top of such a page to the
official documentation page though, which makes it clearer that the wiki
isn't the official project position or curated list).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: I'm surprised to see the word master here

2019-10-09 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 2019-10-08 18:39, Stephen Frost wrote:
> > I'm also at a loss as to why we're discussing what to call 'master/
> > master' here since we don't actually use the term 'master/master' in our
> > docs at all today that I can see.
> 
> The underlying premise appears to be that the word should be banned.  In
> which case that would certainly propagate to the web site, the wiki,
> other nearby resources, some of which certainly do use that term
> liberally in various configurations.  So we should think that through
> regardless.

Saying 'banned' goes beyond what I think the goal here is- but, yes, I
believe the premise here is that we should avoid using the term.  That's
only part of it though- we should also be trying to actually be consistent
in our use of terms and we're far from that today.  Fixing the consistency
issue would just about entirely remove the term by itself, to the point
where adjusting the remaining one page in the docs is a modest change
that also puts us in a better light.

Regarding taking that avoidance to other places- I believe it's pretty
clear that we have not implemented any kind of ban even for other words
which we now avoid using on postgresql.org and in the docs- a search for
'slave' on the wiki turns up a pretty impressive number of hits, so I
don't know why we're thinking of expanding this to include that as we
haven't before.  Perhaps we should consider that, but that's an entirely
different discussion and not relevant to this discussion.  Additionally,
the wiki is pretty clearly a user resource and nothing on it should be
considered an endorsement from the project- which is part of the reason
we moved things like policies from the wiki over to the main website.

Reviewing the website, beyond the docs, turns up almost zero references
to 'master' or 'multi-master' or even 'remastering'- and the ones that
do exist appear to all be from older press releases which I don't think
anyone would reasonably expect us to change at this point (though I do
wonder if perhaps we should start to clean up "presskit91" and
friends...).  We do have "Mastering PostgreSQL", as a reference to the
book of the same name, but I don't believe anyone has an issue with
that.  Saying that these terms are in use liberally on the main website
just doesn't stand up to any scrutiny or even cursory review, from what
I can see, and if we aren't using them in the docs, it seems extremely
unlikely that they'd end up getting added to the main website.

If there are other resources which you have specific concerns regarding,
I'd suggest you go look and see to what extent they're actually using
the terminology in question and then we can discuss them and how much
they are viewed as representing the project.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: I'm surprised to see the word master here

2019-10-08 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Oct  2, 2019 at 03:04:55PM -0400, Stephen Frost wrote:
> > I don't agree with this claim.  While we could argue about if a hot
> > standby is considered "active" or not, the vast majority of the world
> > considers "active/active" to actually be where you can use the two
> > systems interchangably, including being able to write to both.  As such,
> > I disagree with this claim- while perhaps you could make an argument
> > that it's "technically" correct, it's not how the terms are used in
> > practice and saying active/active instead would be well understood by
> > the community and industry at large.
> 
> With master/standby-replica-slave, it is clear what multi-master is, and
> what master/replica is.  If you start using active-active, is it
> active/replica?  The full choices are:

I'm a bit confused, as I thought that I explained exactly my thoughts on
this in the paragraph you quoted.

No, active/active isn't "active/replica", it's what we call today
"multi-master".

I'm also at a loss as to why we're discussing what to call 'master/
master' here since we don't actually use the term 'master/master' in our
docs at all today that I can see.  There's one place where we talk about
'Multimaster' and that's here:

https://www.postgresql.org/docs/current/different-replication-solutions.html

> Whatever terms we use, it would be nice to use the same term for the
> multi-master as for master/replica.  Using active-active and
> primary/replica just seems odd.  Multi-primary?  Seems odd since primary
> suggests one, though multiple master seems odd too, i.e., more than one
> master.  Multi-active seems the most logical, or active-active, but then
> active-replica seems odd because it suggests the repica is not active,
> i.e. does nothing.  Is no clear logical terminology possible?

I'm confused here again.  Why would we want to use the same term for two
primaries that are working together as we would for a primary and a
replica...?  Those are two quite different setups, and, at least in my
experience, 'active-active' only applies to the case where you have two
primaries involved.

In any case, removing the term 'master' from:

https://www.postgresql.org/docs/current/warm-standby.html

doesn't actually run into any of these issues because we don't say
'Multimaster' or 'master/master' or anything like that on that page, we
just keep going back and forth between 'master' and 'primary' without
any real reason for doing so.  There's also some areas which could use
clarification, like:

In standby mode, the server continuously applies WAL received from the
master server.

Which isn't entirely accurate- the server continuously applies WAL
received from the primary, or the upstream standby it is connected to,
though we could possibly just include a reference down to Cacading
Replication to clarify that.

Of course, down in cascading replication we actually say:

which eventually links to a single master/primary server.

Which just re-enforces that we're pretty confused in that page regarding
the language/distinction between those two.

Coming back around to:

https://www.postgresql.org/docs/current/different-replication-solutions.html

and the "master/Multimaster" references there- most of the "master" ones
can be changed directly to "primary" (and most of the "standby" one
should really probably be "replica"...  that's a different task though,
to go through and clean up that, but also a worthwhile one, imv).  The
"Multimaster" references could go to "active-active" without any loss in
readability or understanding, imv, and we further explain each case in
the description of those anyway.

Though I have to say that I take some exception with the general
"Multimaster" or "active-active" concept being applied to offline /
periodically sync'd systems like when you have laptops or such involved,
but that's a different issue.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: I'm surprised to see the word master here

2019-10-02 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Wed, Oct 2, 2019 at 3:10 PM Jonathan S. Katz 
> wrote:
> > On 10/2/19 7:39 AM, Chris Travers wrote:
> > > On Wed, Oct 2, 2019 at 12:57 PM Erikjan Rijkers  > > > wrote:
> > >
> > > On 2019-10-02 12:46, Peter Eisentraut wrote:
> > > > On 2019-10-02 10:21, Magnus Hagander wrote:
> > > >> Exactly. Both might be accurate, but one comes with a lot less
> > > >> baggage.
> > > >>
> > > >> I support a search and replace.
> > > >>
> > > >> I think it'll take a bit more than just a simple "sed script to
> > > >> replace", if that's what you mean. But probably not all that much
> > --
> > > >> but
> > > >> there can certainly be cases where nearby langaugae also has to be
> > > >> changed to make it work properly. But I have a hard time seeing
> > it as
> > > >> being a *huge* undertaking.
> > > >
> > > > I find this proposal to be dubious and unsubstantiated.  Do we
> > need to
> > > > get rid of "multimaster", "postmaster"?
> > > >
> > >
> > > IMHO, hat would seem a bad idea.  Let's not take the politicising too
> > > far.
> > >
> > > I would say leave it at abolishing 'slave' (as we have already done).
> > >
> > >
> > > But that raises an important point, which is that if we remove master
> > > entirely from the replication lexicon, then I don't see how multi-master
> > > makes sense.  If consistency is a goal, postmaster still works but there
> > > is no alternative to multi-master in common usage.
> >
> > At various events and tradeshows that include representation from other
> > database systems, the terminology that I hear is "active-active" -- this
> > is not one-off, but from a lot of people. This is also a common term for
> > the major proprietary systems as well. I hear it much more commonly than
> > "multi-master" even.
> 
> That has the tiny problem of not being correct though.
> 
> A classic primary/standby cluster is *also* active/active. It used to be
> very common to have active/passive clusters -- these were the typical
> shared-disk-mounted-on-one-node-at-a-time style clusters. This indicates
> that the standby node isn't available *at all* until after a
> fail/switchover. So pretty much anything based on our streaming replication
> today is active/active..

I don't agree with this claim.  While we could argue about if a hot
standby is considered "active" or not, the vast majority of the world
considers "active/active" to actually be where you can use the two
systems interchangably, including being able to write to both.  As such,
I disagree with this claim- while perhaps you could make an argument
that it's "technically" correct, it's not how the terms are used in
practice and saying active/active instead would be well understood by
the community and industry at large.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: I'm surprised to see the word master here

2019-10-02 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 2019-09-25 00:28, Dave Cramer wrote:
> > Ya, I was under that impression as well. 
> > 
> > Dave Cramer
> > 
> > 
> > On Tue, 24 Sep 2019 at 18:18, Renee  > > wrote:
> > 
> > I was under the impression that both terms were being deprecated. Is
> > that only the case when they appear in tandem?
> 
> Again, you might be confusing this.  I don't recall any such initiative
> nor do I see any commits to that effect.

Alright then, given we have multiple people asking about this- should we
be considering adopting different language, even if we hadn't previously
had such an initiative?

I know that I tend towards primary/replica when discussing physical
replication, and we do that quite a bit in the documentation (consider
https://www.postgresql.org/docs/11/warm-standby.html where we seem to be
pretty confused about if we want to talk about the system as a 'primary'
or as a 'master'- but *clearly* primary is winning the war there).

Even if we aren't avoiding the term for its negative connotations
explicitly, having some consistency here strikes me as worthwhile.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: readability changes to postgres.sgml

2019-08-15 Thread Stephen Frost
Greetings,

* Joshua D. Drake (j...@commandprompt.com) wrote:
> attached

While I'm all for working on improving the documentation and, in
particular, our tutorials, the above description of what the suggested
change is seems to be rather..  lacking, and the changes themselves
don't come across as obvious or clear improvements (and in some cases
they seem to be simply removing words and removing content that is
actually important and valuable, making it a net negative change).

Specifically-

>  Welcome to the PostgreSQL Tutorial.  The
> -following few chapters are intended to give a simple introduction
> +tutorial is intended to give an introduction
>  to PostgreSQL, relational database

I disagree with removing 'simple'- after all, that's exactly what we
want this tutorial to be and including that hopefully encourages
individuals to move forward.  I'd argue the same applies to pointing out
that the tutorial itself is only a few chapters and isn't the whole rest
of the documentation.

> -concepts, and the SQL language to those who are new to any one of
> -these aspects.  We only assume some general knowledge about how to
> -use computers.  No particular Unix or programming experience is
> -required.  This part is mainly intended to give you some hands-on
> -experience with important aspects of the
> -PostgreSQL system.  It makes no attempt
> -to be a complete or thorough treatment of the topics it covers.
> +concepts, and the SQL language. We assume some general knowledge about 
> +how to use computers and no particular Unix or programming experience is
> +required.  This tutorial is intended to provide hands-on experience with 
> +important aspects of the PostgreSQL system.  
> +It makes no attempt to be a comprehensive treatment of the topics it 
> covers.
> 

This seems to primairly just remove the "who are new to any one of those
aspects" but that's pretty key to the goal of this tutorial and it
speaks to how we should be thinking about the rest of this part of the
documentation.

> 
> -After you have worked through this tutorial you might want to move
> -on to reading  to gain a more formal knowledge
> +After you have successfully completed this tutorial you will want to
> +read the  section to gain a better understanding
>  of the SQL language, or  for
> -information about developing applications for
> -PostgreSQL.  Those who set up and
> -manage their own server should also read .
> +information about developing applications with
> +PostgreSQL.  Those who provision and
> +manage their own PostgreSQL installation should also read  linkend="admin"/>.
> 
>

Why change "might" to "will"..?  Or remove "formal"?  Also not sure
about changing "set up" to "provision", the latter seems to imply a
particular environment while the former doesn't.

> @@ -66,28 +64,26 @@
>  This part describes the use of the SQL language
>  in PostgreSQL.  We start with
>  describing the general syntax of SQL, then
> -explain how to create the structures to hold data, how to populate
> -the database, and how to query it.  The middle part lists the
> -available data types and functions for use in
> -SQL commands.  The rest treats several
> -aspects that are important for tuning a database for optimal
> -performance.
> +how to create tables, how to populate the database, and how to 
> +query it.  The middle part lists the available data types and 
> +functions for use in SQL commands.  Lasty,
> +we address several aspects of importantance for tuning a database.
> 

The term "structures to hold data" seems to be specifically used because
we haven't yet defined what a 'table' is, so I don't agree with this
change either.

The later changes seem to be in a similar vein..  Dropping things like
"language" when talking about server-side programming languages,
removing references to "in this part" or changing them to be "in this
tutorial" or similar, and just don't strike me as particularly good
improvements or ones which have a specific direction or a defined reason
for being made.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Google Season of Docs 2019 - PostgreSQL

2019-05-20 Thread Stephen Frost
Greetings,

* Raghav Jajodia (jajodia.rag...@gmail.com) wrote:
> I am Raghav Jajodia, a software engineer from Bangalore, India. I have been
> a diligent open source contributor and have been a student under the
> following:
> 1. Google Summer of Code 2017 student
> 2. OWASP Code Sprint Winner
> 3. FOSSASIA CodeHeat 2018 Grand Prize winner

GSoD, based on my understanding, is not intended as an internship and is
not comparable to GSoC in that regard.  Instead, it's for experienced
technical writers.

> Apart from building a solid proposal, I would like to know, what exactly
> should I do in this period to increase my chances of selection. I am
> looking forward to be a long term contributor, both in terms of code and
> documentation for PostgreSQL organisation. Looking forward to your response.

I would suggest that you reach out to Google to discuss if GSoD is a
good fit for you.  Note also that the PG GSoD wiki page asks for
proposals to be sent to the pgsql-docs mailing list, not here.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: SET ROLE documentation not entirely correct

2019-04-25 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Joe Conway  writes:
> > On 4/25/19 1:08 PM, Joe Conway wrote:
> >> There is one image which can be used as an example though:
> >> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=doc/src/sgml/storage.sgml;hb=HEAD
> 
> > Here is a more precise URL:
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=doc/src/sgml/storage.sgml;hb=HEAD#l934
> 
> Comparing that to what I see at
> 
> https://www.postgresql.org/docs/devel/storage-page-layout.html
> 
> does not lead to a warm feeling about our images-in-docs stuff
> really being ready for prime time.  It works OK in some browsers
> but others (at least Safari) cut off the image.

Yeah, I'm seeing it cut off in Chrome too, initially, and then it seems
to fix itself if I resize the window down from 4k (and stay fixed even
if I then go back to the original, much larger, size).  That seems like
a CSS or layout issue though, not an issue with the image itself..?

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Minor typo in sentence

2018-12-06 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * PG Doc comments form (nore...@postgresql.org) wrote:
> > The following documentation comment has been logged on the website:
> > 
> > Page: https://www.postgresql.org/docs/11/planner-stats.html
> > Description:
> > 
> > There is typo in "14.2.2. Extended Statistics": 
> > 
> > > Statistics objects are created using CREATE STATISTICS, which see for more
> > details.
> > 
> > I think the part after the comma can be removed
> 
> Agreed, my thinking is that we should have that be:
> 
> Statistics objects are created using the CREATE STATISTICS command.
> 
> which is more in-line with how we refer to commands elsewhere.  I'll
> make that change and back-patch it later today.

I've pushed this change now.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Minor typo in sentence

2018-12-06 Thread Stephen Frost
Greetings,

* PG Doc comments form (nore...@postgresql.org) wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/11/planner-stats.html
> Description:
> 
> There is typo in "14.2.2. Extended Statistics": 
> 
> > Statistics objects are created using CREATE STATISTICS, which see for more
> details.
> 
> I think the part after the comma can be removed

Agreed, my thinking is that we should have that be:

Statistics objects are created using the CREATE STATISTICS command.

which is more in-line with how we refer to commands elsewhere.  I'll
make that change and back-patch it later today.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Return codes for archive and restore commands

2018-11-28 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Wed, Nov 28, 2018 at 09:39:58PM -0500, Stephen Frost wrote:
> > Having discussed this quite a bit lately with David Steele and Magnus,
> > it's pretty clear that we need to completely rip out how this works
> > today and rewrite it based around an extension model where a background
> > worker can start up and essentially take the place of the archiver
> > process, with flexibility to jump forward through the WAL stream,
> > communicate clearly with other processes, handle failure to do so
> > gracefully based on the specific cases, etc.
> 
> Hm.  When an instance state is in PM_SHUTDOWN_2, the postmaster
> explicitely waits for the WAL senders and the archiver to shut down.  So
> I think that you would need more control regarding the timing a bgworker
> should be shut down first to be completely correct.

Yes, it couldn't be exactly the same as a generic background worker,
that's a good point.  We definitely need to make sure that the
postmaster waits for the archiver to shut down, as it does for the WAL
senders.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Return codes for archive and restore commands

2018-11-28 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Wed, Nov 28, 2018 at 11:00:31AM +, PG Doc comments form wrote:
> > For the archive command:
> > <=128 There are not errors in the PostgreSQL log (messages with severity
> > equal or higher than ERROR). Firstly 3 messages of type LOG about fault,
> > then WARNING about this and pause for 1 minute, then repeated.
> > >=129 FATAL error in the PostgeSQL log. The message about stoping an archive
> > process, but not the database. Repeated after roughly 16 seconds.
> 
> This code is around for some time, and comes from this commit:
> commit: 3ad0728c817bf8abd2c76bd11d856967509b307c
> author: Tom Lane 
> date: Tue, 21 Nov 2006 20:59:53 +
> committer: Tom Lane 
> date: Tue, 21 Nov 2006 20:59:53 +
> On systems that have setsid(2) (which should be just about everything except
> Windows), arrange for each postmaster child process to be its own process
> group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
> process group not only the direct child process.  This provides saner behavior
> for archive and recovery scripts; in particular, it's possible to shut down a
> warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
> of SIGQUIT to the startup subprocess will result in killing the waiting
> recovery_command.  Also, this makes Query Cancel and statement_timeout apply
> to scripts being run from backends via system().  (There is no support in the
> core backend for that, but it's widely done using untrusted PLs.)  Per gripe
> from Stephen Harris and subsequent discussion.
> 
> The relevant part if pgarch_archiveXlog() in pgarch.c, and this part
> is most relevant:
> * Per the Single Unix Spec, shells report exit status > 128 when a
> * called command died on a signal.
> 
> > In this case PostgreSQL tries confirm rules for return codes of a unix
> > shell. A unix shell return 126 in the case of "command not executable", 127
> > in the case "command not found", 128+# of signal in the case if application
> > interrupted by uncatched signal.
> 
> If you were to rewrite those paragraphs or make them more precise, how
> would you actually shape your suggestions?  I personally quite like the
> current formulations, but I am rather used to it to be honest.

This is another example, at least imv, of why we really need to move
away from archive_command as an interface for doing WAL archiving.

Having discussed this quite a bit lately with David Steele and Magnus,
it's pretty clear that we need to completely rip out how this works
today and rewrite it based around an extension model where a background
worker can start up and essentially take the place of the archiver
process, with flexibility to jump forward through the WAL stream,
communicate clearly with other processes, handle failure to do so
gracefully based on the specific cases, etc.

We could then possibly write an extension to be included that mimics
what archive_command does today, but imv we should immediately consider
it deprecated and encourage people to move off of it.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Document the limit on the number of parameters

2018-08-07 Thread Stephen Frost
Greetings Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> On Tue, Jul 17, 2018 at 01:25:08PM +, PG Doc comments form wrote:
> >>> The backend protocol limits the number of parameters that can be passed 
> >>> for
> >>> a prepared statement to 32767 (2 byte signed integer).
> 
> >> Uh, is this something people will care about?
> 
> > Seems like a limitation in the backend protocol should probably be
> > documented in the backend protocol section of the docs..
> 
> It is: the definition of the Bind message format says that the parameter
> count is an int16.
> 
> The question is whether this deserves to be cross-referenced from
> elsewhere.  I agree with Bruce that it seems unlikely to be helpful.

Ah, I read the original poster as saying that it wasn't in the protocol
docs.  If it's there already then I agree, that's sufficient.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Document the limit on the number of parameters

2018-08-07 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Jul 17, 2018 at 01:25:08PM +, PG Doc comments form wrote:
> > The following documentation comment has been logged on the website:
> > 
> > Page: https://www.postgresql.org/docs/10/static/sql-prepare.html
> > Description:
> > 
> > The backend protocol limits the number of parameters that can be passed for
> > a prepared statement to 32767 (2 byte signed integer).
> > 
> > As that is something that is independent of the client library, I think this
> > should be documented in the user facing manual for the Postgres server
> > somewhere. 
> > 
> > I did not find anything in the chapter about PREPARE
> > https://www.postgresql.org/docs/current/static/sql-prepare.html or about the
> > backend protocol:
> > https://www.postgresql.org/docs/current/static/protocol.html nor on the FAQ
> > page: https://wiki.postgresql.org/wiki/FAQ
> 
> Uh, is this something people will care about?

Seems like a limitation in the backend protocol should probably be
documented in the backend protocol section of the docs..

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Missing column_constraint explanation

2018-01-14 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Sat, Jan 13, 2018 at 09:06:22PM -0500, Stephen Frost wrote:
> > I'm not really sure that we want to go there for this case though.
> > Perhaps others disagree, but that seems like a lot to avoid this
> > particular duplication, which really isn't all that bad.
> > 
> > This patch also seems to have gotten lost in the shuffle of things, but
> > it still applies cleanly and I took another look at it today and it
> > looks good to me, so I'm going to stick it in the CF and mark it as
> > Needs Review for now.  Perhaps someone else can give it another
> > once-over to make sure everything looks good and, if so, mark it as
> > Ready for Committer and then I'll take care of it.
> 
> I may be missing something, but no patch is attached to this
> thread. Honestly I think that duplicating this code should be avoided,
> and the patch to produce is not that complicated technically. So are you
> planning to just duplicate hte definitions in CREATE TABLE to ALTER
> TABLE? This is a recipy for forgetting updates in the future on one
> page or the other...

I'm not really sure why the thread keeps getting broken, but the
original was here:

https://www.postgresql.org/message-id/flat/CAB_COdgOoA=G18RhWPoW8zZ+xOxTns7xD7psHA=ct+xccok...@mail.gmail.com#CAB_COdgOoA=G18RhWPoW8zZ+xOxTns7xD7psHA=ct+xccok...@mail.gmail.com

where the latest patch is from Amit.

I agree that there's some risk that someone changing what
'column_constraint' means will miss changing it everywhere it needs to
be changed in the documentation, but that's hardly new, we have quite a
few places that need to be changed when someone adds such a new feature.

We also do not actually have the exact same definition of it everywhere
either- see CREATE TABLE and CREATE FOREIGN TABLE for example.  In this
case, column_constraint is the same between CREATE TABLE and ALTER
TABLE, and presumably the same is true between CREATE FOREIGN TABLE and
ALTER FOREIGN TABLE, but it's not the same between regular tables and
foreign tables, despite sharing the same name.  As such, someone
extending what column_constraint means would need to at least be
thinking about if their new feature needs to be added into the CREATE
FOREIGN TABLE definition too.  Unless we start marking up the
documentation somehow further to indicate which bits of
column_constraint are accepted for regular tables and which are accepted
for foreign tables, but we may also have cases where one part of
'column_constraint' is actually different between the two for perhaps
entirely appropriate technical reasons.

Further, as mentioned upthread in
https://www.postgresql.org/message-id/20171221031511.GX4628%40tamriel.snowman.net
I'm concerned that this wouldn't be very easy to make work with the way
psql's documentation is generated.  Did you take a look at how that
works with create_help.pl?  I don't see create_help.pl as supporting
'entity'.  We could perhaps fix that, but I'm not really sure it's worth
it and it's certainly quite a bit to add on to this particular patch.

If we used the method that features-supported.sgml uses, where we have
another perl script that's used to build the sgml files up from text
files, then we'd also need to make building psql depend on the docs
having been built, I believe, and I don't think we really want to do
that either.

I wouldn't be against a larger patch which actually refactored all of
these sub-stanzas into independent files (with unique names, unlike how
column_constraint is today) and also taught create_help.pl and the doc
build to be able to work with them, but that's a great deal more work
and nothing in this patch stops that from being done in the future.

In the end, I'm inclinced to simply adopt this change and even consider
back-patching it as a documentation fix.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Is this still accurate?

2018-01-05 Thread Stephen Frost
Greetings,

* Moser, Glen G (glen.mo...@charter.com) wrote:
> That's really the gist of the concern from a team member of mine.  Not that 
> the 4TB number is wrong but that it could be misleading to assume that 4TB is 
> some sort of upper bound.
> 
> That's how this concern was relayed to me and I am just following up.

Well, saying 'in excess of' is pretty clear, but I don't think the
sentence is really adding much either, so perhaps we should just remove
it.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Is this still accurate?

2018-01-05 Thread Stephen Frost
Greetings,

* Moser, Glen G (glen.mo...@charter.com) wrote:
> Can someone confirm the accuracy of the information found at 
> https://www.postgresql.org/about/?
> 
> Specifically the maximum data values in the screen shot below...it seems as 
> though this documentation might be out of date.

The part you highlighted was:

"There are active PostgreSQL systems in production environments that
manage in excess of 4 terabytes of data."

Which is pretty accurate, I know of some myself that are larger than
4TB.  That 4TB number isn't a limit of any kind and the sentence says
"in excess of" meaning that there are databases larger than that.
There's actually some which are quite a bit larger than that, in fact.

We could bump the number up there or remove the sentence, but I don't
think there's anything inaccurate about the statement.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Missing column_constraint explanation

2017-12-20 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Wed, Dec 20, 2017 at 6:08 PM, PG Doc comments form
>  wrote:
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/9.6/static/sql-altertable.html
> > Description:
> >
> > Missing column_constraint explanation in parameters section
> 
> Those docs say already that ADD COLUMN follows the same grammar as
> CREATE TABLE, which basically means that there is no need to duplicate
> the same definition in two places. Note that the same thing applies to
> table_constraint.

I actually disagree with this because it means that psql's \h output for
ALTER TABLE references column_constraint but doesn't define it anywhere.

I'd rather see us move in the other direction- let's try to make the \h
output for each command actually stand alone.  There was some progress
made in that direction recently though I don't recall which command it
was for off-hand, but I'd prefer if it was a general rule.

Now, if we could do that in such a way that we avoid having to actually
duplicate the 'source' for these productions into different places in
the documentation, that would be fantastic because it certainly isn't
fun having to find all the places that need to be updated, but I'm not
sure how easy that would be to do (and to make work with how psql's help
is generated...).

Thanks!

Stephen


signature.asc
Description: Digital signature