from:"\"Laurenz Albe\""

Re: Stronger safeguard for archive recovery not to miss data

2021-01-20 Thread Laurenz Albe

On Wed, 2021-01-20 at 13:10 +0900, Fujii Masao wrote:
> +errhint("Run recovery again from a new base 
> backup taken after setting wal_level higher than minimal")));
> 
> Isn't it impossible to do this in normal archive recovery case? In that case,
> since the server crashed and the database got corrupted, probably
> we cannot take a new base backup.
> 
> Originally even when users accidentally set wal_level to minimal, they could
> start the server from the backup by disabling hot_standby and salvage the 
> data.
> But with the patch, we lost the way to do that. Right? I was wondering that
> WARNING was used intentionally there for that case.

I would argue that if you set your "wal_level" to minimal, do some work,
reset it to "replica" and recover past that, two things could happen:

1. Since "archive_mode" was off, you are missing some WAL segments and
   cannot recover past that point anyway.

2. You are missing some relevant WAL records, and your recovered
   database is corrupted.  This is very likely, because you probably
   switched to "minimal" with the intention to generate less WAL.

Now I see your point that a corrupted database may be better than no
database at all, but wouldn't you agree that a warning in the log
(that nobody reads) is too little information?

Normally we don't take such a relaxed attitude towards data corruption.

Perhaps there could be a GUC "recovery_allow_data_corruption" to
override this check, but I'd say that a warning is too little.

> if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
> ereport(ERROR,
> (errmsg("hot standby is not possible 
> because wal_level was not set to \"replica\" or higher on the primary 
> server"),
>  errhint("Either set wal_level to 
> \"replica\" on the primary, or turn off hot_standby here.")));
> 
> With the patch, we never reach the above code?

Right, that would have to go.  I didn't notice that.

Yours,
Laurenz Albe

Re: Stronger safeguard for archive recovery not to miss data

2021-01-21 Thread Laurenz Albe

On Thu, 2021-01-21 at 11:49 +, osumi.takami...@fujitsu.com wrote:
> Adding a condition to check if "recovery_allow_data_corruption" is 'on' 
> around the end of
> CheckRequiredParameterValues() sounds safer for me too, although
> implementing a new GUC parameter sounds bigger than what I expected at first.
> The default of the value should be 'off' to protect users from getting the 
> corrupted server.
> Does everyone agree with this direction ?

I'd say that adding such a GUC is material for another patch, if we want it at 
all.

I think it is very unlikely that people will switch from "wal_level=replica" to
"minimal" and back very soon afterwards and also try to recover past such
a switch, which probably explains why nobody has complained about data 
corruption
generated that way.  To get the server to start with "wal_level=minimal", you 
must
set "archive_mode=off" and "max_wal_senders=0", and few people will do that and
still expect recovery to work.

My vote is that we should not have a GUC for such an unlikely event, and that
stopping recovery is good enough.

Yours,
Laurenz Albe

Re: Stronger safeguard for archive recovery not to miss data

2021-01-21 Thread Laurenz Albe

On Thu, 2021-01-21 at 13:09 +, osumi.takami...@fujitsu.com wrote:
> > My vote is that we should not have a GUC for such an unlikely event, and 
> > that
> > stopping recovery is good enough.
> 
> OK. IIUC, my current patch for this fix doesn't need to be changed or 
> withdrawn.
> Thank you for your explanation.

Well, that's just my opinion.
Fujii Masao seemed to disagree with the patch, and his voice carries weight.

Yours,
Laurenz Albe

Re: Stronger safeguard for archive recovery not to miss data

2021-01-24 Thread Laurenz Albe

On Thu, 2021-01-21 at 15:30 +0100, I wrote:
> On Thu, 2021-01-21 at 13:09 +, osumi.takami...@fujitsu.com wrote:
> 
> > > My vote is that we should not have a GUC for such an unlikely event, and 
> > > that
> > > stopping recovery is good enough.
> > OK. IIUC, my current patch for this fix doesn't need to be changed or 
> > withdrawn.
> > Thank you for your explanation.
> 
> Well, that's just my opinion.
> 
> Fujii Masao seemed to disagree with the patch, and his voice carries weight.

I think you should pst another patch where the second, now superfluous,
error message is removed.

Yours,
Laurenz Albe

Re: Stronger safeguard for archive recovery not to miss data

2021-01-25 Thread Laurenz Albe

On Mon, 2021-01-25 at 08:19 +, osumi.takami...@fujitsu.com wrote:
> > I think you should pst another patch where the second, now superfluous, 
> > error
> > message is removed.
>
> Updated. This patch showed no failure during regression tests
> and has been aligned by pgindent.

Looks good to me.
I'll set it to "ready for committer" again.

Yours,
Laurenz Albe

Re: Error on failed COMMIT

2021-01-26 Thread Laurenz Albe

On Mon, 2021-01-25 at 11:29 -0500, Dave Cramer wrote:
> Rebased against head 
> 
> Here's my summary of the long thread above.
> 
> This change is in keeping with the SQL spec.
> 
> There is an argument (Tom) that says that this will annoy more people than it 
> will please.
>  I presume this is due to the fact that libpq behaviour will change.
> 
> As the author of the JDBC driver, and I believe I speak for other driver 
> (NPGSQL for one)
>  authors as well that have implemented the protocol I would argue that the 
> current behaviour
>  is more annoying.
> 
> We currently have to keep state and determine if COMMIT actually failed or it 
> ROLLED BACK.
>  There are a number of async drivers that would also benefit from not having 
> to keep state
>  in the session.

I think this change makes sense, but I think everybody agrees that it does as it
makes PostgreSQL more standard compliant.

About the fear that it will break user's applications:

I think that the breakage will be minimal.  All that will change is that COMMIT 
of
an aborted transaction raises an error.

Applications that catch an error in a transaction and roll back will not
be affected.  What will be affected are applications that do *not* check for
errors in statements in a transaction, but check for errors in the COMMIT.
I think that doesn't happen often.

I agree that some people will be hurt, but I don't think it will be a major 
problem.

The patch applies and passes regression tests.

I wonder about the introduction of the new USER_ERROR level:

 #define WARNING_CLIENT_ONLY20  /* Warnings to be sent to client as usual, 
but
 * never to the server log. */
-#define ERROR  21  /* user error - abort transaction; return to
+#define USER_ERROR 21
+#define ERROR  22  /* user error - abort transaction; return to
 * known state */
 /* Save ERROR value in PGERROR so it can be restored when Win32 includes
  * modify it.  We have to use a constant rather than ERROR because macros
  * are expanded only when referenced outside macros.
  */
 #ifdef WIN32
-#define PGERROR21
+#define PGERROR22
 #endif
-#define FATAL  22  /* fatal error - abort process */
-#define PANIC  23  /* take down the other backends with me */
+#define FATAL  23  /* fatal error - abort process */
+#define PANIC  24  /* take down the other backends with me */

I see that without that, COMMIT AND CHAIN does not behave correctly,
since the respective regression tests fail.

But I don't understand why.  I think that this needs some more comments to
make this clear.

Is this new message level something we need to allow setting for
"client_min_messages" and "log_min_messages"?

Yours,
Laurenz Albe

Re: Error on failed COMMIT

2021-01-26 Thread Laurenz Albe

On Tue, 2021-01-26 at 11:09 -0500, Dave Cramer wrote:
> On Tue, 26 Jan 2021 at 05:05, Laurenz Albe  wrote:
>
> > I wonder about the introduction of the new USER_ERROR level:
> > 
> >  #define WARNING_CLIENT_ONLY20  /* Warnings to be sent to client as 
> > usual, but
> >  * never to the server log. */
> > -#define ERROR  21  /* user error - abort transaction; return to
> > +#define USER_ERROR 21
> > +#define ERROR  22  /* user error - abort transaction; return to
> >  * known state */
> >  /* Save ERROR value in PGERROR so it can be restored when Win32 includes
> >   * modify it.  We have to use a constant rather than ERROR because macros
> >   * are expanded only when referenced outside macros.
> >   */
> >  #ifdef WIN32
> > -#define PGERROR21
> > +#define PGERROR22
> >  #endif
> > -#define FATAL  22  /* fatal error - abort process */
> > -#define PANIC  23  /* take down the other backends with me */
> > +#define FATAL  23  /* fatal error - abort process */
> > +#define PANIC  24  /* take down the other backends with me */
> > 
> > I see that without that, COMMIT AND CHAIN does not behave correctly,
> > since the respective regression tests fail.
> > 
> > But I don't understand why.  I think that this needs some more comments to
> > make this clear.
> 
> First off thanks for reviewing.
> 
> The problem is that ereport does not return for any level equal to or above 
> ERROR.
>  This code required it to return so that it could continue processing

Oh, I see.

After thinking some more about it, I think that COMMIT AND CHAIN would have
to change behavior: if COMMIT throws an error (because the transaction was
aborted), no new transaction should be started.  Everything else seems fishy:
the statement fails, but still starts a new transaction?

I guess that's also at fault for the unexpected result status that
Masahiko complained about in the other message.

So I think we should not introduce USER_ERROR at all.  It is too much
of a kluge: fail, but not really...

I guess that is one example for the incompatibilities that Tom worried
about upthread.  I am beginning to see his point better now.

Yours,
Laurenz Albe

Re: Error on failed COMMIT

2021-01-26 Thread Laurenz Albe

On Tue, 2021-01-26 at 12:25 -0500, Dave Cramer wrote:
> > After thinking some more about it, I think that COMMIT AND CHAIN would have
> > to change behavior: if COMMIT throws an error (because the transaction was
> > aborted), no new transaction should be started.  Everything else seems 
> > fishy:
> > the statement fails, but still starts a new transaction?
> > 
> > I guess that's also at fault for the unexpected result status that
> > Masahiko complained about in the other message.
> 
>  
> I haven't had a look at the result status in libpq. For JDBC we don't see 
> that. 
> We throw an exception when we get this error report. This is very consistent 
> as the commit fails and we throw an exception
> 
> > So I think we should not introduce USER_ERROR at all.  It is too much
> > of a kluge: fail, but not really...
> 
> What we do now is actually worse as we do not get an error report and we 
> silently change commit to rollback.
> How is this better ?

I see your point from the view of the JDBC driver.

It just feels hacky - somewhat similar to what you say
above: don't go through the normal transaction rollback steps,
but issue an error message.

At least we should fake it well...

Yours,
Laurenz Albe

Re: Commitfest 2021-01 is now closed.

2021-02-01 Thread Laurenz Albe

On Mon, 2021-02-01 at 23:33 +0900, Masahiko Sawada wrote:
> I've closed this commitfest. If you have feedback or comment on my CFM
> work, please tell me here or by directly emailing me.

I think you did an excellent job.

Yours,
Laurenz Albe

Bug with indexes on whole-row expressions

2020-06-29 Thread Laurenz Albe

CREATE TABLE boom (a integer, b integer);

-- index on whole-row expression
CREATE UNIQUE INDEX ON boom ((boom));

INSERT INTO boom VALUES
   (1, 2),
   (1, 3);

ALTER TABLE boom DROP b;

TABLE boom;

 a 
---
 1
 1
(2 rows)

REINDEX TABLE boom;
ERROR:  could not create unique index "boom_boom_idx"
DETAIL:  Key ((boom.*))=((1)) is duplicated.

The problem here is that there *is* a "pg_depend" entry for the
index, but it only depends on the whole table, not on specific columns.

I have been thinking what would be the right approach to fix this:

1. Don't fix it, because it is an artificial corner case.
   (But I can imagine someone trying to exclude duplicate rows with
   a unique index.)

2. Add code that checks if there is an index with a whole-row reference
   in the definition before dropping a column.
   That feels like a wart for a special case.

3. Forbid indexes on whole-row expressions.
   After all, you can do the same with an index on all the columns.
   That would open the question what to do about upgrading old databases
   that might have such indexes today.

4. Add dependencies on all columns whenever a whole-row expression
   is used in an index.
   That would need special processing for pg_upgrade.

I'd like to hear your opinions.

Yours,
Laurenz Albe

Re: Remove Deprecated Exclusive Backup Mode

2020-07-01 Thread Laurenz Albe

On Tue, 2020-06-30 at 17:38 -0400, David Steele wrote:
> Rebased patch is attached.

I remain -1 on removing the exclusive backup API.

People who don't mind manual intervention when PostgreSQL doesn't
start after a crash in backup mode could safely use it.

We have been over this, and there is not much point in re-opening
the discussion.  I just had to speak up.

> Lastly, there is some concern about getting the backup label too late 
> when doing snapshots or using traditional backup software. It would 
> certainly be possible to return the backup_label and tablespace_map from 
> pg_start_backup() and let the user make the choice about what to do with 
> them. Any of these methods will require some scripting so I don't see 
> why the files couldn't be written as, for example, backup_label.snap and 
> tablespace_map.snap and then renamed by the restore script. The patch 
> does not currently make this change, but it could be added pretty easily 
> if that overcomes this objection.

That would be interesting improvements that would make the non-exclusive
backup API easier to use.

Yours,
Laurenz Albe

Re: Remove Deprecated Exclusive Backup Mode

2020-07-01 Thread Laurenz Albe

On Wed, 2020-07-01 at 08:47 -0400, Stephen Frost wrote:
> > I remain -1 on removing the exclusive backup API.
> 
> It's still deprecated and I'm certainly against removing that status
> until/unless someone actually fixes it (including documentation), and if
> we're not going to do that then we should remove it.

Well, it doesn't need fixing, since it is working just fine (for certain
values of "just fine").

I agree with the need to document the problems better.

Yours,
Laurenz Albe

Re: Remove Deprecated Exclusive Backup Mode

2020-07-02 Thread Laurenz Albe

On Wed, 2020-07-01 at 16:46 -0400, Stephen Frost wrote:
> I continue to find it curious that we stress a great deal over (very
> likely) poorly written backup scripts that haven't been updated in the
> 5+? years since exclusive mode was deprecated, but we happily add new
> keywords in new major versions and remove columns in our catalog tables
> or rename columns or entirely redo how recovery works (breaking every
> script out there that performed a restore..) or do any number of other
> things that potentially break applications that communicate through the
> PG protocol and other parts of the system that very likely have scripts
> that were written to work with them.
> 
> I'm really of the opinion that we need to stop doing that.
> 
> If someone could explain what is so special about *this* part of the
> system that we absolutely can't possibly accept any change that might
> break user's scripts, and why it's worth all of the angst, maintenance,
> ridiculously difficult documentation to understand and hacks (the
> interface to pg_start_backup is ridiculously warty because of this), I'd
> greatly appreciate it.

Easy.  Lots of people write backup scripts, and fewer people write
complicated pg_catalog queries.  We would make way more users unhappy.

Besides, there is nothing "poorly written" about a backup script using
exclusive backup as long as the user is aware that there could be some
small degree of trouble in case of a crash.

You have a point about reworking the way recovery works: after teaching
several classes and running into oddities with recovery parameters in
postgresql.conf, I am not sure if that was a good move.
Particularly the necessity for "recovery.signal" makes the New Way no
easier than the Old Way - perhaps something like "pg_ctl start_in_recovery"
would have been smarter.

But one instance where we made users unhappy is not justification
for making them unhappy again.

Yours,
Laurenz Albe

Add session statistics to pg_stat_database

2020-07-08 Thread Laurenz Albe

Here is a patch that adds the following to pg_stat_database:
- number of connections
- number of sessions that were not disconnected regularly
- total time spent in database sessions
- total time spent executing queries
- total idle in transaction time

This is useful to check if connection pooling is working.
It also helps to estimate the size of the connection pool
required to keep the database busy, which depends on the
percentage of the transaction time that is spent idling.

Yours,
Laurenz Albe
From: Laurenz Albe 
Date: Wed, 8 Jul 2020 13:12:42 +0200
Subject: [PATCH] Add session statistics to pg_stat_database

If "track_counts" is active, track the following per database:
- number of connections
- number of sessions that were not disconnected regularly
- total time spent in database sessions
- total time spent executing queries
- total idle in transaction time

This is useful to check if connection pooling is working.
It also helps to estimate the size of the connection pool
required to keep the database busy, which depends on the
percentage of the transaction time that is spent idling.
---
 doc/src/sgml/monitoring.sgml |  46 +
 src/backend/catalog/system_views.sql |   5 +
 src/backend/postmaster/pgstat.c  | 138 ++-
 src/backend/tcop/postgres.c  |   5 +
 src/backend/utils/adt/pgstatfuncs.c  |  78 +++
 src/include/catalog/pg_proc.dat  |  20 
 src/include/pgstat.h |  29 +-
 src/test/regress/expected/rules.out  |   5 +
 8 files changed, 324 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dfa9d0d641..da66808f02 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3519,6 +3519,52 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
   
  
 
+ 
+  
+   session_time double precision
+  
+  
+   Time spent in database sessions in this database, in milliseconds.
+  
+ 
+
+ 
+  
+   active_time double precision
+  
+  
+   Time spent executing SQL statements in this database, in milliseconds.
+  
+ 
+
+ 
+  
+   idle_in_transaction_time double precision
+  
+  
+   Time spent idling while in a transaction in this database, in milliseconds.
+  
+ 
+
+ 
+  
+   connections bigint
+  
+  
+   Number of connections established to this database.
+  
+ 
+
+ 
+  
+   aborted_sessions bigint
+  
+  
+   Number of database sessions to this database that did not end
+   with a regular client disconnection.
+  
+ 
+
  
   
stats_reset timestamp with time zone
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5314e9348f..64a4e5f0d4 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -909,6 +909,11 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_checksum_last_failure(D.oid) AS checksum_last_failure,
 pg_stat_get_db_blk_read_time(D.oid) AS blk_read_time,
 pg_stat_get_db_blk_write_time(D.oid) AS blk_write_time,
+pg_stat_get_db_session_time(D.oid) AS session_time,
+pg_stat_get_db_active_time(D.oid) AS active_time,
+pg_stat_get_db_idle_in_transaction_time(D.oid) AS idle_in_transaction_time,
+pg_stat_get_db_connections(D.oid) AS connections,
+pg_stat_get_db_aborted_sessions(D.oid) AS aborted_sessions,
 pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset
 FROM (
 SELECT 0 AS oid, NULL::name AS datname
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index c022597bc0..7b62028358 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -247,6 +247,11 @@ static int	pgStatXactCommit = 0;
 static int	pgStatXactRollback = 0;
 PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
+static TimestampTz pgStatActiveStart = DT_NOBEGIN;
+static PgStat_Counter pgStatActiveTime = 0;
+static TimestampTz pgStatTransactionIdleStart = DT_NOBEGIN;
+static PgStat_Counter pgStatTransactionIdleTime = 0;
+bool pgStatSessionDisconnected = false;
 
 /* Record that's written to 2PC state file when pgstat state is persisted */
 typedef struct TwoPhasePgStatRecord
@@ -326,6 +331,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber anum_oid);
+static void pgstat_send_connstats(bool force);
 
 static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared);
 
@@ -359,6 +365,7 @@ static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len);
 static void pgstat_recv_recoveryconflict(PgStat_MsgR

Re: Postgres is not able to handle more than 4k tables!?

2020-07-10 Thread Laurenz Albe

On Thu, 2020-07-09 at 12:47 -0400, Stephen Frost wrote:
> I realize this is likely to go over like a lead balloon, but the churn
> in pg_class from updating reltuples/relpages has never seemed all that
> great to me when just about everything else is so rarely changed, and
> only through some user DDL action- and I agree that it seems like those
> particular columns are more 'statistics' type of info and less info
> about the definition of the relation.  Other columns that do get changed
> regularly are relfrozenxid and relminmxid.  I wonder if it's possible to
> move all of those elsewhere- perhaps some to the statistics tables as
> you seem to be alluding to, and the others to $somewhereelse that is
> dedicated to tracking that information which VACUUM is primarily
> concerned with.

Perhaps we could create pg_class with a fillfactor less than 100
so we het HOT updates there.
That would be less invasive.

Yours,
Laurenz Albe

Re: Transactions involving multiple postgres foreign servers, take 2

2020-07-16 Thread Laurenz Albe

On Fri, 2020-07-17 at 05:21 +, tsunakawa.ta...@fujitsu.com wrote:
> From: Masahiko Sawada 
> I have briefly checked the only oracle_fdw but in general I think that
> > if an existing FDW supports transaction begin, commit, and rollback,
> > these can be ported to new FDW transaction APIs easily.
> 
> Does oracle_fdw support begin, commit and rollback?

Yes.

> And most importantly, do other major DBMSs, including Oracle, provide the API 
> for
> preparing a transaction?  In other words, will the FDWs other than 
> postgres_fdw
> really be able to take advantage of the new FDW functions to join the 2PC 
> processing?
> I think we need to confirm that there are concrete examples.

I bet they do.  There is even a standard for that.

I am not looking forward to adapting oracle_fdw, and I didn't read the patch.

But using distributed transactions is certainly a good thing if it is done 
right.

The trade off is the need for a transaction manager, and implementing that
correctly is a high price to pay.

Yours,
Laurenz Albe

Re: Add session statistics to pg_stat_database

2020-08-11 Thread Laurenz Albe

On Thu, 2020-07-23 at 18:16 +0500, Ahsan Hadi wrote:
> On Wed, Jul 8, 2020 at 4:17 PM Laurenz Albe  wrote:
> > Here is a patch that adds the following to pg_stat_database:
> > - number of connections
> 
> Is it expected behaviour to not count idle connections? The connection is 
> included after it is aborted but not while it was idle.

Thanks for looking.

Currently, the patch counts connections when they close.
I could change the behavior that they are counted immediately.

Yours,
Laurenz Albe

Re: Add session statistics to pg_stat_database

2020-10-13 Thread Laurenz Albe

On Fri, 2020-10-02 at 15:10 -0700, Soumyadeep Chakraborty wrote:
> On Tue, Sep 29, 2020 at 2:44 AM Laurenz Albe  wrote:
> > > * Are we trying to capture ONLY client initiated disconnects in
> > > m_aborted (we are not handling other disconnects by not accounting for
> > > EOF..like if psql was killed)? If yes, why?
> > 
> > I thought it was interesting to know how many database sessions are
> > ended regularly as opposed to ones that get killed or end by unexpected
> > client death.
> 
> It may very well be. It would also be interesting to find out how many
> connections are still open on the database (something we could easily
> glean if we had the number of all disconnects, client-initiated or
> unnatural). Maybe we could have both?
> 
> m_sessions_disconnected;
> m_sessions_killed;

We already have "numbackends" in "pg_stat_database", so we know the number
of active connections, right?

> You are absolutely right! PgBackendStatus is not the place for any of
> these fields. We could place them in LocalPgBackendStatus perhaps. But
> I don't feel too strongly about that now, having looked at similar fields
> such as pgStatXactCommit, pgStatXactRollback etc. If we decide to stick
> with the globals, let's isolate and decorate them with a comment such as
> this example from the source:
> 
> /*
>  * Updated by pgstat_count_buffer_*_time macros
>  */
> extern PgStat_Counter pgStatBlockReadTime;
> extern PgStat_Counter pgStatBlockWriteTime;

I have reduced the number of variables with my latest patch; I think
the rewrite based on your review is definitely an improvement.

The comment you quote is from "pgstat.h", and my only global variable
has a comment there.

> > > pgStatSessionDisconnected is not required as it can be determined if a
> > > session has been disconnected by looking at the force argument to
> > > pgstat_report_stat() [unless we would want to distinguish between
> > > client-initiated disconnects, which I am not sure why, as I have
> > > brought up above].
> > 
> > But wouldn't that mean that we count *every* end of a session as regular
> > disconnection, even if the backend was killed?
> 
> See my comment above about client-initiated and unnatural disconnects.

I decided to leave the functionality as it is; I think it is interesting
information to know if your clients disconnect cleanly or not.


Masahiro Ikeda wrote:
> I think to add the number of login failures is good for security.
> Although we can see the event from log files, it's useful to know the 
> overview if the database may be attached or not.

I don't think login failures can be reasonably reported in
"pg_stat_database", since authentication happens before the session is
attached to a database.

What if somebody attempts to connect to a non-existing database?

I agree that this is interesting information, but I don't think it
belongs into this patch.

> By the way, could you rebase the patch since the latest patches
> failed to be applied to the master branch?

Yes, the patch has bit-rotted.

Attached is v3 with improvements.

Yours,
Laurenz Albe
From 0cc86e8a2bf3ffc76358c9022636502779c30910 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 13 Oct 2020 13:26:48 +0200
Subject: [PATCH] Add session statistics to pg_stat_database

If "track_counts" is active, track the following per database:
- number of connections
- number of sessions that were not disconnected regularly
- total time spent in database sessions
- total time spent executing queries
- total idle in transaction time

This is useful to check if connection pooling is working.
It also helps to estimate the size of the connection pool
required to keep the database busy, which depends on the
percentage of the transaction time that is spent idling.

Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.ca...@cybertec.at
Reviewed-By: Soumyadeep Chakraborty, Masahiro Ikeda
---
 doc/src/sgml/monitoring.sgml |  46 
 src/backend/catalog/system_views.sql |   5 ++
 src/backend/postmaster/pgstat.c  | 105 ++-
 src/backend/tcop/postgres.c  |   5 ++
 src/backend/utils/adt/pgstatfuncs.c  |  78 
 src/include/catalog/pg_proc.dat  |  20 +
 src/include/pgstat.h |  27 +++
 src/test/regress/expected/rules.out  |   5 ++
 8 files changed, 290 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 66566765f0..13ef586857 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3415,6 +3415,52 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
   
  
 
+ 
+  
+   sessi

Re: Add session statistics to pg_stat_database

2020-10-14 Thread Laurenz Albe

Thanks for the --- as always --- valuable review!

On Tue, 2020-10-13 at 17:55 -0500, Justin Pryzby wrote:
> On Tue, Oct 13, 2020 at 01:44:41PM +0200, Laurenz Albe wrote:
> > Attached is v3 with improvements.
> 
> +  
> +   Time spent in database sessions in this database, in milliseconds.
> +  
> 
> Should say "Total time spent *by* DB sessions..." ?

That is indeed better.  Fixed.

> I think these counters are only accurate as of the last state change, right?
> So a session which has been idle for 1hr, that 1hr is not included.  I think
> the documentation should explain that, or (ideally) the implementation would 
> be
> more precise.  Maybe the timestamps should only be updated after a session
> terminates (and the docs should say so).

I agree, and I have added an explanation that the value doesn't include
the duration of the current state.

Of course it would be nice to have totally accurate values, but I think
that the statistics are by nature inaccurate (datagrams can get lost),
and more frequent statistics updates increase the work load.
I don't think that is worth the effort.

> +  
> +   connections bigint
> +  
> +  
> +   Number of connections established to this database.
> 
> *Total* number of connections established, otherwise it sounds like it might
> mean "the number of sessions [currently] established".

Fixed like that.

> +   Number of database sessions to this database that did not end
> +   with a regular client disconnection.
> 
> Does that mean "sessions which ended irregularly" ?  Or does it also include
> "sessions which have not ended" ?

I have added an explanation for that.

> +   msg.m_aborted = (!disconnect || pgStatSessionDisconnected) ? 0 : 1;
> 
> I think this can be just:
> msg.m_aborted = (bool) (disconnect && !pgStatSessionDisconnected);

I mulled over this and finally decided to leave it as it is.

Since "m_aborted" gets added to the total counter, I'd prefer to
have it be an "int".

Your proposed code works (the cast is actually not necessary, right?).
But I think that my version is more readable if you think of
"m_aborted" as a counter rather than a flag.

> +   if ((dbentry = pgstat_fetch_stat_dbentry(dbid)) == NULL)
> +   result = 0;
> +   else
> +   result = ((double) dbentry->n_session_time) / 1000.0;
> 
> I think these can say:
> > double result = 0;
> > if ((dbentry=..) != NULL)
> >  result = (double) ..;
> 
> That not only uses fewer LOC, but also the assignment to zero is (known to be)
> done at compile time (BSS) rather than runtime.

I didn't know about the performance difference.
Concise code (if readable) is good, so I changed the code like you propose.

The code pattern is actually copied from neighboring functions,
which then should also be changed like this, but that is outside
the scope of this patch.

Attached is v4 of the patch.

Yours,
Laurenz Albe
From 9e8bf3efd984306c73243736d0b4a4023cdd5f3a Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 14 Oct 2020 11:08:20 +0200
Subject: [PATCH] Add session statistics to pg_stat_database

If "track_counts" is active, track the following per database:
- total number of connections
- number of sessions that ended other than with a client disconnect
- total time spent in database sessions
- total time spent executing queries
- total idle in transaction time

This is useful to check if connection pooling is working.
It also helps to estimate the size of the connection pool
required to keep the database busy, which depends on the
percentage of the transaction time that is spent idling.

Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.ca...@cybertec.at
Reviewed-By: Soumyadeep Chakraborty, Justin Pryzby, Masahiro Ikeda

(This requires a catversion bump, as well as an update to
PGSTAT_FILE_FORMAT_ID)
---
 doc/src/sgml/monitoring.sgml |  49 +
 src/backend/catalog/system_views.sql |   5 ++
 src/backend/postmaster/pgstat.c  | 105 ++-
 src/backend/tcop/postgres.c  |   5 ++
 src/backend/utils/adt/pgstatfuncs.c  |  68 +
 src/include/catalog/pg_proc.dat  |  20 +
 src/include/pgstat.h |  27 +++
 src/test/regress/expected/rules.out  |   5 ++
 8 files changed, 283 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 66566765f0..a50fc025d5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3663,6 +3663,55 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
   
  
 
+ 
+  
+   session_time double precision
+  
+  
+

Re: Disable WAL logging to speed up data loading

2020-10-28 Thread Laurenz Albe

On Wed, 2020-10-28 at 04:11 +, osumi.takami...@fujitsu.com wrote:
> I wrote and attached the first patch to disable WAL logging.
> This patch passes the regression test of check-world already
> and is formatted by pgindent.

Without reading the code, I have my doubts about that feature.

While it clearly will improve performance, it opens the door to
data loss.  People will use it to speed up their data loads and
then be unhappy if they cannot use their backups to recover from
a problem.

What happens if you try to do archive recovery across a time where
wal_level was "none"?  Will the recovery process fail, as it should,
or will you end up with data corruption?

We already have a performance-related footgun in the shape of
fsync = off.  Do we want to add another one?

Yours,
Laurenz Albe

Re: Disable WAL logging to speed up data loading

2020-10-28 Thread Laurenz Albe

On Wed, 2020-10-28 at 09:55 +, osumi.takami...@fujitsu.com wrote:
> > > I wrote and attached the first patch to disable WAL logging.
> > > This patch passes the regression test of check-world already and is
> > > formatted by pgindent.
> >
> > Without reading the code, I have my doubts about that feature.
> > While it clearly will improve performance, it opens the door to data loss.
> 
> Therefore, this feature must avoid that
> that kind of inconsistent server starts up again.
> This has been discussed in this thread already.
>
> > People will use it to speed up their data loads and then be unhappy if they
> > cannot use their backups to recover from a problem.
> > What happens if you try to do archive recovery across a time where wal_level
> > was "none"?  Will the recovery process fail, as it should, or will you end 
> > up
> > with data corruption?
> > We already have a performance-related footgun in the shape of fsync = off.
> > Do we want to add another one?
> 
> Further, in this thread, we discuss that 
> this feature is intended to serve under
> some specific opportunities like DBA wants
> to load data as soon as possible and/or the operation itself is easily 
> *repeatable*.
> So, before and after the change of wal_level, DBA needs to take a full backup 
> to
> prepare the unexpected crash.
> 
> But anyway, I'll fix and enrich the documents. Thanks.

I read through the thread and the patch now.

The only safety I see is that startup after a crash is prevented.

But what if someone sets wal_level=none, performs some data modifications,
sets wal_level=archive and after dome more processing decides to restore from
a backup that was taken before the cluster was set to wal_level=none?
Then they would end up with a corrupted database, right?

I think the least this patch needs is that starting with wal_level=none emits
a WAL record that will make recovery fail.

I am aware that this is intended for "specific opportunities", but we still
should make it as hard as possible for the user to cause harm.  It may be that
MySQL, which inspired this feature, does not care about that, but I think we
should do better.

Another point that makes me worry is that this feature will unconditionally
break all replication, and there is not the least mention of that in the
documentation.

Yours,
Laurenz Albe

Re: Disable WAL logging to speed up data loading

2020-10-28 Thread Laurenz Albe

On Wed, 2020-10-28 at 14:05 +0100, I wrote:
> But what if someone sets wal_level=none, performs some data modifications,
> sets wal_level=archive and after dome more processing decides to restore from
> a backup that was taken before the cluster was set to wal_level=none?
> Then they would end up with a corrupted database, right?
> 
> I think the least this patch needs is that starting with wal_level=none emits
> a WAL record that will make recovery fail.

I just realized that changing "wal_level" will cause a WAL record anyway.
Besides, the situation is not much different from changing to "wal_level = 
minimal".
So as long as PostgreSQL refuses to start after a crash, we should be good.

Sorry for the noise, and I am beginning to think that this is actually
a useful feature.

Yours,
Laurenz Albe

Re: Disable WAL logging to speed up data loading

2020-10-29 Thread Laurenz Albe

On Thu, 2020-10-29 at 11:42 +0900, Fujii Masao wrote:
> > But what if someone sets wal_level=none, performs some data modifications,
> > sets wal_level=archive and after dome more processing decides to restore 
> > from
> > a backup that was taken before the cluster was set to wal_level=none?
> > Then they would end up with a corrupted database, right?
> 
> I think that the guard to prevent the server from starting up from
> the corrupted database in that senario is necessary.

That wouldn't apply, I think, because the backup from which you start
was taken with wal_level = replica, so the guard wouldn't alert.

But as I said in the other thread, changing wal_level emits a WAL
record, and I am sure that recovery will refuse to proceed if
wal_level < replica.

> I'm still not sure if it's worth supporting this feature in core.
> Because it can really really easily cause users to corrupt whole the database.

You mean, if they take no backup before setting wal_level = none
and then crash the database, so that they are stuck with an
unrecoverable database?

Yes, that feels somewhat too fast and loose...
It would at least require some fat warnings in the documentation
and in postgresql.conf.

Yours,
Laurenz Albe

Re: Disable WAL logging to speed up data loading

2020-10-29 Thread Laurenz Albe

On Thu, 2020-10-29 at 00:07 +, osumi.takami...@fujitsu.com wrote:
> > Sorry for the noise, and I am beginning to think that this is actually a 
> > useful
> > feature.
> 
> No problem at all.
> Probably, for some developers, was the name "none" confusing ?

No, I think that "none" is quite accurate.

The worry is that it makes it quite easy for people to end up with
a crashed database that cannot start - at the very least there should be
warnings in the documentation and postgresql.conf that are as dire as
for "fsync = off".

Yours,
Laurenz Albe

Re: Disable WAL logging to speed up data loading

2020-10-30 Thread Laurenz Albe

On Fri, 2020-10-30 at 05:00 +0900, Fujii Masao wrote:
> > But as I said in the other thread, changing wal_level emits a WAL
> > record, and I am sure that recovery will refuse to proceed if
> > wal_level < replica.
> 
> Yes. What I meant was such a safe guard needs to be implemented.

That should be easy; just modify this code:

static void
CheckRequiredParameterValues(void)
{
/*
 * For archive recovery, the WAL must be generated with at least 'replica'
 * wal_level.
 */
if (ArchiveRecoveryRequested &&
ControlFile->wal_level == WAL_LEVEL_MINIMAL)
{
ereport(WARNING,
(errmsg("WAL was generated with wal_level=minimal, data may be 
missing"),
 errhint("This happens
if you temporarily set wal_level=minimal without taking a new base backup.")));
}

so that it tests

if (ArchiveRecoveryRequested && ControlFile->wal_level <= WAL_LEVEL_MINIMAL)

and we should be safe.

Yours,
Laurenz Albe

Re: Collation versioning

2020-11-04 Thread Laurenz Albe

On Tue, 2020-11-03 at 23:14 +0100, Christoph Berg wrote:
> Re: Thomas Munro
> > for all I know, "en" variants might change
> > independently (I doubt it in practice, but in theory it's wrong).
> 
> Long before the glibc 2.28 incident, the same collation change
> had already happened twice, namely between RedHat 5 -> 6 -> 7, for
> de_DE.UTF-8 only. de_AT.UTF-8 and all other locales were unaffected.
> 
> At the time I didn't connect the dots to check if Debian was affected
> as well, but of course later testing revealed it was since it was a
> change in glibc.

Yes, this is a persistent pain; I had several customers suffering from
these issues too.

I wish
https://postgr.es/m/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com
hade made it into core.

Yours,
Laurenz Albe

Re: Add session statistics to pg_stat_database

2020-11-11 Thread Laurenz Albe

On Tue, 2020-11-10 at 15:03 +, Georgios Kokolatos wrote:
> I noticed that the cfbot fails for this patch.
> 
> For this, I am setting the status to: 'Waiting on Author'.

Thanks for noticing, it was only the documentation build.

Version 5 attached, status changed back to "waiting for review".

Yours,
Laurenz Albe
From afc37856c12fd0a85587c638fca291a0b5652d9b Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 11 Nov 2020 20:14:28 +0100
Subject: [PATCH] Add session statistics to pg_stat_database

If "track_counts" is active, track the following per database:
- total number of connections
- number of sessions that ended other than with a client disconnect
- total time spent in database sessions
- total time spent executing queries
- total idle in transaction time

This is useful to check if connection pooling is working.
It also helps to estimate the size of the connection pool
required to keep the database busy, which depends on the
percentage of the transaction time that is spent idling.

Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.ca...@cybertec.at
Reviewed-By: Soumyadeep Chakraborty, Justin Pryzby, Masahiro Ikeda

(This requires a catversion bump, as well as an update to
PGSTAT_FILE_FORMAT_ID)
---
 doc/src/sgml/monitoring.sgml |  49 +
 src/backend/catalog/system_views.sql |   5 ++
 src/backend/postmaster/pgstat.c  | 105 ++-
 src/backend/tcop/postgres.c  |   5 ++
 src/backend/utils/adt/pgstatfuncs.c  |  68 +
 src/include/catalog/pg_proc.dat  |  20 +
 src/include/pgstat.h |  27 +++
 src/test/regress/expected/rules.out  |   5 ++
 8 files changed, 283 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 98e1995453..89610d1010 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3704,6 +3704,55 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
   
  
 
+ 
+  
+   session_time double precision
+  
+  
+   Time spent by database sessions in this database, in milliseconds
+   (note that statistics are only updated when the state of a session
+   changes, so if sessions have been idle for a long time, this idle time
+   won't be included)
+  
+ 
+
+ 
+  
+   active_time double precision
+  
+  
+   Time spent executing SQL statements in this database, in milliseconds
+  
+ 
+
+ 
+  
+   idle_in_transaction_time double precision
+  
+  
+   Time spent idling while in a transaction in this database, in milliseconds
+  
+ 
+
+ 
+  
+   connections bigint
+  
+  
+   Total number of connections established to this database
+  
+ 
+
+ 
+  
+   aborted_sessions bigint
+  
+  
+   Number of database sessions to this database that were terminated
+   by something else than a regular client disconnection
+  
+ 
+
  
   
stats_reset timestamp with time zone
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2e4aa1c4b6..998b4d542a 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -924,6 +924,11 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_checksum_last_failure(D.oid) AS checksum_last_failure,
 pg_stat_get_db_blk_read_time(D.oid) AS blk_read_time,
 pg_stat_get_db_blk_write_time(D.oid) AS blk_write_time,
+pg_stat_get_db_session_time(D.oid) AS session_time,
+pg_stat_get_db_active_time(D.oid) AS active_time,
+pg_stat_get_db_idle_in_transaction_time(D.oid) AS idle_in_transaction_time,
+pg_stat_get_db_connections(D.oid) AS connections,
+pg_stat_get_db_aborted_sessions(D.oid) AS aborted_sessions,
 pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset
 FROM (
 SELECT 0 AS oid, NULL::name AS datname
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index e76e627c6b..9978aab60a 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -249,6 +249,9 @@ static int	pgStatXactCommit = 0;
 static int	pgStatXactRollback = 0;
 PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
+static PgStat_Counter pgStatActiveTime = 0;
+static PgStat_Counter pgStatTransactionIdleTime = 0;
+bool pgStatSessionDisconnected = false;
 
 /* Record that's written to 2PC state file when pgstat state is persisted */
 typedef struct TwoPhasePgStatRecord
@@ -334,6 +337,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 static HTAB *pgstat_collect_oids(Oid catalogid, AttrNumber a

Re: Add session statistics to pg_stat_database

2020-11-12 Thread Laurenz Albe

I wrote:
> On Tue, 2020-11-10 at 15:03 +, Georgios Kokolatos wrote:
> > I noticed that the cfbot fails for this patch.
> > 
> > For this, I am setting the status to: 'Waiting on Author'.
> 
> Thanks for noticing, it was only the documentation build.
> 
> Version 5 attached, status changed back to "waiting for review".

The patch is still failing, so I looked again:

  make[3]: Entering directory 
'/home/travis/build/postgresql-cfbot/postgresql/doc/src/sgml'
  { \
echo ""; \
echo ""; \
  } > version.sgml
  '/usr/bin/perl' ./mk_feature_tables.pl YES 
../../../src/backend/catalog/sql_feature_packages.txt 
../../../src/backend/catalog/sql_features.txt > features-supported.sgml
  '/usr/bin/perl' ./mk_feature_tables.pl NO 
../../../src/backend/catalog/sql_feature_packages.txt 
../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml
  '/usr/bin/perl' ./generate-errcodes-table.pl 
../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
  '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
  /usr/bin/xmllint --path . --noout --valid postgres.sgml
  error : Unknown IO error
  postgres.sgml:21: /usr/bin/bison -Wno-deprecated  -d -o gram.c gram.y
  warning: failed to load external entity 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd";
  ]>
^
  postgres.sgml:23: element book: validity error : No declaration for attribute 
id of element book

 ^
  postgres.sgml:24: element title: validity error : No declaration for element 
title
   PostgreSQL &version; Documentation

I have the impression that this is not the fault of my patch, something seems 
to be
wrong with the cfbot.

I see that other patches are failing with the same error.

Yours,
Laurenz Albe

Re: Add session statistics to pg_stat_database

2020-11-17 Thread Laurenz Albe

On Fri, 2020-10-16 at 16:24 +0500, Ahsan Hadi wrote:
> I have applied the latest patch on master, all the regression test cases are 
> passing
>  and the implemented functionality is also looking fine. The point that I 
> raised about
>  idle connection not included is also addressed.

If you think that the patch is ready to go, you could mark it as
"ready for committer" in the commitfest app.

Yours,
Laurenz Albe

Re: Disable WAL logging to speed up data loading

2020-11-18 Thread Laurenz Albe

On Thu, 2020-11-19 at 05:24 +, osumi.takami...@fujitsu.com wrote:
> > > > ereport(WARNING,
> > > >  (errmsg("WAL was generated with wal_level=minimal, data may
> > > > be missing"),
> > > >  errhint("This happens if you temporarily set
> > > > wal_level=minimal without taking a new base backup.")));
> > > > There's definitely a question about if a WARNING there is really
> > > > sufficient or not, considering that you could end up with 'logged'
> > > > tables on the replica that are missing data, but I'm not sure that
> > > > inventing a new, independent, mechanism for checking WAL level
> > > > changes makes
> > > sense.
> >
> > I don't know why WARNING was chosen.  I think it should be FATAL,
> > resulting in the standby shutdown, disabling restarting it, and urging the 
> > user
> > to rebuild the standby.  (I guess that's overreaction because the user may
> > not perform operations that lack WAL while wal_level is minimal.)
> 
> Yeah, I agree that WARNING is not sufficient.

I missed that this is only a warning when I looked at it before.
Yes, it should be a fatal error.

I think that there should two patches: one that turns this warning into
a FATAL and should be backpatched.  If you change the test to

   ControlFile->wal_level <= WAL_LEVEL_MINIMAL

it will automatically work for your new feature too.

Then your new wal_level would be a second patch only for HEAD.

With that, the only remaining consideration with this patch is the danger
that enabling wal_level=none without taking a backup before can lead to
data loss.  But that is intended, so I think that an unmistakable warning
in the documentation would be good enough.

Yours,
Laurenz Albe

Re: passwordcheck: Log cracklib diagnostics

2020-08-25 Thread Laurenz Albe

On Tue, 2020-08-25 at 13:48 +0200, Daniel Gustafsson wrote:
> > On 25 Aug 2020, at 12:20, Peter Eisentraut 
> >  wrote:
> > 
> > A user tried to use the cracklib build-time option of the passwordcheck 
> > module.  This failed, as it turned out because there was no dictionary 
> > installed in the right place, but the error was not
> > properly reported, because the existing code just throws away the error 
> > message from cracklib.  Attached is a patch that changes this by logging 
> > any error message returned from the cracklib call.
> 
> +1 on this, it's also in line with the example documentation from cracklib.
> The returned error is potentially a bit misleading now, as it might say claim
> that a strong password is easily cracked if the dictionary fails load.  Given
> that there is no way to distinguish between the class of returned errors it's
> hard to see how we can do better though.
> 
> While poking at this, we might as well update the docs to point to the right
> URL for CrackLib as it moved from Sourceforge five years ago.  The attached
> diff fixes that.

+1 on both patches.

Yours,
Laurenz Albe

Re: Change a constraint's index - ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...

2020-09-04 Thread Laurenz Albe

On Mon, 2020-08-10 at 09:29 +0300, Anna Akenteva wrote:
> On 2020-07-07 01:08, Tom Lane wrote:
> 
> > Alvaro Herrera  writes:
> > > On 2020-Jul-05, Anna Akenteva wrote:
> > > > -- Swapping primary key's index for an equivalent index,
> > > > -- but with INCLUDE-d attributes.
> > > > CREATE UNIQUE INDEX new_idx ON target_tbl (id) INCLUDE (info);
> > > > ALTER TABLE target_tbl ALTER CONSTRAINT target_tbl_pkey USING INDEX
> > > > new_idx;
> > > > ALTER TABLE referencing_tbl ALTER CONSTRAINT 
> > > > referencing_tbl_id_ref_fkey
> > > > USING INDEX new_idx;
> > > How is this state represented by pg_dump?
> > Even if it's possible to represent, I think we should flat out reject
> > this "feature".  Primary keys that aren't primary keys don't seem like
> > a good idea.  For one thing, it won't be possible to describe the
> > constraint accurately in the information_schema.
> 
> 
> Do you think it could still be a good idea if we only swap the 
> relfilenodes of indexes, as it was suggested in [1]? The original use 
> case was getting rid of index bloat, which is now solved by REINDEX 
> CONCURRENTLY, but this feature still has its own use case of adding 
> INCLUDE-d columns to constraint indexes.

How can you just swap the filenodes if "indnatts" and "indkey" is
different, since one index has an INCLUDE clause?

I think that the original proposal is better, except that foreign key
dependencies should be changed along with the primary or unique index,
so that everything is consistent once the command is done.

Then the ALTER CONSTRAINT from that replaces the index referenced
by a foreign key becomes unnecessary and should be removed.

The value I see in this is:
- replacing a primary key index
- replacing the index behind a constraint targeted by a foreign key

Some code comments:

+   
+ALTER CONSTRAINT constraint_name [USING INDEX

Re: Change a constraint's index - ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...

2020-09-04 Thread Laurenz Albe

On Fri, 2020-09-04 at 10:41 -0400, Alvaro Herrera wrote:
> > The value I see in this is:
> > - replacing a primary key index
> > - replacing the index behind a constraint targeted by a foreign key
> 
> But why is this better than using REINDEX CONCURRENTLY?

It is not better, but it can be used to replace a constraint index
with an index with a different INCLUDE clause, which is something
that cannot easily be done otherwise.

For exclusion constraints it is pretty useless, and for unique
constraints it can be worked around with CREATE UNIQUE INDEX CONCURRENTLY.

Admitted, the use case is pretty narrow, and I am not sure if it is
worth adding code and SQL syntax for that.

Yours,
Laurenz Albe

Re: Add session statistics to pg_stat_database

2020-09-04 Thread Laurenz Albe

On Tue, 2020-08-11 at 13:53 +0200, I wrote:
> On Thu, 2020-07-23 at 18:16 +0500, Ahsan Hadi wrote:
> 
> > On Wed, Jul 8, 2020 at 4:17 PM Laurenz Albe  
> > wrote:
> > > Here is a patch that adds the following to pg_stat_database:
> > > - number of connections
> >
> > Is it expected behaviour to not count idle connections? The connection is 
> > included after it is aborted but not while it was idle.
> 
> Currently, the patch counts connections when they close.
> 
> I could change the behavior that they are counted immediately.

I have changed the code so that connections are counted immediately.

Attached is a new version.

Yours,
Laurenz Albe
From 6d9bfbd682a9f4723f030fdc461f731175f55f44 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 4 Sep 2020 17:30:24 +0200
Subject: [PATCH] Add session statistics to pg_stat_database

If "track_counts" is active, track the following per database:
- number of connections
- number of sessions that were not disconnected regularly
- total time spent in database sessions
- total time spent executing queries
- total idle in transaction time

This is useful to check if connection pooling is working.
It also helps to estimate the size of the connection pool
required to keep the database busy, which depends on the
percentage of the transaction time that is spent idling.
---
 doc/src/sgml/monitoring.sgml |  46 +
 src/backend/catalog/system_views.sql |   5 +
 src/backend/postmaster/pgstat.c  | 146 ++-
 src/backend/tcop/postgres.c  |   5 +
 src/backend/utils/adt/pgstatfuncs.c  |  78 ++
 src/include/catalog/pg_proc.dat  |  20 
 src/include/pgstat.h |  29 +-
 src/test/regress/expected/rules.out  |   5 +
 8 files changed, 332 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 673a0e73e4..aa5e22d213 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3514,6 +3514,52 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
   
  
 
+ 
+  
+   session_time double precision
+  
+  
+   Time spent in database sessions in this database, in milliseconds.
+  
+ 
+
+ 
+  
+   active_time double precision
+  
+  
+   Time spent executing SQL statements in this database, in milliseconds.
+  
+ 
+
+ 
+  
+   idle_in_transaction_time double precision
+  
+  
+   Time spent idling while in a transaction in this database, in milliseconds.
+  
+ 
+
+ 
+  
+   connections bigint
+  
+  
+   Number of connections established to this database.
+  
+ 
+
+ 
+  
+   aborted_sessions bigint
+  
+  
+   Number of database sessions to this database that did not end
+   with a regular client disconnection.
+  
+ 
+
  
   
stats_reset timestamp with time zone
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ed4f3f142d..d8b28c7600 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -912,6 +912,11 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_checksum_last_failure(D.oid) AS checksum_last_failure,
 pg_stat_get_db_blk_read_time(D.oid) AS blk_read_time,
 pg_stat_get_db_blk_write_time(D.oid) AS blk_write_time,
+pg_stat_get_db_session_time(D.oid) AS session_time,
+pg_stat_get_db_active_time(D.oid) AS active_time,
+pg_stat_get_db_idle_in_transaction_time(D.oid) AS idle_in_transaction_time,
+pg_stat_get_db_connections(D.oid) AS connections,
+pg_stat_get_db_aborted_sessions(D.oid) AS aborted_sessions,
 pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset
 FROM (
 SELECT 0 AS oid, NULL::name AS datname
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 5f4b168fd1..12a7543554 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -247,6 +247,12 @@ static int	pgStatXactCommit = 0;
 static int	pgStatXactRollback = 0;
 PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
+static TimestampTz pgStatActiveStart = DT_NOBEGIN;
+static PgStat_Counter pgStatActiveTime = 0;
+static TimestampTz pgStatTransactionIdleStart = DT_NOBEGIN;
+static PgStat_Counter pgStatTransactionIdleTime = 0;
+static bool pgStatSessionReported = false;
+bool pgStatSessionDisconnected = false;
 
 /* Record that's written to 2PC state file when pgstat state is persisted */
 typedef struct TwoPhasePgStatRecord
@@ -326,6 +332,7 @@ static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static void pgstat_send_slru(void);
 s

Re: Change a constraint's index - ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...

2020-09-07 Thread Laurenz Albe

On Fri, 2020-09-04 at 13:31 -0400, Alvaro Herrera wrote:
> On 2020-Sep-04, Laurenz Albe wrote:
> > On Fri, 2020-09-04 at 10:41 -0400, Alvaro Herrera wrote:
> > > > The value I see in this is:
> > > > - replacing a primary key index
> > > > - replacing the index behind a constraint targeted by a foreign key
> > > But why is this better than using REINDEX CONCURRENTLY?
> > It is not better, but it can be used to replace a constraint index
> > with an index with a different INCLUDE clause, which is something
> > that cannot easily be done otherwise.
> 
> 
> I can see that there is value in having an index that serves both a
> uniqueness constraint and coverage purposes.  But this seems a pretty
> roundabout way to get that -- I think you should have to do "CREATE
> UNIQUE INDEX ... INCLUDING ..." instead.  That way, the fact that this
> is a Postgres extension remains clear.
> 
> 55432 14devel 24138=# create table foo (a int not null, b int not null, c 
> int);
> CREATE TABLE
> Duración: 1,775 ms
> 55432 14devel 24138=# create unique index on foo (a, b) include (c);
> CREATE INDEX
> Duración: 1,481 ms
> 55432 14devel 24138=# create table bar (a int not null, b int not null, 
> foreign key (a, b) references foo (a, b));
>
> CREATE TABLE
> Duración: 2,559 ms
> 
> Now you have a normal index that you can reindex in the normal way, if you 
> need
> it.

Yes, that is true.

But what if you have done

  CREATE TABLE a (id bigint CONSTRAINT a_pkey PRIMARY KEY, val integer);
  CREATE TABLE b (id bigint CONSTRAINT b_fkey REFERENCES a);

and later you figure out later that it would actually be better to have
an index ON mytab (id) INCLUDE (val), and you don't want to maintain
two indexes.

Yes, you could do

  CREATE UNIQUE INDEX CONCURRENTLY ind ON a (id) INCLUDE (val);
  ALTER TABLE a ADD UNIQUE USING INDEX ind;
  ALTER TABLE a DROP CONSTRAINT a_pkey CASCADE;
  ALTER TABLE b ADD FOREIGN KEY (id) REFERENCES a(id);

but then you don't have a primary key, and you have to live without
the foreign key for a while.

Adding a primary key to a large table is very painful, because it
locks the table exclusively for a long time.


This patch would provide a more convenient way to do that.

Again, I am not sure if that justifies the effort.

Yours,
Laurenz Albe

Re: Change a constraint's index - ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...

2020-09-08 Thread Laurenz Albe

On Mon, 2020-09-07 at 11:42 -0300, Alvaro Herrera wrote:
> > This patch would provide a more convenient way to do that.
> > Again, I am not sure if that justifies the effort.
> 
> I have to admit I've seen cases where it'd be useful to have included
> columns in primary keys.
> 
> TBH I think if we really wanted the feature of primary keys with
> included columns, we'd have to add it to the PRIMARY KEY syntax rather
> than having an ad-hoc ALTER TABLE ALTER CONSTRAINT USING INDEX command
> to replace the index underneath.  Then things like pg_dump would work
> normally.
> 
> (I have an answer for the information_schema question Tom posed; I'd
> like to know what's yours.)

Gah, now I see my mistake.  I was under the impression that a
primary key can have an INCLUDE clause today, which is not true.

So this would introduce that feature in a weird way.
I agree that that is undesirable.

We should at least have

  ALTER TABLE ... ADD PRIMARY KEY (id) INCLUDE (val);

or something before we consider this patch.

As to the information_schema, that could pretend that the INCLUDE
columns just don't exist.

Yours,
Laurenz Albe

Re: Add session statistics to pg_stat_database

2020-09-29 Thread Laurenz Albe

On Thu, 2020-09-24 at 14:38 -0700, Soumyadeep Chakraborty wrote:
> Thanks for submitting this! Please find my feedback below.

Thanks for the thorough review.

Before I update the patch, I have a few comments and questions.

> * Are we trying to capture ONLY client initiated disconnects in
> m_aborted (we are not handling other disconnects by not accounting for
> EOF..like if psql was killed)? If yes, why?

I thought it was interesting to know how many database sessions are
ended regularly as opposed to ones that get killed or end by unexpected
client death.

> * pgstat_send_connstats(): How about renaming the "force" argument to
> "disconnected"?

Yes, that might be better.  I'll do that.

> *
> > static TimestampTz pgStatActiveStart = DT_NOBEGIN;
> > static PgStat_Counter pgStatActiveTime = 0;
> > static TimestampTz pgStatTransactionIdleStart = DT_NOBEGIN;
> > static PgStat_Counter pgStatTransactionIdleTime = 0;
> > static bool pgStatSessionReported = false;
> > bool pgStatSessionDisconnected = false;
> 
> I think we can house all of these globals inside PgBackendStatus and can
> follow the protocol for reading/writing fields in PgBackendStatus.
> Refer: PGSTAT_{BEGIN|END}_WRITE_ACTIVITY

Are you sure that is the right way to go?

Correct me if I am wrong, but isn't PgBackendStatus for relevant status
information that other processes can access?
I'd assume that it is not the correct place to store backend-private data
that are not relevant to others.
Besides, if data is written to this structure more often, readers would
have deal with more contention, which could affect performance.

But I agree with the following:

> Also, some of these fields are not required:
> 
> I don't think we need pgStatActiveStart and pgStatTransactionIdleStart -
> instead of these two we could use
> PgBackendStatus.st_state_start_timestamp which marks the beginning TS of
> the backend's current state (st_state). We can look at that field along
> with the current and to-be-transitioned-to states inside
> pgstat_report_activity() when there is a transition away from
> STATE_RUNNING, STATE_IDLEINTRANSACTION or
> STATE_IDLEINTRANSACTION_ABORTED, in order to update pgStatActiveTime and
> pgStatTransactionIdleTime. We would also need to update those counters
> on disconnect/PGSTAT_STAT_INTERVAL timeout if the backend's current
> state was STATE_RUNNING, STATE_IDLEINTRANSACTION or
> STATE_IDLEINTRANSACTION_ABORTED (in pgstat_send_connstats())

Yes, that would be better.

> pgStatSessionDisconnected is not required as it can be determined if a
> session has been disconnected by looking at the force argument to
> pgstat_report_stat() [unless we would want to distinguish between
> client-initiated disconnects, which I am not sure why, as I have
> brought up above].

But wouldn't that mean that we count *every* end of a session as regular
disconnection, even if the backend was killed?

I personally would want all my database connections to be closed by
the client, unless something unexpected happens.

> pgStatSessionReported is not required. We can glean this information by
> checking if the function local static last_report in
> pgstat_report_stat() is 0 and passing this on as another param
> "first_report" to pgstat_send_connstats().

Yes, that is better.

> * PGSTAT_FILE_FORMAT_ID needs to be updated when a stats collector data
> structure changes and we had a change in PgStat_StatDBEntry.

I think that should be left to the committer.

> * We can directly use PgBackendStatus.st_proc_start_timestamp for
> calculating m_session_time. We can also choose to report session uptime
> even when the report is for the not-disconnect case
> (PGSTAT_STAT_INTERVAL elapsed). No reason why not. Then we would need to
> pass in the value of last_report to pgstat_send_connstats() -> calculate
> m_session_time to be number of time units from
> PgBackendStatus.st_proc_start_timestamp for the first report and then
> number of time units from the last_report for all subsequent reports.

Yes, that would make for better statistics, since client connections
can last quite long.

> * We would need to bump the catalog version since we have made
> changes to system views. Refer: #define CATALOG_VERSION_NO

Again, I think that's up to the committer.

Thanks again!

Yours,
Laurenz Albe

Re: Change a constraint's index - ALTER TABLE ... ALTER CONSTRAINT ... USING INDEX ...

2020-09-30 Thread Laurenz Albe

On Wed, 2020-09-30 at 16:27 +0900, Michael Paquier wrote:
> On Fri, Sep 04, 2020 at 01:59:47PM +0200, Laurenz Albe wrote:
> > I'll set the commitfest entry to "waiting for author".
> 
> This review, as well as any of the follow-up emails, have not been
> answered by the author, so I have marked the patch as returned with
> feedback.

I had the impression that "rejected" would be more appropriate.

It doesn't make sense to introduce a featue whose only remaining
use case is modifying a constraint index in a way that is not
supported currently.

Yours,
Laurenz Albe

Re: [PATCH] Add reloption for views to enable RLS

2022-03-14 Thread Laurenz Albe

On Mon, 2022-03-14 at 13:40 +0100, Christoph Heiss wrote:
> On 3/9/22 16:06, Laurenz Albe wrote:
> > This paragraph contains a couple of grammatical errors.
>
> Replaced the two paragraphs with your suggestion, it is indeed easier to 
> read.
> 
> > Also, this:
> > could be written like this (introducing a new variable):
> > 
> >    if (rule->event == CMD_SELECT
> >    && relation->rd_rel->relkind == RELKIND_VIEW
> >    && RelationHasSecurityInvoker(relation))
> >    user_for_check = InvalidOid;
> >    else
> >    user_for_check = relation->rd_rel->relowner;
> > 
> >    setRuleCheckAsUser((Node *) rule->actions, user_for_check);
> >    setRuleCheckAsUser(rule->qual, user_for_check);
> > 
> > This might be easier to read.
> 
> Makes sense, I've changed that. This also seems to be more in line with 
> all the other code.
> While at it I also split the comment alongside it to match, hopefully 
> that makes sense.

The patch is fine from my point of view.

It passes "make check-world".

I'll mark it as "ready for committer".

Yours,
Laurenz Albe

Re: Can we consider "24 Hours" for "next day" in INTERVAL datatype ?

2022-03-15 Thread Laurenz Albe

On Tue, 2022-03-15 at 12:54 +0530, Prabhat Sahu wrote:
> Kindly check the below scenario with INTERVAL datatype.
> 
> postgres=# select interval '01 20:59:59' + interval '00 05:00:01' as interval;
>     interval    
> 
>  1 day 26:00:00
> (1 row)
> 
> Any operation with INTERVAL data, We are changing the interval values as 
> "60 sec" as "next minute"
> "60 min" as "next hour"
> Similarly can't we consider "24 Hours" for "next day" ?
> Is there any specific purpose we are holding the hours as an increasing 
> number beyond 24 hours also?
> 
> But when we are dealing with TIMESTAMP with INTERVAL values it's considered 
> the "24 Hours" for "next day".
> 
> postgres=# select timestamp '01-MAR-22 20:59:59' + interval '00 05:00:01'  as 
> interval;
>       interval       
> -
>  2022-03-02 02:00:00
> (1 row)

The case is different with days:

test=> SELECT TIMESTAMPTZ '2022-03-26 20:00:00 Europe/Vienna' + INTERVAL '12 
hours' + INTERVAL '12 hours';
?column?    
════
 2022-03-27 21:00:00+02
(1 row)

test=> SELECT TIMESTAMPTZ '2022-03-26 20:00:00 Europe/Vienna' + INTERVAL '1 
day';
?column?

 2022-03-27 20:00:00+02
(1 row)

Yours,
Laurenz Albe

Re: [PATCH] Add reloption for views to enable RLS

2022-03-21 Thread Laurenz Albe

On Sat, 2022-03-19 at 01:10 +, Dean Rasheed wrote:
> I have been hacking on it a bit, and attached is an updated version.
> Aside from some general copy editing, the most notable changes are:
> [...]

Thanks for your diligent work on this, and the patch looks good to me.
It is good that you found the oversight in LOCK - I wasn't even
aware that views could be locked.

Yours,
Laurenz Albe

Re: [PATCH] Add reloption for views to enable RLS

2022-03-21 Thread Laurenz Albe

On Mon, 2022-03-21 at 18:09 +0800, Japin Li wrote:
> After apply the patch, I found pg_checksums.c also has the similar code.
> 
> In progress_report(), I'm not sure we can do this replace for this code.
> 
>     snprintf(total_size_str, sizeof(total_size_str), INT64_FORMAT,
>  total_size / (1024 * 1024));
>     snprintf(current_size_str, sizeof(current_size_str), INT64_FORMAT,
>  current_size / (1024 * 1024));
> 
>     fprintf(stderr, _("%*s/%s MB (%d%%) computed"),
>     (int) strlen(current_size_str), current_size_str, total_size_str,
>     percent);

I think you replied to the wrong thread...

Yours,
Laurenz Albe

Re: [PoC] Let libpq reject unexpected authentication requests

2022-03-23 Thread Laurenz Albe

On Wed, 2022-03-23 at 21:31 +, Jacob Champion wrote:
> On Mon, 2022-03-07 at 11:44 +0100, Laurenz Albe wrote:
> > I am all for the idea, but you implemented the reverse of proposal 2.
> >
> > Wouldn't it be better to list the *rejected* authentication methods?
> > Then we could have "password" on there by default.
> 
> Specifying the allowed list rather than the denied list tends to have
> better security properties.
> 
> In the case I'm pursuing (the attack vector from the CVE), the end user
> expects certificates to be used. Any other authentication method --
> plaintext, hashed, SCRAM, Kerberos -- is unacceptable;

That makes sense.

> But that doesn't help your case; you want to choose a good default, and
> I agree that's important. Since there are arguments already for
> accepting a OR in the list, and -- if we couldn't find a good
> orthogonal method for certs, like Tom suggested -- an AND, maybe it
> wouldn't be so bad to accept a NOT as well?
> 
>     require_auth=cert    # certs only
>     require_auth=cert+scram-sha-256  # SCRAM wrapped by certs
>     require_auth=cert,scram-sha-256  # SCRAM or certs (or both)
>     require_auth=!password   # anything but plaintext
>     require_auth=!password,!md5  # no plaintext or MD5

Great, if there is a !something syntax, then I have nothing left to wish.
It may not be the most secure way do do it, but it sure is convenient.

Yours,
Laurenz Albe

Re: pg_stat_reset_single_*_counters vs pg_stat_database.stats_reset

2022-03-23 Thread Laurenz Albe

On Wed, 2022-03-23 at 17:55 -0700, Andres Freund wrote:
> Starting with the below commit, pg_stat_reset_single_function_counters,
> pg_stat_reset_single_table_counters don't just reset the stats for the
> individual function, but also set pg_stat_database.stats_reset.

I see the point in the fine-grained reset, but I am -1 on having that
reset "pg_stat_database.stats_reset".  That would make the timestamp
mostly useless.

One could argue that resetting a single counter and *not* resetting
"pg_stat_database.stats_reset" would also be a lie, but at least it is
a smaller lie.

Yours,
Laurenz Albe

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-04-06 Thread Laurenz Albe

On Tue, 2022-04-05 at 13:06 -0700, Nathan Bossart wrote:
> On Tue, Apr 05, 2022 at 11:25:36AM -0400, Stephen Frost wrote:
> > Please find attached an updated patch + commit message.  Mostly, I just
> > went through and did a bit more in terms of updating the documentation
> > and improving the comments (there were some places that were still
> > worrying about the chance of a 'stray' backup_label file existing, which
> > isn't possible any longer), along with some additional testing and
> > review.  This is looking pretty good to me, but other thoughts are
> > certainly welcome.  Otherwise, I'm hoping to commit this tomorrow.
> 
> LGTM!

Cassandra (not the software) from the sidelines predicts that we will
get some fire from users for this, although I concede the theoretical
sanity of the change.

Yours,
Laurenz Albe

Re: Should pg_dumpall dump ALTER SYSTEM settings?

2022-04-07 Thread Laurenz Albe

On Wed, 2022-04-06 at 21:39 -0400, Robert Haas wrote:
> On Wed, Apr 6, 2022 at 2:26 PM Tom Lane  wrote:
> > Thoughts?
> 
> I'm a little bit skeptical about this proposal, mostly because it
> seems like it has the end result that values that are configured in
> postgresql.conf and postgresql.auto.conf end up being handled
> differently: one file has to be copied by hand, while the other file's
> contents are propagated forward to the new version by pg_dump. I don't
> think that's what people are going to be expecting...

"postgresql.auto.conf" is an implementation detail, and I would expect
most users to distinguish between "parameters set in postgresql.conf"
and "parameters set via the SQL statement ALTER SYSTEM".
If that is the way you look at things, then it seems natural for the
latter to be included in a dump, but not the former.

As another case in point, the Ubuntu/Debian packages split up the data
directory so that the config files are under /etc, while the rest of
the data directory is under /var/lib.  "postgresql.auto.conf" is *not*
in /etc, but in /var/lib there.  So a user of these distributions would
naturally think that the config files in /etc need to be handled manually,
but "postgresql.auto.conf" need not.

I am +1 on Tom's idea.

Yours,
Laurenz Albe

Re: Proposal: %T Prompt parameter for psql for current time (like Oracle has)

2023-03-02 Thread Laurenz Albe

On Wed, 2023-03-01 at 11:13 -0500, Kirk Wolak wrote:
> Thanks, corrected, and confirmed Unix line endings.

The patch builds fine and works as intended.

I leave it to the committers to decide whether the patch is worth the
effort or not, given that you can get a similar effect with %`date`.
It adds some value by being simpler and uniform across all platforms.

I'll mark the patch as "ready for committer".

Yours,
Laurenz Albe

Re: Allow tailoring of ICU locales with custom rules

2023-03-02 Thread Laurenz Albe

On Wed, 2023-02-22 at 18:35 +0100, Peter Eisentraut wrote:
> > - there doesn't seem to be a way to add rules to template1.
> > If someone wants to have icu rules and initial contents to their new
> > databases, I think they need to create a custom template database
> > (from template0) for that purpose, in addition to template1.
> >   From a usability standpoint, this is a bit cumbersome, as it's
> > normally the role of template1.
> > To improve on that, shouldn't initdb be able to create template0 with
> > rules too?
> 
> Right, that would be an initdb option.  Is that too many initdb options 
> then?  It would be easy to add, if we think it's worth it.

An alternative would be to document that you can drop "template1" and
create it again using the ICU collation rules you need.

But I'd prefer an "initdb" option.

Yours,
Laurenz Albe

Re: Allow tailoring of ICU locales with custom rules

2023-03-08 Thread Laurenz Albe

On Tue, 2023-03-07 at 22:06 -0800, Jeff Davis wrote:
> On Fri, 2023-03-03 at 13:45 +0100, Peter Eisentraut wrote:
> > You can mess with people by setting up your databases like this:
> > 
> > initdb -D data --locale-provider=icu --icu-rules='&a < c < b < e < d'
> > 
> > ;-)
> 
> Would we be the first major database to support custom collation rules?
> This sounds useful for testing, experimentation, hacking, etc.
> 
> What are some of the use cases? Is it helpful to comply with unusual or
> outdated standards or formats? Maybe there are people using special
> delimiters/terminators and they need them to be treated a certain way
> during comparisons?

I regularly see complaints about the sort order; recently this one:
https://postgr.es/m/cafcrh--xt-j8awoavhb216kom6tqnap35ttveqqs5bhh7gm...@mail.gmail.com

So being able to influence the sort order is useful.

Yours,
Laurenz Albe

Re: Allow tailoring of ICU locales with custom rules

2023-03-08 Thread Laurenz Albe

On Fri, 2023-03-03 at 13:45 +0100, Peter Eisentraut wrote:
> Ok, here is a version with an initdb option and also a createdb option 
> (to expose the CREATE DATABASE option).
> 
> You can mess with people by setting up your databases like this:
> 
> initdb -D data --locale-provider=icu --icu-rules='&a < c < b < e < d'

Looks good.  I cannot get it to misbehave, "make check-world" is successful
(the regression tests misbehave in interesting ways when running
"make installcheck" on a cluster created with non-standard ICU rules, but
that can be expected).

I checked the documentation, tested "pg_dump" support, everything fine.

I'll mark it as "ready for committer".

Yours,
Laurenz Albe

Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-03-22 Thread Laurenz Albe

Thanks for the review!

On Tue, 2023-03-21 at 16:32 +0100, Christoph Berg wrote:
> I have some comments:
> 
> > This allows EXPLAIN to generate generic plans for parameterized statements
> > (that have parameter placeholders like $1 in the statement text).
> 
> > +   
> > +    GENERIC_PLAN
> > +    
> > + 
> > +  Generate a generic plan for the statement (see  > linkend="sql-prepare"/>
> > +  for details about generic plans).  The statement can contain 
> > parameter
> > +  placeholders like $1 (but then it has to be a 
> > statement
> > +  that supports parameters).  This option cannot be used together with
> > +  ANALYZE, since a statement with unknown parameters
> > +  cannot be executed.
> 
> Like in the commit message quoted above, I would put more emphasis on
> "parameterized query" here:
> 
>   Allow the statement to contain parameter placeholders like
>   $1 and generate a generic plan for it.
>   This option cannot be used together with ANALYZE.

I went with

   Allow the statement to contain parameter placeholders like
   $1 and generate a generic plan for it.
   See  for details about generic plans
   and the statements that support parameters.
   This option cannot be used together with ANALYZE.

> > +   /* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
> > +   if (es->generic && es->analyze)
> > +   ereport(ERROR,
> > +   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +    errmsg("EXPLAIN ANALYZE cannot be used 
> > with GENERIC_PLAN")));
> 
> To put that in line with the other error messages in that context, I'd
> inject an extra "option":
> 
>   errmsg("EXPLAIN option ANALYZE cannot be used with GENERIC_PLAN")));

Done.

> > --- a/src/test/regress/sql/explain.sql
> > +++ b/src/test/regress/sql/explain.sql
> > [...]
> > +create extension if not exists postgres_fdw;
> 
> "create extension postgres_fdw" cannot be used from src/test/regress/
> since contrib/ might not have been built.

Ouch.  Good catch.

> I suggest leaving this test in place here, but with local tables (to
> show that plan time pruning using the one provided parameter works),
> and add a comment here explaining that is being tested:
> 
> -- create a partition hierarchy to show that plan time pruning removes
> -- the key1=2 table but generates a generic plan for key2=$1

I did that, with a different comment.

> The test involving postgres_fdw is still necessary to exercise the new
> EXEC_FLAG_EXPLAIN_GENERIC code path, but needs to be moved elsewhere,
> probably src/test/modules/.

Tests for postgres_fdw are in contrib/postgres_fdw/sql/postgres_fdw.sql,
so I added the test there.

Version 9 of the patch is attached.

Yours,
Laurenz Albe
From 85aa88280069ca2befe7f4308d7e6f724cdb143a Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 22 Mar 2023 14:08:49 +0100
Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN

This allows EXPLAIN to generate generic plans for parameterized statements
(that have parameter placeholders like $1 in the statement text).
Invent a new executor flag EXEC_FLAG_EXPLAIN_GENERIC that disables runtime
partition pruning for such plans: that would fail if the non-existing
parameters are involved, and we want to show all subplans anyway.

Author: Laurenz Albe
Reviewed-by: Julien Rouhaud, Christoph Berg, Michel Pelletier, Jim Jones
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at
---
 .../postgres_fdw/expected/postgres_fdw.out| 30 +
 contrib/postgres_fdw/sql/postgres_fdw.sql | 25 +++
 doc/src/sgml/ref/explain.sgml | 14 +++
 src/backend/commands/explain.c| 11 +
 src/backend/executor/execMain.c   |  3 ++
 src/backend/executor/execPartition.c  |  9 ++--
 src/backend/parser/analyze.c  | 29 +
 src/include/commands/explain.h|  1 +
 src/include/executor/executor.h   | 18 +---
 src/test/regress/expected/explain.out | 42 +++
 src/test/regress/sql/explain.sql  | 24 +++
 11 files changed, 197 insertions(+), 9 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 04a3ef450c..25b91ab2e1 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11783,3 +11783,33 @@ ANALYZE analyze_table;
 -- cleanup
 DROP FOREIGN TABLE analyze_ftable;
 DROP TABLE analyze_table;
+-- ===
+-- test EXPLAIN (G

Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-03-23 Thread Laurenz Albe

And here is v10, which includes tab completion for the new option.

Yours,
Laurenz Albe
From dfe6d36d79c74fba7bf70b990fdada166d012ff4 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Thu, 23 Mar 2023 19:28:49 +0100
Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN

This allows EXPLAIN to generate generic plans for parameterized statements
(that have parameter placeholders like $1 in the statement text).
Invent a new executor flag EXEC_FLAG_EXPLAIN_GENERIC that disables runtime
partition pruning for such plans: that would fail if the non-existing
parameters are involved, and we want to show all subplans anyway.

Author: Laurenz Albe
Reviewed-by: Julien Rouhaud, Christoph Berg, Michel Pelletier, Jim Jones
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at
---
 .../postgres_fdw/expected/postgres_fdw.out| 30 +
 contrib/postgres_fdw/sql/postgres_fdw.sql | 25 +++
 doc/src/sgml/ref/explain.sgml | 14 +++
 src/backend/commands/explain.c| 11 +
 src/backend/executor/execMain.c   |  3 ++
 src/backend/executor/execPartition.c  |  9 ++--
 src/backend/parser/analyze.c  | 29 +
 src/bin/psql/tab-complete.c   |  4 +-
 src/include/commands/explain.h|  1 +
 src/include/executor/executor.h   | 18 +---
 src/test/regress/expected/explain.out | 42 +++
 src/test/regress/sql/explain.sql  | 24 +++
 12 files changed, 199 insertions(+), 11 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 04a3ef450c..25b91ab2e1 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11783,3 +11783,33 @@ ANALYZE analyze_table;
 -- cleanup
 DROP FOREIGN TABLE analyze_ftable;
 DROP TABLE analyze_table;
+-- ===
+-- test EXPLAIN (GENERIC_PLAN) with foreign partitions
+-- ===
+-- this is needed to exercise the EXEC_FLAG_EXPLAIN_GENERIC flag
+CREATE TABLE gen_part (
+key1 integer NOT NULL,
+key2 integer NOT NULL
+) PARTITION BY LIST (key1);
+CREATE TABLE gen_part_1
+PARTITION OF gen_part FOR VALUES IN (1)
+PARTITION BY RANGE (key2);
+CREATE FOREIGN TABLE gen_part_1_1
+PARTITION OF gen_part_1 FOR VALUES FROM (1) TO (2)
+SERVER testserver1 OPTIONS (table_name 'whatever_1_1');
+CREATE FOREIGN TABLE gen_part_1_2
+PARTITION OF gen_part_1 FOR VALUES FROM (2) TO (3)
+SERVER testserver1 OPTIONS (table_name 'whatever_1_2');
+CREATE FOREIGN TABLE gen_part_2
+PARTITION OF gen_part FOR VALUES IN (2)
+SERVER testserver1 OPTIONS (table_name 'whatever_2');
+-- this should only scan "gen_part_1_1" and "gen_part_1_2", but not "gen_part_2"
+EXPLAIN (GENERIC_PLAN, COSTS OFF) SELECT key1, key2 FROM gen_part WHERE key1 = 1 AND key2 = $1;
+  QUERY PLAN   
+---
+ Append
+   ->  Foreign Scan on gen_part_1_1 gen_part_1
+   ->  Foreign Scan on gen_part_1_2 gen_part_2
+(3 rows)
+
+DROP TABLE gen_part;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 4f3088c03e..6adc3f2c78 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3979,3 +3979,28 @@ ANALYZE analyze_table;
 -- cleanup
 DROP FOREIGN TABLE analyze_ftable;
 DROP TABLE analyze_table;
+
+-- ===
+-- test EXPLAIN (GENERIC_PLAN) with foreign partitions
+-- ===
+
+-- this is needed to exercise the EXEC_FLAG_EXPLAIN_GENERIC flag
+CREATE TABLE gen_part (
+key1 integer NOT NULL,
+key2 integer NOT NULL
+) PARTITION BY LIST (key1);
+CREATE TABLE gen_part_1
+PARTITION OF gen_part FOR VALUES IN (1)
+PARTITION BY RANGE (key2);
+CREATE FOREIGN TABLE gen_part_1_1
+PARTITION OF gen_part_1 FOR VALUES FROM (1) TO (2)
+SERVER testserver1 OPTIONS (table_name 'whatever_1_1');
+CREATE FOREIGN TABLE gen_part_1_2
+PARTITION OF gen_part_1 FOR VALUES FROM (2) TO (3)
+SERVER testserver1 OPTIONS (table_name 'whatever_1_2');
+CREATE FOREIGN TABLE gen_part_2
+PARTITION OF gen_part FOR VALUES IN (2)
+SERVER testserver1 OPTIONS (table_name 'whatever_2');
+-- this should only scan "gen_part_1_1" and "gen_part_1_2", but not "gen_part_2"
+EXPLAIN (GENERIC_PLAN, COSTS OFF) SELECT key1, key2 FROM gen_part WHERE key1 = 1 AND key2 = $1;
+DROP TABLE gen_part;
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 0fce622423..4985545c78 100644
--- a/doc/src/sgml/ref/explain.sgml

Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-03-27 Thread Laurenz Albe

On Fri, 2023-03-24 at 16:41 -0400, Tom Lane wrote:
> Christoph Berg  writes:
> > Re: Tom Lane
> > > I don't actually see why a postgres_fdw test case is needed at all?
> > > The tests in explain.sql seem sufficient.
> 
> > When I asked Laurenz about that he told me that local tables wouldn't
> > exercise the code specific for EXEC_FLAG_EXPLAIN_GENERIC.
> 
> But there isn't any ... or at least, I fail to see what isn't sufficiently
> exercised by that new explain.sql test case that's identical to this one
> except for being a non-foreign table.  Perhaps at some point this patch
> modified postgres_fdw code?  But it doesn't now.
> 
> I don't mind having a postgres_fdw test if there's something for it
> to test, but it just looks duplicative to me.  Other things being
> equal, I'd prefer to test this feature in explain.sql, since (a) it's
> a core feature and (b) the core tests are better parallelized than the
> contrib tests, so the same test should be cheaper to run.
> 
> > (Admittedly my knowledge of the planner wasn't deep enough to verify
> > that. Laurenz is currently traveling, so I don't know if he could
> > answer this himself now.)
> 
> OK, thanks for the status update.  What I'll do to get this off my
> plate is to push the patch without the postgres_fdw test -- if
> Laurenz wants to advocate for that when he returns, we can discuss it
> more.

Thanks for committing this.

As Chrisoph mentioned, I found that I could not reproduce the problem
that was addressed by the EXEC_FLAG_EXPLAIN_GENERIC hack using local
partitioned tables.  My optimizer knowledge is not deep enough to tell
why, and it might well be a bug lurking in the FDW part of the
optimizer code.  It is not FDW specific, since I discovered it with
oracle_fdw and could reproduce it with postgres_fdw.

I was aware that it is awkward to add a test to a contrib module, but
I thought that I should add a test that exercises the new code path.
But I am fine without the postgres_fdw test.

Yours,
Laurenz Albe

Re: Add standard collation UNICODE

2023-03-27 Thread Laurenz Albe

On Thu, 2023-03-23 at 13:16 -0700, Jeff Davis wrote:
> Another thought: for ICU, do we want the default collation to be
> UNICODE (root collation)? What we have now gets the default from the
> environment, which is consistent with the libc provider.
> 
> But now that we have the UNICODE collation, it makes me wonder if we
> should just default to that. The server's environment doesn't
> necessarily say much about the locale of the data stored in it or the
> locale of the applications accessing it.
> 
> I don't have a strong opinion here, but I thought I'd raise the issue.
> 
> By my count, >50% of locales are actually just the root locale. I'm not
> sure if that should matter or not -- we don't want to weigh some
> locales over others -- but I found it interesting.

I second that.  Most people don't pay attention to that when creating a
cluster, so having a locale-agnostic collation is often better than
inheriting whatever default happened to be set in your shell.
For example, the Debian/Ubuntu binary packages create a cluster when
you install the server package, and most people just go on using that.

Yours,
Laurenz Albe

Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-01-09 Thread Laurenz Albe

On Tue, 2022-12-27 at 14:37 -0800, Michel Pelletier wrote:
> I built and tested this patch for review and it works well, although I got 
> the following warning when building:
> 
> analyze.c: In function 'transformStmt':
> analyze.c:2919:35: warning: 'generic_plan' may be used uninitialized in this 
> function [-Wmaybe-uninitialized]
>  2919 |         pstate->p_generic_explain = generic_plan;
>       |         ~~^~
> analyze.c:2909:25: note: 'generic_plan' was declared here
>  2909 |         bool            generic_plan;
>       |                         ^~~~

Thanks for checking.  The variable should indeed be initialized, although
my compiler didn't complain.

Attached is a fixed version.

Yours,
Laurenz Albe
From baf60d9480d8022866d1ed77b00c7b8506f97f70 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 9 Jan 2023 17:37:40 +0100
Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN

This allows EXPLAIN to generate generic plans for parameterized
statements (that have parameter placeholders like $1 in the
statement text).

Author: Laurenz Albe
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at
---
 doc/src/sgml/ref/explain.sgml | 15 +++
 src/backend/commands/explain.c|  9 +
 src/backend/parser/analyze.c  | 13 +
 src/backend/parser/parse_coerce.c | 15 +++
 src/backend/parser/parse_expr.c   | 16 
 src/include/commands/explain.h|  1 +
 src/include/parser/parse_node.h   |  2 ++
 src/test/regress/expected/explain.out | 14 ++
 src/test/regress/sql/explain.sql  |  5 +
 9 files changed, 90 insertions(+)

diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index d4895b9d7d..221f905a59 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ]
 COSTS [ boolean ]
 SETTINGS [ boolean ]
+GENERIC_PLAN [ boolean ]
 BUFFERS [ boolean ]
 WAL [ boolean ]
 TIMING [ boolean ]
@@ -167,6 +168,20 @@ ROLLBACK;
 

 
+   
+GENERIC_PLAN
+
+ 
+  Generate a generic plan for the statement (see 
+  for details about generic plans).  The statement can contain parameter
+  placeholders like $1 and must be a statement that can
+  use parameters.  This option cannot be used together with
+  ANALYZE, since a statement with unknown parameters
+  cannot be executed.
+ 
+
+   
+

 BUFFERS
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e4621ef8d6..7ee3d24da2 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -190,6 +190,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 			es->wal = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "settings") == 0)
 			es->settings = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "generic_plan") == 0)
+			es->generic = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "timing") == 0)
 		{
 			timing_set = true;
@@ -227,6 +229,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
+	if (es->generic && es->analyze)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN ANALYZE cannot be used with GENERIC_PLAN")));
+
+	/* check that WAL is used with EXPLAIN ANALYZE */
 	if (es->wal && !es->analyze)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 5b90974e83..8b56eadf7e 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -27,6 +27,7 @@
 #include "access/sysattr.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
+#include "commands/defrem.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -2905,6 +2906,18 @@ static Query *
 transformExplainStmt(ParseState *pstate, ExplainStmt *stmt)
 {
 	Query	   *result;
+	ListCell   *lc;
+	bool		generic_plan = false;
+
+	foreach(lc, stmt->options)
+	{
+		DefElem*opt = (DefElem *) lfirst(lc);
+
+		if (strcmp(opt->defname, "generic_plan") == 0)
+			generic_plan = defGetBoolean(opt);
+		/* don't "break", as we want the last value */
+	}
+	pstate->p_generic_explain = generic_plan;
 
 	/* transform contained query, allowing SELECT INTO */
 	stmt->query = (Node *) transformOptionalSelectInto(pstate, stmt->query);
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
i

Re: minor bug

2023-01-17 Thread Laurenz Albe

On Mon, 2023-01-16 at 19:59 +0100, Torsten Förtsch wrote:
> not sure if this is known behavior.
> 
> Server version is 14.6 (Debian 14.6-1.pgdg110+1).
> 
> In a PITR setup I have these settings:
> 
> recovery_target_xid = '852381'
> recovery_target_inclusive = 'false'
> 
> In the log file I see this message:
> 
> LOG:  recovery stopping before commit of transaction 852381, time 2000-01-01 
> 00:00:00+00
> 
> But:
> 
> postgres=# select * from pg_last_committed_xact();
>   xid   |           timestamp           | roident 
> +---+-
>  852380 | 2023-01-16 18:00:35.054495+00 |       0
> 
> So, the timestamp displayed in the log message is certainly wrong.

Redirected to -hackers.

If recovery stops at a WAL record that has no timestamp, you get this
bogus recovery stop time.  I think we should show the recovery stop time
only if time was the target, as in the attached patch.

Yours,
Laurenz Albe
From 622e52bbd652fc8872448e46c3ca0bc78dd847fe Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 17 Jan 2023 10:38:40 +0100
Subject: [PATCH] Don't show bogus recovery stop time
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

If target for archive recovery was different from time, the
WAL record that ended recovery might not have a timestamp.
In that case, the log message at the end of recovery would
show a recovery time of midnight on 2000-01-01.

Reported-by: Torsten Förtsch
Author: Laurenz Albe
Discussion: https://postgr.es/m/cakkg4_kuevpqbmyoflajx7opaqk6cvwkvx0hrcfjspfrptx...@mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index bc3c3eb3e7..ce3718ca2d 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2570,19 +2570,21 @@ recoveryStopsBefore(XLogReaderState *record)
 		recoveryStopLSN = InvalidXLogRecPtr;
 		recoveryStopName[0] = '\0';
 
-		if (isCommit)
+		/* for targets other than time, we might have no stop time */
+		if (recoveryTarget == RECOVERY_TARGET_XID)
 		{
 			ereport(LOG,
-	(errmsg("recovery stopping before commit of transaction %u, time %s",
+	(errmsg("recovery stopping before %s of transaction %u, time %s",
+			(isCommit ? "commit" : "abort"),
 			recoveryStopXid,
 			timestamptz_to_str(recoveryStopTime;
 		}
 		else
 		{
 			ereport(LOG,
-	(errmsg("recovery stopping before abort of transaction %u, time %s",
-			recoveryStopXid,
-			timestamptz_to_str(recoveryStopTime;
+	(errmsg("recovery stopping before %s of transaction %u",
+			(isCommit ? "commit" : "abort"),
+			recoveryStopXid)));
 		}
 	}
 
-- 
2.39.0

Re: minor bug

2023-01-18 Thread Laurenz Albe

On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
> Laurenz Albe  writes:
> > On Mon, 2023-01-16 at 19:59 +0100, Torsten Förtsch wrote:
> > > So, the timestamp displayed in the log message is certainly wrong.
> 
> > If recovery stops at a WAL record that has no timestamp, you get this
> > bogus recovery stop time.  I think we should show the recovery stop time
> > only if time was the target, as in the attached patch.
> 
> I don't think that is a tremendously useful definition: the user
> already knows what recoveryStopTime is, or can find it out from
> their settings easily enough.
> 
> I seem to recall that the original idea was to report the timestamp
> of the commit/abort record we are stopping at.  Maybe my memory is
> faulty, but I think that'd be significantly more useful than the
> current behavior, *especially* when the replay stopping point is
> defined by something other than time.
> 
> (Also, the wording of the log message suggests that that's what
> the reported time is supposed to be.  I wonder if somebody messed
> this up somewhere along the way.)

recoveryStopTime is set to recordXtime (the time of the xlog record)
a few lines above that patch, so this is useful information if it is
present.

I realized that my original patch might be a problem for translation;
here is an updated version that does not take any shortcuts.

Yours,
Laurenz Albe
From b01428486f7795f757edd14f66362ee575d2f168 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 18 Jan 2023 09:23:50 +0100
Subject: [PATCH] Don't show bogus recovery stop time
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

If target for archive recovery was different from time, the
WAL record that ended recovery might not have a timestamp.
In that case, the log message at the end of recovery would
show a recovery time of midnight on 2000-01-01.

Reported-by: Torsten Förtsch
Author: Laurenz Albe
Discussion: https://postgr.es/m/cakkg4_kuevpqbmyoflajx7opaqk6cvwkvx0hrcfjspfrptx...@mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c | 35 +--
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index bc3c3eb3e7..73929a535e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2570,19 +2570,38 @@ recoveryStopsBefore(XLogReaderState *record)
 		recoveryStopLSN = InvalidXLogRecPtr;
 		recoveryStopName[0] = '\0';
 
+		/* this could be simplified, but that might break translatability */
 		if (isCommit)
 		{
-			ereport(LOG,
-	(errmsg("recovery stopping before commit of transaction %u, time %s",
-			recoveryStopXid,
-			timestamptz_to_str(recoveryStopTime;
+			if (recoveryStopTime != 0)
+			{
+ereport(LOG,
+		(errmsg("recovery stopping before commit of transaction %u, time %s",
+recoveryStopXid,
+timestamptz_to_str(recoveryStopTime;
+			}
+			else
+			{
+ereport(LOG,
+		(errmsg("recovery stopping before commit of transaction %u",
+recoveryStopXid)));
+			}
 		}
 		else
 		{
-			ereport(LOG,
-	(errmsg("recovery stopping before abort of transaction %u, time %s",
-			recoveryStopXid,
-			timestamptz_to_str(recoveryStopTime;
+			if (recoveryStopTime != 0)
+			{
+ereport(LOG,
+		(errmsg("recovery stopping before abort of transaction %u, time %s",
+recoveryStopXid,
+timestamptz_to_str(recoveryStopTime;
+			}
+			else
+			{
+ereport(LOG,
+		(errmsg("recovery stopping before abort of transaction %u",
+recoveryStopXid)));
+			}
 		}
 	}
 
-- 
2.39.0

Re: document the need to analyze partitioned tables

2023-01-18 Thread Laurenz Albe

On Tue, 2023-01-17 at 16:16 -0500, Bruce Momjian wrote:
> On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:
> > Maybe (all?) the clarification the docs need is to say:
> > "Partitioned tables are not *themselves* processed by autovacuum."
> 
> Yes, I think the lack of autovacuum needs to be specifically mentioned
> since most people assume autovacuum handles _all_ statistics updating.
> 
> Can someone summarize how bad it is we have no statistics on partitioned
> tables?  It sounds bad to me.

Andrey Lepikhov had an example earlier in this thread[1].  It doesn't take
an exotic query. 

Attached is a new version of my patch that tries to improve the wording.

Yours,
Laurenz Albe

 [1]: https://postgr.es/m/3df5c68b-13aa-53d0-c0ec-ed98e6972e2e%40postgrespro.ru
From 53da8083556364490d42077492e608152f9ae02e Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 18 Jan 2023 10:09:21 +0100
Subject: [PATCH] Improve autovacuum doc on partitioned tables

The documentation mentioned that autovacuum doesn't process
partitioned tables, but it was unclear about the impact.
The old wording could be interpreted to mean that there are
problems with dead tuple cleanup on partitioned tables.
Clarify that the only potential problem is autoanalyze, and
that statistics for the partitions will be gathered.

Author: Laurenz Albe
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/1fd81ddc7710a154834030133c6fea41e55c8efb.camel%40cybertec.at
---
 doc/src/sgml/maintenance.sgml | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 759ea5ac9c..3954e797a4 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -860,8 +860,15 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu

 

-Partitioned tables are not processed by autovacuum.  Statistics
-should be collected by running a manual ANALYZE when it is
+The partitions of a partitioned table are normal tables and get processed
+by autovacuum, but autovacuum doesn't process the partitioned table itself.
+This is no problem as far as VACUUM is concerned, since
+processing the partitions is sufficient.  But, as mentioned in
+, it also means that autovacuum won't
+run ANALYZE on the partitioned table itself.
+While statistics are gathered on the partitions, some queries may rely on
+the statistics for the partitioned table.  You should collect statistics by
+running a manual ANALYZE when the partitioned table is
 first populated, and again whenever the distribution of data in its
 partitions changes significantly.

-- 
2.39.0

Re: document the need to analyze partitioned tables

2023-01-18 Thread Laurenz Albe

On Wed, 2023-01-18 at 11:49 -0600, Justin Pryzby wrote:
> I tweaked this a bit to end up with:
> 
> > -    Partitioned tables are not processed by autovacuum.  Statistics
> > -    should be collected by running a manual ANALYZE 
> > when it is
> > +    The leaf partitions of a partitioned table are normal tables and are 
> > processed
> > +    by autovacuum; however, autovacuum does not process the partitioned 
> > table itself.
> > +    This is no problem as far as VACUUM is concerned, 
> > since
> > +    there's no need to vacuum the empty, partitioned table.  But, as 
> > mentioned in
> > +    , it also means that autovacuum 
> > won't
> > +    run ANALYZE on the partitioned table.
> > +    Although statistics are automatically gathered on its leaf partitions, 
> > some queries also need
> > +    statistics on the partitioned table to run optimally.  You should 
> > collect statistics by
> > +    running a manual ANALYZE when the partitioned table 
> > is
> >  first populated, and again whenever the distribution of data in its
> >  partitions changes significantly.
> >     
> 
> "partitions are normal tables" was techically wrong, as partitions can
> also be partitioned.

I am fine with your tweaks.  I think this is good to go.

Yours,
Laurenz Albe

Re: document the need to analyze partitioned tables

2023-01-19 Thread Laurenz Albe

On Wed, 2023-01-18 at 16:23 -0500, Bruce Momjian wrote:
> Is it possible to document when partition table statistics helps?

I think it would be difficult to come up with an exhaustive list.

Yours,
Laurenz Albe

Re: minor bug

2023-01-19 Thread Laurenz Albe

On Wed, 2023-01-18 at 15:03 -0500, Tom Lane wrote:
> Laurenz Albe  writes:
> > On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
> > > I seem to recall that the original idea was to report the timestamp
> > > of the commit/abort record we are stopping at.  Maybe my memory is
> > > faulty, but I think that'd be significantly more useful than the
> > > current behavior, *especially* when the replay stopping point is
> > > defined by something other than time.
> > > (Also, the wording of the log message suggests that that's what
> > > the reported time is supposed to be.  I wonder if somebody messed
> > > this up somewhere along the way.)
> 
> > recoveryStopTime is set to recordXtime (the time of the xlog record)
> > a few lines above that patch, so this is useful information if it is
> > present.
> 
> Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
> Digging in the git history, I see that this did use to work as
> I remember: we always extracted the record time before printing it.
> That was accidentally broken in refactoring in c945af80c.  I think
> the correct fix is more like the attached.

Yes, you are right.  Your patch looks fine to me.

Yours,
Laurenz Albe

Re: document the need to analyze partitioned tables

2023-01-20 Thread Laurenz Albe

On Thu, 2023-01-19 at 15:56 -0500, Bruce Momjian wrote:
> On Thu, Jan 19, 2023 at 01:50:05PM +0100, Laurenz Albe wrote:
> > On Wed, 2023-01-18 at 16:23 -0500, Bruce Momjian wrote:
> > > Is it possible to document when partition table statistics helps?
> > 
> > I think it would be difficult to come up with an exhaustive list.
> 
> I was afraid of that.  I asked only because most people assume
> autovacuum handles _all_ statistics needs, but this case is not handled.
> Do people even have any statistics maintenance process anymore, and if
> not, how would they know they need to run a manual ANALYZE?

Probably not.  I think this would warrant an entry in the TODO list:
"make autovacuum collect startistics for partitioned tables".

Even if we cannot give better advice than "run ALANYZE manually if
the execution plan looks fishy", the patch is still an improvement,
isn't it?

I have already seen several questions by people who read the current
documentation and were worried that autovacuum wouldn't clean up their
partitioned tables.

Yours,
Laurenz Albe

Re: Conflict with recovery error

2023-01-20 Thread Laurenz Albe

On Fri, 2023-01-20 at 08:56 +, Abhishek Prakash wrote:
> We are facing below issue with read replica we did work arounds by setting
> hot_standby_feedback, max_standby_streaming_delay and 
> max_standby_archive_delay,
> which indeed caused adverse effects on primary DB and storage. As our DB is
> nearly 6 TB which runs as AWS Postgres RDS. 
>  
> Even the below error occurs on tables where vacuum is disabled and no DML
> operations are permitted. Will there be any chances to see row versions
> being changed even if vacuum is disabled.
> Please advise.
>  
> 2023-01-13 07:20:12 
> UTC:10.64.103.75(61096):ubpreplica@ubprdb01:[17707]:ERROR:  canceling 
> statement due to conflict with recovery
> 2023-01-13 07:20:12 
> UTC:10.64.103.75(61096):ubpreplica@ubprdb01:[17707]:DETAIL:  User query might 
> have needed to see row versions that must be removed.

It could be HOT chain pruning or an anti-wraparound autovacuum (which runs
even if autovacuum is disabled).
Disabling autovacuum is not a smart idea to begin with.

Your best bet is to set "max_standby_streaming_delay = -1".

More reading:
https://www.cybertec-postgresql.com/en/streaming-replication-conflicts-in-postgresql/

Yours,
Laurenz Albe

Re: Conflict with recovery error

2023-01-20 Thread Laurenz Albe

On Fri, 2023-01-20 at 09:59 +, Abhishek Prakash wrote:
> We had set max_standby_streaming_delay = -1, but faced storage issues
> nearly 3.5 TB of storage was consumed.

Then either don't run queries that take that long or run fewer data
modifications on the primary.

Or invest in a few more TB disk storage.

Yours,
Laurenz Albe

Re: postgres_fdw: dead lock in a same transaction when postgres_fdw server is lookback

2022-10-01 Thread Laurenz Albe

On Sat, 2022-10-01 at 04:02 +, Xiaoran Wang wrote:
> I created a postgers_fdw server lookback as the test does. Then run the 
> following SQLs
> 
> [create a foreign server via loopback and manipulate the same data locally 
> and via foreign table]
> 
> Then the transaction got stuck. Should the "lookback" server be disabled in 
> the postgres_fdw?

It shouldn't; there are good use cases for that ("autonomous transactions").
AT most, some cautioning documentation could be added, but I am not convinced
that even that is necessary.

I'd say that this is a pretty obvious case of pilot error.

Yours,
Laurenz Albe

Re: future of serial and identity columns

2022-10-04 Thread Laurenz Albe

On Tue, 2022-10-04 at 09:41 +0200, Peter Eisentraut wrote:
> In PostgreSQL 10, we added identity columns, as an alternative to serial 
> columns (since 6.something).  They mostly work the same.  Identity 
> columns are SQL-conforming, have some more features (e.g., overriding 
> clause), and are a bit more robust in schema management.  Some of that 
> was described in [0].  AFAICT, there have been no complaints since that 
> identity columns lack features or are somehow a regression over serial 
> columns.
> 
> But clearly, the syntax "serial" is more handy, and most casual examples 
> use that syntax.  So it seems like we are stuck with maintaining these 
> two variants in parallel forever.  I was thinking we could nudge this a 
> little by remapping "serial" internally to create an identity column 
> instead.  At least then over time, the use of the older serial 
> mechanisms would go away.

I think that would be great.
That might generate some confusion among users who follow old tutorials
and are surprised that the eventual table definition differs, but I'd say
that is a good thing.

Yours,
Laurenz Albe

Re: document the need to analyze partitioned tables

2022-10-05 Thread Laurenz Albe

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:
> I've pushed the last version, and backpatched it to 10 (not sure I'd
> call it a bugfix, but I certainly agree with Justin it's worth
> mentioning in the docs, even on older branches).

I'd like to suggest an improvement to this.  The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.


By the way, where are the statistics of a partitioned tables used?  The actual
tables scanned are always the partitions, and in the execution plans that
I have seen, the optimizer always used the statistics of the partitions.

Yours,
Laurenz Albe
From 5209f228f09e52780535edacfee5f7efd2c25081 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 5 Oct 2022 10:31:47 +0200
Subject: [PATCH] Improve autovacuum doc on partitioned tables

The documentation mentioned that autovacuum doesn't process
partitioned tables, but it was unclear about the impact.
The old wording could be interpreted to mean that there are
problems with dead tuple cleanup on partitioned tables.
Clarify that the only potential problem is autoanalyze, and
that statistics for the partitions will be gathered.
---
 doc/src/sgml/maintenance.sgml | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 759ea5ac9c..53e3fadbaf 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -860,10 +860,15 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu

 

-Partitioned tables are not processed by autovacuum.  Statistics
-should be collected by running a manual ANALYZE when it is
-first populated, and again whenever the distribution of data in its
-partitions changes significantly.
+Partitioned tables are not processed by autovacuum.  This is no problem
+as far as VACUUM is concerned, since autovacuum will process
+the partitions.  But, as mentioned in ,
+it also means that autovacuum won't run ANALYZE on the
+partitioned table itself.  While statistics are gathered for the partitions,
+some queries may rely on the statistics for the partitioned table.  You should
+collect statistics by running a manual ANALYZE when the
+partitioned table is first populated, and again whenever the distribution
+of data in its partitions changes significantly.

 

-- 
2.37.3

Make EXPLAIN generate a generic plan for a parameterized query

2022-10-11 Thread Laurenz Albe

Today you get

test=> EXPLAIN SELECT * FROM tab WHERE col = $1;
ERROR:  there is no parameter $1

which makes sense.  Nonetheless, it would be great to get a generic plan
for such a query.  Sometimes you don't have the parameters (if you grab
the statement from "pg_stat_statements", or if it is from an error message
in the log, and you didn't enable "log_parameter_max_length_on_error").
Sometimes it is just very painful to substitute the 25 parameters from
the detail message.

With the attached patch you can get the following:

test=> SET plan_cache_mode = force_generic_plan;
SET
test=> EXPLAIN (COSTS OFF) SELECT * FROM pg_proc WHERE oid = $1;
  QUERY PLAN   
═══
 Index Scan using pg_proc_oid_index on pg_proc
   Index Cond: (oid = $1)
(2 rows)

That's not the same as a full-fledged EXPLAIN (ANALYZE, BUFFERS),
but it can definitely be helpful.

I tied that behavior to the setting of "plan_cache_mode" where you
are guaranteed to get a generic plan; I couldn't think of a better way.

Yours,
Laurenz Albe
From 2bc91581acd478d4648176b58745cadb835d5fbc Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 11 Oct 2022 13:05:31 +0200
Subject: [PATCH] Add EXPLAIN support for parameterized statements

If "plan_cache_mode = force_generic_plan", allow EXPLAIN to
generate generic plans for parameterized statements (that
have parameter placeholders like $1 in the statement text).

This repurposes hooks used by PL/pgSQL, so we better not try
to do that inside PL/pgSQL.
---
 doc/src/sgml/ref/explain.sgml | 10 +
 src/backend/parser/analyze.c  | 53 +++
 src/test/regress/expected/explain.out | 28 ++
 src/test/regress/sql/explain.sql  | 13 +++
 4 files changed, 104 insertions(+)

diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index d4895b9d7d..928d67b9b4 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -321,6 +321,16 @@ ROLLBACK;
execution, and on machines that have relatively slow operating
system calls for obtaining the time of day.
   
+
+  
+   If  is set to
+   force_generic_plan, you can use EXPLAIN
+   to generate generic plans for statements that contain placeholders like
+   $1 without knowing the actual parameter type or value.
+   Note that expressions like $1 + $2 are ambiguous if you
+   don't specify the parameter data types, so you may have to add explicit type
+   casts in such cases.
+  
  
 
  
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 6688c2a865..c481d45376 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -52,6 +52,7 @@
 #include "utils/guc.h"
 #include "utils/queryjumble.h"
 #include "utils/rel.h"
+#include "utils/plancache.h"
 #include "utils/syscache.h"
 
 
@@ -86,6 +87,10 @@ static Query *transformCallStmt(ParseState *pstate,
 CallStmt *stmt);
 static void transformLockingClause(ParseState *pstate, Query *qry,
    LockingClause *lc, bool pushedDown);
+static Node * fakeUnknownParam(ParseState *pstate, ParamRef *pref);
+static Node * coerceUnknownParam(ParseState *pstate, Param *param,
+ Oid targetTypeId, int32 targetTypeMod,
+ int location);
 #ifdef RAW_EXPRESSION_COVERAGE_TEST
 static bool test_raw_expression_coverage(Node *node, void *context);
 #endif
@@ -2895,6 +2900,22 @@ transformExplainStmt(ParseState *pstate, ExplainStmt *stmt)
 {
 	Query	   *result;
 
+	/*
+	 * If we EXPLAIN a statement and are certain to generate a generic plan,
+	 * we can tolerate undefined parameters.  For that purpose, supply
+	 * parameters of type "unknown" and coerce them to the appropriate type
+	 * as needed.
+	 * If we are called from PL/pgSQL, the hooks are already set for the
+	 * purpose of resolving variables, and we don't want to disturb that.
+	 */
+	if (plan_cache_mode == PLAN_CACHE_MODE_FORCE_GENERIC_PLAN &&
+		pstate->p_paramref_hook == NULL &&
+		pstate->p_coerce_param_hook == NULL)
+	{
+		pstate->p_paramref_hook = fakeUnknownParam;
+		pstate->p_coerce_param_hook = coerceUnknownParam;
+	}
+
 	/* transform contained query, allowing SELECT INTO */
 	stmt->query = (Node *) transformOptionalSelectInto(pstate, stmt->query);
 
@@ -3466,6 +3487,38 @@ applyLockingClause(Query *qry, Index rtindex,
 	qry->rowMarks = lappend(qry->rowMarks, rc);
 }
 
+/*
+ * Return an "unknown" parameter for use with EXPLAIN of a parameterized
+ * statement.
+ */
+Node *
+fakeUnknownParam(ParseState *pstate, ParamRef *pref)
+{
+	Param  *param;
+
+	param = makeNode(Param);
+	param->paramkind = PARAM_EXTERN;
+	param->paramid = pref->number;
+	param->paramtype = UNKNOWNOID;
+	param->paramtypmod = -1;
+	param->paramcollid = Inval

Re: [PATCH] Allow usage of archive .backup files as backup_label

2022-10-17 Thread Laurenz Albe

On Tue, 2022-10-18 at 10:55 +0900, Michael Paquier wrote:
> On Mon, Aug 22, 2022 at 05:16:58PM +0200, Michael Banck wrote:
> > The .backup files written to the archive (if archiving is on) are very
> > similar to the backup_label that's written/returned by
> > pg_stop_backup()/pg_backup_stop(), they just have a few extra lines
> > about the end of backup process that are missing from backup_label.
> 
> Historically, there is "STOP WAL LOCATION" after "START WAL LOCATION",
> and "STOP TIME"/"STOP TIMELINE" at the end.
> 
> > The parser in xlogrecovery.c however barfs on them because it does not
> > expect the additional STOP WAL LOCATION on line 2.
> 
> Hm, no.  I don't think that I'd want to expand the use of the backup
> history file in the context of recovery, so as we are free to add any
> extra information into it if necessary without impacting the
> compatibility of the recovery code.  This file is primarily here for
> debugging, so I'd rather let it be used only for this purpose.
> Opinions of others are welcome, of course.

I tend to agree with you.  It is easy to break PostgreSQL by manipulating
or removing "backup_label", and copying a file from the WAL archive and
renaming it to "backup_label" sounds like a footgun of the first order.
There is nothing that prevents you from copying the wrong file.
Such practices should not be encouraged.

Anybody who knows enough about PostgreSQL to be sure that what they are
doing is correct should be smart enough to know how to edit the copied file.

Yours,
Laurenz Albe

Re: Make EXPLAIN generate a generic plan for a parameterized query

2022-10-25 Thread Laurenz Albe

On Wed, 2022-10-12 at 00:03 +0800, Julien Rouhaud wrote:
> On Tue, Oct 11, 2022 at 09:49:14AM -0400, Tom Lane wrote:
> > I think it might be better to drive it off an explicit EXPLAIN option,
> > perhaps
> >
> > EXPLAIN (GENERIC_PLAN) SELECT * FROM tab WHERE col = $1;
> > 
> > If you're trying to investigate custom-plan behavior, then you
> > need to supply concrete parameter values somewhere, so I think
> > this approach is fine for that case.  (Shoehorning parameter
> > values into EXPLAIN options seems like it'd be a bit much.)
> > However, investigating generic-plan behavior this way is tedious,
> > since you have to invent irrelevant parameter values, plus mess
> > with plan_cache_mode or else run the explain half a dozen times.
> > So I can get behind having a more convenient way for that.
> 
> One common use case is tools identifying a slow query using 
> pg_stat_statements,
> identifying some missing indexes and then wanting to check whether the index
> should be useful using some hypothetical index.
> 
> FTR I'm working on such a project and for now we have to go to great lengths
> trying to "unjumble" such queries, so having a way to easily get the answer 
> for
> a generic plan would be great.

Thanks for the suggestions and the encouragement.  Here is a patch that
implements it with an EXPLAIN option named GENERIC_PLAN.

Yours,
Laurenz Albe
From 85991f35f0de6e4e0a0b5843373e2ba3d5976c85 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 25 Oct 2022 11:01:53 +0200
Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN

This allows EXPLAIN to generate generic plans for parameterized
statements (that have parameter placeholders like $1 in the
statement text).

Author: Laurenz Albe
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at
---
 doc/src/sgml/ref/explain.sgml | 15 ++
 src/backend/commands/explain.c|  9 +
 src/backend/parser/analyze.c  | 13 +
 src/backend/parser/parse_coerce.c | 15 ++
 src/backend/parser/parse_expr.c   | 16 +++
 src/include/commands/explain.h|  1 +
 src/include/parser/parse_node.h   |  2 ++
 src/test/regress/expected/explain.out | 28 +++
 src/test/regress/sql/explain.sql  | 16 +++
 9 files changed, 115 insertions(+)

diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index d4895b9d7d..659d5c51b6 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ]
 COSTS [ boolean ]
 SETTINGS [ boolean ]
+GENERIC_PLAN [ boolean ]
 BUFFERS [ boolean ]
 WAL [ boolean ]
 TIMING [ boolean ]
@@ -167,6 +168,20 @@ ROLLBACK;
 

 
+   
+GENERIC_PLAN
+
+ 
+  Generate a generic plan for the statement (see 
+  for details about generic plans).  The statement can contain parameter
+  placeholders like $1 and must be a statement that can
+  use parameters.  This option cannot be used together with
+  ANALYZE, since a statement with unknown parameters
+  cannot be executed.
+ 
+
+   
+

 BUFFERS
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..7b7ca3f90a 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -190,6 +190,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 			es->wal = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "settings") == 0)
 			es->settings = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "generic_plan") == 0)
+			es->generic = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "timing") == 0)
 		{
 			timing_set = true;
@@ -227,6 +229,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
+	if (es->generic && es->analyze)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN ANALYZE cannot be used with GENERIC_PLAN")));
+
+	/* check that WAL is used with EXPLAIN ANALYZE */
 	if (es->wal && !es->analyze)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 6688c2a865..c849765151 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -27,6 +27,7 @@
 #include "access/sysattr.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
+#include "commands/defrem.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h&quo

Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))

2022-10-25 Thread Laurenz Albe

On Thu, 2022-10-20 at 21:09 -0500, Justin Pryzby wrote:
> Rebased.

I had a look at the patch set.

It applies and builds cleanly and passes the regression tests.

0001: Add GUC: explain_regress

  I like the idea of the "explain_regress" GUC.  That should simplify
  the regression tests.

  --- a/src/test/regress/pg_regress.c
  +++ b/src/test/regress/pg_regress.c
  @@ -625,7 +625,7 @@ initialize_environment(void)
   * user's ability to set other variables through that.
   */
  {
  -   const char *my_pgoptions = "-c intervalstyle=postgres_verbose";
  +   const char *my_pgoptions = "-c intervalstyle=postgres_verbose -c 
explain_regress=on";
  const char *old_pgoptions = getenv("PGOPTIONS");
  char   *new_pgoptions;

  @@ -2288,6 +2288,7 @@ regression_main(int argc, char *argv[],
  fputs("log_lock_waits = on\n", pg_conf);
  fputs("log_temp_files = 128kB\n", pg_conf);
  fputs("max_prepared_transactions = 2\n", pg_conf);
  +   // fputs("explain_regress = yes\n", pg_conf);

  for (sl = temp_configs; sl != NULL; sl = sl->next)
  {

  This code comment is a leftover and should go.

0002: exercise explain_regress

  This is the promised simplification.  Looks good.

0003: Make explain default to BUFFERS TRUE

  Yes, please!
  This patch is independent from the previous patches.
  I'd like this to be applied, even if the GUC is rejected.

  (I understand that that would cause more changes in the regression
  test output if we changed that without introducing "explain_regress".)

  The patch changes the default value of "auto_explain.log_buffers"
  to "on", which makes sense.  However, the documentation in
  doc/src/sgml/auto-explain.sgml should be updated to reflect that.

  --- a/doc/src/sgml/perform.sgml
  +++ b/doc/src/sgml/perform.sgml
  @@ -731,8 +731,8 @@ EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> 
polygon '(0.5,2.0)';

  -EXPLAIN has a BUFFERS option that 
can be used with
  -ANALYZE to get even more run time statistics:
  +EXPLAIN ANALYZE has a BUFFERS 
option which
  +provides even more run time statistics:

   EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND 
unique2 > 9000;

  This is not enough.  The patch would have to update all the examples that use 
EXPLAIN ANALYZE.
  I see two options:

  1. Change the output of all the examples and move this explanation to the 
first example
 with EXPLAIN ANALYZE:

   The numbers in the Buffers: line help to identify 
which parts
   of the query are the most I/O-intensive.

  2. Change all examples that currently do *not* use BUFFERS to EXPLAIN 
(BUFFERS OFF).
 Then you could change the example that shows BUFFERS to something like

   If you do not suppress the BUFFERS option that can be 
used with
   EXPLAIN (ANALYZE), you get even more run time 
statistics:

0004, 0005, 0006, 0007: EXPLAIN (MACHINE)

  I think it is confusing that these are included in this patch set.
  EXPLAIN (MACHINE OFF) is similar to "explain_regress = on", only it goes even 
further:
  no query ID, no Heap Fetches, no Sort details, ...
  Why not add this functionality to the GUC?

  0005 suppresses "rows removed by filter", but how is that machine dependent?

> BTW, I think it may be that the GUC should be marked PGDLLIMPORT ?

I think it is project policy to apply this mark wherever it is needed.  Do you 
think
that third-party extensions might need to use this in C code?

Yours,
Laurenz Albe

Re: Make EXPLAIN generate a generic plan for a parameterized query

2022-10-29 Thread Laurenz Albe

On Tue, 2022-10-25 at 19:03 +0800, Julien Rouhaud wrote:
> On Tue, Oct 25, 2022 at 11:08:27AM +0200, Laurenz Albe wrote:
> > Here is a patch that
> > implements it with an EXPLAIN option named GENERIC_PLAN.
> 
> I only have a quick look at the patch for now.  Any reason why you don't rely
> on the existing explain_filter() function for emitting stable output (without
> having to remove the costs)?  It would also take care of checking that it 
> works
> in plpgsql.

No, there is no principled reason I did it like that.  Version 2 does it like
you suggest.

Yours,
Laurenz Albe
From 8704d51f5810619be152ae68faa5743dcf26c7a9 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 28 Oct 2022 20:58:59 +0200
Subject: [PATCH] Add EXPLAIN option GENERIC_PLAN

This allows EXPLAIN to generate generic plans for parameterized
statements (that have parameter placeholders like $1 in the
statement text).

Author: Laurenz Albe
Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel%40cybertec.at
---
 doc/src/sgml/ref/explain.sgml | 15 +++
 src/backend/commands/explain.c|  9 +
 src/backend/parser/analyze.c  | 13 +
 src/backend/parser/parse_coerce.c | 15 +++
 src/backend/parser/parse_expr.c   | 16 
 src/include/commands/explain.h|  1 +
 src/include/parser/parse_node.h   |  2 ++
 src/test/regress/expected/explain.out | 14 ++
 src/test/regress/sql/explain.sql  |  5 +
 9 files changed, 90 insertions(+)

diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index d4895b9d7d..659d5c51b6 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ]
 COSTS [ boolean ]
 SETTINGS [ boolean ]
+GENERIC_PLAN [ boolean ]
 BUFFERS [ boolean ]
 WAL [ boolean ]
 TIMING [ boolean ]
@@ -167,6 +168,20 @@ ROLLBACK;
 

 
+   
+GENERIC_PLAN
+
+ 
+  Generate a generic plan for the statement (see 
+  for details about generic plans).  The statement can contain parameter
+  placeholders like $1 and must be a statement that can
+  use parameters.  This option cannot be used together with
+  ANALYZE, since a statement with unknown parameters
+  cannot be executed.
+ 
+
+   
+

 BUFFERS
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..7b7ca3f90a 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -190,6 +190,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 			es->wal = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "settings") == 0)
 			es->settings = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "generic_plan") == 0)
+			es->generic = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "timing") == 0)
 		{
 			timing_set = true;
@@ -227,6 +229,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
+	if (es->generic && es->analyze)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN ANALYZE cannot be used with GENERIC_PLAN")));
+
+	/* check that WAL is used with EXPLAIN ANALYZE */
 	if (es->wal && !es->analyze)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 6688c2a865..c849765151 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -27,6 +27,7 @@
 #include "access/sysattr.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
+#include "commands/defrem.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -2894,6 +2895,18 @@ static Query *
 transformExplainStmt(ParseState *pstate, ExplainStmt *stmt)
 {
 	Query	   *result;
+	ListCell   *lc;
+	bool		generic_plan;
+
+	foreach(lc, stmt->options)
+	{
+		DefElem*opt = (DefElem *) lfirst(lc);
+
+		if (strcmp(opt->defname, "generic_plan") == 0)
+			generic_plan = defGetBoolean(opt);
+		/* don't "break", as we want the last value */
+	}
+	pstate->p_generic_explain = generic_plan;
 
 	/* transform contained query, allowing SELECT INTO */
 	stmt->query = (Node *) transformOptionalSelectInto(pstate, stmt->query);
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index c4e958e4aa..171d8c60a8 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -369,6 +369,21 @@ coerce_type(ParseState *pstate, Node *node,
 
 		return result;
 	}
+	/*
+	 * If we

Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))

2022-11-04 Thread Laurenz Albe

Thanks for the updated patch set!

On Fri, 2022-10-28 at 17:59 -0500, Justin Pryzby wrote:
> 
> > 0004, 0005, 0006, 0007: EXPLAIN (MACHINE)
> > 
> >   I think it is confusing that these are included in this patch set.
> >   EXPLAIN (MACHINE OFF) is similar to "explain_regress = on", only it goes 
> > even further:
> >   no query ID, no Heap Fetches, no Sort details, ...
> >   Why not add this functionality to the GUC?
> 
> Good question, and one I've asked myself.
> 
> explain_regress affects pre-existing uses of explain in the regression
> tests, aiming to simplify current (or maybe only future) uses.  And
> is obviously used to allow enabling BUFFERS by default.
> 
> explain(MACHINE off) is mostly about allowing "explain ANALYZE" to be
> used in regression tests.  Which, right now, is rare.  Maybe I shouldn't
> include those patches until the earlier patches progress (or, maybe we
> should just defer their discussion).

Essentially, "explain_regress" is to simplify the current regression tests,
and MACHINE OFF is to simplify future regression tests.  That does not seem
like a meaningful distinction to me.  Both are only useful for the regression
tests, and I see no need for two different knobs.

My opinion is now like this:

+1 on enabling BUFFERS by default for EXPLAIN (ANALYZE)

+0.2 on "explain_regress"

-1 on EXPLAIN (MACHINE) in addition to "explain_regress"

-0.1 on adding the MACHINE OFF functionality to "explain_regress"

"explain_regress" reduces the EXPLAIN options you need for regression tests.
This is somewhat useful, but not a big win.  Also, it will make backpatching
regression tests slightly harder for the next 5 years.

Hence the +0.2 for "explain_regress".

For me, an important question is: do we really want more EXPLAIN (ANALYZE)
in the regression tests?  It will probably slow the tests down, and I wonder
if there is a lot of added value, if we have to remove all the machine-specific
output anyway.

Hence the -0.1.

> >   0005 suppresses "rows removed by filter", but how is that machine 
> > dependent?
> 
> It can vary with the number of parallel workers (see 13e8b2ee8), which
> may not be "machine dependent" but the point of that patch is to allow
> predictable output of EXPLAIN ANALYZE.  Maybe it needs a better name (or
> maybe it should be folded into the first patch).

Now it makes sense to me.  I just didn't think of that.

> I'm not actually sure if this patch should try to address parallel
> workers at all, or (if so) if what it's doing now is adequate.
> 
> > > BTW, I think it may be that the GUC should be marked PGDLLIMPORT ?
> > 
> > I think it is project policy to apply this mark wherever it is needed.  Do 
> > you think
> > that third-party extensions might need to use this in C code?
> 
> Since v15 (8ec569479), the policy is:
> > On Windows, export all the server's global variables using PGDLLIMPORT 
> > markers (Robert Haas)
> 
> I'm convinced now that's what's intended here.

You convinced me too.

> > This is not enough.  The patch would have to update all the examples
> > that use EXPLAIN ANALYZE.
> 
> Fair enough.  I've left a comment to handle this later if the patch
> gains any traction and 001 is progressed.

I agree with that.

I would really like to see 0003 go in, but it currently depends on 0001, which
I am not so sure about.
I understand that you did that so that "explain_regress" can turn off BUFFERS
and there is no extra churn in the regression tests.

Still, it would be a shame if resistance against "explain_regress" would
be a show-stopper for 0003.

If I could get my way, I'd want two separate patches: first, one to turn
BUFFERS on, and second one for "explain_regress" with its current functionality
on top of that.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-04 Thread Laurenz Albe

On Sat, 2022-10-15 at 21:08 -0400, Bruce Momjian wrote:
> I therefore merged all three paragraphs into
> one and tried to make the text saner;  release_savepoint.sgml diff
> attached, URL content updated.

I wanted to have a look at this, but I am confused.  The original patch
was much bigger.  Is this just an incremental patch?  If yes, it would
be nice to have a "grand total" patch, so that I can read it all
in one go.

Yours,
Laurenz Albe

Re: Return value of pg_promote()

2023-08-28 Thread Laurenz Albe

On Thu, 2023-08-17 at 09:37 +0900, Michael Paquier wrote:
> I have just noticed that we do not have a CF entry for this proposal,
> so I have added one with Laurenz as author:
> https://commitfest.postgresql.org/44/4504/

I have changed the author to Fujii Masao.

Yours,
Laurenz Albe

Re: Disabling Heap-Only Tuples

2023-08-28 Thread Laurenz Albe

On Thu, 2023-08-24 at 18:23 +0200, Matthias van de Meent wrote:
> On Wed, 19 Jul 2023 at 15:13, Thom Brown  wrote:
> > 
> > On Wed, 19 Jul 2023, 13:58 Laurenz Albe,  wrote:
> > > I agree that the name "max_local_update" could be improved.
> > > Perhaps "avoid_hot_above_size_mb".
> > 
> > Or "hot_table_size_threshold" or "hot_update_limit"?
> 
> Although I like these names, it doesn't quite cover the use of the
> parameter for me, as updated tuples prefer to be inserted on the same
> page as the old tuple regardless of whether HOT applies.
> 
> How about 'local_update_limit'?

I agree with your concern.  I cannot think of a better name than yours.

Yours,
Laurenz Albe

Re: Disabling Heap-Only Tuples

2023-09-05 Thread Laurenz Albe

On Wed, 2023-08-30 at 09:31 -0400, Robert Haas wrote:
> On Wed, Aug 30, 2023 at 9:01 AM Matthias van de Meent
>  wrote:
> > I've reworked the patch a bit to remove the "excessive bloat with low
> > fillfactors when local space is available" issue that this parameter
> > could cause - local updates are now done if the selected page we would
> > be inserting into is after the old tuple's page and the old tuple's
> > page still (or: now) has space available.
> > 
> > Does that alleviate your concerns?
> 
> That seems like a good chance, but my core concern is around people
> having to micromanage local_update_limit, and probably either not
> knowing how to do it properly, or not being able or willing to keep
> updating it as things change.
> 
> In a way, this parameter is a lot like work_mem, which is notoriously
> very difficult to tune.

I don't think that is a good comparison.  While most people probably
never need to touch "local_update_limit", "work_mem" is something everybody
has to consider.

And it is not so hard to tune: the setting would be the desired table
size, and you could use pgstattuple to find a good value.

I don't know what other use cases come to mind, but I see it as a tool to
shrink a table after it has grown big holes, perhaps after a mass delete.
Today, you can only VACUUM (FULL) or play with the likes of pg_squeeze and
pg_repack.

I think this is useful.

To alleviate your concerns, perhaps it would help to describe the use case
and ideas for a good setting in the documentation.

Yours,
Laurenz Albe

Re: document the need to analyze partitioned tables

2023-09-05 Thread Laurenz Albe

Sorry for dropping the ball on this; I'll add it to the next commitfest.

On Wed, 2023-01-25 at 21:43 +1300, David Rowley wrote:
> > I think your first sentence it a bit clumsy and might be streamlined to
> > 
> >   Partitioned tables do not directly store tuples and consequently do not
> >   require autovacuum to perform any VACUUM operations.
> 
> That seems better than what I had.

Ok, I went with it.

> > Also, I am a little bit unhappy about
> > 
> > 1. Your paragraph states that partitioned table need no autovacuum,
> >    but doesn't state unmistakably that they will never be treated
> >    by autovacuum.
> 
> hmm. I assume the reader realises from the text that lack of any
> tuples means VACUUM is not required.  The remaining part of what
> autovacuum does not do is explained when the text goes on to say that
> ANALYZE operations are also not performed on partitioned tables. I'm
> not sure what is left that's mistakable there.

I rewrote the paragraph a little so that it looks clearer to me.
I hope it is OK for you as well.

> > 2. You make a distinction between table partitions and "normal tables",
> >    but really there is no distiction.
> 
> We may have different mental models here. This relates to the part
> that I wasn't keen on in your patch, i.e:
> 
> +    The partitions of a partitioned table are normal tables and get processed
> +    by autovacuum
> 
> While I agree that the majority of partitions are likely to be
> relkind='r', which you might ordinarily consider a "normal table", you
> just might change your mind when you try to INSERT or UPDATE records
> that would violate the partition constraint. Some partitions might
> also be themselves partitioned tables and others might be foreign
> tables. That does not really matter much when it comes to what
> autovacuum does or does not do, but I'm not really keen to imply in
> our documents that partitions are "normal tables".

Agreed, there are differences between partitions and normal tables.
And this is not the place in the documentation where we would like to
get into detail about the differences.

Attached is the next version of my patch.

Yours,
Laurenz Albe
From 33ef30888b5f5b57c776a1bd00065e0c94daccdb Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 6 Sep 2023 05:43:30 +0200
Subject: [PATCH] Improve autovacuum doc on partitioned tables

The documentation mentioned that autovacuum doesn't process
partitioned tables, but it was unclear about the impact.
The old wording could be interpreted to mean that there are
problems with dead tuple cleanup on partitioned tables.
Clarify that the only potential problem is autoanalyze, and
that statistics for the partitions will be gathered.

Author: Laurenz Albe
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/1fd81ddc7710a154834030133c6fea41e55c8efb.camel%40cybertec.at
---
 doc/src/sgml/maintenance.sgml | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 9cf9d030a8..10b1f211e8 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -861,10 +861,18 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu

 

-Partitioned tables are not processed by autovacuum.  Statistics
-should be collected by running a manual ANALYZE when it is
-first populated, and again whenever the distribution of data in its
-partitions changes significantly.
+Partitioned tables do not directly store tuples and consequently do not
+require autovacuum to perform any VACUUM operations.
+Autovacuum performs a VACUUM on the partitioned
+table's partitions the same as it does with other tables.  While that
+works fine, the fact that autovacuum doesn't process partitioned tables
+also means that it doesn't run ANALYZE on them, and this
+can be problematic, as there are various places in the query planner that
+attempt to make use of table statistics for partitioned tables when
+partitioned tables are queried.  For now, you can work around this problem
+by running a manual ANALYZE command on the partitioned
+table when the partitioned table is first populated, and again whenever
+the distribution of data in its partitions changes significantly.

 

-- 
2.41.0

Re: to_regtype() Raises Error

2023-09-17 Thread Laurenz Albe

On Mon, 2023-09-18 at 00:57 +0200, Vik Fearing wrote:
> On 9/18/23 00:41, Erik Wienhold wrote:
> > On 18/09/2023 00:13 CEST David E. Wheeler  wrote:
> > > The docs for `to_regtype()` say, “this function will return NULL rather 
> > > than
> > > throwing an error if the name is not found.” And it’s true most of the 
> > > time:
> > > 
> > > david=# select to_regtype('foo'), to_regtype('clam');
> > >   to_regtype | to_regtype
> > > +
> > >   [null] | [null]
> > > 
> > > But not others:
> > > 
> > > david=# select to_regtype('inteval second');
> > > ERROR:  syntax error at or near "second"
> > > LINE 1: select to_regtype('inteval second');
> > >  ^
> > > CONTEXT:  invalid type name "inteval second”
> > 
> > Probably a typo and you meant 'interval second' which works.
> No, that is precisely the point.  The result should be null instead of 
> an error.

Right.  I debugged into this, and found this comment to typeStringToTypeName():

 * If the string cannot be parsed as a type, an error is raised,
 * unless escontext is an ErrorSaveContext node, in which case we may
 * fill that and return NULL.  But note that the ErrorSaveContext option
 * is mostly aspirational at present: errors detected by the main
 * grammar, rather than here, will still be thrown.

"escontext" is an ErrorSaveContext node, and it is the parser failing.

Not sure if we can do anything about that or if it is worth the effort.

Perhaps the documentation could reflect the implementation.

Yours,
Laurenz Albe

Re: Disabling Heap-Only Tuples

2023-09-18 Thread Laurenz Albe

On Mon, 2023-09-18 at 12:22 -0400, Robert Haas wrote:
> On Tue, Sep 5, 2023 at 11:15 PM Laurenz Albe  wrote:
> > I don't think that is a good comparison.  While most people probably
> > never need to touch "local_update_limit", "work_mem" is something everybody
> > has to consider.
> > 
> > And it is not so hard to tune: the setting would be the desired table
> > size, and you could use pgstattuple to find a good value.
> 
> What I suspect would happen, though, is that you'd end up tuning the
> value over and over. You'd set it to some value and after some number
> of vacuums maybe you'd realize that you could save even more disk
> space if you reduced it a bit further or maybe your data set would
> grow a bit and you'd have to increase it a little (or a lot). And if
> you didn't keep adjusting it then maybe something quite bad would
> happen to your database.

There is that risk, yes.

> work_mem isn't quite the same [...] But what is the same here and in the case
> of work_mem is that you can suddenly get hosed if the situation
> changes substantially and you don't respond by updating the parameter
> setting. In the case of work_mem, again in my experience, it's quite
> common for people to suddenly find themselves in a lot of trouble if
> they have a load spike, because now they're running a lot more copies
> of the same query and the machine runs out of memory.

So the common ground is "both parameters are not so easy to get right,
and if you get them wrong, it's a problem".  For me the big difference is
that while you pretty much have to tune "work_mem", you can normally just ignore
"local_update_limit".

> The equivalent
> problem here would be if the table suddenly gets a lot bigger due to a
> load spike or some change in the way the application is used. Then
> suddenly, a setting that was previously serving to keep the table
> pleasantly small and un-bloated on disk is instead causing tons of
> updates that would have been HOT to become non-HOT, which could very
> easily result in both the table and its indexes bloating quite
> rapidly. I really don't like the idea of an anti-bloat feature that,
> when set to the wrong value, becomes a bloat-amplification feature. I
> don't know how to describe that other than "fragile and dangerous."

Yes, you can hurt yourself that way.  But that applies to many other
settings as well.  You can tank your performance with a bad value for
"commit_delay", "hot_standby_feedback" can bloat your primary, and
so on.  Still we consider these useful parameters.

> Imagine a hypothetical feature that knew how small the table could
> reasonably be kept, say by magic, and did non-HOT updates instead of
> HOT updates whenever doing so would allow moving a tuple from a page
> beyond that magical boundary to an earlier page. Such a feature would
> not have the downsides that this one does -- if there were
> opportunities to make the table smaller, the system would take
> advantage of them automatically, and if the table grew, the system
> would automatically become more relaxed to stay out of trouble. Such a
> feature is clearly more work to design and implement than what is
> proposed here, but it would also work a lot better in practice.

That sounds a bit like we should not have "shared_buffers" unless we
have a magical tool built in that gets the value right automatically.
Yes, the better is the enemy of the good.  You can kill everything with
a line of reasoning like that.

> In
> fact, I daresay that if we accept the feature as proposed, somebody's
> going to go out and write a tool to calculate what the threshold ought
> to be and automatically adjust it as things change. Users of the tool
> will then divide into two camps:
> 
> - People who try to tune it manually and get burned if anything
> changes on their system.
> - People who use that out-of-core tool.
> 
> So the out-of-core tool that does this tuning becomes a stealth
> dependency for any user who is facing this problem. Gosh, don't we
> have enough of those already? Connection pooling being perhaps the
> most obvious example, but far from the only one.

I cannot follow you there.  What I envision is that "local_update_limit"
is not set permanently on a table.  You set it when you realize your table
got bloated.  Then you wait until the bloat goes away or you launch a
couple of UPDATEs that eventually shrink the table.  Then you reset
"local_update_limit" again.
It's a more difficult, but less invasive alternative to VACUUM (FULL).

If a setting is hard to understand and hard to get right, we could invest
in good documentation that explains the use cases and pitfalls.
Wouldn't that go a long way towards defusing this perceived footgun?
I am aware that a frightening number of users don't read documentation,
but I find it hard to believe that anyone would twiddle a non-obvious
knob like "local_update_limit" without first trying to figure out what
it actually does.

Yours,
Laurenz Albe

Re: Disabling Heap-Only Tuples

2023-09-19 Thread Laurenz Albe

On Tue, 2023-09-19 at 14:50 -0400, Robert Haas wrote:
> But I know people will try to use it for instant compaction too, and
> there it's worth remembering why we removed old-style VACUUM FULL. The
> main problem is that it was mind-bogglingly slow. The other really bad
> problem is that it caused massive index bloat. I think any system
> that's based on moving around my tuples right now to make my table
> smaller right now is likely to have similar issues.

I had the same feeling that this is sort of bringing back old-style
VACUUM (FULL).  But I don't think that index bloat is a show stopper
these days, when we have REINDEX CONCURRENTLY, so I am not worried.

Yours,
Laurenz Albe

Re: Disabling Heap-Only Tuples

2023-09-19 Thread Laurenz Albe

On Tue, 2023-09-19 at 12:52 -0400, Robert Haas wrote:
> On Tue, Sep 19, 2023 at 12:30 PM Alvaro Herrera  
> wrote:
> > I was thinking something vaguely like "a table size that's roughly what
> > an optimal autovacuuming schedule would leave the table at" assuming 0.2
> > vacuum_scale_factor.  You would determine the absolute minimum size for
> > the table given the current live tuples in the table, then add 20% to
> > account for a steady state of dead tuples and vacuumed space.  So it's
> > not 1.2x of the "current" table size at the time the local_update_limit
> > feature is installed, but 1.2x of the optimal table size.
> 
> Right, that would be great. And honestly if that's something we can
> figure out, then why does the parameter even need to be an integer
> instead of a Boolean? If the system knows the optimal table size, then
> the user can just say "try to compact this table" and need not say to
> what size. The 1.2 multiplier is probably situation dependent and
> maybe the multiplier should indeed be a configuration parameter, but
> we would be way better off if the absolute size didn't need to be.

I don't have high hopes for a reliable way to automatically determine
the target table size.  There are these queries floating around to estimate
table bloat, which are used by various monitoring systems.  I find that they
get it right a lot of the time, but sometimes they get it wrong.  Perhaps
we can do better than that, but I vastly prefer a setting that I can control
(even at the danger that I can misconfigure it) over an automatism that I
cannot control and that sometimes gets it wrong.

I like Alvaro's idea to automatically reset "local_update_limit" when the
table has shrunk enough.  Why not perform that task during vacuum truncation?
If vacuum truncation has taken place, check if the table size is no bigger
than "local_update_limit" * (1 + "autovacuum_vacuum_scale_factor"), and if
it is no bigger, reset "local_update_limit".  That way, we would not have
to worry about a lock, because vacuum truncation already has the table locked.

Yours,
Laurenz Albe

Regression in COPY FROM caused by 9f8377f7a2

2023-09-25 Thread Laurenz Albe

In v16 and later, the following fails:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

COPY boom FROM STDIN;
ERROR:  value too long for type character varying(5)

In PostgreSQL v15 and earlier, the COPY statement succeeds.

The error is thrown in BeginCopyFrom in line 1578 (HEAD)

  defexpr = expression_planner(defexpr);

Bisecting shows that the regression was introduced by commit 9f8377f7a2,
which introduced DEFAULT values for COPY FROM.

The table definition is clearly silly, so I am not sure if that
regression is worth fixing.  On the other hand, it is not cool if
something that worked without an error in v15 starts to fail later on.

Yours,
Laurenz Albe

Re: Regression in COPY FROM caused by 9f8377f7a2

2023-09-25 Thread Laurenz Albe

On Mon, 2023-09-25 at 09:54 +0200, Laurenz Albe wrote:
> In v16 and later, the following fails:
> 
> CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
> 
> COPY boom FROM STDIN;
> ERROR:  value too long for type character varying(5)
> 
> In PostgreSQL v15 and earlier, the COPY statement succeeds.
> 
> The error is thrown in BeginCopyFrom in line 1578 (HEAD)
> 
>   defexpr = expression_planner(defexpr);
> 
> Bisecting shows that the regression was introduced by commit 9f8377f7a2,
> which introduced DEFAULT values for COPY FROM.

I suggest the attached fix, which evaluates default values only if
the DEFAULT option was specified or if the column does not appear in
the column list of COPY.

Yours,
Laurenz Albe
From 4af982c56df57a1a0f04340d394c297559fbabb5 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 25 Sep 2023 10:56:15 +0200
Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary

Since commit 9f8377f7a2, we evaluate the column default values in
COPY FROM for all columns except generated ones, because they could
be needed if the input value matches the DEFAULT option.

This can cause a surprising regression:

  CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
  COPY boom FROM STDIN;
  ERROR:  value too long for type character varying(5)

This worked before 9f8377f7a2, since default values were only
evaluated for columns that were not specified in the column list.

To fix, fetch the default values only if the DEFAULT option was
specified or for columns not specified in the column list.
---
 src/backend/commands/copyfrom.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 70871ed819..320b764aa9 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate,
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
 
-		if (!att->attgenerated)
+		/*
+		 * We need the default values only for columns that do not appear in the
+		 * column list or if the DEFAULT option was given.  We also don't need
+		 * it for generated columns.
+		 */
+		if ((!list_member_int(cstate->attnumlist, attnum) ||
+			 cstate->opts.default_print != NULL) &&
+			!att->attgenerated)
 		{
 			Expr	   *defexpr = (Expr *) build_column_default(cstate->rel,
 attnum);
-- 
2.41.0

Re: Postgres auto vacuum - Disable

2022-11-07 Thread Laurenz Albe

On Mon, 2022-11-07 at 12:12 +, Karthik Jagadish (kjagadis) wrote:
> I have follow-up question where the vacuum process is waiting and not doing 
> it’s job.
> When we grep on waiting process we see below output. Whenever we see this we 
> notice
> that the vacuum is not happening and the system is running out of space.
>  
> [root@zpah0031 ~]# ps -ef | grep 'waiting'
> postgres  8833 62646  0 Jul28 ?        00:00:00 postgres: postgres cgms 
> [local] VACUUM waiting
> postgres 18437 62646  0 Jul27 ?        00:00:00 postgres: postgres cgms 
> [local] VACUUM waiting
>  
>  
> What could be the reason as to why the vacuum is not happening? Is it because 
> some lock is
> present in the table/db or any other reason?

Look in "pg_stat_activity".  I didn't check, but I'm sure it's the intentional 
break
configured with "autovacuum_vacuum_cost_delay".  Reduce that parameter for more
autovacuum speed.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-07 Thread Laurenz Albe

t would be a better place.  Or, even better, the new "xact.sgml" chapter.

> --- /dev/null
> +++ b/doc/src/sgml/xact.sgml

+  Transaction Management

+   The word transaction is often abbreviated as "xact".

Should use  here.

> +   Transactions and Identifiers

> +   
> +Once a transaction writes to the database, it is assigned a
> +non-virtual TransactionId (or xid),
> +e.g., 278394. Xids are assigned sequentially
> +using a global counter used by all databases within the
> +PostgreSQL cluster. This property is used by
> +the transaction system to order transactions by their first database
> +write, i.e., lower-numbered xids started writing before higher-numbered
> +xids.  Of course, transactions might start in a different order.
> +   

"This property"?  How about:
"Because transaction IDs are assigned sequentially, the transaction system can
use them to order transactions by their first database write"

I would want some additional information here: why does the transaction system 
have
to order transactions by their first database write?

"Of course, transactions might start in a different order."

Now that confuses me.  Are you saying that BEGIN could be in a different order
than the first database write?  Perhaps like this:

"Note that the order in which transactions perform their first database write
might be different from the order in which the transactions started."

> +The internal transaction ID type xid is 32-bits wide

There should be no hyphen in "32 bits wide", just as in "3 years old".

> +A 32-bit epoch is incremented during each
> +wrap around.

We usually call this "wraparound" without a space.

> + There is also a 64-bit type xid8 which
> +includes this epoch and therefore does not wrap around during the
> +life of an installation and can be converted to xid by casting.

Running "and"s.  Better:

"There is also ... and does not wrap ... life of an installation.
 xid8 can be converted to xid by casting."

> +  Xids are used as the
> +basis for PostgreSQL's  +linkend="mvcc">MVCC concurrency mechanism,  +linkend="hot-standby">Hot Standby, and Read Replica servers.

What is the difference between a hot standby and a read replica?  I think
one of these terms is sufficient.

> +In addition to vxid and xid,
> +when a transaction is prepared for two-phase commit it
> +is also identified by a Global Transaction Identifier
> +(GID).

Better:

"In addition to vxid and xid,
 prepared transactions also have a Global Transaction Identifier
 (GID) that is assigned when the transaction is
 prepared for two-phase commit."

> +  
> +
> +   Transactions and Locking
> +
> +   
> +Currently-executing transactions are shown in  +linkend="view-pg-locks">pg_locks
> +in columns virtualxid and
> +transactionid.

Better:

"The transaction IDs of currently executing transactions are shown in pg_locks
 in the columns virtualxid and
 transactionid."

> +Lock waits on table-level locks are shown waiting for
> +virtualxid, while lock waits on row-level
> +locks are shown waiting for transactionid.

That's not true.  Transactions waiting for table-level locks are shown
waiting for a "relation" lock in both "pg_stat_activity" and "pg_locks".

> +Row-level read and write locks are recorded directly in locked
> +rows and can be inspected using the 
> +extension.  Row-level read locks might also require the assignment
> +of multixact IDs (mxid). Mxids are recorded in
> +the pg_multixact directory.

"are recorded directly in *the* locked rows"

I think the mention of multixacts should link to
.  Again, I would not
specifically mention the directory, since it is already described in
"storage.sgml", but I have no strong optinion there.

> +  
> +
> +   Subtransactions

> +The word subtransaction is often abbreviated as
> +subxact.

I'd use , not .

> +If a subtransaction is assigned a non-virtual transaction ID,
> +its transaction ID is referred to as a subxid.

Again, I would use , since we don't  "subxid"
elsewhere.

+   Up to
+64 open subxids are cached in shared memory for each backend; after
+that point, the overhead increases significantly since we must look
+up subxid entries in pg_subtrans.

Comma before "since".  Perhaps you should mention that this means disk I/O.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-07 Thread Laurenz Albe

On Mon, 2022-11-07 at 23:04 +0100, Laurenz Albe wrote:
> On Sat, 2022-11-05 at 10:08 +, Simon Riggs wrote:
> > Agreed; new compilation patch attached, including mine and then
> > Robert's suggested rewordings.
> 
> Thanks.  There is clearly a lot of usefule information in this.
> 
> Some comments: [...]

I thought some more about the patch, and I don't like the title
"Transaction Management" for the new chapter.  I'd expect some more
from a chapter titled "Internals" / "Transaction Management".

In reality, the new chapter is about transaction IDs.  So perhaps the
name should reflect that, so that it does not mislead the reader.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-09 Thread Laurenz Albe

On Mon, 2022-11-07 at 22:43 -0500, Bruce Momjian wrote:
> > I thought some more about the patch, and I don't like the title
> > "Transaction Management" for the new chapter.  I'd expect some more
> > from a chapter titled "Internals" / "Transaction Management".
> > 
> > In reality, the new chapter is about transaction IDs.  So perhaps the
> > name should reflect that, so that it does not mislead the reader.
> 
> I renamed it to "Transaction Processing" since we also cover locking and
> subtransactions.  How is that?

It is better.  Did you take my suggestions from [1] into account in your
latest cumulative patch in [2]?  Otherwise, it will be difficult to
integrate both.

Yours,
Laurenz Albe

 [1]: 
https://postgr.es/m/3603e6e85544daa5300c7106c31bc52673711cd0.camel%40cybertec.at
 [2]: https://postgr.es/m/Y2nP04/3BHQOviVB%40momjian.us

Re: New docs chapter on Transaction Management and related changes

2022-11-09 Thread Laurenz Albe

On Wed, 2022-11-09 at 09:16 -0500, Robert Treat wrote:
> On Mon, Nov 7, 2022 at 5:04 PM Laurenz Albe  wrote:
> > Some comments:
> > 
> 
> > > --- a/doc/src/sgml/ref/release_savepoint.sgml
> > > +++ b/doc/src/sgml/ref/release_savepoint.sgml
> > > @@ -34,23 +34,16 @@ RELEASE [ SAVEPOINT ] 
> > > savepoint_name
> > >    Description
> > > 
> > >    
> > > -   RELEASE SAVEPOINT destroys a savepoint previously 
> > > defined
> > > -   in the current transaction.
> > > +   RELEASE SAVEPOINT will subcommit the subtransaction
> > > +   established by the named savepoint, if one exists. This will release
> > > +   any resources held by the subtransaction. If there were any
> > > +   subtransactions of the named savepoint, these will also be 
> > > subcommitted.
> > >    
> > > 
> > >    
> > 
> > "Subtransactions of the named savepoint" is somewhat confusing; how about
> > "subtransactions of the subtransaction established by the named savepoint"?
> > 
> > If that is too long and explicit, perhaps "subtransactions of that 
> > subtransaction".
> 
> Personally, I think these are more confusing.

Perhaps.  I was worried because everywhere else, the wording makes a clear 
distinction
between a savepoint and the subtransaction created by a savepoint.  But perhaps 
some
sloppiness is better to avoid such word cascades.

> > > --- a/doc/src/sgml/ref/rollback.sgml
> > > +++ b/doc/src/sgml/ref/rollback.sgml
> > > @@ -56,11 +56,14 @@ ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
> > >  AND CHAIN
> > >  
> > >   
> > > -  If AND CHAIN is specified, a new transaction is
> > > +  If AND CHAIN is specified, a new unaborted 
> > > transaction is
> > >    immediately started with the same transaction characteristics (see 
> > >  > >    linkend="sql-set-transaction"/>) as the just finished one.  
> > > Otherwise,
> > >    no new transaction is started.
> > 
> > I don't think that is an improvement.  "Unaborted" is an un-word.  A new 
> > transaction
> > is always "unaborted", isn't it?
> 
> I thought about this as well when reviewing it, but I do think
> something is needed for the case where you have a transaction which
> has suffered an error and then you issue "rollback and chain"; if you
> just say "a new transaction is immediately started with the same
> transaction characteristics" it might imply to some the new
> transaction has some kind of carry over of the previous broken
> transaction... the use of the word unaborted makes it clear that the
> new transaction is 100% functional.

A new transaction is never aborted in my understanding.  Being aborted is not a
characteristic of a transaction, but a state.

> > 
> 
> 
> > > +    The internal transaction ID type xid is 32-bits wide
> > 
> > There should be no hyphen in "32 bits wide", just as in "3 years old".
> 
> Minor aside, we should clean up glossary.sgml as well.

Right, it has this:

 The numerical, unique, sequentially-assigned identifier that each
 transaction receives when it first causes a database modification.
 Frequently abbreviated as xid.
 When stored on disk, xids are only 32-bits wide, so only
 approximately four billion write transaction IDs can be generated;
 to permit the system to run for longer than that,
 epochs are used, also 32 bits wide.

Which reminds me that I should have suggested  rather than
 where I complained about the use of .

> > 
> > +   Up to
> > +    64 open subxids are cached in shared memory for each backend; after
> > +    that point, the overhead increases significantly since we must look
> > +    up subxid entries in pg_subtrans.
> > 
> > Comma before "since".  Perhaps you should mention that this means disk I/O.
> 
> ISTR that you only use a comma before since in cases where the
> preceding thought contains a negative.

Not being a native speaker, I'll leave that to those who are; I went by feeling.

> In any case, are you thinking something like this:
> 
> " 64 open subxids are cached in shared memory for each backend; after
>  that point the overhead increases significantly due to additional disk I/O
>  from looking up subxid entries in pg_subtrans."

Yes.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-13 Thread Laurenz Albe

On Thu, 2022-11-10 at 12:17 +0100, Alvaro Herrera wrote:
> On 2022-Nov-10, Laurenz Albe wrote:
> > On Wed, 2022-11-09 at 09:16 -0500, Robert Treat wrote:
> > > > > -  If AND CHAIN is specified, a new 
> > > > > transaction is
> > > > > +  If AND CHAIN is specified, a new unaborted 
> > > > > transaction is
> > > > >    immediately started with the same transaction characteristics 
> > > > > (see  > > > >    linkend="sql-set-transaction"/>) as the just finished one.  
> > > > > Otherwise,
> > > > >    no new transaction is started.
> > 
> > A new transaction is never aborted in my understanding.  Being aborted
> > is not a characteristic of a transaction, but a state.
> 
> I agree, but maybe it's good to make the point explicit, because it
> doesn't seem obvious.  Perhaps something like
> 
> "If X is specified, a new transaction (never in aborted state) is
> immediately started with the same transaction characteristics (see X) as
> the just finished one.  Otherwise ..."
> 
> Getting the wording of that parenthical comment right is tricky, though.
> What I propose above is not great, but I don't know how to make it
> better.  Other ideas that seem slightly worse but may inspire someone:
> 
>   ... a new transaction (which is never in aborted state) is ...
>   ... a new transaction (not in aborted state) is ...
>   ... a new transaction (never aborted, even if the previous is) is ...
>   ... a new (not-aborted) transaction is ...
>   ... a new (never aborted) transaction is ...
>   ... a new (never aborted, even if the previous is) transaction is ...
>   ... a new (never aborted, regardless of the status of the previous one) 
> transaction is ...
> 
> 
> Maybe there's a way to reword the entire phrase that leads to a better
> formulation of the idea.

Any of your auggestions is better than "unaborted".

How about:

  If AND CHAIN is specified, a new transaction is
  immediately started with the same transaction characteristics (see ) as the just finished one.
  This new transaction won't be in the aborted state, even
  if the old transaction was aborted.

Yours,
Laurenz Albe

Re: Reducing power consumption on idle servers

2022-11-20 Thread Laurenz Albe

On Mon, 2022-11-21 at 10:13 +1300, Thomas Munro wrote:
> I'll wait 24 hours before committing, to
> provide a last chance for anyone who wants to complain about dropping
> promote_trigger_file.

Remove "promote_trigger_file"?  Now I have never seen anybody use that
parameter, but I don't think that it is a good idea to deviate from our
usual standard of deprecating a feature for about five years before
actually removing it.

Yours,
Laurenz Albe

Re: Reducing power consumption on idle servers

2022-11-21 Thread Laurenz Albe

On Mon, 2022-11-21 at 11:42 +0530, Bharath Rupireddy wrote:
> On Mon, Nov 21, 2022 at 10:37 AM Laurenz Albe  
> wrote:
> > 
> > On Mon, 2022-11-21 at 10:13 +1300, Thomas Munro wrote:
> > > I'll wait 24 hours before committing, to
> > > provide a last chance for anyone who wants to complain about dropping
> > > promote_trigger_file.
> > 
> > Remove "promote_trigger_file"?  Now I have never seen anybody use that
> > parameter, but I don't think that it is a good idea to deviate from our
> > usual standard of deprecating a feature for about five years before
> > actually removing it.
> 
> I'm not sure what the guidelines are here, however years have gone by
> since pg_ctl promote [1] in 2011 and pg_promote() [2] in 2018 were
> added. With two such alternatives in place for many years, it was sort
> of an undeclared deprecation of promote_trigger_file GUC. And the
> changes required to move to newer ways from the GUC aren't that hard
> for those who're still relying on the GUC. Therefore, I think it's now
> time for us to do away with the GUC.

I disagree.  With the same argument, you could rip out "serial", since we
have supported identity columns since v11.

I understand the desire to avoid needless wakeups, but isn't it possible
to only perform the regular poll if "promote_trigger_file" is set?

Yours,
Laurenz Albe

Re: Reducing power consumption on idle servers

2022-11-21 Thread Laurenz Albe

On Mon, 2022-11-21 at 07:36 +, Simon Riggs wrote:
> On Mon, 21 Nov 2022 at 05:07, Laurenz Albe  wrote:
> > 
> > On Mon, 2022-11-21 at 10:13 +1300, Thomas Munro wrote:
> > > I'll wait 24 hours before committing, to
> > > provide a last chance for anyone who wants to complain about dropping
> > > promote_trigger_file.
> > 
> > Remove "promote_trigger_file"?  Now I have never seen anybody use that
> > parameter, but I don't think that it is a good idea to deviate from our
> > usual standard of deprecating a feature for about five years before
> > actually removing it.
> 
> We aren't removing the ability to promote, just enforcing a change to
> a better mechanism, hence I don't see a reason for a long(er)
> deprecation period than we have already had.

We have had a deprecation period?  I looked at the documentation, but found
no mention of a deprecation.  How hard can it be to leave the GUC and only
poll for the existence of the file if it is set?

I personally don't need the GUC, and I know nobody who does, but I think
we should not be cavalier about introducing unnecessary compatibility breaks.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-21 Thread Laurenz Albe

On Fri, 2022-11-18 at 14:28 -0500, Bruce Momjian wrote:
> New patch attached.

Thanks.

> --- a/doc/src/sgml/ref/release_savepoint.sgml
> +++ b/doc/src/sgml/ref/release_savepoint.sgml

> +   RELEASE SAVEPOINT releases the named savepoint and
> +   all active savepoints that were created after the named savepoint,
> +   and frees their resources.  All changes made since the creation of the
> +   savepoint, excluding rolled back savepoints changes, are merged into
> +   the transaction or savepoint that was active when the named savepoint
> +   was created.  Changes made after RELEASE SAVEPOINT
> +   will also be part of this active transaction or savepoint.

I am not sure if "rolled back savepoints changes" is clear enough.
I understand that you are trying to avoid the term "subtransaction".
How about:

  All changes made since the creation of the savepoint that didn't already
  get rolled back are merged ...

> --- a/doc/src/sgml/ref/rollback.sgml
> +++ b/doc/src/sgml/ref/rollback.sgml
>
> +  If AND CHAIN is specified, a new (unaborted)

*Sigh* I'll make one last plea for "not aborted".

> --- /dev/null
> +++ b/doc/src/sgml/xact.sgml

> +  
> +   Transactions can be created explicitly using BEGIN
> +   and COMMIT, which creates a transaction block.
> +   An SQL statement outside of a transaction block automatically uses
> +   a single-statement transaction.
> +  

Sorry, I should have noted that the first time around.

Transactions are not created using COMMIT.

Perhaps we should also avoid the term "transaction block".  Even without 
speaking
of a "block", way too many people confuse PL/pgSQL's BEGIN ... END blocks
with transactions.  On the other hand, we use "transaction block" everywhere
else in the documentation...

How about:

   Multi-statement transactions can be created explicitly using
   BEGIN or START TRANSACTION and
   are ended using COMMIT or ROLLBACK.
   An SQL statement outside of a transaction block automatically uses
   a single-statement transaction.

> + 
> +
> +  Transactions and Locking
> +
> +  
> +   The transaction IDs of currently executing transactions are shown in
> +   pg_locks
> +   in columns virtualxid and
> +   transactionid.  Read-only transactions
> +   will have virtualxids but NULL
> +   transactionids, while read-write transactions
> +   will have both as non-NULL.
> +  

Perhaps the following will be prettier than "have both as non-NULL":

  ..., while both columns will be set in read-write transactions.

Yours,
Laurenz Albe

Re: Reducing power consumption on idle servers

2022-11-21 Thread Laurenz Albe

On Mon, 2022-11-21 at 12:11 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > The reason that I pushed back -- not as successfully as I would have
> > liked -- on the changes to pg_stop_backup / pg_start_backup is that I
> > know there are people using the old method successfully, and it's not
> > just a 1:1 substitution. Here I don't, and it is. I'm totally open to
> > the feedback that such people exist and to hearing why adopting one of
> > the newer methods would be a problem for them, if that's the case. But
> > if there's no evidence that such people exist or that changing is a
> > problem for them, I don't think waiting 5 years on principle is good
> > for the project.
> 
> We make incompatible changes in every release; see the release notes.
> Unless somebody can give a plausible use-case where this'd be a
> difficult change to deal with, I concur that we don't need to
> deprecate it ahead of time.

Since I am the only one that seems to worry, I'll shut up.  You are probably
right that it the feature won't be missed by many users.

Yours,
Laurenz Albe

Re: New docs chapter on Transaction Management and related changes

2022-11-22 Thread Laurenz Albe

On Tue, 2022-11-22 at 13:50 -0500, Bruce Momjian wrote:
> Agreed, updated patch attached.

I cannot find any more problems, and I shouldn't mention the extra empty
line at the end of the patch.

I'd change the commitfest status to "ready for committer" now if it were
not already in that status.

Yours,
Laurenz Albe

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2022-11-26 Thread Laurenz Albe

On Fri, 2022-11-25 at 14:47 -0800, Peter Geoghegan wrote:
> Attached WIP patch invents the idea of a regular autovacuum that is
> tasked with advancing relfrozenxid -- which is really just another
> trigger criteria, reported on in the server log in its autovacuum
> reports. Of course we retain the idea of antiwraparound autovacuums.
> The only difference is that they are triggered when table age has
> advanced by twice the usual amount, which is presumably only possible
> because a regular autovacuum couldn't start or couldn't complete in
> time (most likely due to continually being auto-cancelled).
> 
> As I said before, I think that the most important thing is to give
> regular autovacuuming a chance to succeed. The exact approach taken
> has a relatively large amount of slack, but that probably isn't
> needed. So the new antiwraparound cutoffs were chosen because they're
> easy to understand and remember, which is fairly arbitrary.

The target is a table that receives no DML at all, right?
I think that is a good idea.
Wouldn't it make sense to trigger that at *half* "autovacuum_freeze_max_age"?

Yours,
Laurenz Albe

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2022-11-27 Thread Laurenz Albe

On Sat, 2022-11-26 at 11:00 -0800, Peter Geoghegan wrote:

> > I think that is a good idea.
> > Wouldn't it make sense to trigger that at *half* 
> > "autovacuum_freeze_max_age"?
> 
> That would be equivalent to what I've done here, provided you also
> double the autovacuum_freeze_max_age setting.

That's exactly what I was trying to debate.  Wouldn't it make sense to
trigger VACUUM earlier so that it has a chance of being less heavy?
On the other hand, if there are not sufficiently many modifications
on the table to trigger autovacuum, perhaps it doesn't matter in many
cases.

> I did it this way
> because I believe that it has fewer problems. The approach I took
> makes the general perception that antiwraparound autovacuum are a
> scary thing (really just needed for emergencies) become true, for the
> first time.
> 
> We should expect to see very few antiwraparound autovacuums with the
> patch, but when we do see even one it'll be after a less aggressive
> approach was given the opportunity to succeed, but (for whatever
> reason) failed.

Is that really so much less aggressive?  Will that autovacuum run want
to process all pages that are not all-frozen?  If not, it probably won't
do much good.  If yes, it will be just as heavy as an anti-wraparound
autovacuum (except that it won't block other sessions).

> Just seeing any antiwraparound autovacuums will become
> a useful signal of something being amiss in a way that it just isn't
> at the moment. The docs can be updated to talk about antiwraparound
> autovacuum as a thing that you should encounter approximately never.
> This is possible even though the patch isn't invasive at all.

True.  On the other hand, it might happen that after this, people start
worrying about normal autovacuum runs because they occasionally experience
a table age autovacuum that is much heavier than the other ones.  And
they can no longer tell the reason, because it doesn't show up anywhere.

Yours,
Laurenz Albe

Re: Patch: Global Unique Index

2022-11-29 Thread Laurenz Albe

On Tue, 2022-11-29 at 13:58 +0100, Vik Fearing wrote:
> I disagree.  A user does not need to know that a table is partitionned, 
> and if the user wants a unique constraint on the table then making them 
> type an extra word to get it is just annoying.

Hmm.  But if I created a primary key without thinking too hard about it,
only to discover later that dropping old partitions has become a problem,
I would not be too happy either.

Yours,
Laurenz Albe

Re: Patch: Global Unique Index

2022-11-30 Thread Laurenz Albe

On Wed, 2022-11-30 at 10:09 +0100, Vik Fearing wrote:
> On 11/29/22 17:29, Laurenz Albe wrote:
> > On Tue, 2022-11-29 at 13:58 +0100, Vik Fearing wrote:
> > > I disagree.  A user does not need to know that a table is partitionned,
> > > and if the user wants a unique constraint on the table then making them
> > > type an extra word to get it is just annoying.
> > 
> > Hmm.  But if I created a primary key without thinking too hard about it,
> > only to discover later that dropping old partitions has become a problem,
> > I would not be too happy either.
> 
> I have not looked at this patch, but my understanding of its design is 
> the "global" part of the index just makes sure to check a unique index 
> on each partition.  I don't see from that how dropping old partitions 
> would be a problem.

Right, I should have looked closer.  But, according to the parallel discussion,
ATTACH PARTITION might be a problem.  A global index is likely to be a footgun
one way or the other, so I think it should at least have a safety on
(CREATE PARTITIONED GLOBAL INDEX or something).

Yours,
Laurenz Albe

< 1 2 3 4 5 6 7 8 >

101 - 200 of 729 matches

Mail list logo