Re: GIN pending list cleanup during autoanalyze blocks cleanup by VACUUM
On Thu, Oct 14, 2021 at 12:50 AM Masahiko Sawada wrote: > Looking at the original commit, as you mentioned, ISTM performing > pending list cleanup during (auto)analyze (and analyze_only) was > introduced to perform the pending list cleanup on insert-only tables. > Now that we have autovacuum_vacuum_insert_threshold, we don’t > necessarily need to rely on that. Right. > On the other hand, I still see a little value in performing the > pending list cleanup during autoanalyze. For example, if the user > wants to clean up the pending list frequently in the background (so > that it's not triggered in the INSERT path), it might be better to do > that during autoanalyze than autovacuum. If the table has garbage, > autovacuum has to vacuum all indexes and the table, taking a very long > time. But autoanalyze can be done in a shorter time. If we trigger > autoanalyze frequently and perform pending list cleanup, the pending > list cleanup can also be done in a relatively short time, preventing > MVCC snapshots from being held for a long time. I agree that that's true -- there is at least a little value. But, that's just an accident of history. Today, ginvacuumcleanup() won't need to scan the whole index in the autoanalyze path that I'm concerned about - it will just do pending list insertion. This does mean that the autoanalyze path taken within ginvacuumcleanup() should be a lot faster than a similar cleanup-only call to ginvacuumcleanup(). But...is there actually a good reason for that? Why should a cleanup-only VACUUM ANALYZE (i.e. a V-A where the VACUUM does not see any heap-page LP_DEAD items) be so much slower than a similar ANALYZE against the same table, under the same conditions? I see no good reason. Ideally, ginvacuumcleanup() would behave like btvacuumcleanup() and hashvacuumcleanup(). That is, it should not unnecessarily scan the index (even when used by VACUUM). In other words, it should have something like the "Skip full index scan" mechanism that you added to nbtree in commit 857f9c36. That way it just wouldn't make sense to have this autoanalyze path anymore -- it would no longer have this accidental advantage over a regular ginvacuumcleanup() call made from VACUUM. More generally, I think it's a big problem that ginvacuumcleanup() has so many weird special cases. Why does the full_clean argument to ginInsertCleanup() even exist? It makes the behavior inside ginInsertCleanup() vary based on whether we're in autovacuum (or autoanalyze) for no reason at all. I think that the true reason for this is simple: the code in ginInsertCleanup() is *bad*. full_clean was just forgotten about by one of its many bug fixes since the code quality started to go down. Somebody (who shall remain nameless) was just careless when maintaining that code. VACUUM should be in charge of index AMs -- not the other way around. It's good that the amvacuumcleanup() interface is so flexible, but I think that GIN is over relying on this flexibility. Ideally, VACUUM wouldn't have to think about the specific index AMs involved at all -- why should GIN be so different to GiST, nbtree, or hash? If GIN (or any other index AM) behaves like a special little snowflake, with its own unique performance characteristics, then it is harder to improve certain important things inside VACUUM. For example, the conveyor belt index vacuuming design from Robert Haas won't work as well as it could. > Therefore, I personally think that it's better to eliminate > analyze_only code after introducing a way that allows us to perform > the pending list cleanup more frequently. I think that the idea of > Jaime Casanova's patch is a good solution. I now think that it would be better to fix ginvacuumcleanup() in the way that I described (I changed my mind). Jaime's patch does seem like a good idea to me, but not for this. It makes sense to have that, for the reasons that Jaime said himself. > I'm not very positive about back-patching. The first reason is what I > described above; I still see little value in performing pending list > cleanup during autoanalyze. Another reason is that if the user relies > on autoanalyze to perform pending list cleanup, they have to enable > autovacuum_vacuum_insert_threshold instead during the minor upgrade. > Since it also means to trigger autovacuum in more cases I think it > will have a profound impact on the existing system. I have to admit that when I wrote my earlier email, I was still a little shocked by the problem -- which is not a good state of mind when making a decision about backpatching. But, I now think that I underappreciated the risk of making the problem worse in the backbranches, rather than better. I won't be backpatching anything here. The problem report from Nikolay was probably an unusually bad case, where the pending list cleanup/insertion was particularly expensive. This *is* really awful behavior, but that alone isn't a good enough reason to backpatch. -- Peter Geoghegan
Re: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber
On Sat, 16 Oct 2021 at 22:42, Japin Li wrote: > On Thu, 14 Oct 2021 at 19:49, Dilip Kumar wrote: >> On Thu, Oct 14, 2021 at 3:48 PM Sadhuprasad Patro wrote: >>> >>> Hi All, >>> >>> Publisher 'DateStyle' is set as "SQL, MDY", whereas in Subscriber as >>> "SQL, DMY", the logical replication is not working... >>> >>> From Publisher: >>> postgres=# INSERT INTO calendar VALUES ('07-18-1036', '1'), ('05-15-1135', >>> '1'); >>> INSERT 0 2 >>> >>> Getting below error in the subscriber log file, >>> 2021-10-14 00:59:23.067 PDT [38262] ERROR: date/time field value out >>> of range: "07/18/1036" >>> 2021-10-14 00:59:23.067 PDT [38262] HINT: Perhaps you need a >>> different "datestyle" setting. >>> >>> Is this an expected behavior? >> >> Looks like a problem to me, I think for fixing this, on logical >> replication connection always set subscriber's DateStlyle, with that >> the walsender will always send the data in the same DateStyle that >> worker understands and then we are good. > > Right! Attached fix it. Add a test case in subscription/t/100_bugs.pl. Please consider the v2 patch for review. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 5c6e56a5b2..81cf9e30d7 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -30,6 +30,7 @@ #include "pqexpbuffer.h" #include "replication/walreceiver.h" #include "utils/builtins.h" +#include "utils/guc.h" #include "utils/memutils.h" #include "utils/pg_lsn.h" #include "utils/tuplestore.h" @@ -218,6 +219,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, if (logical) { PGresult *res; + const char *datestyle; res = libpqrcv_PQexec(conn->streamConn, ALWAYS_SECURE_SEARCH_PATH_SQL); @@ -229,6 +231,23 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, pchomp(PQerrorMessage(conn->streamConn); } PQclear(res); + + datestyle = GetConfigOption("datestyle", true, true); + if (datestyle != NULL) { + char *sql; + + sql = psprintf("SELECT pg_catalog.set_config('datestyle', '%s', false);", datestyle); + res = libpqrcv_PQexec(conn->streamConn, sql); + if (PQresultStatus(res) != PGRES_TUPLES_OK) + { +PQclear(res); +ereport(ERROR, + (errmsg("could not set datestyle: %s", +pchomp(PQerrorMessage(conn->streamConn); + } + PQclear(res); + pfree(sql); + } } conn->logical = logical; diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl index baa4a90771..a88a61df41 100644 --- a/src/test/subscription/t/100_bugs.pl +++ b/src/test/subscription/t/100_bugs.pl @@ -6,7 +6,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 5; +use Test::More tests => 6; # Bug #15114 @@ -224,3 +224,44 @@ $node_sub->safe_psql('postgres', "DROP TABLE tab1"); $node_pub->stop('fast'); $node_pub_sub->stop('fast'); $node_sub->stop('fast'); + +# Verify different datestyle between publisher and subscriber. +$node_publisher = PostgresNode->new('datestyle_publisher'); +$node_publisher->init(allows_streaming => 'logical'); +$node_publisher->append_conf('postgresql.conf', + "datestyle = 'SQL, MDY'"); +$node_publisher->start; + +$node_subscriber = PostgresNode->new('datestyle_subscriber'); +$node_subscriber->init(allows_streaming => 'logical'); +$node_subscriber->append_conf('postgresql.conf', + "datestyle = 'SQL, DMY'"); +$node_subscriber->start; + +$node_publisher->safe_psql('postgres', + "CREATE TABLE tab_rep(a date)"); +$node_publisher->safe_psql('postgres', + "INSERT INTO tab_rep VALUES ('07-18-2021'), ('05-15-2018')"); + +$node_subscriber->safe_psql('postgres', + "CREATE TABLE tab_rep(a date)"); + +# Setup logical replication +my $node_publisher_connstr = $node_publisher->connstr . ' dbname=postgres'; +$node_publisher->safe_psql('postgres', + "CREATE PUBLICATION tab_pub FOR ALL TABLES"); +$node_subscriber->safe_psql('postgres', + "CREATE SUBSCRIPTION tab_sub CONNECTION '$node_publisher_connstr' PUBLICATION tab_pub"); + +$node_publisher->wait_for_catchup('tab_sub'); + +my $result = $node_subscriber->safe_psql('postgres', + "SELECT count(*) FROM tab_rep"); +is($result, qq(2), 'failed to replication date from different datestyle'); + +# Clean up the tables on both publisher and subscriber as we don't need them +$node_publisher->safe_psql('postgres', 'DROP TABLE tab_rep'); +$node_subscriber->safe_psql('postgres', 'DROP TABLE tab_rep'); + +$node_publisher->stop('fast'); +$node_subscriber->stop('fast');
Re: Reset snapshot export state on the transaction abort
On Sat, Oct 16, 2021 at 08:31:36AM -0700, Zhihong Yu wrote: > On Sat, Oct 16, 2021 at 3:10 AM Dilip Kumar wrote: >> On Sat, Oct 16, 2021 at 9:13 AM Michael Paquier >> wrote: >>> One solution would be as simple as saving >>> SavedResourceOwnerDuringExport into a temporary variable before >>> calling AbortCurrentTransaction(), and save it back into >>> CurrentResourceOwner once we are done in >>> SnapBuildClearExportedSnapshot() as we need to rely on >>> AbortTransaction() to do the static state cleanup if an error happens >>> until the command after the replslot creation command shows up. >> >> Yeah, this idea looks fine to me. I have modified the patch. In >> addition to that I have removed calling >> ResetSnapBuildExportSnapshotState from the >> SnapBuildClearExportedSnapshot because that is anyway being called >> from the AbortTransaction. That seems logically fine. I'll check that tomorrow. > +extern void ResetSnapBuildExportSnapshotState(void); > > ResetSnapBuildExportSnapshotState() is only called inside snapbuild.c > I wonder if the addition to snapbuild.h is needed. As of xact.c in v2 of the patch, we have that: @@ -2698,6 +2699,9 @@ AbortTransaction(void) /* Reset logical streaming state. */ ResetLogicalStreamingState(); + /* Reset snapshot export state. */ + ResetSnapBuildExportSnapshotState(); -- Michael signature.asc Description: PGP signature
Refactoring pg_dump's getTables()
Attached is a proposed patch that refactors getTables() along the same lines as some previous work (eg 047329624, ed2c7f65b, daa9fe8a5) to avoid having multiple partially-redundant copies of the SQL query. This gets rid of nearly 300 lines of duplicative spaghetti code, creates a uniform style for dealing with cross-version changes (replacing at least three different methods currently being used for that in this same stretch of code), and allows moving some comments to be closer to the code they describe. There's a lot I still want to change here, but this part seems like it should be fairly uncontroversial. I've tested it against servers back to 8.0 (which is what led me to trip over the bug fixed in 40dfac4fc). Any objections to just pushing it? regards, tom lane diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 6ec524f8e6..e35ff6e7fb 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -6274,65 +6274,152 @@ getTables(Archive *fout, int *numTables) * We include system catalogs, so that we can work if a user table is * defined to inherit from a system catalog (pretty weird, but...) * - * We ignore relations that are not ordinary tables, sequences, views, - * materialized views, composite types, or foreign tables. - * - * Composite-type table entries won't be dumped as such, but we have to - * make a DumpableObject for them so that we can track dependencies of the - * composite type (pg_depend entries for columns of the composite type - * link to the pg_class entry not the pg_type entry). - * * Note: in this phase we should collect only a minimal amount of * information about each table, basically just enough to decide if it is * interesting. We must fetch all tables in this phase because otherwise * we cannot correctly identify inherited columns, owned sequences, etc. - * - * We purposefully ignore toast OIDs for partitioned tables; the reason is - * that versions 10 and 11 have them, but 12 does not, so emitting them - * causes the upgrade to fail. */ + appendPQExpBuffer(query, + "SELECT c.tableoid, c.oid, c.relname, " + "c.relnamespace, c.relkind, " + "(%s c.relowner) AS rolname, " + "c.relchecks, " + "c.relhasindex, c.relhasrules, c.relpages, " + "d.refobjid AS owning_tab, " + "d.refobjsubid AS owning_col, " + "tsp.spcname AS reltablespace, ", + username_subquery); + + if (fout->remoteVersion >= 80400) + appendPQExpBufferStr(query, + "c.relhastriggers, "); + else + appendPQExpBufferStr(query, + "(c.reltriggers <> 0) AS relhastriggers, "); + + if (fout->remoteVersion >= 90500) + appendPQExpBufferStr(query, + "c.relrowsecurity, c.relforcerowsecurity, "); + else + appendPQExpBufferStr(query, + "false AS relrowsecurity, " + "false AS relforcerowsecurity, "); + + if (fout->remoteVersion >= 12) + appendPQExpBufferStr(query, + "false AS relhasoids, "); + else + appendPQExpBufferStr(query, + "c.relhasoids, "); + + if (fout->remoteVersion >= 80200) + appendPQExpBufferStr(query, + "c.relfrozenxid, tc.relfrozenxid AS tfrozenxid, "); + else + appendPQExpBufferStr(query, + "0 AS relfrozenxid, 0 AS tfrozenxid, "); + + if (fout->remoteVersion >= 90300) + appendPQExpBufferStr(query, + "c.relminmxid, tc.relminmxid AS tminmxid, "); + else + appendPQExpBufferStr(query, + "0 AS relminmxid, 0 AS tminmxid, "); + + if (fout->remoteVersion >= 80200) + appendPQExpBufferStr(query, + "tc.oid AS toid, "); + else + appendPQExpBufferStr(query, + "0 AS toid, "); + + if (fout->remoteVersion >= 90100) + appendPQExpBufferStr(query, + "c.relpersistence, "); + else + appendPQExpBufferStr(query, + "'p' AS relpersistence, "); + + if (fout->remoteVersion >= 90300) + appendPQExpBufferStr(query, + "c.relispopulated, "); + else + appendPQExpBufferStr(query, + "'t' as relispopulated, "); + + if (fout->remoteVersion >= 90400) + appendPQExpBufferStr(query, + "c.relreplident, "); + else + appendPQExpBufferStr(query, + "'d' AS relreplident, "); + + if (fout->remoteVersion >= 90100) + appendPQExpBufferStr(query, + "CASE WHEN c.relkind = " CppAsString2(RELKIND_FOREIGN_TABLE) " THEN " + "(SELECT ftserver FROM pg_catalog.pg_foreign_table WHERE ftrelid = c.oid) " + "ELSE 0 END AS foreignserver, "); + else + appendPQExpBufferStr(query, + "0 AS foreignserver, "); + + if (fout->remoteVersion >= 90300) + appendPQExpBufferStr(query, + "array_remove(array_remove(c.reloptions,'check_option=local'),'check_option=cascaded') AS reloptions, " + "CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text " + "WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text ELSE NULL END AS checkoption, "); + else if (fout->remoteVersion >= 80200) + appendPQExpBufferStr(query, +
access to record's field in dynamic SQL doesn't work
Hi I played with one dynamic access to record's fields. I was surprised so I cannot to access to record field from dynamic SQL. Is there some reason why it is not possible? Today all composite types in PL/pgSQL are records: do $$ declare r record; _relname varchar; begin for r in select * from pg_class limit 3 loop execute 'select ($1).relname' using r into _relname; raise notice '%', _relname; end loop; end; $$; ERROR: could not identify column "relname" in record data type LINE 1: select ($1).relname ^ QUERY: select ($1).relname CONTEXT: PL/pgSQL function inline_code_block line 6 at EXECUTE but: do $$ declare r record; _relname varchar; begin for r in select * from pg_class limit 3 loop --execute 'select ($1).relname' using r into _relname; raise notice '%', r.relname; end loop; end; $$; NOTICE: pg_statistic NOTICE: pg_type NOTICE: pg_toast_1255 and postgres=# do $$ declare r pg_class; _relname varchar; begin for r in select * from pg_class limit 3 loop execute 'select ($1).relname' using r into _relname; raise notice '%', _relname; end loop; end; $$; NOTICE: pg_statistic NOTICE: pg_type NOTICE: pg_toast_1255 it is working too. Why there is difference between typed composite and record type although internally should be same? Regards Pavel
Re: Trivial doc patch
On Sat, Oct 16, 2021 at 11:14:46AM +0900, Michael Paquier wrote: > On Fri, Oct 15, 2021 at 01:13:14PM -0400, rir wrote: > > This removes the outer square brackets in the create_database.sgml > > file's synopsis. In the command sysopses, this is the only case > > where an optional group contains only optional groups. > > > > CREATE DATABASE name > > -[ [ WITH ] [ OWNER [=] > class="parameter">user_name ] > > +[ WITH ] [ OWNER [=] > class="parameter">user_name ] > > [...] > > - [ IS_TEMPLATE [=] > class="parameter">istemplate ] ] > > + [ IS_TEMPLATE [=] > class="parameter">istemplate ] > > > > > > You are not wrong, and the existing doc is not wrong either. I tend > to prefer the existing style, though, as it insists on the options > as being a single group, with or without the keyword WITH. Michael, perhaps I mistake you; it seems you would like it better with the extra '[' before OWNER. That would more accurately point up CREATE DATABASE name WITH; Either way, my argument would have the basis. In what sense are the options a single group? That they all might follow the 'WITH' is expressed without the duplicated brackets. That the extra braces promote readability relies on an assumption or knowledge of the command. Given that 'optional, optional' has no independent meaning from 'optional'; it requires one to scan the entire set looking for the non-optional embedded in the option. So no gain. Rob
Re: XTS cipher mode for cluster file encryption
On Sat, Oct 16, 2021 at 09:15:05AM -0700, Andres Freund wrote: > Hi, > > On 2021-10-16 10:16:25 -0400, Bruce Momjian wrote: > > As a final comment to Andres's email, adding a GCM has the problems > > above, plus it wouldn't detect changes to pg_xact, fsm, vm, etc, which > > could also affect the integrity of the data. Someone could also restore > > and old copy of a patch to revert a change, and that would not be > > detected even by GCM. > > > I consider this a checkbox feature and making it too complex will cause > > it to be rightly rejected. > > You're just deferring / hiding the complexity. For one, we'll need integrity > before long if we add encryption support. Then we'll deal with a more complex > on-disk format because there will be two different ways of encrypting. For > another, you're spreading out the security analysis to a lot of places in the > code and more importantly to future changes affecting on-disk data. > > If it's really just a checkbox feature without a real use case, then we should > just reject requests for it and use our energy for useful things. Agreed. That is the conclusion I came to in May: https://www.postgresql.org/message-id/20210526210201.GZ3048%40momjian.us https://www.postgresql.org/message-id/20210527160003.GF5646%40momjian.us -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: XTS cipher mode for cluster file encryption
Hi, On 2021-10-16 10:16:25 -0400, Bruce Momjian wrote: > As a final comment to Andres's email, adding a GCM has the problems > above, plus it wouldn't detect changes to pg_xact, fsm, vm, etc, which > could also affect the integrity of the data. Someone could also restore > and old copy of a patch to revert a change, and that would not be > detected even by GCM. > I consider this a checkbox feature and making it too complex will cause > it to be rightly rejected. You're just deferring / hiding the complexity. For one, we'll need integrity before long if we add encryption support. Then we'll deal with a more complex on-disk format because there will be two different ways of encrypting. For another, you're spreading out the security analysis to a lot of places in the code and more importantly to future changes affecting on-disk data. If it's really just a checkbox feature without a real use case, then we should just reject requests for it and use our energy for useful things. Greetings, Andres Freund
Re: Reset snapshot export state on the transaction abort
On Sat, Oct 16, 2021 at 3:10 AM Dilip Kumar wrote: > On Sat, Oct 16, 2021 at 9:13 AM Michael Paquier > wrote: > > > > While double-checking this stuff, I have noticed something that's > > wrong in the patch when a command that follows a > > CREATE_REPLICATION_SLOT query resets SnapBuildClearExportedSnapshot(). > > Once the slot is created, the WAL sender is in a TRANS_INPROGRESS > > state, meaning that AbortCurrentTransaction() would call > > AbortTransaction(), hence calling ResetSnapBuildExportSnapshotState() > > and resetting SavedResourceOwnerDuringExport to NULL before we store a > > NULL into CurrentResourceOwner :) > > Right, good catch! > > > One solution would be as simple as saving > > SavedResourceOwnerDuringExport into a temporary variable before > > calling AbortCurrentTransaction(), and save it back into > > CurrentResourceOwner once we are done in > > SnapBuildClearExportedSnapshot() as we need to rely on > > AbortTransaction() to do the static state cleanup if an error happens > > until the command after the replslot creation command shows up. > > Yeah, this idea looks fine to me. I have modified the patch. In > addition to that I have removed calling > ResetSnapBuildExportSnapshotState from the > SnapBuildClearExportedSnapshot because that is anyway being called > from the AbortTransaction. > > > > -- > Regards, > Dilip Kumar > EnterpriseDB: http://www.enterprisedb.com Hi, bq. While exporting a snapshot we set a temporary states which get a temporary states -> temporary states +extern void ResetSnapBuildExportSnapshotState(void); ResetSnapBuildExportSnapshotState() is only called inside snapbuild.c I wonder if the addition to snapbuild.h is needed. Cheers
Re: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber
On Thu, 14 Oct 2021 at 19:49, Dilip Kumar wrote: > On Thu, Oct 14, 2021 at 3:48 PM Sadhuprasad Patro wrote: >> >> Hi All, >> >> Publisher 'DateStyle' is set as "SQL, MDY", whereas in Subscriber as >> "SQL, DMY", the logical replication is not working... >> >> From Publisher: >> postgres=# INSERT INTO calendar VALUES ('07-18-1036', '1'), ('05-15-1135', >> '1'); >> INSERT 0 2 >> >> Getting below error in the subscriber log file, >> 2021-10-14 00:59:23.067 PDT [38262] ERROR: date/time field value out >> of range: "07/18/1036" >> 2021-10-14 00:59:23.067 PDT [38262] HINT: Perhaps you need a >> different "datestyle" setting. >> >> Is this an expected behavior? > > Looks like a problem to me, I think for fixing this, on logical > replication connection always set subscriber's DateStlyle, with that > the walsender will always send the data in the same DateStyle that > worker understands and then we are good. Right! Attached fix it. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 5c6e56a5b2..81cf9e30d7 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -30,6 +30,7 @@ #include "pqexpbuffer.h" #include "replication/walreceiver.h" #include "utils/builtins.h" +#include "utils/guc.h" #include "utils/memutils.h" #include "utils/pg_lsn.h" #include "utils/tuplestore.h" @@ -218,6 +219,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, if (logical) { PGresult *res; + const char *datestyle; res = libpqrcv_PQexec(conn->streamConn, ALWAYS_SECURE_SEARCH_PATH_SQL); @@ -229,6 +231,23 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, pchomp(PQerrorMessage(conn->streamConn); } PQclear(res); + + datestyle = GetConfigOption("datestyle", true, true); + if (datestyle != NULL) { + char *sql; + + sql = psprintf("SELECT pg_catalog.set_config('datestyle', '%s', false);", datestyle); + res = libpqrcv_PQexec(conn->streamConn, sql); + if (PQresultStatus(res) != PGRES_TUPLES_OK) + { +PQclear(res); +ereport(ERROR, + (errmsg("could not set datestyle: %s", +pchomp(PQerrorMessage(conn->streamConn); + } + PQclear(res); + pfree(sql); + } } conn->logical = logical;
Re: XTS cipher mode for cluster file encryption
On Fri, Oct 15, 2021 at 10:57:03PM +0200, Tomas Vondra wrote: > > That said, I don't think that's really a huge issue or something that's > > a show stopper or a reason to hold off on using XTS. Note that what > > those bits actually *are* isn't leaked, just that they changed in some > > fashion inside of that 16-byte cipher text block. That they're directly > > leaked with CTR is why there was concern raised about using that method, > > as discussed above and previously. > > > > Yeah. With CTR you pretty learn where the hint bits are exactly, while with > XTS the whole ciphertext changes. > > This also means CTR is much more malleable, i.e. you can tweak the > ciphertext bits to flip the plaintext, while with XTS that's not really > possible - it's pretty much guaranteed to break the block structure. Not > sure if that's an issue for our use case, but if it is then neither of the > two modes is a solution. Yes, this is a vary good point. Let's look at the impact of _not_ using the LSN. For CTR (already rejected) bit changes would be visible by comparing old/new page contents. For CBC (also not under consideration) the first 16-byte block would show a change, and all later 16-byte blocks would show a change. For CBC, you see the 16-byte blocks change, but you have no idea how many bits were changed, and in what locations in the 16-byte block (AES uses substitution and diffusion). For XTS, because earlier blocks don't change the IV used by later blocks like CBC, you would be able to see each 16-byte block that changed in the 8k page. Again, you would not know the number of bits changed or their locations. Do we think knowing which 16-byte blocks on an 8k page change would leak useful information? If so, we should use the LSN and just accept that some cases might leak as described above. If we don't care, then we can skip the use of the LSN and simplify the patch. > Not sure - it seems a bit weird to force LSN change even in cases that don't > generate any WAL. I was not following the encryption thread and maybe it was > discussed/rejected there, but I've always imagined we'd have a global nonce > generator (similar to a sequence) and we'd store it at the end of each > block, or something like that. Storing the nonce in the page means more code complexity, possible performance impact, and the inability to create standbys via binary replication that use cluster file encryption. As a final comment to Andres's email, adding a GCM has the problems above, plus it wouldn't detect changes to pg_xact, fsm, vm, etc, which could also affect the integrity of the data. Someone could also restore and old copy of a patch to revert a change, and that would not be detected even by GCM. I consider this a checkbox feature and making it too complex will cause it to be rightly rejected. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On 10/16/21 7:21 AM, Bharath Rupireddy wrote: > On Wed, Oct 6, 2021 at 4:31 PM Bharath Rupireddy > wrote: >> On Wed, Oct 6, 2021 at 7:52 AM Michael Paquier wrote: >>> On Wed, Oct 06, 2021 at 07:19:04AM +0530, Bharath Rupireddy wrote: I was thinking of the same. +1 for "vcregress check-world" which is more in sync with it's peer "make check-world" instead of "vcregress all-taptest". I'm not sure whether we can also have "vcregress installcheck-world" as well. >>> check-world, if it spins new instances for each contrib/ test, would >>> be infinitely slower than installcheck-world. I recall that's why >>> Andrew has been doing an installcheck for contribcheck to minimize the >>> load. If you run the TAP tests, perhaps you don't really care anyway, >>> but I think that there is a case for making this set of targets run as >>> fast as we can, if we can, when TAP is disabled. >> Out of all the regression tests available with vcregress command >> today, the tests shown at [1] require an already running postgres >> instance (much like the installcheck). Should we change these for >> "vcregress checkworld" so that they spin up tmp instances and run? I >> don't want to go in this direction and change the code a lot. >> >> To be simple, we could just have "vcregress installcheckworld" which >> requires users to spin up an instance so that the tests shown at [1] >> would run along with other tests [2] that spins up their own instance. >> The problem with this approach is that user might setup a different >> GUC value in the instance that he/she spins up expecting it to be >> effective for the tests at [2] as well. I'm not sure if anyone would >> do that. We can ignore "vcregress checkworld" but mention why we don't >> do this in the documentation "something like it makes tests slower as >> it spinus up lot of temporary pg instances". >> >> Another idea, simplest of all, is that just have "vcregress >> subscriptioncheck" as proposed in this first mail and not have >> ""vcregress checkworld" or "vcregress installcheckworld". With this >> new option and the existing options of vcregress tool, it sort of >> covers all the tests - core, TAP, contrib, bin, isolation, modules, >> upgrade, recovery etc. >> >> Thoughts? >> >> [1] >> vcregress installcheck >> vcregress plcheck >> vcregress contribcheck >> vcregress modulescheck >> vcregress isolationcheck >> >> [2] >> vcregress check >> vcregress ecpgcheck >> vcregress bincheck >> vcregress recoverycheck >> vcregress upgradecheck >> vcregress subscriptioncheck > The problems with having "vcregress checkworld" are: 1) required code > modifications are more as the available "vcregress" test functions, > which required pre-started pg instance, can't be used directly. 2) it > looks like spinning up a separate postgres instance for all module > tests takes time on Windows which might make the test time longer. If > we were to have "vcregress installcheckworld", we might have to add > new code for converting some of the existing functions to not use a > pre-started pg instance. > > IMHO, we can just have "vcregress subscriptioncheck" and let users > decide which tests they want to run. > > I would like to hear more opinions on this. > My opinion hasn't changed. There is already a way to spell this and I'm opposed to adding more such specific tests to vcregress.pl. Every such addition we make increases the maintenance burden. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Missing log message in recoveryStopsAfter() for RECOVERY_TARGET_TIME recovery target type
On Thu, Oct 14, 2021 at 7:05 AM Kyotaro Horiguchi wrote: > > At Wed, 13 Oct 2021 19:56:17 +0530, Bharath Rupireddy > wrote in > > Hi, > > > > I see that the recoveryStopsAfter() doesn't have any "recovery > > stopping after " sort of log for RECOVERY_TARGET_TIME recovery > > target type. It has similar logs for other recoveryTarget types > > though. Is there any specific reason for not having it? > > > > I see that we have "starting point-in-time recovery to " sorts of > > logs for all the recovery target types and also recoveryStopsBefore() > > has a log (by setting stopsHere) for RECOVERY_TARGET_TIME. > > So you should have seen the following comment there. > > /* > >* There can be many transactions that share the same commit time, so > >* we stop after the last one, if we are inclusive, or stop at the > >* first one if we are exclusive > >*/ > > Since both inclusive and exclusive cases are processed in > recoveryStopsBefore(), recoveryStopsAfter() has nothing to do for the > target type. IIUC, the recoveryStopsBefore handles the target type RECOVERY_TARGET_TIME and recoveryStopsAfter has nothing to do with the target type RECOVERY_TARGET_TIME when the actual recovery ends. Am I correct? If yes, can we have a comment in recoveryStopsBefore or recoveryStopsAfter? I have another question: do recoveryStopsAfter and recoveryStopsBefore ever be doing useful work when ArchiveRecoveryRequested is true and recoveryTarget is RECOVERY_TARGET_UNSET? With Assert(recoveryTarget != RECOVERY_TARGET_UNSET);, in those two functions, the regression tests fail. May I know what is the recovery scenario (crash recovery or recovery with specified target or recovery with unspecified target) that makes the startup process call recoveryStopsAfter and recoveryStopsBefore when ArchiveRecoveryRequested is true? Regards, Bharath Rupireddy.
Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?
On Wed, Oct 6, 2021 at 4:31 PM Bharath Rupireddy wrote: > > On Wed, Oct 6, 2021 at 7:52 AM Michael Paquier wrote: > > > > On Wed, Oct 06, 2021 at 07:19:04AM +0530, Bharath Rupireddy wrote: > > > I was thinking of the same. +1 for "vcregress check-world" which is > > > more in sync with it's peer "make check-world" instead of "vcregress > > > all-taptest". I'm not sure whether we can also have "vcregress > > > installcheck-world" as well. > > > > check-world, if it spins new instances for each contrib/ test, would > > be infinitely slower than installcheck-world. I recall that's why > > Andrew has been doing an installcheck for contribcheck to minimize the > > load. If you run the TAP tests, perhaps you don't really care anyway, > > but I think that there is a case for making this set of targets run as > > fast as we can, if we can, when TAP is disabled. > > Out of all the regression tests available with vcregress command > today, the tests shown at [1] require an already running postgres > instance (much like the installcheck). Should we change these for > "vcregress checkworld" so that they spin up tmp instances and run? I > don't want to go in this direction and change the code a lot. > > To be simple, we could just have "vcregress installcheckworld" which > requires users to spin up an instance so that the tests shown at [1] > would run along with other tests [2] that spins up their own instance. > The problem with this approach is that user might setup a different > GUC value in the instance that he/she spins up expecting it to be > effective for the tests at [2] as well. I'm not sure if anyone would > do that. We can ignore "vcregress checkworld" but mention why we don't > do this in the documentation "something like it makes tests slower as > it spinus up lot of temporary pg instances". > > Another idea, simplest of all, is that just have "vcregress > subscriptioncheck" as proposed in this first mail and not have > ""vcregress checkworld" or "vcregress installcheckworld". With this > new option and the existing options of vcregress tool, it sort of > covers all the tests - core, TAP, contrib, bin, isolation, modules, > upgrade, recovery etc. > > Thoughts? > > [1] > vcregress installcheck > vcregress plcheck > vcregress contribcheck > vcregress modulescheck > vcregress isolationcheck > > [2] > vcregress check > vcregress ecpgcheck > vcregress bincheck > vcregress recoverycheck > vcregress upgradecheck > vcregress subscriptioncheck The problems with having "vcregress checkworld" are: 1) required code modifications are more as the available "vcregress" test functions, which required pre-started pg instance, can't be used directly. 2) it looks like spinning up a separate postgres instance for all module tests takes time on Windows which might make the test time longer. If we were to have "vcregress installcheckworld", we might have to add new code for converting some of the existing functions to not use a pre-started pg instance. IMHO, we can just have "vcregress subscriptioncheck" and let users decide which tests they want to run. I would like to hear more opinions on this. Regards, Bharath Rupireddy.
Re: Accommodate startup process in a separate ProcState array slot instead of in MaxBackends slots.
On Thu, Oct 14, 2021 at 10:56 AM Fujii Masao wrote: > On 2021/10/12 15:46, Bharath Rupireddy wrote: > > On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao > > wrote: > >> > >> On 2021/10/12 4:07, Bharath Rupireddy wrote: > >>> Hi, > >>> > >>> While working on [1], it is found that currently the ProcState array > >>> doesn't have entries for auxiliary processes, it does have entries for > >>> MaxBackends. But the startup process is eating up one slot from > >>> MaxBackends. We need to increase the size of the ProcState array by 1 > >>> at least for the startup process. The startup process uses ProcState > >>> slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit. > >>> The procState array size is initialized to MaxBackends in > >>> SInvalShmemSize. > >>> > >>> The consequence of not fixing this issue is that the database may hit > >>> the error "sorry, too many clients already" soon in > >>> SharedInvalBackendInit. > > On second thought, I wonder if this error could not happen in practice. No? > Because autovacuum doesn't work during recovery and the startup process > can safely use the ProcState entry for autovacuum worker process. > Also since the minimal allowed value of autovacuum_max_workers is one, > the ProcState array guarantees to have at least one entry for autovacuum > worker. > > If this understanding is right, we don't need to enlarge the array and > can just update the comment. I don't strongly oppose to enlarge > the array in the master, but I'm not sure it's worth doing that > in back branches if the issue can cause no actual error. Yes, the issue can't happen. The comment in the SInvalShmemSize, mentioning about the startup process always having an extra slot because the autovacuum worker is not active during recovery, looks okay. But, is it safe to assume that always? Do we have a way to specify that in the form an Assert(when_i_am_startup_proc && autovacuum_not_running) (this looks a bit dirty though)? Instead, we can just enlarge the array in the master and be confident about the fact that the startup process always has one dedicated slot. Regards, Bharath Rupireddy.
Re: Improve the HINT message of the ALTER command for postgres_fdw
On Wed, Oct 13, 2021 at 11:06 PM Fujii Masao wrote: > On 2021/10/13 14:00, Bharath Rupireddy wrote: > > On Tue, Oct 12, 2021 at 11:11 PM Fujii Masao > > wrote: > >> BTW, I found file_fdw.c, dblink.c, postgres_fdw/option.c and foreign.c > >> use different error codes for the same error message as follows. > >> They should use the same error code? If yes, ISTM that > >> ERRCODE_FDW_INVALID_OPTION_NAME is better because > >> the error message is "invalid option ...". > >> > >> - ERRCODE_FDW_INVALID_OPTION_NAME (file_fdw.c) > >> - ERRCODE_FDW_OPTION_NAME_NOT_FOUND (dblink.c) > >> - ERRCODE_FDW_INVALID_OPTION_NAME (postgres_fdw/option.c) > >> - ERRCODE_SYNTAX_ERROR (foreign.c) > > > > Good catch. ERRCODE_FDW_INVALID_OPTION_NAME seems reasonable to me. I > > think we can remove the error code ERRCODE_FDW_OPTION_NAME_NOT_FOUND > > (it is being used only by dblink.c), instead use > > ERRCODE_FDW_INVALID_OPTION_NAME for all option parsing and > > validations. > > Alvaro told me the difference of those error codes as follows at [1]. > This makes me think that ERRCODE_FDW_OPTION_NAME_NOT_FOUND > is more proper for the error message. Thought? > > --- > in SQL/MED compare GetServerOptByName: INVALID OPTION NAME is used > when the buffer length does not match the option name length; > OPTION NAME NOT FOUND is used when an option of that name is not found > --- > > [1] https://twitter.com/alvherre/status/1447991206286348302 I'm fine with the distinction that's made, now I'm thinking about the appropriate areas where ERRCODE_FDW_INVALID_OPTION_NAME can be used. Is it correct to use ERRCODE_FDW_INVALID_OPTION_NAME in postgresImportForeignSchema where we don't check buffer length and option name length but throw the error when we don't find what's being expected for IMPORT FOREIGN SCHEMA command? Isn't it the ERRCODE_FDW_OPTION_NAME_NOT_FOUND right choice there? I've seen some of the option parsing logic(with the search text "stmt->options)" in the code base), they are mostly using "option \"%s\" not recognized" without an error code or "unrecognized option \"%s\"" with ERRCODE_SYNTAX_ERROR. I'm not sure which is right. If not in postgresImportForeignSchema, where else can ERRCODE_FDW_INVALID_OPTION_NAME be used? If we were to retain the error code ERRCODE_FDW_INVALID_OPTION_NAME, do we need to maintain the difference in documentation or in code comments or in the commit message at least? Regards, Bharath Rupireddy.
Re: Reset snapshot export state on the transaction abort
On Sat, Oct 16, 2021 at 9:13 AM Michael Paquier wrote: > > While double-checking this stuff, I have noticed something that's > wrong in the patch when a command that follows a > CREATE_REPLICATION_SLOT query resets SnapBuildClearExportedSnapshot(). > Once the slot is created, the WAL sender is in a TRANS_INPROGRESS > state, meaning that AbortCurrentTransaction() would call > AbortTransaction(), hence calling ResetSnapBuildExportSnapshotState() > and resetting SavedResourceOwnerDuringExport to NULL before we store a > NULL into CurrentResourceOwner :) Right, good catch! > One solution would be as simple as saving > SavedResourceOwnerDuringExport into a temporary variable before > calling AbortCurrentTransaction(), and save it back into > CurrentResourceOwner once we are done in > SnapBuildClearExportedSnapshot() as we need to rely on > AbortTransaction() to do the static state cleanup if an error happens > until the command after the replslot creation command shows up. Yeah, this idea looks fine to me. I have modified the patch. In addition to that I have removed calling ResetSnapBuildExportSnapshotState from the SnapBuildClearExportedSnapshot because that is anyway being called from the AbortTransaction. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com From 062a84dcd821fa5e41998c8714f8ec16befcf983 Mon Sep 17 00:00:00 2001 From: Dilip Kumar Date: Mon, 11 Oct 2021 20:38:58 +0530 Subject: [PATCH v2] Reset snapshot export state during abort While exporting a snapshot we set a temporary states which get cleaned up only while executing the next replication command. But if the snapshot exporting transaction gets aborted then those states are not cleaned. This patch fix this by cleaning it up during AbortTransaction. --- src/backend/access/transam/xact.c | 4 src/backend/replication/logical/snapbuild.c | 18 +- src/include/replication/snapbuild.h | 1 + 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 4cc38f0..d3f7440 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -46,6 +46,7 @@ #include "replication/logical.h" #include "replication/logicallauncher.h" #include "replication/origin.h" +#include "replication/snapbuild.h" #include "replication/syncrep.h" #include "replication/walsender.h" #include "storage/condition_variable.h" @@ -2698,6 +2699,9 @@ AbortTransaction(void) /* Reset logical streaming state. */ ResetLogicalStreamingState(); + /* Reset snapshot export state. */ + ResetSnapBuildExportSnapshotState(); + /* If in parallel mode, clean up workers and exit parallel mode. */ if (IsInParallelMode()) { diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index a54..d889c22 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -682,6 +682,8 @@ SnapBuildGetOrBuildSnapshot(SnapBuild *builder, TransactionId xid) void SnapBuildClearExportedSnapshot(void) { + ResourceOwner tmpResOwner; + /* nothing exported, that is the usual case */ if (!ExportInProgress) return; @@ -689,10 +691,24 @@ SnapBuildClearExportedSnapshot(void) if (!IsTransactionState()) elog(ERROR, "clearing exported snapshot in wrong transaction state"); + /* + * Abort transaction will reset the SavedResourceOwnerDuringExport so + * remember this in a local variable. + */ + tmpResOwner = SavedResourceOwnerDuringExport; + /* make sure nothing could have ever happened */ AbortCurrentTransaction(); - CurrentResourceOwner = SavedResourceOwnerDuringExport; + CurrentResourceOwner = tmpResOwner; +} + +/* + * Reset snapshot export state on the transaction abort. + */ +void +ResetSnapBuildExportSnapshotState(void) +{ SavedResourceOwnerDuringExport = NULL; ExportInProgress = false; } diff --git a/src/include/replication/snapbuild.h b/src/include/replication/snapbuild.h index de72124..6a1082b 100644 --- a/src/include/replication/snapbuild.h +++ b/src/include/replication/snapbuild.h @@ -70,6 +70,7 @@ extern void SnapBuildSnapDecRefcount(Snapshot snap); extern Snapshot SnapBuildInitialSnapshot(SnapBuild *builder); extern const char *SnapBuildExportSnapshot(SnapBuild *snapstate); extern void SnapBuildClearExportedSnapshot(void); +extern void ResetSnapBuildExportSnapshotState(void); extern SnapBuildState SnapBuildCurrentState(SnapBuild *snapstate); extern Snapshot SnapBuildGetOrBuildSnapshot(SnapBuild *builder, -- 1.8.3.1
Re: [PATCH] Proposal for HIDDEN/INVISIBLE column
Le 15/10/2021 à 20:51, Bruce Momjian a écrit : > Why is this not better addressed by creating a view on the original > table, even perhaps renaming the original table and create a view using > the old table name. Because when you use the view for the select you can not use the "hidden" column in your query, for example in the WHERE or ORDER BY clause. Also if you have a hundred of tables, let's says with a ts_vector column that you want to unexpand, you will have to create a hundred of view. The other problem it for write in the view, it you have a complex modification involving other tables in the query you have to define rules. Handling a technical column through a view over the real table require lot of work, this feature will help a lot to save this time. -- Gilles Darold
Re: [PATCH] Proposal for HIDDEN/INVISIBLE column
Le 15/10/2021 à 18:42, Aleksander Alekseev a écrit : > Hi Gilles, > >> Yes, I don't wanted to offend you or to troll. This was just to point >> that the position of "SELECT * is bad practice" is not a good argument >> in my point of view, just because it is allowed for every one. I mean >> that in an extension or a client which allow user query input we must >> handle the case. > Sure, no worries. And my apologies if my feedback seemed a little harsh. > > I'm sure our goal is mutual - to make PostgreSQL even better than it > is now. Finding a consensus occasionally can take time though. > Right, no problem Aleksander, my english speaking and understanding is not very good so it doesn't help too. Let's have a beer next time :-)
RE: Data is copied twice when specifying both child and parent table in publication
On Friday, October 15, 2021 7:23 PM houzj.f...@fujitsu.com wrote: > Attach a patch to fix it. Attach a new version patch which refactor the fix code in a cleaner way. Best regards, Hou zj v2-0001-fix-double-publish.patch Description: v2-0001-fix-double-publish.patch