Re: archive_command / pg_stat_archiver & documentation
On Mon, Mar 1, 2021 at 3:36 PM Michael Paquier wrote: > > On Fri, Feb 26, 2021 at 10:03:05AM +0100, Benoit Lobréau wrote: > > Done here : https://commitfest.postgresql.org/32/3012/ > > Documenting that properly for the archive command, as already done for > restore_command, sounds good to me. I am not sure that there is much > point in doing a cross-reference to the archiving section for one > specific field of pg_stat_archiver. Agreed. > For the second paragraph, I would recommend to move that to a > different to outline this special case, leading to the > attached. +1 > What do you think? LGTM!
Re: proposal: schema variables
út 16. 2. 2021 v 18:46 odesílatel Pavel Stehule napsal: > Hi > > út 2. 2. 2021 v 9:43 odesílatel Pavel Stehule > napsal: > >> Hi >> >> rebase and set PK for pg_variable table >> > > rebase > rebase > Pavel > > >> Regards >> >> Pavel >> > schema-variables-20200301.patch.gz Description: application/gzip
Re: archive_command / pg_stat_archiver & documentation
On Fri, Feb 26, 2021 at 10:03:05AM +0100, Benoit Lobréau wrote: > Done here : https://commitfest.postgresql.org/32/3012/ Documenting that properly for the archive command, as already done for restore_command, sounds good to me. I am not sure that there is much point in doing a cross-reference to the archiving section for one specific field of pg_stat_archiver. For the second paragraph, I would recommend to move that to a different to outline this special case, leading to the attached. What do you think? -- Michael diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 21094c6a9d..c5557d5444 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -639,6 +639,15 @@ test ! -f /mnt/server/archivedir/000100A90065 && cp pg_wal/0 it will try again periodically until it succeeds. + +When the archive command is terminated by a signal (other than +SIGTERM that is used as part of a server +shutdown) or an error by the shell with an exit status greater than +125 (such as command not found), the archiver process aborts and gets +restarted by the postmaster. In such cases, the failure is +not reported in . + + The archive command should generally be designed to refuse to overwrite any pre-existing archive file. This is an important safety feature to signature.asc Description: PGP signature
[PATCH] postgres-fdw: column option to override foreign types
Full use of a custom data type with postgres_fdw currently requires the type be maintained in both the local and remote databases. `CREATE FOREIGN TABLE` does not check declared types against the remote table, but declaring e.g. a remote enum to be local text works only partway, as seen here. A simple select query against alpha_items returns the enum values as text; however, *filtering* on the column yields an error. create database alpha; create database beta; \c alpha create type itemtype as enum ('one', 'two', 'three'); create table items ( id serial not null primary key, type itemtype not null ); insert into items (type) values ('one'), ('one'), ('two'); \c beta create extension postgres_fdw; create server alpha foreign data wrapper postgres_fdw options (dbname 'alpha', host 'localhost', port '5432'); create user mapping for postgres server alpha options (user 'postgres'); create foreign table alpha_items ( id int, type text ) server alpha options (table_name 'items'); select * from alpha_items; -- ok select * from alpha_items where type = 'one'; ERROR: operator does not exist: public.itemtype = text HINT: No operator matches the given name and argument types. You might need to add explicit type casts. CONTEXT: remote SQL command: SELECT id, type FROM public.items WHERE ((type = 'one'::text)) The attached changeset adds a new boolean option for postgres_fdw foreign table columns, `use_local_type`. When true, ColumnRefs for the relevant attribute will be deparsed with a cast to the type defined in `CREATE FOREIGN TABLE`. create foreign table alpha_items ( id int, type text options (use_local_type 'true') ) server alpha options (table_name 'items'); select * from alpha_items where type = 'one'; -- succeeds This builds and checks, with a new regression test and documentation. From 09961e10daf72e2c1fbbdddb05b2940b2da14df0 Mon Sep 17 00:00:00 2001 From: Dian M Fay Date: Mon, 1 Mar 2021 00:44:15 -0500 Subject: [PATCH] postgres_fdw: column option to override foreign types Enabling the use_local_type option on a foreign table column will deparse it with a cast to the locally-declared type, allowing overrides of foreign types not known to the local database. --- contrib/postgres_fdw/deparse.c | 15 --- contrib/postgres_fdw/expected/postgres_fdw.out | 17 + contrib/postgres_fdw/option.c | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 10 ++ doc/src/sgml/postgres-fdw.sgml | 13 + 5 files changed, 53 insertions(+), 3 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 6faf499f9a..c722d49316 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -2264,6 +2264,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte, else { char *colname = NULL; + bool uselocaltype = NULL; List *options; ListCell *lc; @@ -2280,10 +2281,9 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte, DefElem*def = (DefElem *) lfirst(lc); if (strcmp(def->defname, "column_name") == 0) - { colname = defGetString(def); - break; - } + else if (strcmp(def->defname, "use_local_type") == 0) + uselocaltype = defGetBoolean(def); } /* @@ -2297,6 +2297,15 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, RangeTblEntry *rte, ADD_REL_QUALIFIER(buf, varno); appendStringInfoString(buf, quote_identifier(colname)); + + if (uselocaltype) + { + Oid coltype = get_atttype(rte->relid, varattno); + char *typname = deparse_type_name(coltype, -1); + + appendStringInfoString(buf, "::"); + appendStringInfoString(buf, typname); + } } } diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 0649b6b81c..c3271bac8f 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -4105,6 +4105,23 @@ ERROR: invalid input syntax for type integer: "foo" CONTEXT: processing expression at position 2 in select list ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE user_enum; -- === +-- conversion with use_local_type +-- === +ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE text; +SELECT * FROM ft1 WHERE c8 = 'foo' LIMIT 1; -- ERROR +ERROR: operator does not
Re: macOS SIP, next try
So, we haven't gotten anywhere satisfying with these proposed technical solutions. I have since learned that there is a way to disable only the part of SIP that is relevant for us. This seems like a useful compromise, and it appears that a number of other open-source projects are following the same route. I suggest the attached documentation patch and then close this issue. From 7efb0ec3e15f37f9c5e12845aeccd9cd8693c01d Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 1 Mar 2021 07:58:17 +0100 Subject: [PATCH] doc: Update information on macOS SIP On more recent versions of macOS, it is sufficient to disable only a part of SIP in order to get make check to run before make install. Document how. --- doc/src/sgml/installation.sgml | 26 ++ 1 file changed, 26 insertions(+) diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 66ad4ba938..39adf1f5d9 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -2375,6 +2375,9 @@ macOS You may or may not wish to also install Xcode. + +Sysroot + On recent macOS releases, it's necessary to embed the sysroot path in the include switches used to @@ -2419,6 +2422,10 @@ macOS to build with a non-Apple compiler, but beware that that case is not tested or supported by the PostgreSQL developers. + + + +System Integrity Protection macOS's System Integrity @@ -2429,6 +2436,25 @@ macOS install before make check. Most PostgreSQL developers just turn off SIP, though. + + +To disable SIP, boot into recovery mode, open a terminal, and run + +csrutil disable + + and reboot. In macOS version 10.14 and later, it is sufficient to disable + the Debugging part of SIP, by running + +csrutil enable --without debug + +instead. The status of SIP can be shown using + +csrutil status + +Note that that status display does not reflect changes until after a +reboot. + + -- 2.30.1
Re: Side effect of remove_useless_groupby_columns
Hi, On Sun, Feb 28, 2021 at 5:15 PM David Rowley wrote: > On Sun, 28 Feb 2021 at 20:52, Richard Guo wrote: > > When looking at [1], I realized we may have a side effect when removing > > redundant columns in the GROUP BY clause. Suppose we have a query with > > ORDER BY 'b', and meanwhile column 'b' is also a group key. If we decide > > that 'b' is redundant due to being functionally dependent on other GROUP > > BY columns, we would remove it from group keys. This will make us lose > > the opportunity to leverage the index on 'b'. > > > Any thoughts? > > I had initially thought that this was a non-issue before incremental > sort was added, but the following case highlights that's wrong. > > # create table t (a int not null, b int); > # insert into t select i, i%1000 from generate_series(1,100)i; > # create index on t(b,a); > > # explain (analyze) select b from t group by a, b order by b,a limit 10; > # alter table t add constraint t_pkey primary key (a); > # explain (analyze) select b from t group by a, b order by b,a limit 10; > > Execution slows down after adding the primary key since that allows > the functional dependency to be detected and the "b" column removed > from the GROUP BY clause. > > Perhaps we could do something like add: > > diff --git a/src/backend/optimizer/plan/planner.c > b/src/backend/optimizer/plan/planner.c > index 545b56bcaf..7e52212a13 100644 > --- a/src/backend/optimizer/plan/planner.c > +++ b/src/backend/optimizer/plan/planner.c > @@ -3182,7 +3182,8 @@ remove_useless_groupby_columns(PlannerInfo *root) > if (!IsA(var, Var) || > var->varlevelsup > 0 || > !bms_is_member(var->varattno - > FirstLowInvalidHeapAttributeNumber, > - > surplusvars[var->varno])) > + > surplusvars[var->varno]) || > + list_member(parse->sortClause, sgc)) > new_groupby = lappend(new_groupby, sgc); > } > > to remove_useless_groupby_columns(). It's not exactly perfect though > since it's still good to remove the useless GROUP BY columns if > there's no useful index to implement the ORDER BY. We just don't know > that when we call remove_useless_groupby_columns(). > Or, rather than thinking it as a side effect of removing useless groupby columns, how about we do an additional optimization as 'adding useful groupby columns', to add into group keys some column which matches ORDER BY and meanwhile is being functionally dependent on other GROUP BY columns. What I'm thinking is a scenario like below. # create table t (a int primary key, b int, c int); # insert into t select i, i%1000, i%1000 from generate_series(1,100)i; # create index on t(b); # explain (analyze) select b from t group by a, c order by b limit 10; For this query, we can first remove 'c' from groupby columns with the existing optimization of 'removing useless groupby columns', and then add 'b' to groupby columns with the new-added optimization of 'adding useful groupby columns'. We can do that because we know 'b' is functionally dependent on 'a', and we are required to be sorted by 'b', and meanwhile there is index on 'b' that we can leverage. Thanks Richard
Re: Parallel INSERT (INTO ... SELECT ...)
On Mon, Mar 1, 2021 at 12:38 PM Greg Nancarrow wrote: > On Fri, Feb 26, 2021 at 5:50 PM Amit Langote wrote: > > > > On Fri, Feb 26, 2021 at 3:35 PM Greg Nancarrow wrote: > > > On Fri, Feb 26, 2021 at 4:07 PM Amit Langote > > > wrote: > > > > The attached patch fixes this, although I am starting to have second > > > > thoughts about how we're tracking partitions in this patch. Wondering > > > > if we should bite the bullet and add partitions into the main range > > > > table instead of tracking them separately in partitionOids, which > > > > might result in a cleaner patch overall. > > > > > > Thanks Amit, > > > > > > I was able to reproduce the problem using your instructions (though I > > > found I had to run that explain an extra time, in order to hit the > > > breakpoint). > > > Also, I can confirm that the problem doesn't occur after application > > > of your patch. > > > I'll leave it to your better judgement as to what to do next - if you > > > feel the current tracking method is not sufficient > > > > Just to be clear, I think the tracking method added by the patch is > > sufficient AFAICS for the problems we were able to discover. The > > concern I was trying to express is that we seem to be duct-taping > > holes in our earlier chosen design to track partitions separately from > > the range table. If we had decided to add partitions to the range > > table as "extra" target relations from the get-go, both the issues I > > mentioned with cached plans -- partitions not being counted as a > > dependency and partitions not being locked before execution -- would > > have been prevented. I haven't fully grasped how invasive that design > > would be, but it sure sounds like it would be a bit more robust. > > Posting an updated set of patches that includes Amit Langote's patch > to the partition tracking scheme... Thanks Greg. > (the alternative of adding partitions to the range table needs further > investigation) I looked at this today with an intention to write and post a PoC patch, but was quickly faced with the task of integrating that design with how ModifyTable works for inserts. Specifically if, in addition to adding partitions to the range table, we also their RT indexes to ModifyTable.resultRelations, then maybe we'll need to rethink our executor tuple routing code. That code currently tracks the insert's target partitions separately from the range table, exactly because they are not there to begin with. So if we change the latter as in this hypothetical PoC, maybe we should also revisit the former. Also, it would not be great that the planner's arrays in PlannerInfo would get longer as a result of Query.rtable getting longer by adding partitions, thus making all the loops over those arrays slower for no benefit. So, let's do this in a follow-on patch, if at all, instead of considering this topic a blocker for committing the current set of patches. -- Amit Langote EDB: http://www.enterprisedb.com
Re: [PATCH] refactor ATExec{En,Dis}ableRowSecurity
On Sun, Feb 28, 2021 at 02:27:44PM -0800, Zhihong Yu wrote: > For 0002-Further-refactoring.patch, should there be assertion > inside ATExecSetRowSecurity() on the values for rls and force_rls ? > There could be 3 possible values: -1, 0 and 1. 0001 is a clean simplification and a good catch, so I'll see about applying it. 0002 just makes the code more confusing to the reader IMO, and its interface could easily lead to unwanted errors. -- Michael signature.asc Description: PGP signature
Re: doc review for v14
On Sun, Feb 28, 2021 at 10:33:55PM -0600, Justin Pryzby wrote: > I still have an issue with the sentence that begins: > "Checksums are normally enabled..." You could say here "Checksums can be enabled", but "normally" does not sound bad to me either as it insists on the fact that it is better to do that when the cluster is initdb'd as this has no downtime compared to enabling checksums on an existing cluster. > Note, the patch I sent said "create" but should be "created". Initialized sounds better to me, FWIW. > "page corruptions" is wrong .. you could say "corrupt pages" "corruptED pages" would sound more correct to me as something that has already happened. Anyway, I'd rather keep what I am proposing upthread. -- Michael signature.asc Description: PGP signature
Re: simplifying foreign key/RI checks
> > > It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I > > know that doesn't mean the usefulness of the macro but the mechanism > > the macro suggests, but it is confusing.) On the other hand, > > RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no > > longer used. (Couldn't we remove them?) > > Yeah, better to just remove those _PK macros and say this module no > longer runs any queries on the PK table. > > How about the attached? > > Sorry for the delay. I see that the changes were made as described. Passes make check and make check-world yet again. I'm marking this Ready For Committer unless someone objects.
Re: repeated decoding of prepared transactions
On Mon, Mar 1, 2021 at 7:23 AM Ajin Cherian wrote: > Pushed, the first patch in the series. -- With Regards, Amit Kapila.
Re: [HACKERS] Custom compression methods
On Sat, Feb 27, 2021 at 9:35 PM Justin Pryzby wrote: > > > Subject: [PATCH v28 3/4] Add default_toast_compression GUC > > This part isn't working. My first patch worked somewhat better: due to doing > strcmp() with the default GUC, it avoided using the cached AM OID. (But it > would've failed with more than 2 AMs, since the cache wasn't invalidated, > since > I couldn't tell when it was needed). > > Your patch does this: > > |postgres=# SET default_toast_compression=lz4 ; > |postgres=# CREATE TABLE t(a text); > |postgres=# \d+ t > | a | text | | | | extended | pglz| >| > > assign_default_toast_compression() should set > default_toast_compression_oid=InvalidOid, rather than > default_toast_compression=NULL. I will fix this. > In my original patch, that was commented, since I was confused, not realizing > that the GUC machinery itself assigns to the string value. We should assign > to > the cached Oid, instead. > Reading my own patch, I see that in AccessMethodCallback() should also say > InvalidOid. > | default_toast_compression_oid = false; > The false assignment was copied from namespace.c: baseSearchPathValid. I will fix this. So as of now, we can make this patch such that it is enough to work with the built-in method and later we can add another enhancement patch that can work with the custom compression methods. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Custom compression methods
On Mon, Mar 01, 2021 at 10:32:23AM +0530, Dilip Kumar wrote: > On Sun, Feb 28, 2021 at 9:48 AM Justin Pryzby wrote: > > > > On my PC, this new change is causing a test failure: > > > > SELECT SUBSTR(f1, 2000, 50) FROM cmdata1; > > - substr > > - > > - 01234567890123456789012345678901234567890123456789 > > -(1 row) > > - > > +ERROR: compressed lz4 data is corrupt > > The older version of lz4 had this problem that while decompressing > partial if we don't give the buffer size up to full data length it was > failing[1] but it is solved in version 1.9. Thanks. It seems like that explains it. I think if that's a problem with recent versions, then you'll have to conditionally disable slicing. https://packages.debian.org/liblz4-dev Slicing isn't generally usable if it sometimes makes people's data inaccessible and gives errors about corruption. I guess you could make it a compile time test on these constants (I don't know the necessary version, though) #define LZ4_VERSION_MAJOR1/* for breaking interface changes */ #define LZ4_VERSION_MINOR7/* for new (non-breaking) interface capabilities */ #define LZ4_VERSION_RELEASE 1/* for tweaks, bug-fixes, or development */ #define LZ4_VERSION_NUMBER (LZ4_VERSION_MAJOR *100*100 + LZ4_VERSION_MINOR *100 + LZ4_VERSION_RELEASE) If the version is too low, either make it #error, or disable slicing. The OS usual library version infrastructure will make sure the runtime version is at least the MAJOR+MINOR of the compile time version. -- Justin
Re: Add client connection check during the execution of the query
On Sat, Aug 3, 2019 at 4:40 AM Konstantin Knizhnik wrote: > On 18.07.2019 6:19, Tatsuo Ishii wrote: > > So the performance is about 5% down with the feature enabled in this > > case. For me, 5% down is not subtle. Probably we should warn this in > > the doc. > > I also see some performance degradation, although it is not so large in > my case (I used pgbench with scale factor 10 and run the same command as > you). > In my case difference is 103k vs. 105k TPS is smaller than 2%. I didn't test, but hopefully the degradation is fixed by commit 09cf1d52? > If OS detected closed connection, it should return POLLHUP, should not it? > I am not sure if it is more portable or more efficient way - just seems > to be a little bit more natural way (from my point of view) to check if > connection is still alive. That's if you're sleeping inepoll etc. This patch is for CPU-bound backends, running a long query. We need to do something special to find out if the kernel knows that the connection has been closed. I've done a quick rebase of this the patch and added it to the commitfest. No other changes. Several things were mentioned earlier that still need to be tidied up. From 5f7c327c5896369b80529467a3f1f64eab690887 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 1 Mar 2021 18:08:23 +1300 Subject: [PATCH v3] Detect dropped connections while running queries. Provide a new optional GUC that can be used to check whether the client connection has gone away periodically while running very long queries. Author: Sergey Cherkashin Author: Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru --- doc/src/sgml/config.sgml | 20 src/backend/libpq/pqcomm.c| 31 + src/backend/tcop/postgres.c | 5 + src/backend/utils/init/globals.c | 1 + src/backend/utils/init/postinit.c | 13 +++ src/backend/utils/misc/guc.c | 10 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/include/libpq/libpq.h | 2 + src/include/miscadmin.h | 3 +- src/include/utils/timeout.h | 1 + src/test/modules/Makefile | 1 + src/test/modules/connection/Makefile | 16 +++ .../connection/t/001_close_connection.pl | 107 ++ 13 files changed, 212 insertions(+), 1 deletion(-) create mode 100644 src/test/modules/connection/Makefile create mode 100644 src/test/modules/connection/t/001_close_connection.pl diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b5718fc136..db8798eb95 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -9220,6 +9220,26 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' + + client_connection_check_interval (integer) + + client_connection_check_interval configuration parameter + + + + +Sets a time interval, in milliseconds, between periodic +verification of client-server connection during query execution. +If the client aborts the connection, the query is terminated. + + +Default value is zero - it disables connection +checks, so the backend will detect client disconnection only when trying +to send a response to the query. + + + + diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 27a298f110..44fb79c2b5 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -118,6 +118,7 @@ */ int Unix_socket_permissions; char *Unix_socket_group; +int client_connection_check_interval; /* Where the Unix socket files are (list of palloc'd strings) */ static List *sock_paths = NIL; @@ -2022,3 +2023,33 @@ pq_settcpusertimeout(int timeout, Port *port) return STATUS_OK; } + +/* + * pq_check_client_connection - check if client connected to socket or not + * + */ +void pq_check_client_connection(void) +{ + CheckClientConnectionPending = false; + if (IsUnderPostmaster && + MyProcPort != NULL && !PqCommReadingMsg && !PqCommBusy) + { + char nextbyte; + int r; + +#ifdef WIN32 + pgwin32_noblock = 1; +#endif + r = recv(MyProcPort->sock, &nextbyte, 1, MSG_PEEK); +#ifdef WIN32 + pgwin32_noblock = 0; +#endif + + if (r == 0 || (r == -1 && + errno != EAGAIN && errno != EWOULDBLOCK && errno != EINTR)) + { + ClientConnectionLost = true; + InterruptPending = true; + } + } +} diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index bb5ccb4578..4ce004a2d5 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -3128,6 +3128,8 @@ ProcessInterrupts(void) (errcode(ERRCODE_ADMIN_SHUTDOWN), errmsg("terminating connection due to administrator command"))); } + if (CheckClientConne
Re: Speeding up GIST index creation for tsvectors
I have added this one in the March commitfest. https://commitfest.postgresql.org/32/3023/
Re: Printing backtrace of postgres processes
Hi, I also think this feature would be useful when supporting environments that lack debugger or debug symbols. I think such environments are not rare. + for more information. This +will help in identifying where exactly the backend process is currently +executing. When I read this, I expected a backtrace would be generated at the moment when it receives the signal, but actually it just sets a flag that causes the next CHECK_FOR_INTERRUPTS to print a backtrace. How about explaining the timing of the backtrace generation? +print backtrace of superuser backends. This feature is not supported +for postmaster, logging and statistics process. Since the current patch use BackendPidGetProc(), it does not support this feature not only postmaster, logging, and statistics but also checkpointer, background writer, and walwriter. And when I specify pid of these PostgreSQL processes, it says "PID is not a PostgreSQL server process". I think it may confuse users, so it might be worth changing messages for those PostgreSQL processes. AuxiliaryPidGetProc() may help to do it. diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index 54a818b..5fae328 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -57,6 +57,7 @@ #include "storage/shmem.h" #include "storage/smgr.h" #include "storage/spin.h" +#include "tcop/tcopprot.h" #include "utils/guc.h" #include "utils/memutils.h" #include "utils/resowner.h" @@ -547,6 +548,13 @@ HandleCheckpointerInterrupts(void) if (ProcSignalBarrierPending) ProcessProcSignalBarrier(); + /* Process printing backtrace */ + if (PrintBacktracePending) + { + PrintBacktracePending = false; + set_backtrace(NULL, 0); + } + Although it implements backtrace for checkpointer, when I specified pid of checkpointer it was refused from BackendPidGetProc(). Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
Re: Is Recovery actually paused?
On Fri, Feb 26, 2021 at 1:33 PM Kyotaro Horiguchi wrote: > > At Thu, 25 Feb 2021 13:22:53 +0530, Dilip Kumar wrote > in > > On Thu, Feb 25, 2021 at 12:42 PM Kyotaro Horiguchi > > wrote: > > > The latest version applies (almost) cleanly to the current master and > > > works fine. > > > I don't have further comment on this. > > > > > > I'll wait for a day before marking this RfC in case anyone have > > > further comments. > > > > Okay. > > Hearing nothing, done that. Thanks. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Different compression methods for FPI
On Mon, Mar 01, 2021 at 01:57:12PM +0900, Michael Paquier wrote: > On Sun, Feb 28, 2021 at 05:08:17PM -0600, Justin Pryzby wrote: > > Does this need to patch ./configure{,.ac} and Solution.pm for HAVE_LIBLZ4 ? > > I suggest to also include an 0002 patch (not for commit) which changes to > > use a > > non-default compression, to exercise this on the CIs - linux and bsd > > environments now have liblz4 installed, and for windows you can keep > > "undef". > > Good idea. But are you sure that lz4 is available in the CF bot > environments? Yes, see here https://www.postgresql.org/message-id/flat/20210119205643.GA1941%40telsasoft.com -- Justin
Re: [HACKERS] Custom compression methods
On Sun, Feb 28, 2021 at 9:48 AM Justin Pryzby wrote: > > On my PC, this new change is causing a test failure: > > SELECT SUBSTR(f1, 2000, 50) FROM cmdata1; > - substr > - > - 01234567890123456789012345678901234567890123456789 > -(1 row) > - > +ERROR: compressed lz4 data is corrupt The older version of lz4 had this problem that while decompressing partial if we don't give the buffer size up to full data length it was failing[1] but it is solved in version 1.9. > @@ -119,15 +119,15 @@ lz4_cmdecompress_slice(const struct varlena *value, > int32 slicelength) > int32 rawsize; > struct varlena *result; > > - /* allocate memory for holding the uncompressed data */ > - result = (struct varlena *) palloc(VARRAWSIZE_4B_C(value) + VARHDRSZ); > + /* allocate memory for the uncompressed data */ > + result = (struct varlena *) palloc(slicelength + VARHDRSZ); > > - /* decompress partial data using lz4 routine */ > + /* decompress the data */ > rawsize = LZ4_decompress_safe_partial((char *) value + > VARHDRSZ_COMPRESS, > > VARDATA(result), > > VARSIZE(value) - VARHDRSZ_COMPRESS, > > slicelength, > - > VARRAWSIZE_4B_C(value)); > + > slicelength); This is done for the latest version as now we don't need to allocate the buffer of full size, it is enough the allocate just equal to the slicelength. > Also, in the tests, you have this at both the top and bottom of the file: > > src/test/regress/sql/compression.sql:\set HIDE_COMPRESSAM false > src/test/regress/sql/compression.sql:\set HIDE_COMPRESSAM false > > Whereas the patch I sent had at the end: > > +\set HIDE_COMPRESSAM on > > ("on" is the default when run under pg_regress) I will fix this. [1] https://docs.unrealengine.com/en-US/API/Runtime/Core/Compression/LZ4_decompress_safe_partial/index.html -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: repeated decoding of prepared transactions
On Mon, Mar 1, 2021 at 7:23 AM Ajin Cherian wrote: > > On Sat, Feb 27, 2021 at 11:06 PM Amit Kapila wrote: > > > Few comments on 0002 patch: > > = > > 1. > > + > > + /* > > + * Disable two-phase here, it will be set in the core if it was > > + * enabled whole creating the slot. > > + */ > > + ctx->twophase = false; > > > > Typo, /whole/while. I think we don't need to initialize this variable > > here at all. > > > > 2. > > + /* If twophase is set on the slot at create time, then > > + * make sure the field in the context is also updated > > + */ > > + if (MyReplicationSlot->data.twophase) > > + { > > + ctx->twophase = true; > > + } > > + > > > > For multi-line comments, the first line of comment should be empty. > > Also, I think this is not the right place because the WALSender path > > needs to set it separately. I guess you can set it in > > CreateInitDecodingContext/CreateDecodingContext by doing something > > like > > > > ctx->twophase &= MyReplicationSlot->data.twophase > > Updated accordingly. > > > > > 3. I think we can support this option at the protocol level in a > > separate patch where we need to allow it via replication commands (say > > when we support it in CreateSubscription). Right now, there is nothing > > to test all the code you have added in repl_gram.y. > > > > Removed that. > > > > 4. I think we can expose this new option via pg_replication_slots. > > > > Done. Added, > v7-0002-Add-option-to-enable-two-phase-commits-in-pg_crea.patch adds twophase to pg_create_logical_replication_slot, I feel this option should be documented in src/sgml/func.sgml. Regards, Vignesh
Re: Different compression methods for FPI
On Sun, Feb 28, 2021 at 05:08:17PM -0600, Justin Pryzby wrote: > Does this need to patch ./configure{,.ac} and Solution.pm for HAVE_LIBLZ4 ? > I suggest to also include an 0002 patch (not for commit) which changes to use > a > non-default compression, to exercise this on the CIs - linux and bsd > environments now have liblz4 installed, and for windows you can keep "undef". Good idea. But are you sure that lz4 is available in the CF bot environments? > Your patch looks fine, but I wonder if we should first implement a generic > compression API. Then, the callers don't need to have a bunch of #ifdef. > If someone calls zs_create() for an algorithm which isn't enabled at compile > time, it throws the error at a lower level. Yeah, the patch feels incomplete with its footprint in xloginsert.c for something that could be available in src/common/ like pglz, particularly knowing that you will need to have this information available to frontend tools, no? > In some cases there's an argument that the compression ID should be globally > defined constant, not like a dynamic "plugable" OID. That may be true for the > libpq compression, WAL compression, and pg_dump, since there's separate > "producer" and "consumer". I think if there's "pluggable" compression (like > the TOAST patch), then it may have to map between the static or dynamic OIDs > and the global compression ID. This declaration should be frontend-aware. As presented, the patch would enforce zlib or lz4 on top of pglz, but I guess that it would be more user-friendly to complain when attempting to set up wal_compression_method (why not just using wal_compression?) to an unsupported option rather than doing it after-the-fact for the first full page generated once the new parameter value is loaded. This stuff could just add tests in src/test/recovery/. See for example src/test/ssl and with_ssl to see how conditional tests happen depending on the configure options. Not much a fan either of assuming that it is just fine to add one byte to XLogRecordBlockImageHeader for the compression_method field. -- Michael signature.asc Description: PGP signature
Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?
On Sat, Feb 27, 2021 at 4:14 PM Thomas Munro wrote: > Here's a new version. The condition variable patch 0001 fixes a bug: > CleanupProcSignalState() also needs to broadcast. The hunk that > allows the interrupt handlers to use CVs while you're already waiting > on a CV is now in a separate patch 0002. I'm thinking of going ahead > and committing those two. Done. Of course nothing in the tree reaches any of this code yet. It'll be exercised by cfbot in this thread and (I assume) Amul's "ALTER SYSTEM READ { ONLY | WRITE }" thread. > The 0003 patch to achieve $SUBJECT needs > more discussion. Rebased. The more I think about it, the more I think that this approach is good enough for an initial solution to the problem. It only affects Windows, dropping tablespaces is hopefully rare, and it's currently broken on that OS. That said, it's complex enough, and I guess more to the point, enough of a compromise, that I'm hoping to get some explicit consensus about that. A better solution would probably have to be based on the sinval queue, somehow. Perhaps with a new theory or rule making it safe to process at every CFI(), or by deciding that we're prepared have the operation wait arbitrarily long until backends reach a point where it is known to be safe (probably near ProcessClientReadInterrupt()'s call to ProcessCatchupInterrupt()), or by inventing a new kind of lightweight "sinval peek" that is safe to run at every CFI() point, being based on reading (but not consuming!) the sinval queue and performing just vfd-close of referenced smgr relations in this case. The more I think about all that complexity for a super rare event on only one OS, the more I want to just do it the stupid way and close all the fds. Robert opined similarly in an off-list chat about this problem. From 5daf2244894bf5a399aa8022efbf71b5428335c7 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Mon, 1 Mar 2021 17:14:11 +1300 Subject: [PATCH v5] Use a global barrier to fix DROP TABLESPACE on Windows. Previously, it was possible for DROP TABLESPACE to fail because a table had been dropped, but other backends still had open file handles. Windows won't allow a directory containing unlinked-but-still-open files to be unlinked. Tackle this problem by forcing all backends to close all fds. No change for Unix systems, which didn't suffer from the problem, but simulate the need to do it in assertion builds just to exercise the code. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/ca+hukgldemy2gbm80kz20gte6hnvwoere8kwcjk6-u56ost...@mail.gmail.com --- src/backend/commands/tablespace.c| 22 +++--- src/backend/storage/ipc/procsignal.c | 22 ++ src/backend/storage/smgr/md.c| 6 ++ src/backend/storage/smgr/smgr.c | 12 src/include/storage/md.h | 1 + src/include/storage/procsignal.h | 7 +-- src/include/storage/smgr.h | 1 + 7 files changed, 42 insertions(+), 29 deletions(-) diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c index 69ea155d50..8853f1b658 100644 --- a/src/backend/commands/tablespace.c +++ b/src/backend/commands/tablespace.c @@ -520,15 +520,23 @@ DropTableSpace(DropTableSpaceStmt *stmt) * but we can't tell them apart from important data files that we * mustn't delete. So instead, we force a checkpoint which will clean * out any lingering files, and try again. - * - * XXX On Windows, an unlinked file persists in the directory listing - * until no process retains an open handle for the file. The DDL - * commands that schedule files for unlink send invalidation messages - * directing other PostgreSQL processes to close the files. DROP - * TABLESPACE should not give up on the tablespace becoming empty - * until all relevant invalidation processing is complete. */ RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT); + /* + * On Windows, an unlinked file persists in the directory listing until + * no process retains an open handle for the file. The DDL + * commands that schedule files for unlink send invalidation messages + * directing other PostgreSQL processes to close the files, but nothing + * guarantees they'll be processed in time. So, we'll also use a + * global barrier to ask all backends to close all files, and wait + * until they're finished. + */ +#if defined(WIN32) || defined(USE_ASSERT_CHECKING) + LWLockRelease(TablespaceCreateLock); + WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE)); + LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE); +#endif + /* And now try again. */ if (!destroy_tablespace_directories(tablespaceoid, false)) { /* Still not empty, the files must be important then */ diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index 8e5ee49fbd..236f6edd60 100644 --- a/src/backend/storage/ipc/procsignal.c
Re: doc review for v14
On Mon, Mar 01, 2021 at 01:11:10PM +0900, Michael Paquier wrote: > On Sun, Feb 28, 2021 at 06:46:47PM -0600, Justin Pryzby wrote: > > It looks like you applied 0010...but I agree that it's not an improvement. > > It > > appears that's something I intended to go back and revisit myself. > > The rest of the patch looks right, to me. > > Oops. This was not intended. > > > I'm suggesting to either revert that part, or apply these more polished > > changes > > in 0002. > > I would just group both things together. Monday helping, I can see > that the new wording is better on a couple of points after doing a > diff of wal.sgml with c82d59d6: > - "checksum protected" in the first sentence is weird, so I agree that > using "By default, data pages are not protected by checksums" is an > improvement. > - "assigned" is indeed a bit strange, "includes" is an improvement, > and I would tend to not use a passive form here. +1 > - "to recover from corrupt data" is redundant with "to recover data" > so the second one should be removed. My take is to use "page > corruptions" instead of "corrupt data", which should be corrupted data > to be grammatically correct. > - Checksums verification is normally ENABLED when the cluster is > initialized using + Checksums are normally enabled when the cluster is initialized using - When attempting to recover from corrupt data, it may be necessary to > bypass > - the checksum protection. To do this, temporarily set the configuration > - parameter . > + When attempting to recover from page corruptions, it may be necessary to > + bypass the checksum protection. To do this, temporarily set the > + configuration parameter . "page corruptions" is wrong .. you could say "corrupt pages" -- Justin
Re: doc review for v14
On Sun, Feb 28, 2021 at 06:46:47PM -0600, Justin Pryzby wrote: > It looks like you applied 0010...but I agree that it's not an improvement. It > appears that's something I intended to go back and revisit myself. > The rest of the patch looks right, to me. Oops. This was not intended. > I'm suggesting to either revert that part, or apply these more polished > changes > in 0002. I would just group both things together. Monday helping, I can see that the new wording is better on a couple of points after doing a diff of wal.sgml with c82d59d6: - "checksum protected" in the first sentence is weird, so I agree that using "By default, data pages are not protected by checksums" is an improvement. - "assigned" is indeed a bit strange, "includes" is an improvement, and I would tend to not use a passive form here. - "to recover from corrupt data" is redundant with "to recover data" so the second one should be removed. My take is to use "page corruptions" instead of "corrupt data", which should be corrupted data to be grammatically correct. This gives the attached, that has as result to not change the second paragraph compared to the pre-c82d59d6 version of the area. -- Michael diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index 02f576a1a9..f75527f764 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -237,19 +237,19 @@ - By default, data pages are not protected by checksums, but this can optionally be - enabled for a cluster. When enabled, each data page will be ASSIGNED a - checksum that is updated when the page is written and verified each time + By default, data pages are not protected by checksums, but this can + optionally be enabled for a cluster. When enabled, each data page includes + a checksum that is updated when the page is written and verified each time the page is read. Only data pages are protected by checksums; internal data structures and temporary files are not. - Checksums verification is normally ENABLED when the cluster is initialized using initdb. They can also be enabled or disabled at a later time as an offline operation. Data checksums are enabled or disabled at the full cluster - level, and cannot be specified for individual databases or tables. + level, and cannot be specified individually for databases or tables. @@ -260,9 +260,9 @@ - When attempting to recover from corrupt data, it may be necessary to bypass - the checksum protection. To do this, temporarily set the configuration - parameter . + When attempting to recover from page corruptions, it may be necessary to + bypass the checksum protection. To do this, temporarily set the + configuration parameter . signature.asc Description: PGP signature
Re: 64-bit XIDs in deleted nbtree pages
On Fri, Feb 26, 2021 at 9:58 AM Peter Geoghegan wrote: > > If we don't want btvacuumcleanup() to collect index statistics, we can > > remove vacuum_cleanup_index_scale_factor (at least from btree > > perspectives), as you mentioned. One thing that may be worth > > mentioning is that the difference between the index statistics taken > > by ANALYZE and btvacuumcleanup() is that the former statistics is > > always an estimation. That’s calculated by compute_index_stats() > > whereas the latter uses the result of an index scan. If > > btvacuumcleanup() doesn’t scan the index and always returns NULL, it > > would become hard to get accurate index statistics, for example in a > > static table case. I've not checked which cases index statistics > > calculated by compute_index_stats() are inaccurate, though. > > The historic context makes it easier to understand what to do here -- > it makes it clear that amvacuumcleanup() routine does not (or should > not) do any index scan when the index hasn't (and won't) be modified > by the current VACUUM operation. The relevant sgml doc sentence I > quoted to you recently ("It is OK to return NULL if the index was not > changed at all during the VACUUM operation...") was added by commit > e57345975cf in 2006. Much of the relevant 2006 discussion is here, > FWIW: > > https://www.postgresql.org/message-id/flat/26433.1146598265%40sss.pgh.pa.us#862ee11c24da63d0282e0025abbad19c > > So now we have the formal rules for index AMs, as well as background > information about what various hackers (mostly Tom) were considering > when the rules were written. > > > According to the doc, if amvacuumcleanup/btvacuumcleanup returns NULL, > > it means the index is not changed at all. So do_analyze_rel() executed > > by VACUUM ANALYZE also doesn't need to update the index statistics > > even when amvacuumcleanup/btvacuumcleanup returns NULL. No? > > Consider hashvacuumcleanup() -- here it is in full (it hasn't really > changed since 2006, when it was updated by that same commit I cited): > > /* > * Post-VACUUM cleanup. > * > * Result: a palloc'd struct containing statistical info for VACUUM displays. > */ > IndexBulkDeleteResult * > hashvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats) > { > Relationrel = info->index; > BlockNumber num_pages; > > /* If hashbulkdelete wasn't called, return NULL signifying no change */ > /* Note: this covers the analyze_only case too */ > if (stats == NULL) > return NULL; > > /* update statistics */ > num_pages = RelationGetNumberOfBlocks(rel); > stats->num_pages = num_pages; > > return stats; > } > > Clearly hashvacuumcleanup() was considered by Tom when he revised the > documentation in 2006. Here are some observations about > hashvacuumcleanup() that seem relevant now: > > * There is no "analyze_only" handling, just like nbtree. > > "analyze_only" is only used by GIN, even now, 15+ years after it was > added. GIN uses it to make autovacuum workers (never VACUUM outside of > an AV worker) do pending list insertions for ANALYZE -- just to make > it happen more often. This is a niche thing -- clearly we don't have > to care about it in nbtree, even if we make btvacuumcleanup() (almost) > always return NULL when there was no btbulkdelete() call. > > * num_pages (which will become pg_class.relpages for the index) is not > set when we return NULL -- hashvacuumcleanup() assumes that ANALYZE > will get to it eventually in the case where VACUUM does no real work > (when it just returns NULL). > > * We also use RelationGetNumberOfBlocks() to set pg_class.relpages for > index relations during ANALYZE -- it's called when we call > vac_update_relstats() (I quoted this do_analyze_rel() code to you > directly in a recent email). > > * In general, pg_class.relpages isn't an estimate (because we use > RelationGetNumberOfBlocks(), both in the VACUUM-updates case and the > ANALYZE-updates case) -- only pg_class.reltuples is truly an estimate > during ANALYZE, and so getting a "true count" seems to have only > limited practical importance. > > I think that this sets a precedent in support of my view that we can > simply get rid of vacuum_cleanup_index_scale_factor without any > special effort to maintain pg_class.reltuples. As I said before, we > can safely make btvacuumcleanup() just like hashvacuumcleanup(), > except when there are known deleted-but-not-recycled pages, where a > full index scan really is necessary for reasons that are not related > to statistics at all (of course we still need the *logic* that was > added to nbtree by the vacuum_cleanup_index_scale_factor commit -- > that is clearly necessary). My guess is that Tom would have made > btvacuumcleanup() look identical to hashvacuumcleanup() in 2006 if > nbtree didn't have page deletion to consider -- but that had to be > considered. Makes sense. If getting a true pg_class.reltuples is not important in practice, it seems not to need btvacuumcle
Re: Update docs of logical replication for commit ce0fdbfe97.
On Fri, Feb 26, 2021 at 8:29 AM Amit Kapila wrote: > > We forgot to update the logical replication configuration settings > page in commit ce0fdbfe97. After commit ce0fdbfe97, table > synchronization workers also started using replication origins to > track the progress and the same should be reflected in docs. > > Attached patch for the same. > Pushed! -- With Regards, Amit Kapila.
Re: Reducing WaitEventSet syscall churn
On Sat, Feb 27, 2021 at 2:48 PM Thomas Munro wrote: > Here's the alternative patch set, with no change to existing error > message behaviour. I'm going to commit this version and close this CF > item soon if there are no objections. Pushed. That leaves just walreceiver and postgres_fdw in need of WaitEventSet improvements along these lines. The raw ingredients for that were present in earlier patches, and I think Horiguchi-san had the right idea: we should use the libpq event system to adjust our WES as appropriate when it changes the socket underneath us. I will leave this CF entry open a bit longer in case he would like to post an updated version of that part (considering Tom's feedback[1]). If not, we can close this and try again in the next cycle. [1] https://www.postgresql.org/message-id/2446176.1594495351%40sss.pgh.pa.us
Re: locking [user] catalog tables vs 2pc vs logical rep
On Tue, Feb 23, 2021 at 12:00 PM Amit Kapila wrote: > > On Tue, Feb 23, 2021 at 9:33 AM Andres Freund wrote: > > > > On 2021-02-23 09:24:18 +0530, Amit Kapila wrote: > > > Okay, so is it sufficient to add comments in code, or do we want to > > > add something in docs? I am not completely sure if we need to add in > > > docs till we have core-implementation of prepare waiting to get > > > logically replicated. > > > > There's plenty users of logical decoding that aren't going through the > > normal replication mechanism - so they can hit this. So I think it needs > > to be documented somewhere. > > > > As per discussion, the attached patch updates both docs and comments > in the code. > I have pushed this patch. -- With Regards, Amit Kapila.
Re: alter table set TABLE ACCESS METHOD
On Mon, Mar 01, 2021 at 11:16:36AM +0900, Michael Paquier wrote: > On Sun, Feb 28, 2021 at 04:25:30PM -0600, Justin Pryzby wrote: > > I called this "set TABLE access method" rather than just "set access method" > > for the reasons given on the LIKE thread: > > https://www.postgresql.org/message-id/20210119210331.gn8...@telsasoft.com > > ALTER TABLE applies to a table (or perhaps a sequence, still..), so > that sounds a bit weird to me to add again the keyword "TABLE" for > that. I don't know if you're following along the toast compression patch - Alvaro had suggested that instead of making a new catalog just for a handful of tuples for compression types, to instead store them in pg_am, with a new am_type='c'. So I proposed a patch for | CREATE TABLE .. (LIKE other INCLUDING ACCESS METHOD), but then decided that it should say INCLUDING *TABLE* ACCESS METHOD, since otherwise it was somewhat strange that it didn't include the compression access methods (which have a separate LIKE option). > > I've tested this with zedstore, but the lack of 2nd, in-core table AM limits > > testing possibilities. It also limits at least my own ability to reason > > about > > the AM API. > > > > For example, I was surprised to hear that toast is a concept that's > > intended to be applied to AMs other than heap. > > https://www.postgresql.org/message-id/flat/CA%2BTgmoYTuT4sRtviMLOOO%2B79VnDCpCNyy9rK6UZFb7KEAVt21w%40mail.gmail.com > > What kind of advanced testing do you have in mind? It sounds pretty > much enough to me for a basic patch to use the trick with heap2 as > your patch does. That would be enough to be sure that the rewrite > happens and that data is still around. The issue is that the toast data can be compressed, so it needs to be detoasted before pushing it to the other AM, which otherwise may not know how to decompress it. If it's not detoasted, this works with "COMPRESSION lz4" (since zedstore happens to know how to decompress it) but that's just an accident, and it fails with when using pglz. That's got to do with 2 non-core patches - when core has only heap, then I don't see how something like this can be exercized. postgres=# DROP TABLE t; CREATE TABLE t (a TEXT COMPRESSION pglz) USING heap; INSERT INTO t SELECT repeat(a::text,) FROM generate_series(1,99)a; ALTER TABLE t SET ACCESS METHOD zedstore; SELECT * FROM t; DROP TABLE CREATE TABLE INSERT 0 99 ALTER TABLE 2021-02-28 20:50:42.653 CST client backend[14958] psql ERROR: compressed lz4 data is corrupt 2021-02-28 20:50:42.653 CST client backend[14958] psql STATEMENT: SELECT * FROM t; -- Justin
Re: [PATCH] pgbench: Remove ecnt, a member variable of CState
2021-02-28 08:06, Michael Paquier wrote: On Fri, Feb 26, 2021 at 04:36:41PM -0300, Alvaro Herrera wrote: +1 Thanks, done. -- Michael Thanks for reviewing and committing this patch! Regards -- Kota Miyake
Re: [BUG] segfault during delete
On Sun, Feb 28, 2021 at 6:14 AM Álvaro Herrera wrote: > Thanks Amit for working on this fix! It seems correct to me, so I pushed it > with trivial changes. Thanks Bertrand for reporting the problem. Thanks Alvaro. > In addition to backpatching the code fix to pg12, I backpatched the test case > to pg11. It worked fine for me (with no code changes), but it seems good to > have it there just to make sure the buildfarm agrees with us on this. Ah, good call. -- Amit Langote EDB: http://www.enterprisedb.com
Re: alter table set TABLE ACCESS METHOD
On Sun, Feb 28, 2021 at 04:25:30PM -0600, Justin Pryzby wrote: > I called this "set TABLE access method" rather than just "set access method" > for the reasons given on the LIKE thread: > https://www.postgresql.org/message-id/20210119210331.gn8...@telsasoft.com ALTER TABLE applies to a table (or perhaps a sequence, still..), so that sounds a bit weird to me to add again the keyword "TABLE" for that. > I've tested this with zedstore, but the lack of 2nd, in-core table AM limits > testing possibilities. It also limits at least my own ability to reason about > the AM API. > > For example, I was surprised to hear that toast is a concept that's > intended to be applied to AMs other than heap. > https://www.postgresql.org/message-id/flat/CA%2BTgmoYTuT4sRtviMLOOO%2B79VnDCpCNyy9rK6UZFb7KEAVt21w%40mail.gmail.com What kind of advanced testing do you have in mind? It sounds pretty much enough to me for a basic patch to use the trick with heap2 as your patch does. That would be enough to be sure that the rewrite happens and that data is still around. If you are worrying about recovery, a TAP test with an immediate stop of the server could equally be used to check after the FPWs produced for the new relfilenode during the rewrite. -- Michael signature.asc Description: PGP signature
Re: repeated decoding of prepared transactions
On Sat, Feb 27, 2021 at 11:06 PM Amit Kapila wrote: > Few comments on 0002 patch: > = > 1. > + > + /* > + * Disable two-phase here, it will be set in the core if it was > + * enabled whole creating the slot. > + */ > + ctx->twophase = false; > > Typo, /whole/while. I think we don't need to initialize this variable > here at all. > > 2. > + /* If twophase is set on the slot at create time, then > + * make sure the field in the context is also updated > + */ > + if (MyReplicationSlot->data.twophase) > + { > + ctx->twophase = true; > + } > + > > For multi-line comments, the first line of comment should be empty. > Also, I think this is not the right place because the WALSender path > needs to set it separately. I guess you can set it in > CreateInitDecodingContext/CreateDecodingContext by doing something > like > > ctx->twophase &= MyReplicationSlot->data.twophase Updated accordingly. > > 3. I think we can support this option at the protocol level in a > separate patch where we need to allow it via replication commands (say > when we support it in CreateSubscription). Right now, there is nothing > to test all the code you have added in repl_gram.y. > Removed that. > 4. I think we can expose this new option via pg_replication_slots. > Done. Added, regards, Ajin Cherian Fujitsu Australia v7-0001-Avoid-repeated-decoding-of-prepared-transactions-.patch Description: Binary data v7-0002-Add-option-to-enable-two-phase-commits-in-pg_crea.patch Description: Binary data
Re: [PoC] Non-volatile WAL buffer
Hi Sawada, I am relieved to hear that the performance problem was solved. And I added a tip about PMEM namespace and partitioning in PG wiki[1]. Regards, [1] https://wiki.postgresql.org/wiki/Persistent_Memory_for_WAL#Configure_and_verify_DAX_hugepage_faults -- Takashi Menjo
Re: Optimising latch signals
On Sat, Feb 27, 2021 at 12:04 AM Thomas Munro wrote: > I'm planning to commit this soon if there are no objections. Pushed, with the addition of an SFD_CLOEXEC flag for the signalfd. Time to watch the buildfarm to find out if my speculation about illumos is correct...
Regex back-reference semantics and performance
I noticed that some of the slowest cases in Joel's regex test corpus had issues with back-reference matching, and along the way to fixing that discovered what seems to me to be a bug in the engine's handling of back-references. To wit, what should happen if a back-reference is to a subexpression that contains constraints? A simple example is SELECT regexp_match('foof', '(^f)o*\1'); To my mind, the back reference is only chartered to match the literal characters matched by the referenced subexpression. Here, since that expression matches "f", the backref should too, and thus we should get a match to "foof". Perl gives that answer, anyway; but our existing code says there's no match. That's because it effectively copies the constraints within the referenced subexpression, in addition to making the data comparison. The "^" can't match where the second "f" is, so we lose. 0001 attached fixes this by stripping constraint arcs out of the NFA that's applied to the backref subre tree node. Now, as to the performance issue ... if you load up the data in "trouble.sql" attached, and do SELECT regexp_matches(subject, pattern, 'g') FROM trouble; you'll be waiting a good long time, even with our recent improvements. (Up to now I hadn't tried the 'g' flag with Joel's test cases, so I hadn't noticed what a problem this particular example has got.) The reason for the issue is that the pattern is (["'`])(?:\\\1|.)*?\1 and the subject string has a mix of " and ' quote characters. As currently implemented, our engine tries to resolve the match at any substring ending in either " or ', since the NFA created for the backref can match either. That leads to O(N^2) time wasted trying to verify wrong matches. I realized that this could be improved by replacing the NFA/DFA match step for a backref node with a string literal match, if the backreference match string is already known at the time we try to apply the NFA/DFA. That's not a panacea, but it helps in most simple cases including this one. The way to visualize what is happening is that we have a tree of binary concatenation nodes: concat / \ captureconcat / \ other stuffbackref Each concat node performs fast NFA/DFA checks on both its children before recursing to the children to make slow exact checks. When we recurse to the capture node, it records the actual match substring, so now we know whether the capture is " or '. Then, when we recurse to the lower concat node, the capture is available while it makes NFA/DFA checks for its two children; so it will never mistakenly guess that its second child matches a substring it doesn't, and thus it won't try to do exact checking of the "other stuff" on a match that's bound to fail later. So this works as long as the tree of concat nodes is right-deep, which fortunately is the normal case. It won't help if we have a left-deep tree: concat / \ concatbackref / \ captureother stuff because the upper concat node will do its NFA/DFA check on the backref node before recursing to its left child, where the capture will occur. But to get that tree, you have to have written extra parentheses: ((capture)otherstuff)\2 I don't see a way to improve that situation, unless perhaps with massive rejiggering of the regex execution engine. But 0002 attached does help a lot in the simple case. (BTW, the connection between 0001 and 0002 is that if we want to keep the existing semantics that a backref enforces constraints, 0002 doesn't work, since it won't do that.) regards, tom lane diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 08f08322ca..6c189bfed2 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -6166,6 +6166,9 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo; The subexpression must entirely precede the back reference in the RE. Subexpressions are numbered in the order of their leading parentheses. Non-capturing parentheses do not define subexpressions. +The back reference considers only the string characters matched by the +referenced subexpression, not any constraints contained in it. For +example, (^\d)\1 will match 22. diff --git a/src/backend/regex/regc_nfa.c b/src/backend/regex/regc_nfa.c index a10a346e8f..77b860cb0f 100644 --- a/src/backend/regex/regc_nfa.c +++ b/src/backend/regex/regc_nfa.c @@ -1382,6 +1382,77 @@ duptraverse(struct nfa *nfa, } } +/* + * removeconstraints - remove any constraints in an NFA + * + * Constraint arcs are replaced by empty arcs, essentially treating all + * constraints as automatically satisfied. + */ +static void +removeconstraints(struct nfa *nfa, + struct state *start, /* process subNFA starting
Re: doc review for v14
On Wed, Feb 24, 2021 at 04:18:51PM +0900, Michael Paquier wrote: > This leaves 0003, 0004, 0005, 0010, 0012, 0018, 0020 and 0021 as these > did not look like improvements after review. It looks like you applied 0010...but I agree that it's not an improvement. It appears that's something I intended to go back and revisit myself. The rest of the patch looks right, to me. Subject: [PATCH 10/21] doc review for checksum docs doc/src/sgml/wal.sgml | 18 +- I'm suggesting to either revert that part, or apply these more polished changes in 0002. -- Justin >From 03d014809720d90ba43c780cb34fc82dd7173c8d Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Fri, 26 Feb 2021 19:42:54 -0600 Subject: [PATCH 1/2] Partially revert bcf2667bf62d72faced64cb60ffbd2e599a0ebe8 --- doc/src/sgml/wal.sgml | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index 02f576a1a9..66de1ee2f8 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -237,19 +237,19 @@ - By default, data pages are not protected by checksums, but this can optionally be - enabled for a cluster. When enabled, each data page will be ASSIGNED a - checksum that is updated when the page is written and verified each time - the page is read. Only data pages are protected by checksums; internal data + Data pages are not checksum protected by default, but this can optionally be + enabled for a cluster. When enabled, each data page will be assigned a + checksum that is updated when the page is written and verified every time + the page is read. Only data pages are protected by checksums, internal data structures and temporary files are not. - Checksums verification is normally ENABLED when the cluster is initialized using initdb. They can also be enabled or disabled at a later time as an offline operation. Data checksums are enabled or disabled at the full cluster - level, and cannot be specified for individual databases or tables. + level, and cannot be specified individually for databases or tables. @@ -260,9 +260,9 @@ - When attempting to recover from corrupt data, it may be necessary to bypass - the checksum protection. To do this, temporarily set the configuration - parameter . + When attempting to recover from corrupt data it may be necessary to bypass + the checksum protection in order to recover data. To do this, temporarily + set the configuration parameter . -- 2.17.0 >From bed0d532bc2415cea1e2da54b6842053ecc27e04 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Fri, 26 Feb 2021 19:56:25 -0600 Subject: [PATCH 2/2] Second attempt to improve checksum docs 'assigned' is odd Each not every Colon not comma Rephase 'Checksums are normally enabled...', since this is easy to mis-interpret and sounds like they're enabled by default Do not repeat: 'to recover...data' --- doc/src/sgml/wal.sgml | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index 66de1ee2f8..347be030b2 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -238,18 +238,18 @@ Data pages are not checksum protected by default, but this can optionally be - enabled for a cluster. When enabled, each data page will be assigned a - checksum that is updated when the page is written and verified every time - the page is read. Only data pages are protected by checksums, internal data + enabled for a cluster. When enabled, a checksum is included in each data page + when the page is written and verified each time + the page is read. Only data pages are protected by checksums; internal data structures and temporary files are not. - Checksums are normally enabled when the cluster is initialized using initdb. They can also be enabled or disabled at a later time as an offline operation. Data checksums are enabled or disabled at the full cluster - level, and cannot be specified individually for databases or tables. + level, and cannot be specified for individual databases or tables. @@ -260,8 +260,8 @@ - When attempting to recover from corrupt data it may be necessary to bypass - the checksum protection in order to recover data. To do this, temporarily + When attempting to recover from corrupt data, it may be necessary to bypass + the checksum protection. To do this, temporarily set the configuration parameter . -- 2.17.0
Re: Different compression methods for FPI
On Sat, Feb 27, 2021 at 12:43:52PM +0500, Andrey Borodin wrote: > So I think it worth to propose a patch to make wal_compression_method = > {"pglz", "lz4", "zlib"}. Probably, "zstd" can be added to the list. > Attached is a draft taking CompressionId from "custom compression methods" > patch and adding zlib to it. Thanks for submitting it. Does this need to patch ./configure{,.ac} and Solution.pm for HAVE_LIBLZ4 ? I suggest to also include an 0002 patch (not for commit) which changes to use a non-default compression, to exercise this on the CIs - linux and bsd environments now have liblz4 installed, and for windows you can keep "undef". Daniil had a patch to add src/common/z_stream.c: https://github.com/postgrespro/libpq_compression/blob/0a9c70d582cd4b1ef60ff39f8d535f6e800bd7d4/src/common/z_stream.c https://www.postgresql.org/message-id/470e411e-681d-46a2-a1e9-6de11b5f5...@yandex-team.ru Your patch looks fine, but I wonder if we should first implement a generic compression API. Then, the callers don't need to have a bunch of #ifdef. If someone calls zs_create() for an algorithm which isn't enabled at compile time, it throws the error at a lower level. That also allows a central place to do things like handle options (compression level, and things like zstd --long, --rsyncable, etc). In some cases there's an argument that the compression ID should be globally defined constant, not like a dynamic "plugable" OID. That may be true for the libpq compression, WAL compression, and pg_dump, since there's separate "producer" and "consumer". I think if there's "pluggable" compression (like the TOAST patch), then it may have to map between the static or dynamic OIDs and the global compression ID. -- Justin
Re: Remove latch.c workaround for Linux < 2.6.27
On Sat, Feb 27, 2021 at 9:30 PM Thomas Munro wrote: > On Sat, Feb 27, 2021 at 9:01 PM Heikki Linnakangas wrote: > > What happens if you try to try to compile or run on such an ancient kernel? > > Does it fall back to something else? Can you still make it work with > > different configure options? > > > > I'm just curious, not objecting. > > With this patch, I guess you'd have to define WAIT_USE_POLL. I > suppose we could fall back to that automatically if EPOLL_CLOEXEC > isn't defined, if anyone thinks that's worth bothering with. I thought about doing: /* don't overwrite manual choice */ -#elif defined(HAVE_SYS_EPOLL_H) +#elif defined(HAVE_SYS_EPOLL_H) && defined(EPOLL_CLOEXEC) #define WAIT_USE_EPOLL ... but on reflection, I don't think we should expend energy on a desupported OS vintage that isn't present in our build farm, at least not without a reasonable field complaint; I wouldn't even know if it worked. So, pushed without that.
alter table set TABLE ACCESS METHOD
I thought this was a good idea, but didn't hear back when I raised it before. I was motivated to finally look into it by Dilip's toast compression patch, which also (can do) a table rewrite when changing a column's toast compression. I called this "set TABLE access method" rather than just "set access method" for the reasons given on the LIKE thread: https://www.postgresql.org/message-id/20210119210331.gn8...@telsasoft.com I've tested this with zedstore, but the lack of 2nd, in-core table AM limits testing possibilities. It also limits at least my own ability to reason about the AM API. For example, I was surprised to hear that toast is a concept that's intended to be applied to AMs other than heap. https://www.postgresql.org/message-id/flat/CA%2BTgmoYTuT4sRtviMLOOO%2B79VnDCpCNyy9rK6UZFb7KEAVt21w%40mail.gmail.com I plan to add to CF - it seems like a minor addition, but may not be for v14. https://www.postgresql.org/message-id/20190818193533.gl11...@telsasoft.com On Sun, Aug 18, 2019 at 02:35:33PM -0500, Justin Pryzby wrote: > . What do you think about pg_restore --no-tableam; similar to >--no-tablespaces, it would allow restoring a table to a different AM: >PGOPTIONS='-c default_table_access_method=zedstore' pg_restore > --no-tableam ./pg_dump.dat -d postgres >Otherwise, the dump says "SET default_table_access_method=heap", which >overrides any value from PGOPTIONS and precludes restoring to new AM. ... > . it'd be nice if there was an ALTER TABLE SET ACCESS METHOD, to allow >migrating data. Otherwise I think the alternative is: > begin; lock t; > CREATE TABLE new_t LIKE (t INCLUDING ALL) USING (zedstore); > INSERT INTO new_t SELECT * FROM t; > for index; do CREATE INDEX...; done > DROP t; RENAME new_t (and all its indices). attach/inherit, etc. > commit; > > . Speaking of which, I think LIKE needs a new option for ACCESS METHOD, which >is otherwise lost. >From b13dac0ffbe108c45c97095b69218e76bf02799f Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sat, 27 Feb 2021 20:40:54 -0600 Subject: [PATCH] ALTER TABLE SET ACCESS METHOD This does a tuple-level rewrite to the new AM --- src/backend/catalog/heap.c | 1 + src/backend/commands/cluster.c | 17 +++-- src/backend/commands/matview.c | 1 + src/backend/commands/tablecmds.c| 86 ++--- src/backend/parser/gram.y | 10 +++ src/include/commands/cluster.h | 2 +- src/include/commands/event_trigger.h| 1 + src/include/nodes/parsenodes.h | 1 + src/test/regress/expected/create_am.out | 10 +++ src/test/regress/sql/create_am.sql | 5 ++ 10 files changed, 119 insertions(+), 15 deletions(-) diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 9abc4a1f55..9d5a9240e9 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1117,6 +1117,7 @@ AddNewRelationType(const char *typeName, * reltypeid: OID to assign to rel's rowtype, or InvalidOid to select one * reloftypeid: if a typed table, OID of underlying type; else InvalidOid * ownerid: OID of new rel's owner + * accessmtd: OID of new rel's access method * tupdesc: tuple descriptor (source of column definitions) * cooked_constraints: list of precooked check constraints and defaults * relkind: relkind for new rel diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 096a06f7b3..426a5b83ba 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -577,6 +577,7 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose) { Oid tableOid = RelationGetRelid(OldHeap); Oid tableSpace = OldHeap->rd_rel->reltablespace; + Oid accessMethod = OldHeap->rd_rel->relam; Oid OIDNewHeap; char relpersistence; bool is_system_catalog; @@ -598,6 +599,7 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose) /* Create the transient table that will receive the re-ordered data */ OIDNewHeap = make_new_heap(tableOid, tableSpace, relpersistence, + accessMethod, AccessExclusiveLock); /* Copy the heap data into the new table in the desired order */ @@ -619,15 +621,15 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose) * Create the transient table that will be filled with new data during * CLUSTER, ALTER TABLE, and similar operations. The transient table * duplicates the logical structure of the OldHeap, but is placed in - * NewTableSpace which might be different from OldHeap's. Also, it's built - * with the specified persistence, which might differ from the original's. + * NewTableSpace/accessMethod/persistence, which might differ from those + * of the OldHeap. * * After this, the caller should load the new heap with transferred/modified * data, then call finish_heap_swap to complete the operation. */ Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, char r
Re: [PATCH] refactor ATExec{En,Dis}ableRowSecurity
Hi, For 0002-Further-refactoring.patch, should there be assertion inside ATExecSetRowSecurity() on the values for rls and force_rls ? There could be 3 possible values: -1, 0 and 1. Cheers On Sun, Feb 28, 2021 at 1:19 PM Justin Pryzby wrote: > tablecmds.c is 17k lines long, this makes it ~30 lines shorter. >
[PATCH] refactor ATExec{En,Dis}ableRowSecurity
tablecmds.c is 17k lines long, this makes it ~30 lines shorter. >From 2e9500227d45142eb00e9e1ebee001642a834518 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sat, 27 Feb 2021 20:39:10 -0600 Subject: [PATCH 1/2] Refactor ATExec{En,Dis}ableRowSecurity in the style of ATExecForceNoForceRowSecurity commit 088c83363a11200f2225f279d4a5c6cc6f9db3d2 ALTER TABLE .. FORCE ROW LEVEL SECURITY commit 491c029dbc4206779cf659aa0ff986af7831d2ff Row-Level Security Policies (RLS) --- src/backend/commands/tablecmds.c | 34 +--- 1 file changed, 5 insertions(+), 29 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 7adeaedd0e..69e3184c86 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -526,8 +526,7 @@ static ObjectAddress ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKM static void ATExecDropOf(Relation rel, LOCKMODE lockmode); static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode); static void ATExecGenericOptions(Relation rel, List *options); -static void ATExecEnableRowSecurity(Relation rel); -static void ATExecDisableRowSecurity(Relation rel); +static void ATExecSetRowSecurity(Relation rel, bool rls); static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls); static void index_copy_data(Relation rel, RelFileNode newrnode); @@ -4865,10 +4864,10 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) cmd->def, lockmode); break; case AT_EnableRowSecurity: - ATExecEnableRowSecurity(rel); + ATExecSetRowSecurity(rel, true); break; case AT_DisableRowSecurity: - ATExecDisableRowSecurity(rel); + ATExecSetRowSecurity(rel, false); break; case AT_ForceRowSecurity: ATExecForceNoForceRowSecurity(rel, true); @@ -14883,30 +14882,7 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode * ALTER TABLE ENABLE/DISABLE ROW LEVEL SECURITY */ static void -ATExecEnableRowSecurity(Relation rel) -{ - Relation pg_class; - Oid relid; - HeapTuple tuple; - - relid = RelationGetRelid(rel); - - pg_class = table_open(RelationRelationId, RowExclusiveLock); - - tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid)); - - if (!HeapTupleIsValid(tuple)) - elog(ERROR, "cache lookup failed for relation %u", relid); - - ((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = true; - CatalogTupleUpdate(pg_class, &tuple->t_self, tuple); - - table_close(pg_class, RowExclusiveLock); - heap_freetuple(tuple); -} - -static void -ATExecDisableRowSecurity(Relation rel) +ATExecSetRowSecurity(Relation rel, bool rls) { Relation pg_class; Oid relid; @@ -14922,7 +14898,7 @@ ATExecDisableRowSecurity(Relation rel) if (!HeapTupleIsValid(tuple)) elog(ERROR, "cache lookup failed for relation %u", relid); - ((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = false; + ((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = rls; CatalogTupleUpdate(pg_class, &tuple->t_self, tuple); table_close(pg_class, RowExclusiveLock); -- 2.17.0 >From bf200702203eebe08011eaa007e93cae46a63ccb Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sat, 27 Feb 2021 20:52:35 -0600 Subject: [PATCH 2/2] Further refactoring --- src/backend/commands/tablecmds.c | 46 +--- 1 file changed, 13 insertions(+), 33 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 69e3184c86..2e322e78ae 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -526,8 +526,7 @@ static ObjectAddress ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKM static void ATExecDropOf(Relation rel, LOCKMODE lockmode); static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode); static void ATExecGenericOptions(Relation rel, List *options); -static void ATExecSetRowSecurity(Relation rel, bool rls); -static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls); +static void ATExecSetRowSecurity(Relation rel, int rls, int force_rls); static void index_copy_data(Relation rel, RelFileNode newrnode); static const char *storage_name(char c); @@ -4864,16 +4863,16 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) cmd->def, lockmode); break; case AT_EnableRowSecurity: - ATExecSetRowSecurity(rel, true); + ATExecSetRowSecurity(rel, true, -1); break; case AT_DisableRowSecurity: - ATExecSetRowSecurity(rel, false); + ATExecSetRowSecurity(rel, false, -1); break; case AT_ForceRowSecurity: - ATExecForceNoForceRowSecurity(rel, true); + ATExecSetRowSecurity(rel, -1, true); break; case AT_NoForceRowSecurity: - ATExecForceNoForceRowSecurity(rel, false); + ATExecSetRowSecurity(rel, -1, false); break; case AT_GenericOptions:
Re: proposal: psql –help reflecting service or URI usage
Hi Mark, > On 28. Feb, 2021, at 17:54, Mark Dilger wrote: > > "definited" is a typo. yes, definitely a typo, sorry. Thanks for pointing this out. > Should this say "as defined in pg_service.conf"? That's the default, but the > user might have $PGSERVICEFILE set to something else. Perhaps you could > borrow the wording of other options and use "(default: as defined in > pg_service.conf)", or something like that, but of course being careful to > still fit in the line length limit. I agree to all, thanks. What is the line length limit? > Other client applications follow the same pattern as psql, so if this change > were adopted, it should apply to all of them. well, psql is central and IMHO the best place to start. I'd have to try out all of them then. What I do know, though, is that pg_isready does not understand a URI (why is that?), which is very unfortunate. So, I'd have to try them all out and supply patches for them all? Still, supporting a feature and not documenting it in its help is IMHO not a good idea. > Your proposal seems like something that would have been posted to the list > before, possibly multiple times. Any chance you could dig up past > conversations on this subject and post links here for context? I don't know any past discussions here. I only subscribed today to psql-hackers. It might have been mentioned on psql-general, though. But I'm not sure. This idea popped into my mind just yesterday when I was playing around with psql and URIs and noticed that psql –help doesn't show them. Cheers, Paul
Re: proposal: psql –help reflecting service or URI usage
> On Feb 28, 2021, at 1:57 AM, Paul Förster wrote: > > Hi, > > I'd like to propose a patch to psql --help output: > > Currently it is: > > Usage: > psql [OPTION]... [DBNAME [USERNAME]] > > ... > > Connection options: > -h, --host=HOSTNAME database server host or socket directory (default: > "local socket") > -p, --port=PORT database server port (default: "5432") > -U, --username=USERNAME database user name (default: "paul") > -w, --no-passwordnever prompt for password > -W, --password force password prompt (should happen automatically) > > I'd like to change it to the following to reflect the psql ability to process > a service name or a PostgreSQL URI: > > Usage: > psql [OPTION]... [DBNAME [USERNAME]|service|uri] > > ... > > Connection options: > -h, --host=HOSTNAME database server host or socket directory (default: > "local socket") > -p, --port=PORT database server port (default: "5432") > -U, --username=USERNAME database user name (default: "paul") > -w, --no-passwordnever prompt for password > -W, --password force password prompt (should happen automatically) > service=name service name as definited in pg_service.conf "definited" is a typo. Should this say "as defined in pg_service.conf"? That's the default, but the user might have $PGSERVICEFILE set to something else. Perhaps you could borrow the wording of other options and use "(default: as defined in pg_service.conf)", or something like that, but of course being careful to still fit in the line length limit. > uri connection URI (postgresql://...) > > ... > > Attached is a patch for src/bin/psql/help.c for this. The file > doc/src/sgml/ref/psql-ref.sgml does not seem to need any changes for this. > > Any thoughts on this? Other client applications follow the same pattern as psql, so if this change were adopted, it should apply to all of them. Your proposal seems like something that would have been posted to the list before, possibly multiple times. Any chance you could dig up past conversations on this subject and post links here for context? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: regexp_positions()
I had a bug in the function, and I see I also accidentally renamed it to regexp_ranges(). Attached is a fixed version of the PoC. This function is e.g. useful when we're interested in patterns in meta-data, where we're not actually finding patterns in the data, but in a string where each character corresponds to an element in an array, containing the actual data. In such case, we need to know the positions of the matches, since they tell what corresponding array elements that matched. For instance, let's take the UNIX diff tool we all know as an example. Let's say you have all the raw diff lines stored in a text[] array, and we want to produce a unified diff, by finding hunks with up to 3 unchanged lines before/after each hunk containing changes. If we produce a text string containing one character per diff line, using "=" for unchanged, "+" for addition, "-" for deletion. Example: =-===+=-+== We could then find the hunks using this regex: (={0,3}[-+]+={0,3})+ using regexp_positions() to find the start and end positions for each hunk: SELECT * FROM regexp_positions('=-===+=-+==','(={0,3}[-+]+={0,3})+'); start_pos | end_pos ---+- 3 | 9 11 | 24 (2 rows) /Joel regexp_positions.sql Description: Binary data
proposal: psql –help reflecting service or URI usage
Hi, I'd like to propose a patch to psql --help output: Currently it is: Usage: psql [OPTION]... [DBNAME [USERNAME]] ... Connection options: -h, --host=HOSTNAME database server host or socket directory (default: "local socket") -p, --port=PORT database server port (default: "5432") -U, --username=USERNAME database user name (default: "paul") -w, --no-passwordnever prompt for password -W, --password force password prompt (should happen automatically) I'd like to change it to the following to reflect the psql ability to process a service name or a PostgreSQL URI: Usage: psql [OPTION]... [DBNAME [USERNAME]|service|uri] ... Connection options: -h, --host=HOSTNAME database server host or socket directory (default: "local socket") -p, --port=PORT database server port (default: "5432") -U, --username=USERNAME database user name (default: "paul") -w, --no-passwordnever prompt for password -W, --password force password prompt (should happen automatically) service=name service name as definited in pg_service.conf uri connection URI (postgresql://...) ... Attached is a patch for src/bin/psql/help.c for this. The file doc/src/sgml/ref/psql-ref.sgml does not seem to need any changes for this. Any thoughts on this? psql-help.c.patch Description: Binary data Cheers, Paul
Re: Side effect of remove_useless_groupby_columns
On Sun, 28 Feb 2021 at 20:52, Richard Guo wrote: > When looking at [1], I realized we may have a side effect when removing > redundant columns in the GROUP BY clause. Suppose we have a query with > ORDER BY 'b', and meanwhile column 'b' is also a group key. If we decide > that 'b' is redundant due to being functionally dependent on other GROUP > BY columns, we would remove it from group keys. This will make us lose > the opportunity to leverage the index on 'b'. > Any thoughts? I had initially thought that this was a non-issue before incremental sort was added, but the following case highlights that's wrong. # create table t (a int not null, b int); # insert into t select i, i%1000 from generate_series(1,100)i; # create index on t(b,a); # explain (analyze) select b from t group by a, b order by b,a limit 10; # alter table t add constraint t_pkey primary key (a); # explain (analyze) select b from t group by a, b order by b,a limit 10; Execution slows down after adding the primary key since that allows the functional dependency to be detected and the "b" column removed from the GROUP BY clause. Perhaps we could do something like add: diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 545b56bcaf..7e52212a13 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -3182,7 +3182,8 @@ remove_useless_groupby_columns(PlannerInfo *root) if (!IsA(var, Var) || var->varlevelsup > 0 || !bms_is_member(var->varattno - FirstLowInvalidHeapAttributeNumber, - surplusvars[var->varno])) + surplusvars[var->varno]) || + list_member(parse->sortClause, sgc)) new_groupby = lappend(new_groupby, sgc); } to remove_useless_groupby_columns(). It's not exactly perfect though since it's still good to remove the useless GROUP BY columns if there's no useful index to implement the ORDER BY. We just don't know that when we call remove_useless_groupby_columns(). FWIW, these also don't really seem to be all that great examples since it's pretty useless to put the primary key columns in the GROUP BY unless there's some join that's going to duplicate those columns. Having the planner remove the GROUP BY completely in that case would be nice. That's a topic of discussion in the UniqueKeys patch thread. However, it is possible to add a join with a join condition matching the ORDER BY and have the same problem when a join type that preserves the outer path's order is picked. David > [1] > https://www.postgresql.org/message-id/flat/16869-26346b77d6ccaeec%40postgresql.org