Re: pgbench - use pg logging capabilities
Bonjour Michaël, TBH, my recommendation would be to drop *all* of these likely() and unlikely() calls. What evidence have you got that those are meaningfully improving the quality of the generated code? And if they're buried inside macros, they certainly aren't doing anything useful in terms of documenting the code. Yes. I am wondering if we should not rework this part of the logging with something like the attached. My 2c, thoughts welcome. ISTM that the intent is to minimise the performance impact of ignored pg_log calls, especially when under debug where it is most likely to be the case AND that they may be in critical places. Compared to dealing with the level inside the call, the use of the level variable avoids a call-test-return cycle in this case, and the unlikely should help the compiler reorder instructions so that no actual branch is taken under the common case. So I think that the current situation is a good thing at least for debug. For other levels, they are on by default AND would not be placed at critical performance points, so the whole effort of avoiding the call are moot. I agree with Tom that __pg_log_level variable name violates usages. ISTM that switching the variable to explicitely global solves the issues, and that possible the level test can be moved to inside the function for all but the debug level. See attached which reprises some of your idea, but keep the outside filtering for the debug level. -- Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 39c1a243d5..e7b33bb8e0 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3295,7 +3295,7 @@ executeMetaCommand(CState *st, instr_time *now) argc = command->argc; argv = command->argv; - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) + if (unlikely(pg_log_current_level <= PG_LOG_DEBUG)) { PQExpBufferData buf; diff --git a/src/common/logging.c b/src/common/logging.c index c78ae793b8..ce12593e32 100644 --- a/src/common/logging.c +++ b/src/common/logging.c @@ -13,7 +13,7 @@ #include "common/logging.h" -enum pg_log_level __pg_log_level; +enum pg_log_level pg_log_current_level; static const char *progname; static int log_flags; @@ -45,7 +45,7 @@ pg_logging_init(const char *argv0) setvbuf(stderr, NULL, _IONBF, 0); progname = get_progname(argv0); - __pg_log_level = PG_LOG_INFO; + pg_log_current_level = PG_LOG_INFO; if (pg_color_env) { @@ -107,7 +107,7 @@ pg_logging_config(int new_flags) void pg_logging_set_level(enum pg_log_level new_level) { - __pg_log_level = new_level; + pg_log_current_level = new_level; } void @@ -142,6 +142,9 @@ pg_log_generic_v(enum pg_log_level level, const char *pg_restrict fmt, va_list a size_t required_len; char *buf; + if (unlikely(pg_log_current_level > level)) + return; + Assert(progname); Assert(level); Assert(fmt); diff --git a/src/include/common/logging.h b/src/include/common/logging.h index 028149c7a1..c73c4ee76c 100644 --- a/src/include/common/logging.h +++ b/src/include/common/logging.h @@ -55,7 +55,7 @@ enum pg_log_level PG_LOG_OFF, }; -extern enum pg_log_level __pg_log_level; +extern enum pg_log_level pg_log_current_level; /* * Kind of a hack to be able to produce the psql output exactly as required by @@ -72,24 +72,16 @@ void pg_logging_set_locus_callback(void (*cb) (const char **filename, uint64 *l void pg_log_generic(enum pg_log_level level, const char *pg_restrict fmt,...) pg_attribute_printf(2, 3); void pg_log_generic_v(enum pg_log_level level, const char *pg_restrict fmt, va_list ap) pg_attribute_printf(2, 0); -#define pg_log_fatal(...) do { \ - if (likely(__pg_log_level <= PG_LOG_FATAL)) pg_log_generic(PG_LOG_FATAL, __VA_ARGS__); \ - } while(0) +#define pg_log_fatal(...) pg_log_generic(PG_LOG_FATAL, __VA_ARGS__) -#define pg_log_error(...) do { \ - if (likely(__pg_log_level <= PG_LOG_ERROR)) pg_log_generic(PG_LOG_ERROR, __VA_ARGS__); \ - } while(0) +#define pg_log_error(...) pg_log_generic(PG_LOG_ERROR, __VA_ARGS__) -#define pg_log_warning(...) do { \ - if (likely(__pg_log_level <= PG_LOG_WARNING)) pg_log_generic(PG_LOG_WARNING, __VA_ARGS__); \ - } while(0) +#define pg_log_warning(...) pg_log_generic(PG_LOG_WARNING, __VA_ARGS__) -#define pg_log_info(...) do { \ - if (likely(__pg_log_level <= PG_LOG_INFO)) pg_log_generic(PG_LOG_INFO, __VA_ARGS__); \ - } while(0) +#define pg_log_info(...) pg_log_generic(PG_LOG_INFO, __VA_ARGS__) #define pg_log_debug(...) do { \ - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) pg_log_generic(PG_LOG_DEBUG, __VA_ARGS__); \ + if (unlikely(pg_log_current_level <= PG_LOG_DEBUG)) pg_log_generic(PG_LOG_DEBUG, __VA_ARGS__); \ } while(0) #endif /* COMMON_LOGGING_H */
Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema
On Thu, 9 Jan 2020 at 09:36, Michael Paquier wrote: > > On Wed, Jan 08, 2020 at 10:56:01AM +0900, Michael Paquier wrote: > > And done this way as per the attached. I am of course open to > > objections or better ideas, though this looks formulation looks pretty > > good to me. Robert? > > Just to be clear here, I would like to commit this patch and backpatch > with the current formulation in the error strings in the follow-up > days. In 9.4~10, the error cannot be reached, but that feels safer if > we begin to work again on this portion of the autovacuum code. So if > you would like to object, that's the moment.. > -- Hi, I reviewed and tested the patch. After applying patch, I am getting other assert failure. postgres=# CREATE TEMPORARY TABLE temp (a int); CREATE TABLE postgres=# \d List of relations Schema | Name | Type | Owner ---+--+---+-- pg_temp_3 | temp | table | mahendra (1 row) postgres=# drop schema pg_temp_3 cascade ; NOTICE: drop cascades to table temp DROP SCHEMA postgres=# \d Did not find any relations. postgres=# CREATE TEMPORARY TABLE temp (a int); CREATE TABLE postgres=# \d WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. postgres=# *Stack trace:* (gdb) bt #0 0x7f7d749bd277 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x7f7d749be968 in __GI_abort () at abort.c:90 #2 0x00eca3c4 in ExceptionalCondition (conditionName=0x114cc08 "relation->rd_backend != InvalidBackendId", errorType=0x114ca8b "FailedAssertion", fileName=0x114c8b0 "relcache.c", lineNumber=1123) at assert.c:67 #3 0x00eaacb9 in RelationBuildDesc (targetRelId=16392, insertIt=true) at relcache.c:1123 #4 0x00eadf61 in RelationIdGetRelation (relationId=16392) at relcache.c:2021 #5 0x0049f370 in relation_open (relationId=16392, lockmode=8) at relation.c:59 #6 0x0064ccda in heap_drop_with_catalog (relid=16392) at heap.c:1890 #7 0x006435f3 in doDeletion (object=0x2d623c0, flags=21) at dependency.c:1360 #8 0x00643180 in deleteOneObject (object=0x2d623c0, depRel=0x7ffcb9636290, flags=21) at dependency.c:1261 #9 0x00640d97 in deleteObjectsInList (targetObjects=0x2dce438, depRel=0x7ffcb9636290, flags=21) at dependency.c:271 #10 0x00640ed6 in performDeletion (object=0x7ffcb96363b0, behavior=DROP_CASCADE, flags=21) at dependency.c:356 #11 0x00aebc3d in do_autovacuum () at autovacuum.c:2269 #12 0x00aea478 in AutoVacWorkerMain (argc=0, argv=0x0) at autovacuum.c:1693 #13 0x00ae9cf9 in StartAutoVacWorker () at autovacuum.c:1487 #14 0x00b13cdc in StartAutovacuumWorker () at postmaster.c:5562 #15 0x00b131b5 in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5279 #16 #17 0x7f7d74a7cc53 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81 #18 0x00b09fc9 in ServerLoop () at postmaster.c:1691 #19 0x00b09544 in PostmasterMain (argc=3, argv=0x2ce2290) at postmaster.c:1400 #20 0x00974b43 in main (argc=3, argv=0x2ce2290) at main.c:210 I think, before committing 1st patch, we should fix this crash also and then we should commit all the patches. Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
Re: CREATE ROUTINE MAPPING
[ I wasn't paying much attention to this thread at the time, but Kyotaro-san just drew my attention to it again ] Masahiko Sawada writes: > I agree that this feature covers A and B as the first step. But I'm > concerned that for D (and maybe for C?) the volatility of mapped > function could be changed. That is, currently we allow to push down > only immutable functions but they might break it. Also, can the > replacing a function with any expression be a risk of sql injections? Yes, I'm afraid that there's serious security issues here that the SQL standard fails to address. Let's assume that user A has created a function F(), and user B has created a foreign server S. What privileges should be required for user C to create a mapping for F on S? AFAICS, the spec only requires C to have USAGE on S. This means that all C needs is USAGE on S to create a trojan horse that will execute arbitrary code when any other user D executes a query using F on S. C doesn't need to have any privilege on F at all, and USAGE is not exactly strong privilege on S --- you certainly wouldn't expect that "USAGE" translates to "I can backdoor anybody else's usage of this server". I see that SQL:2011's access rules for "CREATE ROUTINE MAPPING" are 1) The applicable privileges shall include the USAGE privilege on the foreign server identified by FSN. 2) Additional privileges, if any, necessary to execute are implementation-defined. It seems to me that (2) should be read as "we know we blew it here, but we're leaving it up to implementors to fix this". Some of the alternatives that were discussed upthread basically replace this whole idea with attaching properties to the original function F. I think I like that a lot better from a security perspective. If you are calling F in your query, you are already placing trust in F's owner. Better to let F's owner define how it maps to functions on remote servers than to let random third parties define that. regards, tom lane
Re: [Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDE
On Fri, Jan 10, 2020 at 10:31 AM Michael Paquier wrote: > > On Fri, Jan 10, 2020 at 07:30:34AM +0530, Dilip Kumar wrote: > > On Thu, 9 Jan 2020 at 10:43 PM, Andres Freund wrote: > >> There's not much point in having this assert, right? Given that it > >> covers all choices? Seems better to just drop it. > > > > Yeah right! > > Refreshing my mind on that... The two remaining assertions still make > sense for update and delete changes per the restrictions in place in > CheckCmdReplicaIdentity(), Right and there is a gap with the regression > tests. So combining all that I get the attached patch (origin point > is 665d1fa). Thoughts? LGTM -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: remove some STATUS_* symbols
On Thu, Jan 09, 2020 at 11:15:08AM +0100, Peter Eisentraut wrote: > You mean put he subsequent GrantLock() calls into LockCheckConflicts()? That > would technically save some duplicate code, but it seems weird, because > LockCheckConflicts() is notionally a read-only function that shouldn't > change state. Nah. I was thinking about the first part of this "if" clause LockCheckConflicts is part of here: if (lockMethodTable->conflictTab[lockmode] & lock->waitMask) status = STATUS_FOUND; else status = LockCheckConflicts(lockMethodTable, lockmode, lock, proclock); But now that I look at it closely it messes up heavily with ProcSleep() ;) -- Michael signature.asc Description: PGP signature
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Mon, Dec 30, 2019 at 3:43 PM Amit Kapila wrote: > > On Sun, Dec 29, 2019 at 1:34 PM Dilip Kumar wrote: > > > > I have observed some more issues > > > > 1. Currently, In ReorderBufferCommit, it is always expected that > > whenever we get REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM, we must > > have already got REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT and in > > SPEC_CONFIRM we send the tuple we got in SPECT_INSERT. But, now those > > two messages can be in different streams. So we need to find a way to > > handle this. Maybe once we get SPEC_INSERT then we can remember the > > tuple and then if we get the SPECT_CONFIRM in the next stream we can > > send that tuple? > > > > Your suggestion makes sense to me. So, we can try it. I have implemented this and attached it as a separate patch. In my latest patch set[1] > > > 2. During commit time in DecodeCommit we check whether we need to skip > > the changes of the transaction or not by calling > > SnapBuildXactNeedsSkip but since now we support streaming so it's > > possible that before we decode the commit WAL, we might have already > > sent the changes to the output plugin even though we could have > > skipped those changes. So my question is instead of checking at the > > commit time can't we check before adding to ReorderBuffer itself > > > > I think if we can do that then the same will be true for current code > irrespective of this patch. I think it is possible that we can't take > that decision while decoding because we haven't assembled a consistent > snapshot yet. I think we might be able to do that while we try to > stream the changes. I think we need to take care of all the > conditions during streaming (when the logical_decoding_workmem limit > is reached) as we do in DecodeCommit. This needs a bit more study. I have analyzed this further and I think we can not decide all the conditions even while streaming. Because IMHO once we get the SNAPBUILD_FULL_SNAPSHOT we can add the changes to the reorder buffer so that if we get the commit of the transaction after we reach to the SNAPBUILD_CONSISTENT. However, if we get the commit before we reach to SNAPBUILD_CONSISTENT then we need to ignore this transaction. Now, even if we have SNAPBUILD_FULL_SNAPSHOT we can stream the changes which might get dropped later but that we can not decide while streaming. [1] https://www.postgresql.org/message-id/CAFiTN-snMb%3D53oqkM8av8Lqfxojjm4OBwCNxmFssgLCceY_zgg%40mail.gmail.com -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Noah Misch writes: > On Thu, Jan 09, 2020 at 12:45:41AM -0500, Tom Lane wrote: >> (1) Why did we get a crash and not some more-decipherable out-of-resources >> error? Can we improve that experience? > By default, 32-bit AIX binaries have maxdata:0x. Specifying > maxdata:0x1000 provides the same 256M of RAM, yet it magically changes the > SIGSEGV to ENOMEM: > ... > We could add -Wl,-bmaxdata:0x1000 (or a higher value) to LDFLAGS when > building for 32-bit AIX. +1, seems like that would improve matters considerably on that platform. >> (2) Should we be dialing back the resource consumption of this test? >> Even on machines where it doesn't fail outright, I'd imagine that it's >> costing a lot of buildfarm cycles. Is it actually worth that? > The test's resource usage, being quite low, should not be a factor in the > test's fate. On my usual development machine, the entire > 006_logical_decoding.pl file takes just 3s and ~250 MiB of RAM. Yeah, as I noted downthread, it appears that initdb itself can't succeed with less than ~250MB these days. My old-school self feels like that's excessive, but I must admit I'm not motivated to go reduce it right now. But I think it's a clear win to fail with "out of memory" rather than "SIGSEGV", so I think we ought to adjust the AIX build options as you suggest. regards, tom lane
Re: [HACKERS] Block level parallel vacuum
On Thu, 9 Jan 2020 at 19:33, Amit Kapila wrote: > > On Thu, Jan 9, 2020 at 10:41 AM Masahiko Sawada > wrote: > > > > On Wed, 8 Jan 2020 at 22:16, Amit Kapila wrote: > > > > > > > > > What do you think of the attached? Sawada-san, kindly verify the > > > changes and let me know your opinion. > > > > I agreed to not include both the FAST option patch and > > DISABLE_LEADER_PARTICIPATION patch at this stage. It's better to focus > > on the main part and we can discuss and add them later if want. > > > > I've looked at the latest version patch you shared. Overall it looks > > good and works fine. I have a few small comments: > > > > I have addressed all your comments and slightly change nearby comments > and ran pgindent. I think we can commit the first two preparatory > patches now unless you or someone else has any more comments on those. Yes. I'd like to briefly summarize the v43-0002-Allow-vacuum-command-to-process-indexes-in-parallel for other reviewers who wants to newly starts to review this patch: Introduce PARALLEL option to VACUUM command. Parallel vacuum is enabled by default. The number of parallel workers is determined based on the number of indexes that support parallel index when user didn't specify the parallel degree or PARALLEL option is omitted. Specifying PARALLEL 0 disables parallel vacuum. In parallel vacuum of this patch, only the leader process does heap scan and collect dead tuple TIDs on the DSM segment. Before starting index vacuum or index cleanup the leader launches the parallel workers and perform it together with parallel workers. Individual index are processed by one vacuum worker process. Therefore parallel vacuum can be used when the table has at least 2 indexes (the leader always takes one index). After launched parallel workers, the leader process vacuums indexes first that don't support parallel index after launched parallel workers. The parallel workers process indexes that support parallel index vacuum and the leader process join as a worker after completing such indexes. Once all indexes are processed the parallel worker processes exit. After that, the leader process re-initializes the parallel context so that it can use the same DSM for multiple passes of index vacuum and for performing index cleanup. For updating the index statistics, we need to update the system table and since updates are not allowed during parallel mode we update the index statistics after exiting from the parallel mode. When the vacuum cost-based delay is enabled, even parallel vacuum is throttled. The basic idea of a cost-based vacuum delay for parallel index vacuuming is to allow all parallel vacuum workers including the leader process to have a shared view of cost related parameters (mainly VacuumCostBalance). We allow each worker to update it as and when it has incurred any cost and then based on that decide whether it needs to sleep. We allow the worker to sleep proportional to the work done and reduce the VacuumSharedCostBalance by the amount which is consumed by the current worker (VacuumCostBalanceLocal). This can avoid letting the workers sleep who have done less or no I/O as compared to other workers and therefore can ensure that workers who are doing more I/O got throttled more. Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Block level parallel vacuum
On Thu, 9 Jan 2020 at 17:31, Sergei Kornilov wrote: > > Hello > > I noticed that parallel vacuum uses min_parallel_index_scan_size GUC to skip > small indexes but this is not mentioned in documentation for both vacuum > command and GUC itself. > > + /* Determine the number of parallel workers to launch */ > + if (lps->lvshared->for_cleanup) > + { > + if (lps->lvshared->first_time) > + nworkers = lps->nindexes_parallel_cleanup + > + lps->nindexes_parallel_condcleanup - 1; > + else > + nworkers = lps->nindexes_parallel_cleanup - 1; > + > + } > + else > + nworkers = lps->nindexes_parallel_bulkdel - 1; v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum and v43-0001-Introduce-IndexAM-fields-for-parallel-vacuum patches look fine to me. Below are some review comments for v43-0002 patch. 1. +integer + + + Specifies a positive integer value passed to the selected option. + The integer value can + also be omitted, in which case the value is decided by the command + based on the option used. + +http://www.enterprisedb.com
Re: [Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDE
On Fri, Jan 10, 2020 at 07:30:34AM +0530, Dilip Kumar wrote: > On Thu, 9 Jan 2020 at 10:43 PM, Andres Freund wrote: >> There's not much point in having this assert, right? Given that it >> covers all choices? Seems better to just drop it. > > Yeah right! Refreshing my mind on that... The two remaining assertions still make sense for update and delete changes per the restrictions in place in CheckCmdReplicaIdentity(), and there is a gap with the regression tests. So combining all that I get the attached patch (origin point is 665d1fa). Thoughts? -- Michael diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c index dcf7c08c18..3c6d0cd171 100644 --- a/src/backend/replication/logical/proto.c +++ b/src/backend/replication/logical/proto.c @@ -143,10 +143,6 @@ logicalrep_write_insert(StringInfo out, Relation rel, HeapTuple newtuple) { pq_sendbyte(out, 'I'); /* action INSERT */ - Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || - rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || - rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX); - /* use Oid as relation identifier */ pq_sendint32(out, RelationGetRelid(rel)); diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl index 77a1560b23..b9608d21f2 100644 --- a/src/test/subscription/t/001_rep_changes.pl +++ b/src/test/subscription/t/001_rep_changes.pl @@ -3,7 +3,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 22; +use Test::More tests => 23; # Initialize publisher node my $node_publisher = get_new_node('publisher'); @@ -34,6 +34,10 @@ $node_publisher->safe_psql('postgres', $node_publisher->safe_psql('postgres', "CREATE TABLE tab_include (a int, b text, CONSTRAINT covering PRIMARY KEY(a) INCLUDE(b))" ); +# Let this table with REPLICA IDENTITY NOTHING, allowing only INSERT changes. +$node_publisher->safe_psql('postgres', "CREATE TABLE tab_nothing (a int)"); +$node_publisher->safe_psql('postgres', + "ALTER TABLE tab_nothing REPLICA IDENTITY NOTHING"); # Setup structure on subscriber $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_notrep (a int)"); @@ -42,6 +46,7 @@ $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_full (a int)"); $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_full2 (x text)"); $node_subscriber->safe_psql('postgres', "CREATE TABLE tab_rep (a int primary key)"); +$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_nothing (a int)"); # different column count and order than on publisher $node_subscriber->safe_psql('postgres', @@ -59,7 +64,7 @@ $node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub"); $node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub_ins_only WITH (publish = insert)"); $node_publisher->safe_psql('postgres', - "ALTER PUBLICATION tap_pub ADD TABLE tab_rep, tab_full, tab_full2, tab_mixed, tab_include" + "ALTER PUBLICATION tap_pub ADD TABLE tab_rep, tab_full, tab_full2, tab_mixed, tab_include, tab_nothing" ); $node_publisher->safe_psql('postgres', "ALTER PUBLICATION tap_pub_ins_only ADD TABLE tab_ins"); @@ -97,6 +102,9 @@ $node_publisher->safe_psql('postgres', "UPDATE tab_rep SET a = -a"); $node_publisher->safe_psql('postgres', "INSERT INTO tab_mixed VALUES (2, 'bar', 2.2)"); +$node_publisher->safe_psql('postgres', + "INSERT INTO tab_nothing VALUES (generate_series(1,20))"); + $node_publisher->safe_psql('postgres', "INSERT INTO tab_include SELECT generate_series(1,50)"); $node_publisher->safe_psql('postgres', @@ -117,6 +125,10 @@ $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM tab_mixed"); is( $result, qq(local|1.1|foo|1 local|2.2|bar|2), 'check replicated changes with different column order'); +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(*) FROM tab_nothing"); +is( $result, qq(20), 'check replicated changes with REPLICA IDENTITY NOTHING'); + $result = $node_subscriber->safe_psql('postgres', "SELECT count(*), min(a), max(a) FROM tab_include"); is($result, qq(20|-20|-1), signature.asc Description: PGP signature
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Sat, Jan 4, 2020 at 4:07 PM Amit Kapila wrote: > > On Mon, Dec 30, 2019 at 3:11 PM Dilip Kumar wrote: > > > > On Thu, Dec 12, 2019 at 9:44 AM Dilip Kumar wrote: > > > > 0002-Issue-individual-invalidations-with-wal_level-log > > > > > > > > 1. > > > > xact_desc_invalidations(StringInfo buf, > > > > { > > > > .. > > > > + else if (msg->id == SHAREDINVALSNAPSHOT_ID) > > > > + appendStringInfo(buf, " snapshot %u", msg->sn.relId); > > > > > > > > You have removed logging for the above cache but forgot to remove its > > > > reference from one of the places. Also, I think you need to add a > > > > comment somewhere in inval.c to say why you are writing for WAL for > > > > some types of invalidations and not for others? > > Done > > > > I don't see any new comments as asked by me. Done I think we should also > consider WAL logging at each command end instead of doing piecemeal as > discussed in another email [1], which will have lesser code changes > and maybe better in performance. You might want to evaluate the > performance of both approaches. Still pending, will work on this. > > > > > 0005-Gracefully-handle-concurrent-aborts-of-uncommitte > > > > -- > > > > 1. > > > > @@ -1877,6 +1877,7 @@ ReorderBufferCommit(ReorderBuffer *rb, > > > > TransactionId xid, > > > > PG_CATCH(); > > > > { > > > > /* TODO: Encapsulate cleanup > > > > from the PG_TRY and PG_CATCH blocks */ > > > > + > > > > if (iterstate) > > > > ReorderBufferIterTXNFinish(rb, iterstate); > > > > > > > > Spurious line change. > > > > > > Done > > + /* > + * We don't expect direct calls to heap_getnext with valid > + * CheckXidAlive for regular tables. Track that below. > + */ > + if (unlikely(TransactionIdIsValid(CheckXidAlive) && > + !(IsCatalogRelation(scan->rs_base.rs_rd) || > + RelationIsUsedAsCatalogTable(scan->rs_base.rs_rd > + elog(ERROR, "improper heap_getnext call"); > > Earlier, I thought we don't need to check if it is a regular table in > this check, but it is required because output plugins can try to do > that and if they do so during decoding (with historic snapshots), the > same should be not allowed. > > How about changing the error message to "unexpected heap_getnext call > during logical decoding" or something like that? Done > > > > > 2. The commit message of this patch refers to Prepared transactions. > > > > I think that needs to be changed. > > > > > > > > 0006-Implement-streaming-mode-in-ReorderBuffer > > > > - > > Few comments on v4-0018-Review-comment-fix-and-refactoring: > 1. > + if (streaming) > + { > + /* > + * Set the last last of the stream as the final lsn before calling > + * stream stop. > + */ > + txn->final_lsn = prev_lsn; > + rb->stream_stop(rb, txn); > + } > > Shouldn't we try to final_lsn as is done by Vignesh's patch [2]? Already agreed upon current implementation > > 2. > + if (streaming) > + { > + /* > + * Set the CheckXidAlive to the current (sub)xid for which this > + * change belongs to so that we can detect the abort while we are > + * decoding. > + */ > + CheckXidAlive = change->txn->xid; > + > + /* Increment the stream count. */ > + streamed++; > + } > > Is the variable 'streamed' used anywhere? Removed > > 3. > + /* > + * Destroy the (relfilenode, ctid) hashtable, so that we don't leak > + * any memory. We could also keep the hash table and update it with > + * new ctid values, but this seems simpler and good enough for now. > + */ > + ReorderBufferDestroyTupleCidHash(rb, txn); > > Won't this be required only when we are streaming changes? Fixed > > As per my understanding apart from the above comments, the known > pending work for this patchset is as follows: > a. The two open items agreed to you in the email [3]. > b. Complete the handling of schema_sent as discussed above [4]. > c. Few comments by Vignesh and the response on the same by me [5][6]. > d. WAL overhead and performance testing for additional WAL logging by > this patchset. > e. Some way to see the tuple for streamed transactions by decoding API > as speculated by you [7]. > > Have I missed anything? I have worked upon most of these items, I will reply to them separately. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: pgbench - use pg logging capabilities
On Thu, Jan 09, 2020 at 08:09:29PM -0500, Tom Lane wrote: > TBH, my recommendation would be to drop *all* of these likely() > and unlikely() calls. What evidence have you got that those are > meaningfully improving the quality of the generated code? And if > they're buried inside macros, they certainly aren't doing anything > useful in terms of documenting the code. Yes. I am wondering if we should not rework this part of the logging with something like the attached. My 2c, thoughts welcome. -- Michael diff --git a/src/include/common/logging.h b/src/include/common/logging.h index 028149c7a1..dd54dc57d0 100644 --- a/src/include/common/logging.h +++ b/src/include/common/logging.h @@ -55,8 +55,6 @@ enum pg_log_level PG_LOG_OFF, }; -extern enum pg_log_level __pg_log_level; - /* * Kind of a hack to be able to produce the psql output exactly as required by * the regression tests. @@ -66,6 +64,7 @@ extern enum pg_log_level __pg_log_level; void pg_logging_init(const char *argv0); void pg_logging_config(int new_flags); void pg_logging_set_level(enum pg_log_level new_level); +enum pg_log_level pg_logging_get_level(void); void pg_logging_set_pre_callback(void (*cb) (void)); void pg_logging_set_locus_callback(void (*cb) (const char **filename, uint64 *lineno)); @@ -73,23 +72,23 @@ void pg_log_generic(enum pg_log_level level, const char *pg_restrict fmt,...) p void pg_log_generic_v(enum pg_log_level level, const char *pg_restrict fmt, va_list ap) pg_attribute_printf(2, 0); #define pg_log_fatal(...) do { \ - if (likely(__pg_log_level <= PG_LOG_FATAL)) pg_log_generic(PG_LOG_FATAL, __VA_ARGS__); \ + pg_log_generic(PG_LOG_FATAL, __VA_ARGS__); \ } while(0) #define pg_log_error(...) do { \ - if (likely(__pg_log_level <= PG_LOG_ERROR)) pg_log_generic(PG_LOG_ERROR, __VA_ARGS__); \ + pg_log_generic(PG_LOG_ERROR, __VA_ARGS__); \ } while(0) #define pg_log_warning(...) do { \ - if (likely(__pg_log_level <= PG_LOG_WARNING)) pg_log_generic(PG_LOG_WARNING, __VA_ARGS__); \ + pg_log_generic(PG_LOG_WARNING, __VA_ARGS__); \ } while(0) #define pg_log_info(...) do { \ - if (likely(__pg_log_level <= PG_LOG_INFO)) pg_log_generic(PG_LOG_INFO, __VA_ARGS__); \ + pg_log_generic(PG_LOG_INFO, __VA_ARGS__); \ } while(0) #define pg_log_debug(...) do { \ - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) pg_log_generic(PG_LOG_DEBUG, __VA_ARGS__); \ + pg_log_generic(PG_LOG_DEBUG, __VA_ARGS__); \ } while(0) #endif /* COMMON_LOGGING_H */ diff --git a/src/common/logging.c b/src/common/logging.c index c78ae793b8..18b3c94fe2 100644 --- a/src/common/logging.c +++ b/src/common/logging.c @@ -13,7 +13,7 @@ #include "common/logging.h" -enum pg_log_level __pg_log_level; +static enum pg_log_level __pg_log_level; static const char *progname; static int log_flags; @@ -110,6 +110,12 @@ pg_logging_set_level(enum pg_log_level new_level) __pg_log_level = new_level; } +enum pg_log_level +pg_logging_get_level(void) +{ + return __pg_log_level; +} + void pg_logging_set_pre_callback(void (*cb) (void)) { @@ -127,6 +133,9 @@ pg_log_generic(enum pg_log_level level, const char *pg_restrict fmt,...) { va_list ap; + if (level < __pg_log_level) + return; + va_start(ap, fmt); pg_log_generic_v(level, fmt, ap); va_end(ap); diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 39c1a243d5..5e63e1f51a 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -3295,7 +3295,7 @@ executeMetaCommand(CState *st, instr_time *now) argc = command->argc; argv = command->argv; - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) + if (pg_logging_get_level() <= PG_LOG_DEBUG) { PQExpBufferData buf; signature.asc Description: PGP signature
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
On Fri, Jan 10, 2020 at 6:10 AM Tom Lane wrote: > > I wrote: > > ReorderBuffer: 223302560 total in 26995 blocks; 7056 free (3 > > chunks); 223295504 used > > > The test case is only inserting 50K fairly-short rows, so this seems > > like an unreasonable amount of memory to be consuming for that; and > > even if you think it's reasonable, it clearly isn't going to scale > > to large production transactions. > > > Now, the good news is that v11 and later get through > > 006_logical_decoding.pl just fine under the same restriction. > > So we did something in v11 to fix this excessive memory consumption. > > However, unless we're willing to back-port whatever that was, this > > test case is clearly consuming excessive resources for the v10 branch. > > I dug around a little in the git history for backend/replication/logical/, > and while I find several commit messages mentioning memory leaks and > faulty spill logic, they all claim to have been back-patched as far > as 9.4. > > It seems reasonably likely to me that this result is telling us about > an actual bug, ie, faulty back-patching of one or more of those fixes > into v10 and perhaps earlier branches. > I think it would be good to narrow down this problem, but it seems we can do this separately. I think to avoid forgetting about this, can we track it somewhere as an open issue (In Older Bugs section of PostgreSQL 12 Open Items or some other place)? It seems to me that this test has found a problem in back-branches, so we might want to keep it after removing the max_files_per_process restriction. However, unless we narrow down this memory leak it is not a good idea to keep it at least not in v10. So, we have the below options: (a) remove this test entirely from all branches and once we found the memory leak problem in back-branches, then consider adding it again without max_files_per_process restriction. (b) keep this test without max_files_per_process restriction till v11 and once the memory leak issue in v10 is found, we can back-patch to v10 as well. Suggestions? > If I have to do so to prove my point, I will set up a buildfarm member > that uses USE_NAMED_POSIX_SEMAPHORES, and then insist that the patch > cope with that. > Shall we document that under USE_NAMED_POSIX_SEMAPHORES, we consume additional fd? I thought about it because the minimum limit for max_files_per_process is 25 and the system won't even start if someone has used a platform where USE_NAMED_POSIX_SEMAPHORES is enabled. Also, if it would have been explicitly mentioned, then I think this test wouldn't have tried to become so optimistic about max_files_per_process. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: base backup client as auxiliary backend process
On Fri, 22 Nov 2019 at 19:22, Peter Eisentraut wrote: > > On 2019-11-15 14:52, Sergei Kornilov wrote: > >> I looked into this. It seems trivial to make walsender create and use a > >> temporary replication slot by default if no permanent replication slot > >> is specified. This is basically the logic that pg_basebackup has but > >> done server-side. See attached patch for a demonstration. Any reason > >> not to do that? > > Seems this would break pg_basebackup --no-slot option? > > After thinking about this a bit more, doing the temporary slot stuff on > the walsender side might lead to too many complications in practice. > > Here is another patch set that implements the temporary slot use on the > walreceiver side, essentially mirroring what pg_basebackup already does. > > I think this patch set might be useful on its own, even without the base > backup stuff to follow. > I agreed that these patches are useful on its own and 0001 patch and 0002 patch look good to me. For 0003 patch, + linkend="guc-primary-slot-name"/>. Otherwise, the WAL receiver may use + a temporary replication slot (determined by ), but these are not shown + here. I think it's better to show the temporary slot name on pg_stat_wal_receiver view. Otherwise user would have no idea about what wal receiver is using the temporary slot. Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Avoiding hash join batch explosions with extreme skew and weird stats
On Tue, Jan 7, 2020 at 4:14 PM Thomas Munro wrote: > * I am uneasy about BarrierArriveExplicitAndWait() (a variant of > BarrierArriveAndWait() that lets you skip directly to a given phase?); > perhaps you only needed that for a circular phase system, which you > could do with modular phase numbers, like PHJ_GROW_BATCHES_PHASE? I > tried to make the barrier interfaces look like the libraries in other > parallel programming environments, and I'd be worried that the > explicit phase thing could easily lead to bugs. > So, I actually use it to circle back up to the first phase while skipping the last phase. So I couldn't do it with modular phase numbers and a loop. The last phase detaches from the chunk barrier. I don't want to detach from the chunk barrier if there are more chunks. I basically need a way to only attach to the chunk barrier at the begininng of the first chunk and only detach at the end of the last chunk--not in between chunks. I will return from the function and re-enter between chunks -- say between chunk 2 and chunk 3 of 5. However, could this be solved by having more than one chunk barrier? A worker would attach to one chunk barrier and then when it moves to the next chunk it would attach to the other chunk barrier and then switch back when it switches to the next chunk. Then it could detach and attach each time it enters/leaves the function. > * I'm not sure it's OK to wait at the end of each loop, as described > in the commit message: > > Workers probing a fallback batch will wait until all workers have > finished probing before moving on so that an elected worker can read > and combine the outer match status files into a single bitmap and use > it to emit unmatched outer tuples after all chunks of the inner side > have been processed. > > Maybe I misunderstood completely, but that seems to break the > programming rule described in nodeHashjoin.c's comment beginning "To > avoid deadlocks, ...". To recap: (1) When you emit a tuple, the > program counter escapes to some other node, and maybe that other node > waits for thee, (2) Maybe the leader is waiting for you but you're > waiting for it to drain its queue so you can emit a tuple (I learned a > proper name for this: "flow control deadlock"). That's why the > current code only ever detaches (a non-waiting operation) after it's > begun emitting tuples (that is, the probing phase). It just moves > onto another batch. That's not a solution here: you can't simply move > to another loop, loops are not independent of each other like batches. > It's possible that barriers are not the right tool for this part of > the problem, or that there is a way to use a barrier that you don't > remain attached to while emitting, or that we should remove the > deadlock risks another way entirely[1] but I'm not sure. Furthermore, > the new code in ExecParallelHashJoinNewBatch() appears to break the > rule even in the non-looping case (it calls BarrierArriveAndWait() in > ExecParallelHashJoinNewBatch(), where the existing code just > detaches). > Yea, I think I'm totally breaking that rule. Just to make sure I understand the way in which I am breaking that rule: In my patch, while attached to a chunk_barrier, worker1 emits a matched tuple (control leaves the current node). Meanwhile, worker2 has finished probing the chunk and is waiting on the chunk_barrier for worker1. How though could worker1 be waiting for worker2? Is this only a problem when one of the barrier participants is the leader and is reading from the tuple queue? (reading your tuple queue deadlock hazard example in the thread [1] you referred to). Basically is my deadlock hazard a tuple queue deadlock hazard? I thought maybe this could be a problem with nested HJ nodes, but I'm not sure. As I understand it, this isn't a problem with current master with batch barriers because while attached to a batch_barrier, a worker can emit tuples. No other workers will wait on the batch barrier once they have started probing. I need to think more about the suggestions you provided in [1] about nixing the tuple queue deadlock hazard. However, hypothetically, if we decide we don't want to break the no emitting tuples while attached to a barrier rule, how can we still allow workers to coordinate while probing chunks of the batch sequentially (1 chunk at a time)? I could think of two options (both sound slow and bad): Option 1: Stash away the matched tuples in a tuplestore and emit them at the end of the batch (incurring more writes). Option 2: Degenerate to 1 worker for fallback batches Any other ideas? > > > - Rename "chunk" (as in chunks of inner side) to something that is > > not already used in the context of memory chunks and, more > > importantly, SharedTuplestoreChunk > > +1. Fragments? Loops? Blocks (from > https://en.wikipedia.org/wiki/Block_nested_loop, though, no, strike > that, blocks are also super overloaded). > Hmmm. I think loop is kinda
Re: [Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDE
On Thu, 9 Jan 2020 at 10:43 PM, Andres Freund wrote: > Hi, > > On 2020-01-09 13:17:59 +0530, Dilip Kumar wrote: > > I am able to reproduce the failure, I think the assert in the > > 'logicalrep_write_insert' is not correct. IMHO even if the replica > > identity is set to NOTHING we should be able to replicate INSERT? > > > > This will fix the issue. > > > > diff --git a/src/backend/replication/logical/proto.c > > b/src/backend/replication/logical/proto.c > > index dcf7c08..471461c 100644 > > --- a/src/backend/replication/logical/proto.c > > +++ b/src/backend/replication/logical/proto.c > > @@ -145,7 +145,8 @@ logicalrep_write_insert(StringInfo out, Relation > > rel, HeapTuple newtuple) > > > > Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || > >rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || > > - rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX); > > + rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX || > > + rel->rd_rel->relreplident == > REPLICA_IDENTITY_NOTHING); > > > > /* use Oid as relation identifier */ > > pq_sendint32(out, RelationGetRelid(rel)); > > There's not much point in having this assert, right? Given that it > covers all choices? Seems better to just drop it. Yeah right! -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
On Thu, Jan 09, 2020 at 12:45:41AM -0500, Tom Lane wrote: > Noah Misch writes: > > Even so, a web search for "extend_brk" led to the answer. By default, > > 32-bit > > AIX binaries get only 256M of RAM for stack and sbrk. The new regression > > test > > used more than that, hence this crash. > > Hm, so > > (1) Why did we get a crash and not some more-decipherable out-of-resources > error? Can we improve that experience? By default, 32-bit AIX binaries have maxdata:0x. Specifying maxdata:0x1000 provides the same 256M of RAM, yet it magically changes the SIGSEGV to ENOMEM: $ OBJECT_MODE=32 gcc maxdata.c && ./a.out Segmentation fault $ OBJECT_MODE=32 gcc -Wl,-bmaxdata:0x maxdata.c && ./a.out Segmentation fault $ OBJECT_MODE=32 gcc -Wl,-bmaxdata:0x1000 maxdata.c && ./a.out done at 255 MiB: Not enough space We could add -Wl,-bmaxdata:0x1000 (or a higher value) to LDFLAGS when building for 32-bit AIX. > (2) Should we be dialing back the resource consumption of this test? > Even on machines where it doesn't fail outright, I'd imagine that it's > costing a lot of buildfarm cycles. Is it actually worth that? The test's resource usage, being quite low, should not be a factor in the test's fate. On my usual development machine, the entire 006_logical_decoding.pl file takes just 3s and ~250 MiB of RAM.
Re: table partitioning and access privileges
On Tue, Jan 7, 2020 at 5:15 PM Amit Langote wrote: > > On Fri, Dec 27, 2019 at 4:26 AM Tom Lane wrote: > > Fujii Masao writes: > > > My customer reported me that the queries through a partitioned table > > > ignore each partition's SELECT, INSERT, UPDATE, and DELETE privileges, > > > on the other hand, only TRUNCATE privilege specified for each partition > > > is applied. I'm not sure if this behavior is expected or not. But anyway > > > is it better to document that? For example, > > > > > Access privileges may be defined and removed separately for each > > > partition. > > > But note that queries through a partitioned table ignore each > > > partition's > > > SELECT, INSERT, UPDATE and DELETE privileges, and apply only TRUNCATE > > > one. > > > > I believe it's intentional that we only check access privileges on > > the table explicitly named in the query. So I'd say SELECT etc > > are doing the right thing, and if TRUNCATE isn't in step with them > > that's a bug to fix, not something to document. > > I tend to agree that TRUNCATE's permission model for inheritance > should be consistent with that for the other commands. How about the > attached patch toward that end? Thanks for the patch! The patch basically looks good to me. +GRANT SELECT (f1, fz), UPDATE (fz) ON atestc TO regress_priv_user2; +REVOKE TRUNCATE ON atestc FROM regress_priv_user2; These seem not to be necessary for the test. BTW, I found that LOCK TABLE on the parent table checks the permission of its child tables. This also needs to be fixed (as a separate patch)? Regards, -- Fujii Masao
Re: backup manifests
Hi Robert, On 1/7/20 6:33 PM, Stephen Frost wrote: > These are issues that we've thought > about and worried about over the years of pgbackrest and with that > experience we've come down on the side that a JSON-based format would be > an altogether better design. That's why we're advocating for it, not > because it requires more code or so that it delays the efforts here, but > because we've been there, we've used other formats, we've dealt with > user complaints when we do break things, this is all history for us > that's helped us learn- for PG, it looks like the future with a static > format, and I get that the future is hard to predict and pg_basebackup > isn't pgbackrest and yeah, I could be completely wrong because I don't > actually have a crystal ball, but this starting point sure looks really > familiar. For example, have you considered what will happen if you have a file in the cluster with a tab in the name? This is perfectly valid in Posix filesystems, at least. You may already be escaping tabs but the simple code snippet you provided earlier isn't going to work so well either way. It gets complicated quickly. I know users should not be creating weird files in PGDATA, but it's amazing how often this sort of thing pops up. We currently have an open issue because = in file names breaks our file format. Tab is surely less common but it's amazing what users will do. Another fun one is 03849840 which fixes the handling of \ characters in the code which checksums the manifest. The file is not fully JSON but the checksums are and that was initially missed in the C migration. The bug never got released but it easily could have been. In short, using a quick-and-dirty homegrown format seemed great at first but has caused many headaches. Because we don't change the repo format across releases we are kind of stuck with past sins until we create a new repo format and write update/compatability code. Users are understandably concerned if new versions of the software won't work with their repo, some of which contain years of backups (really). This doesn't even get into the work everyone else will need to do to read a custom format. I do appreciate your offer of contributing parser code to pgBackRest, but honestly I'd rather it were not necessary. Though of course I'd still love to see a contribution of some sort from you! Hard experience tells me that using a standard format where all these issues have been worked out is the way to go. There are a few MIT-licensed JSON projects that are implemented in a single file. cJSON is very capable while JSMN is very minimal. Is is possible that one of those (or something like it) would be acceptable? It looks like the one requirement we have is that the JSON can be streamed rather than just building up one big blob? Even with that requirement there are a few tricks that can be used. JSON nests rather nicely after all so the individual file records can be transmitted independently of the overall file format. Your first question may be why didn't pgBackRest use one of those parsers? The answer is that JSON parsing/rendering is pretty trivial. Memory management and a (datum-like) type system are the hard parts and pgBackRest already had those. Would it be acceptable to bring in JSON code with a compatible license to use in libcommon? If so I'm willing to help adapt that code for use in Postgres. It's possible that the pgBackRest code could be adapted similarly, but it might make more sense to start from one of these general purpose parsers. Thoughts? -- -David da...@pgmasters.net
RE: [PATCH] Resolve Parallel Hash Join Performance Issue
Regarding to the reason of setting bit was not cheap anymore in parallel join. As I explain in my original mail, it is because 'false sharing cache coherence'. In short word, setting of the bit will cause the whole cache line (64 bytes) dirty. So that all CPU cores contain the cache line have to load it again, which will waste much cpu time. Article https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads explain more detail. -Original Message- From: Tom Lane Sent: Thursday, January 9, 2020 10:43 PM To: Thomas Munro Cc: Deng, Gang ; pgsql-hack...@postgresql.org Subject: Re: [PATCH] Resolve Parallel Hash Join Performance Issue Thomas Munro writes: > Right, I see. The funny thing is that the match bit is not even used > in this query (it's used for right and full hash join, and those > aren't supported for parallel joins yet). Hmm. So, instead of the > test you proposed, an alternative would be to use if (!parallel). > That's a value that will be constant-folded, so that there will be no > branch in the generated code (see the pg_attribute_always_inline > trick). If, in a future release, we need the match bit for parallel > hash join because we add parallel right/full hash join support, we > could do it the way you showed, but only if it's one of those join > types, using another constant parameter. Can we base the test off the match type today, and avoid leaving something that will need to be fixed later? I'm pretty sure that the existing coding is my fault, and that it's like that because I reasoned that setting the bit was too cheap to justify having a test-and-branch around it. Apparently that's not true anymore in a parallel join, but I have to say that it's unclear why. In any case, the reasoning probably still holds good in non-parallel cases, so it'd be a shame to introduce a run-time test if we can avoid it. regards, tom lane
Re: pgbench - use pg logging capabilities
Alvaro Herrera writes: > On 2020-Jan-09, Fabien COELHO wrote: >> -if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) >> +if (pg_log_debug_level) >> { > Umm ... I find the original exceedingly ugly, but the new line is > totally impenetrable. So, I had not been paying any attention to this thread, but that snippet is already enough to set off alarm bells. 1. (problem with already-committed code, evidently) The C standard is quite clear that -- All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. -- All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces. "Reserved" in this context appears to mean "reserved for use by system headers and/or compiler special behaviors". Declaring our own global variables with double-underscore prefixes is not just asking for trouble, it's waving a red flag in front of a bull. 2. (problem with proposed patch) I share Alvaro's allergy for replacing uses of a common variable with a bunch of macros, especially macros that don't look like macros. That's not reducing the reader's cognitive burden. I'd even say it's actively misleading the reader, because what the new code *looks* like it's doing is referencing several independent global variables. We don't need our code to qualify as an entry for the Obfuscated C Contest. The notational confusion could be solved perhaps by writing the macros with function-like parentheses, but it still doesn't seem like an improvement. In particular, the whole point here is to have a common idiom for logging, but I'm unconvinced that every frontend program should be using unlikely() in this particular way. Maybe it's unlikely for pgbench's usage that verbose logging would be turned on, but why should we build in an assumption that that's universally the case? TBH, my recommendation would be to drop *all* of these likely() and unlikely() calls. What evidence have you got that those are meaningfully improving the quality of the generated code? And if they're buried inside macros, they certainly aren't doing anything useful in terms of documenting the code. regards, tom lane
RE: [PATCH] Resolve Parallel Hash Join Performance Issue
Thank you for the comment. Yes, I agree the alternative of using '(!parallel)', so that no need to test the bit. Will someone submit patch to for it accordingly? -Original Message- From: Thomas Munro Sent: Thursday, January 9, 2020 6:04 PM To: Deng, Gang Cc: pgsql-hack...@postgresql.org Subject: Re: [PATCH] Resolve Parallel Hash Join Performance Issue On Thu, Jan 9, 2020 at 10:04 PM Deng, Gang wrote: > Attached is a patch to resolve parallel hash join performance issue. This is > my first time to contribute patch to PostgreSQL community, I referred one of > previous thread as template to report the issue and patch. Please let me know > if need more information of the problem and patch. Thank you very much for investigating this and for your report. > HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple)); > > changed to: > > if > (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple))) > > { > > > HeapTupleHeaderSetMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple)); > > } > > Compared with original code, modified code can avoid unnecessary write to > memory/cache. Right, I see. The funny thing is that the match bit is not even used in this query (it's used for right and full hash join, and those aren't supported for parallel joins yet). Hmm. So, instead of the test you proposed, an alternative would be to use if (!parallel). That's a value that will be constant-folded, so that there will be no branch in the generated code (see the pg_attribute_always_inline trick). If, in a future release, we need the match bit for parallel hash join because we add parallel right/full hash join support, we could do it the way you showed, but only if it's one of those join types, using another constant parameter. > D. Result > > With the modified code, performance of hash join operation can scale better > with number of threads. Here is result of query02 after patch. For example, > performance improved ~2.5x when run 28 threads. > > number of thread:1 48 1628 > time used(sec):465.1 193.1 97.9 55.9 41 Wow. That is a very nice improvement.
Re: pgbench - use pg logging capabilities
On Thu, Jan 09, 2020 at 09:27:42PM -0300, Alvaro Herrera wrote: > On 2020-Jan-09, Fabien COELHO wrote: >> -if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) >> +if (pg_log_debug_level) >> { > > Umm ... I find the original exceedingly ugly, but the new line is > totally impenetrable. Maybe just a pg_logging_get_level() for consistency with the _set_level() one, and then compare the returned result with PG_LOG_DEBUG in pgbench? -- Michael signature.asc Description: PGP signature
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
I wrote: > ReorderBuffer: 223302560 total in 26995 blocks; 7056 free (3 > chunks); 223295504 used > The test case is only inserting 50K fairly-short rows, so this seems > like an unreasonable amount of memory to be consuming for that; and > even if you think it's reasonable, it clearly isn't going to scale > to large production transactions. > Now, the good news is that v11 and later get through > 006_logical_decoding.pl just fine under the same restriction. > So we did something in v11 to fix this excessive memory consumption. > However, unless we're willing to back-port whatever that was, this > test case is clearly consuming excessive resources for the v10 branch. I dug around a little in the git history for backend/replication/logical/, and while I find several commit messages mentioning memory leaks and faulty spill logic, they all claim to have been back-patched as far as 9.4. It seems reasonably likely to me that this result is telling us about an actual bug, ie, faulty back-patching of one or more of those fixes into v10 and perhaps earlier branches. I don't know this code well enough to take point on looking for the problem, though. regards, tom lane
Re: pgbench - use pg logging capabilities
On 2020-Jan-09, Fabien COELHO wrote: > - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) > + if (pg_log_debug_level) > { Umm ... I find the original exceedingly ugly, but the new line is totally impenetrable. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
I looked at this a little while and was bothered by the perl changes; it seems out of place to have RecursiveCopy be thinking about tablespaces, which is way out of its league. So I rewrote that to use a callback: the PostgresNode code passes a callback that's in charge to handle the case of a symlink. Things look much more in place with that. I didn't verify that all places that should use this are filled. In 0002 I found adding a new function unnecessary: we can keep backwards compat by checking 'ref' of the third argument. With that we don't have to add a new function. (POD changes pending.) I haven't reviewed 0003. v8 of all these patches attached. "git am" told me your 0001 was in unrecognized format. It applied fine with "patch". I suggest that if you're going to submit a series with commit messages and all, please use "git format-patch" with the same "-v" argument (9 in this case) for all patches. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From a81e747f0bfa90af8021e2399e196e177a93f62c Mon Sep 17 00:00:00 2001 From: Asim R P Date: Fri, 20 Sep 2019 17:31:25 +0530 Subject: [PATCH v8 1/3] Support node initialization from backup with tablespaces User defined tablespaces appear as symlinks in in the backup. This commit tweaks recursive copy subroutine to allow for symlinks specific to tablespaces. Authored by Kyotaro --- src/test/perl/PostgresNode.pm | 29 +++- src/test/perl/RecursiveCopy.pm | 40 -- 2 files changed, 61 insertions(+), 8 deletions(-) diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm index 2e0cf4a2f3..3cae483ddb 100644 --- a/src/test/perl/PostgresNode.pm +++ b/src/test/perl/PostgresNode.pm @@ -593,6 +593,32 @@ sub backup_fs_cold return; } +sub _srcsymlink +{ + my ($srcpath, $destpath) = @_; + + croak "Cannot operate on symlink \"$srcpath\"" + if ($srcpath !~ qr{/(pg_tblspc/[0-9]+)$}); + + # We have mapped tablespaces. Copy them individually + my $tmpdir = TestLib::tempdir; + my $dstrealdir = TestLib::perl2host($tmpdir); + my $srcrealdir = readlink($srcpath); + + opendir(my $dh, $srcrealdir); + while (readdir $dh) + { + next if (/^\.\.?$/); + my $spath = "$srcrealdir/$_"; + my $dpath = "$dstrealdir/$_"; + RecursiveCopy::copypath($spath, $dpath); + } + closedir $dh; + + symlink $dstrealdir, $destpath; + + return 1; +} # Common sub of backup_fs_hot and backup_fs_cold sub _backup_fs @@ -680,7 +706,8 @@ sub init_from_backup my $data_path = $self->data_dir; rmdir($data_path); - RecursiveCopy::copypath($backup_path, $data_path); + RecursiveCopy::copypath($backup_path, $data_path, + srcsymlinkfn => \&_srcsymlink); chmod(0700, $data_path); # Base configuration for this node diff --git a/src/test/perl/RecursiveCopy.pm b/src/test/perl/RecursiveCopy.pm index baf5d0ac63..715edcdedd 100644 --- a/src/test/perl/RecursiveCopy.pm +++ b/src/test/perl/RecursiveCopy.pm @@ -66,6 +66,7 @@ sub copypath { my ($base_src_dir, $base_dest_dir, %params) = @_; my $filterfn; + my $srcsymlinkfn; if (defined $params{filterfn}) { @@ -80,31 +81,55 @@ sub copypath $filterfn = sub { return 1; }; } + if (defined $params{srcsymlinkfn}) + { + croak "if specified, srcsymlinkfn must be a subroutine reference" + unless defined(ref $params{srcsymlinkfn}) + and (ref $params{srcsymlinkfn} eq 'CODE'); + + $srcsymlinkfn = $params{srcsymlinkfn}; + } + else + { + $srcsymlinkfn = undef; + } + # Complain if original path is bogus, because _copypath_recurse won't. croak "\"$base_src_dir\" does not exist" if !-e $base_src_dir; # Start recursive copy from current directory - return _copypath_recurse($base_src_dir, $base_dest_dir, "", $filterfn); + return _copypath_recurse($base_src_dir, $base_dest_dir, "", $filterfn, $srcsymlinkfn); } # Recursive private guts of copypath sub _copypath_recurse { - my ($base_src_dir, $base_dest_dir, $curr_path, $filterfn) = @_; + my ($base_src_dir, $base_dest_dir, $curr_path, $filterfn, + $srcsymlinkfn) = @_; my $srcpath = "$base_src_dir/$curr_path"; my $destpath = "$base_dest_dir/$curr_path"; # invoke the filter and skip all further operation if it returns false return 1 unless &$filterfn($curr_path); - # Check for symlink -- needed only on source dir - # (note: this will fall through quietly if file is already gone) - croak "Cannot operate on symlink \"$srcpath\"" if -l $srcpath; - # Abort if destination path already exists. Should we allow directories # to exist already? croak "Destination path \"$destpath\" already exists" if -e $destpath; + # Check for symlink -- needed only on source dir + # If caller provided us with a callback, call it; otherwise we're out. + if (-l $srcpath) + { + if (defined $srcsymlinkfn) + { + return &$srcsymlinkfn($srcpath, $destpath); + } + else + { + croak "Cannot operate on symlink
Re: pgbench - use pg logging capabilities
On Thu, Jan 09, 2020 at 10:28:21AM +0100, Fabien COELHO wrote: > Yep, I thought of it, but I was not very keen on having a malloc/free cycle > just for one debug message. However under debug this is probably not an > issue. Consistency is more important here IMO, so applied. > Your patch works for me. IT can avoid some level of format interpretation > overheads by switching to Char/Str functions, see first attachement. I kept both grouped to avoid any unnecessary churn with the manipulation of PQExpBufferData. > The other point is the test on __pg_log_level, see second attached. May be better to discuss that on a separate thread as that's not only related to pgbench. -- Michael signature.asc Description: PGP signature
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Amit Kapila writes: > On Thu, Jan 9, 2020 at 11:15 AM Tom Lane wrote: >> Noah Misch writes: >>> Even so, a web search for "extend_brk" led to the answer. By default, >>> 32-bit >>> AIX binaries get only 256M of RAM for stack and sbrk. The new regression >>> test >>> used more than that, hence this crash. >> Hm, so >> (1) Why did we get a crash and not some more-decipherable out-of-resources >> error? Can we improve that experience? >> (2) Should we be dialing back the resource consumption of this test? > In HEAD, we have a guc variable 'logical_decoding_work_mem' by which > we can control the memory usage of changes and we have used that, but > for back branches, we don't have such a control. I poked into this a bit more by running the src/test/recovery tests under restrictive ulimit settings. I used ulimit -s 1024 ulimit -v 25 (At least on my 64-bit RHEL6 box, reducing ulimit -v much below this causes initdb to fail, apparently because the post-bootstrap process tries to load all our tsearch and encoding conversion shlibs at once, and it hasn't got enough VM space to do so. Someday we may have to improve that.) I did not manage to duplicate Noah's crash this way. What I see in the v10 branch is that the new 006_logical_decoding.pl test fails, but with a clean "out of memory" error. The memory map dump that that produces fingers the culprit pretty unambiguously: ... ReorderBuffer: 223302560 total in 26995 blocks; 7056 free (3 chunks); 223295504 used ReorderBufferByXid: 24576 total in 2 blocks; 11888 free (3 chunks); 12688 used Slab: TXN: 8192 total in 1 blocks; 5208 free (21 chunks); 2984 used Slab: Change: 2170880 total in 265 blocks; 2800 free (35 chunks); 2168080 used ... Grand total: 226714720 bytes in 27327 blocks; 590888 free (785 chunks); 226123832 used The test case is only inserting 50K fairly-short rows, so this seems like an unreasonable amount of memory to be consuming for that; and even if you think it's reasonable, it clearly isn't going to scale to large production transactions. Now, the good news is that v11 and later get through 006_logical_decoding.pl just fine under the same restriction. So we did something in v11 to fix this excessive memory consumption. However, unless we're willing to back-port whatever that was, this test case is clearly consuming excessive resources for the v10 branch. We're not out of the woods either. I also observe that v12 and HEAD fall over, under these same test conditions, with a stack-overflow error in the 012_subtransactions.pl test. This seems to be due to somebody's decision to use a heavily recursive function to generate a bunch of subtransactions. Is there a good reason for hs_subxids() to use recursion instead of a loop? If there is, what's the value of using 201 levels rather than, say, 10? Anyway it remains unclear why Noah's machine got a crash instead of something more user-friendly. But the reason why it's only in the v10 branch seems non-mysterious. regards, tom lane
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Mon, Jan 6, 2020 at 4:43 AM Masahiko Sawada < masahiko.saw...@2ndquadrant.com> wrote: > On Sat, 4 Jan 2020 at 15:11, cary huang wrote: > >> > >> Hello Sawada and all > >> > >> I would like to elaborate more on Sehrope and Sawada's discussion on > passing NULL IV in "pg_cipher_encrypt/decrypt" functions during > kmgr_wrap_key and kmgr_unwrap_key routines in kmgr_utils.c. Openssl > implements key wrap according to RFC3394 as Sawada mentioned and passing > NULL will make openssl to use default IV, which equals to A6A6A6A6A6A6A6A6. > I have confirmed this on my side; A key wrapped with "NULL" IV can be > unwrapped successfully with IV=A6A6A6A6A6A6A6A6, and unwrap will fail if IV > is set to anything else other than NULL or A6A6A6A6A6A6A6A6. > >> > > >Sehrope also suggested me not to use the fixed IV in order to avoid > >getting the same result from the same value. I'm researching it now. > >Also, currently it's using key wrap algorithm[1] but it accepts only > >multiple of 8 bytes as input. Since it's not good for some cases it's > >better to use key wrap with padding algorithm[2] instead, which seems > >available in OpenSSL 1.1.0 or later. > Since the current kmgr only supports AES128 and AES256, the master key generated by kmgr during bootstrap will always be multiple of 8. I believe AES keys in general must always be in multiple of 8. I have not done enough research as to which encryption algorithm will involve keys not in multiple of 8 so I think with the current key_wrap_algorithm is fine. With the key wrap algorithm defined in RFC3394. The IV is used only as a "initial value" and it has to be static; either we randomize one or we use the default A6A6A6... by passing NULL, It is different from the CTR block cipher mode which has been selected to encrypt WAL and buffer. In CTR mode, each block requires a different and unique IV as input in order to be secured; and we have agreed to use segment IDs as IVs. For this reason, I think the current key wrap implementation is fine. The least we can do is to generate a IV during bootstrap and store it in control file, and this generated IV will be used for all key wrapping / unwrapping purposes instead of using the default one. Best regards Cary Huang HighGo Software Canada
Re: Setting min/max TLS protocol in clientside libpq
Thanks for review everyone! A v2 of the patch which I believe addresses all concerns raised is attached. > On 6 Jan 2020, at 07:01, Michael Paquier wrote: > > On Thu, Jan 02, 2020 at 09:46:44PM +, cary huang wrote: >> I agree with Arthur that it makes sense to check the validity of >> "conn->sslmaxprotocolversion" first before checking if it is larger >> than "conn->sslminprotocolversion" > > Here I don't agree. Why not just let OpenSSL handle things with > SSL_CTX_set_min_proto_version? We don't bother about that in the > backend code for that reason on top of keeping the code more simple > with less error handling. And things are cleaner when it comes to > this libpq patch by giving up with the INT_MIN hack. I looked into this and it turns out that OpenSSL does nothing to prevent the caller from setting a nonsensical protocol range like min=tlsv1.3,max=tlsv1.1. Thus, it's quite easy to screw up the backend server config and get it to start properly, but with quite unrelated error messages as a result on connection. Since I think this needs to be dealt with for both backend and frontend (if this is accepted), I removed it from this patch to return to it in a separate thread. >> In the patch, if the client does not supply "sslminprotocolversion", >> it will run to "else" statement and sets TLS min version to "INT_MIN", >> which is a huge negative number and of course openssl won't set >> it. I think this else statement can be enhanced a little to set >> "sslminprotocolversion" to TLSv1.2 by default to match the server >> and provide some warning message that TLS minimum has defaulted to >> TLSv1.2. > > In this patch fe-secure-openssl.c has just done a copy-paste of > SSL_CTX_set_min_proto_version and SSL_CTX_set_max_proto_version > present in be-secure-openssl.c. That's not good. Could you refactor > that please as a separate file? Done. I opted for a more generic header to make usage of the code easier, not sure if thats ok. One thing I noticed when looking at it is that we now have sha2_openssl.c and openssl_protocol.c in src/common. For easier visual grouping of OpenSSL functionality, it makes sense to me to rename sha2_openssl.c to openssl_sha2.c, but that might just be pointless churn. > The patch should have tests in src/test/ssl/, like for invalid values, > incorrect combinations leading to failures, etc. Also done. cheers ./daniel libpq_minmaxproto_v2.patch Description: Binary data
Re: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
On Fri, Jan 10, 2020 at 8:32 AM Tom Lane wrote: > > Andrew Dunstan writes: > > On Fri, Jan 10, 2020 at 1:21 AM Robert Haas wrote: > >> I share the concern about the security issue here. I can't testify to > >> whether Christoph's whole analysis is here, but as a general point, > >> non-superusers can't be allowed to do things that cause the server to > >> access arbitrary local files. > > > It's probably fairly easy to do (c.f. 6136e94dcb). I'm not (yet) > > convinced that there is any significant security threat here. This > > doesn't give the user or indeed any postgres code any access to the > > contents of these files. But if there is a consensus to restrict this > > I'll do it. > > Well, even without access to the file contents, the mere ability to > probe the existence of a file is something we don't want unprivileged > users to have. And (I suppose) this is enough for that, by looking > at what error you get back from trying it. > OK, that's convincing enough. Will do it before long. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgbench - rework variable management
Patch v4 is a just a rebase. Patch v5 is a rebase with some adjustements. -- Fabien.diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile index f402fe7b91..e1d3ef9bb3 100644 --- a/src/bin/pgbench/Makefile +++ b/src/bin/pgbench/Makefile @@ -10,6 +10,7 @@ include $(top_builddir)/src/Makefile.global OBJS = \ $(WIN32RES) \ exprparse.o \ + symbol_table.o \ pgbench.o override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS) diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y index 85d61caa9f..ca21ee012e 100644 --- a/src/bin/pgbench/exprparse.y +++ b/src/bin/pgbench/exprparse.y @@ -211,7 +211,7 @@ make_variable(char *varname) PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr)); expr->etype = ENODE_VARIABLE; - expr->u.variable.varname = varname; + expr->u.variable.varnum = getOrCreateSymbolId(symbol_table, varname); return expr; } diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l index 430bff38a6..a3c5ea348d 100644 --- a/src/bin/pgbench/exprscan.l +++ b/src/bin/pgbench/exprscan.l @@ -207,19 +207,19 @@ notnull [Nn][Oo][Tt][Nn][Uu][Ll][Ll] {digit}+ { if (!strtoint64(yytext, true, >ival)) expr_yyerror_more(yyscanner, "bigint constant overflow", - strdup(yytext)); + pg_strdup(yytext)); return INTEGER_CONST; } {digit}+(\.{digit}*)?([eE][-+]?{digit}+)? { if (!strtodouble(yytext, true, >dval)) expr_yyerror_more(yyscanner, "double constant overflow", - strdup(yytext)); + pg_strdup(yytext)); return DOUBLE_CONST; } \.{digit}+([eE][-+]?{digit}+)? { if (!strtodouble(yytext, true, >dval)) expr_yyerror_more(yyscanner, "double constant overflow", - strdup(yytext)); + pg_strdup(yytext)); return DOUBLE_CONST; } {alpha}{alnum}* { diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index ee1134aea2..22ca43738a 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -251,24 +251,32 @@ const char *progname; volatile bool timer_exceeded = false; /* flag from signal handler */ /* - * Variable definitions. + * Variable contents. + * + * Variable names are managed in symbol_table with a number. * * If a variable only has a string value, "svalue" is that value, and value is - * "not set". If the value is known, "value" contains the value (in any - * variant). + * not set. + * + * If the value is known, "value" contains the value (in any variant). * * In this case "svalue" contains the string equivalent of the value, if we've - * had occasion to compute that, or NULL if we haven't. + * had occasion to compute that, or an empty string if we haven't. + * + * The string length is arbitrary limited to benefit from static allocation. */ +#define SVALUE_SIZE 128 typedef struct { - char *name; /* variable's name */ - char *svalue; /* its value in string form, if known */ - PgBenchValue value; /* actual variable's value */ + int number;/* variable symbol number, -1 if unset */ + char svalue[SVALUE_SIZE]; /* value in string form, if known */ + PgBenchValue value; /* actual value, if known */ } Variable; +#define undefined_variable_value(v) \ + v.svalue[0] == '\0' && v.value.type == PGBT_NO_VALUE + #define MAX_SCRIPTS 128 /* max number of SQL scripts allowed */ -#define SHELL_COMMAND_SIZE 256 /* maximum size allowed for shell command */ /* * Simple data structure to keep stats about something. @@ -414,7 +422,6 @@ typedef struct /* client variables */ Variable *variables; /* array of variable definitions */ int nvariables; /* number of variables */ - bool vars_sorted; /* are variables sorted by name? */ /* various times about current transaction */ int64 txn_scheduled; /* scheduled start time of transaction (usec) */ @@ -427,6 +434,9 @@ typedef struct /* per client collected stats */ int64 cnt; /* client transaction count, for -t */ int ecnt; /* error count */ + + /* next buffer should be at thread level, but it would change functions... */ + PQExpBufferData buf; /* persistant buffer to avoid malloc/free cycles */ } CState; /* @@ -512,6 +522,7 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"}; * varprefix SQL commands terminated with \gset have this set *to a non NULL value. If nonempty, it's used to prefix the *variable name that receives the value. + * set_varnum variable symbol number for \set and \setshell. * expr Parsed expression, if needed. * stats Time spent in this command. */ @@ -524,6 +535,7 @@ typedef struct Command int argc; char *argv[MAX_ARGS]; char *varprefix; + int set_varnum; PgBenchExpr *expr; SimpleStats stats; } Command; @@ -540,6 +552,9 @@ static ParsedScript sql_script[MAX_SCRIPTS]; /* SQL script files */ static int num_scripts; /* number of scripts in sql_script[] */ static int64 total_weight = 0; +#define
Re: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
Andrew Dunstan writes: > On Fri, Jan 10, 2020 at 1:21 AM Robert Haas wrote: >> I share the concern about the security issue here. I can't testify to >> whether Christoph's whole analysis is here, but as a general point, >> non-superusers can't be allowed to do things that cause the server to >> access arbitrary local files. > It's probably fairly easy to do (c.f. 6136e94dcb). I'm not (yet) > convinced that there is any significant security threat here. This > doesn't give the user or indeed any postgres code any access to the > contents of these files. But if there is a consensus to restrict this > I'll do it. Well, even without access to the file contents, the mere ability to probe the existence of a file is something we don't want unprivileged users to have. And (I suppose) this is enough for that, by looking at what error you get back from trying it. regards, tom lane
Re: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
> On 9 Jan 2020, at 22:38, Andrew Dunstan > wrote: > I'm not (yet) > convinced that there is any significant security threat here. This > doesn't give the user or indeed any postgres code any access to the > contents of these files. But if there is a consensus to restrict this > I'll do it. I've seen successful exploits made from parts that I in my wildest imagination couldn't think be useful, so FWIW +1 for adding belts to suspenders and restricting this. cheers ./daniel
Re: pgsql: Add basic TAP tests for psql's tab-completion logic.
I wrote: > Peter Eisentraut writes: >> I think he means something like >> make check with_readline=no >> not for the actual build. > Oh, I see. I'd rather not codify that though, because it risks > problems if that symbol ever gets used any other way. I was > thinking of making the test script check for some independent > environment variable, say SKIP_READLINE_TESTS. I thought of another problem with the with_readline=no method, which is that it requires the user to be issuing "make check" directly; it wouldn't be convenient for a buildfarm owner, say. (*Perhaps* it'd work to set with_readline=no throughout a buildfarm run, but I think that's asking for trouble with the build part.) I pushed a patch using SKIP_READLINE_TESTS. Christoph should be able to set that for the Ubuntu branches where the test is failing. regards, tom lane
Re: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
On Fri, Jan 10, 2020 at 1:21 AM Robert Haas wrote: > > On Thu, Jan 9, 2020 at 5:30 AM Christoph Berg wrote: > > I have some concerns about security, though. It's true that the > > sslcert/sslkey options can only be set/modified by superusers when > > "password_required" is set. But when password_required is not set, any > > user and create user mappings that reference arbitrary files on the > > server filesystem. I believe the options are still used in that case > > for creating connections, even when that means the remote server isn't > > set up for cert auth, which needs password_required=false to succeed. > > > > In short, I believe these options need explicit superuser checks. > > I share the concern about the security issue here. I can't testify to > whether Christoph's whole analysis is here, but as a general point, > non-superusers can't be allowed to do things that cause the server to > access arbitrary local files. It's probably fairly easy to do (c.f. 6136e94dcb). I'm not (yet) convinced that there is any significant security threat here. This doesn't give the user or indeed any postgres code any access to the contents of these files. But if there is a consensus to restrict this I'll do it. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Coding in WalSndWaitForWal
Alvaro Herrera writes: > However, we already have a large number of proc_exit() calls in switch > blocks that are not followed by breaks. In fact, the majority are > already like that. Oh, hmm ... consistency is good. regards, tom lane
Re: our checks for read-only queries are not great
Robert Haas writes: > On Thu, Jan 9, 2020 at 3:07 PM Tom Lane wrote: >> You could argue about exactly how to extend that to non-spec >> utility commands, but for the most part allowing them seems >> to make sense if DML is allowed. > But I think we allow them on all tables, not just temp tables, so I > don't think I understand this argument. Oh, I misunderstood your concern. Peter might remember more clearly, but I have a feeling that we concluded that the intent of the spec was for read-only-ness to disallow globally-visible changes in the visible database contents. VACUUM, for example, does not cause any visible change, so it should be admissible. REINDEX ditto. (We ignore here the possibility of such things causing, say, a change in the order in which rows are returned, since that's beneath the spec's notice to begin with.) ANALYZE ditto, except to the extent that if you look at pg_stats you might see something different --- but again, system catalog contents are outside the spec's purview. You could extend this line of argument, perhaps, far enough to justify ALTER SYSTEM SET as well. But I don't like that because some GUCs have visible effects on the results that an ordinary query minding its own business can get. Timezone is perhaps the poster child there, or search_path. If we were to subdivide the GUCs into "affects implementation details only" vs "can affect query semantics", I'd hold still for allowing ALTER SYSTEM SET on the former group. Doubt it's worth the trouble to distinguish, though. regards, tom lane
Re: our checks for read-only queries are not great
On Thu, Jan 9, 2020 at 3:37 PM Robert Haas wrote: > > You could argue about exactly how to extend that to non-spec > > utility commands, but for the most part allowing them seems > > to make sense if DML is allowed. > > But I think we allow them on all tables, not just temp tables, so I > don't think I understand this argument. Oh, wait: I'm conflating two things. The current behavior extends the spec behavior to COPY in a logical way. But it also allows CLUSTER, REINDEX, and VACUUM on any table. The spec presumably has no view on that, nor does the passage you quoted seem to apply here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: our checks for read-only queries are not great
On Thu, Jan 9, 2020 at 3:07 PM Tom Lane wrote: > Robert Haas writes: > > Maybe the SQL standard has something to say about this? > > [ pokes around ... ] Yeah, it does, and I'd say it's pretty clearly > in agreement with what Peter did, so far as DML ops go. For instance, > this bit from SQL99's description of DELETE: > > 1) If the access mode of the current SQL-transaction or the access > mode of the branch of the current SQL-transaction at the current > SQL-connection is read-only, and T is not a temporary table, > then an exception condition is raised: invalid transaction state > - read-only SQL-transaction. > > UPDATE and INSERT say the same. (I didn't look at later spec versions, > since Peter's 2003 commit was probably based on SQL99.) OK. That's good to know. > You could argue about exactly how to extend that to non-spec > utility commands, but for the most part allowing them seems > to make sense if DML is allowed. But I think we allow them on all tables, not just temp tables, so I don't think I understand this argument. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Removing pg_pltemplate and creating "trustable" extensions
On Thu, Jan 9, 2020 at 3:18 PM Tom Lane wrote: > * ISTM that that's assuming that the DBA and the sysadmin are the same > person (or at least hold identical views on this subject). In many > installations it'd only be root who has control over what's in that > directory, and I don't think it's unreasonable for the DBA to wish > to be able to exercise additional filtering. An emphatic +1 from me. This is what I've been trying to argue over and over, apparently rather unclearly. > * The point of a default role would be for the DBA to be able to > control which database users can install extensions. Even if the > DBA has full authority over the extension library, that would not > provide control over who can install, only over what is available > for any of them to install. I agree with that, too. I guess you could decide that the answer to the question "who can install extensions?" must be the same as the answer to the question "who owns a database?" but having the flexibility to make the answers to those questions different seems better than forcing them to always be the same. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR
> On 9 Jan 2020, at 17:43, Daniel Verite wrote: > > […] > (using plain text instead of HTML messages might help with that). Thanks. I’ll do that next time. > […] > * unnest-refcursor-v3.patch needs a slight rebase because this chunk > in the Makefile fails > - regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \ > + refcursor.o regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \ Likewise I’ll make that rebase in the next version. > * I'm under the impression that refcursor_unnest() is functionally > equivalent to a plpgsql implementation like this: > > […] > > but it would differ in performance, because internally a materialization step > could be avoided, but only when the other patch "Allow FunctionScans to > pipeline results" gets in? Yes. That’s at least true if unnest(x) is used in the FROM. If it’s used in the SELECT, actually it can get the performance benefit right away. However, in the SELECT case, there’s a bit of a gotcha because anonymous records can’t easily be manipulated because they have no type information available. So to make a useful performance contribution, it does need to be combined with another change — either to make it FROM pipeline as in my other patch, or perhaps enabling anonymous record types to be cast or otherwise manipulated. > […] > /* > - * UNNEST > + * UNNEST (array) > */ > > This chunk looks unnecessary? It was for purpose of disambiguating. But indeed it is unnecessary. Choosing a different name would avoid need for it. > * some user-facing doc would be needed. Indeed. I fully intend that. I figured I’d get the concept on a firmer footing first. > * Is it good to overload "unnest" rather than choosing a specific > function name? Yeah. I wondered about that. A couple of syntactically obvious ideas were: SELECT … FROM TABLE (x) (which is what I think Oracle does, but is reserved) SELECT … FROM CURSOR (x) (which seems likely to confuse, but, surprisingly, isn’t actually reserved) SELECT … FROM FETCH (x) (which I quite like, but is reserved) SELECT … FROM ROWS_FROM (x) (is okay, but conflicts with our ROWS FROM construct) So I kind of landed on UNNEST(x) out of lack of better idea. EXPAND(x) could be an option. Actually ROWS_IN(x) or ROWS_OF(x) might work. Do you have any preference or suggestion? Thanks a lot for the feedback. denty.
Re: Removing pg_pltemplate and creating "trustable" extensions
Stephen Frost writes: > I am not particularly concerned about that backwards compatibility issue > and I don’t intend to base my argument on it, but I would use that case to > point out that we have long had the ability to install trusted C functions > into the backend as a DB owner, without complaint from either users or > security pedants, Right, which is why my patch proposes generalizing that feature for trusted PLs into a general feature for other extensions. I'd be much leerier of that if we'd had any pushback on it for trusted PLs. > ... and that backs up my position that we are setting up this > privilege at the wrong level by using a default role which a superuser must > grant independently from DB ownership. Don't see how this follows. It's somewhat accidental I think that the existing behavior is tied to DB ownership. That's just because at the time, that's the only sort of privilege we had that seemed intermediate between superuser and Joe User. If we were designing the behavior today, with default roles already a done deal for handing out possibly-dangerous privileges, I think there's no question that we'd be setting up this privilege as a default role rather than tying it to DB ownership. We don't make DB ownership a prerequisite to creating other sorts of functions, yet other functions can be just as dangerous in some cases as C functions. regards, tom lane
Re: Coding in WalSndWaitForWal
On 2020-Jan-09, Tom Lane wrote: > Alvaro Herrera writes: > > In modern times, we define pg_attribute_noreturn() like this: > > > /* GCC, Sunpro and XLC support aligned, packed and noreturn */ > > #if defined(__GNUC__) || defined(__SUNPRO_C) || defined(__IBMC__) > > #define pg_attribute_noreturn() __attribute__((noreturn)) > > #define HAVE_PG_ATTRIBUTE_NORETURN 1 > > #else > > #define pg_attribute_noreturn() > > #endif > > > I suppose this will cause warnings in compilers other than those, but > > I'm not sure if we care. What about MSVC for example? > > Yeah, the lack of coverage for MSVC seems like the main reason not > to assume this works "everywhere of interest". That would easy to add as __declspec(noreturn) ... except that in MSVC the decoration goes *before* the prototype rather after it, so this seems difficult to achieve without invasive surgery. https://docs.microsoft.com/en-us/cpp/cpp/noreturn?view=vs-2015 > > With the attached patch, everything compiles cleanly in my setup, no > > warnings, but then it's GCC. > > Meh ... I'm not really convinced that any of those changes are > improvements. Particularly not the removals of switch-case breaks. However, we already have a large number of proc_exit() calls in switch blocks that are not followed by breaks. In fact, the majority are already like that. I can easily leave this well enough alone, though. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Removing pg_pltemplate and creating "trustable" extensions
Greetings, On Thu, Jan 9, 2020 at 14:48 Tom Lane wrote: > Robert Haas writes: > > So, if I understand correctly, the patch you are proposing to commit > > has a new system role, and if you've got that system role, then you > > can install extensions. > > Install *trusted* extensions, correct. The patch as it stands also > allows DB owners to install trusted extensions. > > > I thought that part of the earlier debate was > > whether DB owners should also be able to install trusted extensions > > even without that role, and I thought it would be cleaner if the > > answer was "no," because then the superuser could decide whether to > > grant that role or not in particular cases. But I'm not clear whether > > you agreed with that, what Stephen thought about it, or whether that's > > still what you are proposing to commit. > > I agree that if we dropped the proviso about DB owners, it would be > a cleaner design. I included that only for backwards compatibility > with the existing behavior that DB owners can install trusted PLs. > If we can agree that we're willing to lose that behavior, I'd be > perfectly fine with removing the special case for DB owners. > However, I'm unsure whether that compatibility cost is acceptable. > It's definitely likely that it would cause an upgrade headache > for some installations. > > One idea for working around the upgrade problem would be to teach > pg_dumpall to automatically issue "GRANT pg_install_trusted_extension" > to each DB-owner role, when dumping from a pre-v13 database. There's > room to object to that, because it would end with more privilege than > before (that is, an owner of some DB could now install extensions > even in DBs she doesn't own, as long as she can connect to them). > So maybe it's a bad idea. But it would probably reduce the number > of complaints --- and I think a lot of installations would end up > making such grants anyway, because otherwise their DB owners can't > do things they expect to be able to do. > > I should not put words into Stephen's mouth, but perhaps his > concern about having some DB-level privilege here is to alleviate > the problem that there's no exact equivalent to the old level of > privilege that DB ownership afforded, ie you can install in your > own DB but not others. It's not clear to me whether that behavior > is critical to preserve. I am not particularly concerned about that backwards compatibility issue and I don’t intend to base my argument on it, but I would use that case to point out that we have long had the ability to install trusted C functions into the backend as a DB owner, without complaint from either users or security pedants, and that backs up my position that we are setting up this privilege at the wrong level by using a default role which a superuser must grant independently from DB ownership. Thanks, Stephen >
Re: Removing pg_pltemplate and creating "trustable" extensions
Stephen Frost writes: > So I'm at a loss for why there is this insistence on a default role and > a superuser-explicit-granting based approach that goes beyond "is it > installed on the filesystem?" and "is it marked as trusted?". Okay, so it seems like we're down to just this one point of contention. You feel that the superuser can control what is in the extension library directory and that that ought to be sufficient control. I disagree with that, for two reasons: * ISTM that that's assuming that the DBA and the sysadmin are the same person (or at least hold identical views on this subject). In many installations it'd only be root who has control over what's in that directory, and I don't think it's unreasonable for the DBA to wish to be able to exercise additional filtering. * The point of a default role would be for the DBA to be able to control which database users can install extensions. Even if the DBA has full authority over the extension library, that would not provide control over who can install, only over what is available for any of them to install. regards, tom lane
Re: our checks for read-only queries are not great
Robert Haas writes: > Maybe the SQL standard has something to say about this? [ pokes around ... ] Yeah, it does, and I'd say it's pretty clearly in agreement with what Peter did, so far as DML ops go. For instance, this bit from SQL99's description of DELETE: 1) If the access mode of the current SQL-transaction or the access mode of the branch of the current SQL-transaction at the current SQL-connection is read-only, and T is not a temporary table, then an exception condition is raised: invalid transaction state - read-only SQL-transaction. UPDATE and INSERT say the same. (I didn't look at later spec versions, since Peter's 2003 commit was probably based on SQL99.) You could argue about exactly how to extend that to non-spec utility commands, but for the most part allowing them seems to make sense if DML is allowed. regards, tom lane
Re: Removing pg_pltemplate and creating "trustable" extensions
On Thu, Jan 9, 2020 at 2:48 PM Tom Lane wrote: > I agree that if we dropped the proviso about DB owners, it would be > a cleaner design. I included that only for backwards compatibility > with the existing behavior that DB owners can install trusted PLs. > If we can agree that we're willing to lose that behavior, I'd be > perfectly fine with removing the special case for DB owners. > However, I'm unsure whether that compatibility cost is acceptable. > It's definitely likely that it would cause an upgrade headache > for some installations. I was assuming that installing extensions was fairly infrequent and that it probably gets done mostly by superusers anyway, so probably most people won't care if, after upgrading, they needed an extra GRANT to get things working again. That might be wrong, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Removing pg_pltemplate and creating "trustable" extensions
Greetings, * Robert Haas (robertmh...@gmail.com) wrote: > On Thu, Jan 9, 2020 at 1:35 PM Tom Lane wrote: > > Robert Haas writes: > > > Again, as I said upthread, Tom had the exact feature about which I am > > > talking in the first version of the patch. That is a strong argument > > > in favor of it being practical. It's also a pretty good argument that > > > it is at least potentially useful, because Tom doesn't usually do > > > useless things for no reason. > > > > To try to clarify that a bit: I think there is certainly some value > > in allowing superusers to control which extensions could be installed > > by non-superusers, further restricting what we may think is trustworthy. > > Cool. I'm arguing for the position that superusers/admins have the ability to control which extensions exist on the filesystem, and that plus the 'trusted' marking is sufficient flexibility. > > However, I felt at the time that my GUC-based implementation of that > > was ugly, and then Peter raised some concrete points against it, > > so I took it out. I don't want to put it back in the same form. > > I think we could leave designing a replacement for later, because it's > > pretty optional, especially if we aren't aggressive about promoting > > contrib modules to "trusted" status. > > Agreed. Also agreed- which is why I figured we weren't really discussing that any more. > > I don't agree that the lack of > > such a feature is a reason not to commit what I've got. > > I said the same in > http://postgr.es/m/ca+tgmoygwgs_rnmoooczzcgrzfqtfngshaq2gu7lm5skxrf...@mail.gmail.com > - penultimate paragraph, last sentence. I also agree that we don't need the "who can install what extension" flexibility that the original GUC-based approach contemplated, but that's because I don't think we are likely to ever need it. If we do and someone comes up with a good design for it, that'd be fine too. > > In any case, AFAICT most of the heat-vs-light in this thread has not > > been about which extensions are trustworthy, but about which users > > should be allowed to install extensions, which seems like a totally > > independent discussion. > > I agree it's independent. It wasn't really the main point of what *I* > was trying to talk about, but the heat-vs-light problem seems to have > totally obscured what I *was* trying to talk about. I'm entirely confused about what you were trying to talk about then. Most of the back-and-forth, as I saw it anyway, were points being raised to say "we can't let the right of installing extensions be allowed to DB owners", which I don't agree with and which I've yet to see an actual justification for beyond "well, we think it should require some explicit superuser privilege-granting, beyond the granting that the superuser does when they create a database owned by a given user." > > And controlling that is also a feature that > > we don't have today, so I'd rather get a minimal feature committed > > for v13 and then later consider whether we need more functionality. > > > > The idea of a DB-level INSTALL privilege addresses the second > > point not the first, unless I'm totally misunderstanding it. As > > I said before, I'm not terribly comfortable with handing control > > of that over to non-superuser DB owners, and I sure don't see why > > doing so should be a required part of the minimal feature. > > So, if I understand correctly, the patch you are proposing to commit > has a new system role, and if you've got that system role, then you > can install extensions. I thought that part of the earlier debate was > whether DB owners should also be able to install trusted extensions > even without that role, and I thought it would be cleaner if the > answer was "no," because then the superuser could decide whether to > grant that role or not in particular cases. But I'm not clear whether > you agreed with that, what Stephen thought about it, or whether that's > still what you are proposing to commit. I do *not* agree with having a default role for this, at all. This looks just like the right to CREATE tables or functions inside a schema, except with a DB-level object (an extension) instead of a schema-level object, and that is the purview of the DB owner. The arguments raised about $SCARYEXTENSION and security concerns make a lot of sense- I agree that those are things we should discuss and make sure that allowing a DB owner this privilege won't pave a way for them to get superuser access, but, imv anyway, we discussed those and didn't actually come up with any cases where it'd be an issue, in part thanks to Tom's design where the objects end up owned by the bootstrap superuser except in specific cases. So I'm at a loss for why there is this insistence on a default role and a superuser-explicit-granting based approach that goes beyond "is it installed on the filesystem?" and "is it marked as trusted?". Thanks, Stephen signature.asc Description: PGP signature
Re: our checks for read-only queries are not great
On Thu, Jan 9, 2020 at 2:24 PM Tom Lane wrote: > Robert Haas writes: > > I'd be really interested to hear if anyone knows the history behind > > allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables. > > It seems to have been that way for a long time. I wonder if it was a > > deliberate choice or something that just happened semi-accidentally. > > Within a "read-only" xact you mean? I believe that allowing DML writes > was intentional. As for the utility commands, I suspect that it was in > part accidental (error of omission?), and then if anyone thought hard > about it they decided that allowing DML writes to temp tables justifies > those operations too. > > Have you tried excavating in our git history to see when the relevant > permission tests originated? check_xact_readonly() with a long list of command tags originated in the same commit that added read-only transactions. CLUSTER, REINDEX, and VACUUM weren't included in the list of prohibited operations then, either, but it's unclear whether that was a deliberate omission or an oversight. That commit also thought that COPY FROM - and queries - should allow temp tables. But there's nothing in the commit that seems to explain why, unless the commit message itself is a hint: commit b65cd562402ed9d3206d501cc74dc38bc421b2ce Author: Peter Eisentraut Date: Fri Jan 10 22:03:30 2003 + Read-only transactions, as defined in SQL. Maybe the SQL standard has something to say about this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Coding in WalSndWaitForWal
Alvaro Herrera writes: > In modern times, we define pg_attribute_noreturn() like this: > /* GCC, Sunpro and XLC support aligned, packed and noreturn */ > #if defined(__GNUC__) || defined(__SUNPRO_C) || defined(__IBMC__) > #define pg_attribute_noreturn() __attribute__((noreturn)) > #define HAVE_PG_ATTRIBUTE_NORETURN 1 > #else > #define pg_attribute_noreturn() > #endif > I suppose this will cause warnings in compilers other than those, but > I'm not sure if we care. What about MSVC for example? Yeah, the lack of coverage for MSVC seems like the main reason not to assume this works "everywhere of interest". > With the attached patch, everything compiles cleanly in my setup, no > warnings, but then it's GCC. Meh ... I'm not really convinced that any of those changes are improvements. Particularly not the removals of switch-case breaks. regards, tom lane
Re: Removing pg_pltemplate and creating "trustable" extensions
Robert Haas writes: > So, if I understand correctly, the patch you are proposing to commit > has a new system role, and if you've got that system role, then you > can install extensions. Install *trusted* extensions, correct. The patch as it stands also allows DB owners to install trusted extensions. > I thought that part of the earlier debate was > whether DB owners should also be able to install trusted extensions > even without that role, and I thought it would be cleaner if the > answer was "no," because then the superuser could decide whether to > grant that role or not in particular cases. But I'm not clear whether > you agreed with that, what Stephen thought about it, or whether that's > still what you are proposing to commit. I agree that if we dropped the proviso about DB owners, it would be a cleaner design. I included that only for backwards compatibility with the existing behavior that DB owners can install trusted PLs. If we can agree that we're willing to lose that behavior, I'd be perfectly fine with removing the special case for DB owners. However, I'm unsure whether that compatibility cost is acceptable. It's definitely likely that it would cause an upgrade headache for some installations. One idea for working around the upgrade problem would be to teach pg_dumpall to automatically issue "GRANT pg_install_trusted_extension" to each DB-owner role, when dumping from a pre-v13 database. There's room to object to that, because it would end with more privilege than before (that is, an owner of some DB could now install extensions even in DBs she doesn't own, as long as she can connect to them). So maybe it's a bad idea. But it would probably reduce the number of complaints --- and I think a lot of installations would end up making such grants anyway, because otherwise their DB owners can't do things they expect to be able to do. I should not put words into Stephen's mouth, but perhaps his concern about having some DB-level privilege here is to alleviate the problem that there's no exact equivalent to the old level of privilege that DB ownership afforded, ie you can install in your own DB but not others. It's not clear to me whether that behavior is critical to preserve. regards, tom lane
Re: Coding in WalSndWaitForWal
On 2019-Nov-12, Andres Freund wrote: > We still have the curious > proc_exit(0); > abort();/* keep the compiler > quiet */ > > pattern in WalSndShutdown() - wouldn't the right approach to instead > proc_exit() with pg_attribute_noreturn()? proc_exit() is already marked noreturn ... and has been marked as such since commit eeece9e60984 (2012), which is the same that added abort() after some proc_exit calls as well as other routines that were already marked noreturn, such as WalSenderMain(). However, back then we were using the GCC-specific notation of __attribute__((noreturn)), so perhaps the reason we kept the abort() (and a few breaks, etc) after proc_exit() was to satisfy compilers other than GCC. In modern times, we define pg_attribute_noreturn() like this: /* GCC, Sunpro and XLC support aligned, packed and noreturn */ #if defined(__GNUC__) || defined(__SUNPRO_C) || defined(__IBMC__) #define pg_attribute_noreturn() __attribute__((noreturn)) #define HAVE_PG_ATTRIBUTE_NORETURN 1 #else #define pg_attribute_noreturn() #endif I suppose this will cause warnings in compilers other than those, but I'm not sure if we care. What about MSVC for example? With the attached patch, everything compiles cleanly in my setup, no warnings, but then it's GCC. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c index bfc629c753..e202ade4b9 100644 --- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -304,7 +304,6 @@ AuxiliaryProcessMain(int argc, char *argv[]) write_stderr("Try \"%s --help\" for more information.\n", progname); proc_exit(1); -break; } } diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index c2250d7d4e..d26868e578 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -89,8 +89,8 @@ StartupProcShutdownHandler(SIGNAL_ARGS) if (in_restore_command) proc_exit(1); - else - shutdown_requested = true; + + shutdown_requested = true; WakeupRecovery(); errno = save_errno; diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index 77360f1524..45b552cdf2 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -208,7 +208,6 @@ WalReceiverMain(void) case WALRCV_STOPPED: SpinLockRelease(>mutex); proc_exit(1); - break; case WALRCV_STARTING: /* The usual case */ @@ -616,8 +615,8 @@ WalRcvWaitForStartPosition(XLogRecPtr *startpoint, TimeLineID *startpointTLI) SpinLockRelease(>mutex); if (state == WALRCV_STOPPING) proc_exit(0); - else - elog(FATAL, "unexpected walreceiver state"); + + elog(FATAL, "unexpected walreceiver state"); } walrcv->walRcvState = WALRCV_WAITING; walrcv->receiveStart = InvalidXLogRecPtr; diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 9c063749b6..99f897cb32 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -337,7 +337,6 @@ WalSndShutdown(void) whereToSendOutput = DestNone; proc_exit(0); - abort(); /* keep the compiler quiet */ } /*
Re: our checks for read-only queries are not great
Robert Haas writes: > I'd be really interested to hear if anyone knows the history behind > allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables. > It seems to have been that way for a long time. I wonder if it was a > deliberate choice or something that just happened semi-accidentally. Within a "read-only" xact you mean? I believe that allowing DML writes was intentional. As for the utility commands, I suspect that it was in part accidental (error of omission?), and then if anyone thought hard about it they decided that allowing DML writes to temp tables justifies those operations too. Have you tried excavating in our git history to see when the relevant permission tests originated? regards, tom lane
Re: Removing pg_pltemplate and creating "trustable" extensions
On Thu, Jan 9, 2020 at 1:35 PM Tom Lane wrote: > Robert Haas writes: > > Again, as I said upthread, Tom had the exact feature about which I am > > talking in the first version of the patch. That is a strong argument > > in favor of it being practical. It's also a pretty good argument that > > it is at least potentially useful, because Tom doesn't usually do > > useless things for no reason. > > To try to clarify that a bit: I think there is certainly some value > in allowing superusers to control which extensions could be installed > by non-superusers, further restricting what we may think is trustworthy. Cool. > However, I felt at the time that my GUC-based implementation of that > was ugly, and then Peter raised some concrete points against it, > so I took it out. I don't want to put it back in the same form. > I think we could leave designing a replacement for later, because it's > pretty optional, especially if we aren't aggressive about promoting > contrib modules to "trusted" status. Agreed. > I don't agree that the lack of > such a feature is a reason not to commit what I've got. I said the same in http://postgr.es/m/ca+tgmoygwgs_rnmoooczzcgrzfqtfngshaq2gu7lm5skxrf...@mail.gmail.com - penultimate paragraph, last sentence. > In any case, AFAICT most of the heat-vs-light in this thread has not > been about which extensions are trustworthy, but about which users > should be allowed to install extensions, which seems like a totally > independent discussion. I agree it's independent. It wasn't really the main point of what *I* was trying to talk about, but the heat-vs-light problem seems to have totally obscured what I *was* trying to talk about. > And controlling that is also a feature that > we don't have today, so I'd rather get a minimal feature committed > for v13 and then later consider whether we need more functionality. > > The idea of a DB-level INSTALL privilege addresses the second > point not the first, unless I'm totally misunderstanding it. As > I said before, I'm not terribly comfortable with handing control > of that over to non-superuser DB owners, and I sure don't see why > doing so should be a required part of the minimal feature. So, if I understand correctly, the patch you are proposing to commit has a new system role, and if you've got that system role, then you can install extensions. I thought that part of the earlier debate was whether DB owners should also be able to install trusted extensions even without that role, and I thought it would be cleaner if the answer was "no," because then the superuser could decide whether to grant that role or not in particular cases. But I'm not clear whether you agreed with that, what Stephen thought about it, or whether that's still what you are proposing to commit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: our checks for read-only queries are not great
On Wed, Jan 8, 2020 at 5:57 PM Stephen Frost wrote: > Yeah, I don't like the WEAKLY_READ_ONLY idea either- explicitly having > COMMAND_OK_IN_X seems cleaner. Done that way. v2 attached. > Would we be able to make a rule of "can't change on-disk stuff, except > for things in temporary schemas" and have it stick without a lot of > complaints? Seems like that would address Tom's ALTER SYSTEM SET > concern, and would mean CLUSTER/REINDEX/VACUUM are disallowed in a > backwards-incompatible way (though I think I'm fine with that..), and > SET would still be allowed (which strikes me as correct too). I'm not > quite sure how I feel about LISTEN, but that it could possibly actually > be used post-promotion and doesn't change on-disk stuff makes me feel > like it actually probably should be allowed. I think we can make any rule we like, but I think we should have some measure of broad agreement on it. I'd like to go ahead with this for now and then further changes can continue to be discussed and debated. Hopefully we'll get a few more people to weigh in, too, because deciding something like this based on what a handful of people doesn't seem like a good idea to me. I'd be really interested to hear if anyone knows the history behind allowing CLUSTER, REINDEX, VACUUM, and some operations on temp tables. It seems to have been that way for a long time. I wonder if it was a deliberate choice or something that just happened semi-accidentally. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company v2-0001-Fix-problems-with-read-only-query-checks-and-refa.patch Description: Binary data
Re: Removing pg_pltemplate and creating "trustable" extensions
Robert Haas writes: > Again, as I said upthread, Tom had the exact feature about which I am > talking in the first version of the patch. That is a strong argument > in favor of it being practical. It's also a pretty good argument that > it is at least potentially useful, because Tom doesn't usually do > useless things for no reason. To try to clarify that a bit: I think there is certainly some value in allowing superusers to control which extensions could be installed by non-superusers, further restricting what we may think is trustworthy. However, I felt at the time that my GUC-based implementation of that was ugly, and then Peter raised some concrete points against it, so I took it out. I don't want to put it back in the same form. I think we could leave designing a replacement for later, because it's pretty optional, especially if we aren't aggressive about promoting contrib modules to "trusted" status. I don't agree that the lack of such a feature is a reason not to commit what I've got. In any case, AFAICT most of the heat-vs-light in this thread has not been about which extensions are trustworthy, but about which users should be allowed to install extensions, which seems like a totally independent discussion. And controlling that is also a feature that we don't have today, so I'd rather get a minimal feature committed for v13 and then later consider whether we need more functionality. The idea of a DB-level INSTALL privilege addresses the second point not the first, unless I'm totally misunderstanding it. As I said before, I'm not terribly comfortable with handing control of that over to non-superuser DB owners, and I sure don't see why doing so should be a required part of the minimal feature. regards, tom lane
Re: Removing pg_pltemplate and creating "trustable" extensions
On Thu, Jan 9, 2020 at 11:30 AM Stephen Frost wrote: > * Robert Haas (robertmh...@gmail.com) wrote: > > On Thu, Jan 9, 2020 at 10:09 AM Stephen Frost wrote: > > > [ wall of text ] > > This really isn't helpful. Sorry. That being said, I'm pretty tired of writing emails that say the same thing over and over again and having you write long responses that don't seem to actually respond to the points being raised in the email. Like: > > I don't see anything in here I really disagree with, but nor do I > > understand why any of it means that giving superusers the ability to > > customize which extensions are database-owner-installable would be a > > bad thing. > > Alright, I think there's definitely something we need to sort through. > > If you agree that the approach I advocated means less code for hosting > providers to have to change in their fork, and that they don't want to > give users superuser, and that they want non-superusers to be able to > install extensions, and that they really don't want to modify things > post-install, then I don't get why you're against the DB-level privilege > that I've been advocating for except that it's "not as customizable." What I was writing about in the quoted paragraph and what you are writing about in the response are two different things. I said *nothing* about a DB-level privilege in the paragraph I wrote, and yet somehow your response to that paragraph says that I'm opposing a DB-level privilege. > Are you saying that in order to have something here that we must make it > so that a superuser is able to specifiy, individually, which extensions > can be installed by which users? You keep coming back to this point of > saying that you want this to be 'customizable' but I really don't see > any justification for the level of customization you're asking for- but > I see an awful lot of work involved. When there's a lot of work > involved for a use-case that no one is actually asking for, I'm really > skeptical. The use-cases that you've presented, at least thus far, > certainly haven't swayed me into thinking that you're right that there's > a justifiable use-case here for this level of complicated privileges. I set forth my exact views in http://postgr.es/m/CA+TgmoZK=5EC2O13J3sfOUCvYtvjGtxUKg=wq11q-wy4sc4...@mail.gmail.com Everything since then has been trying to somehow clarify what I wrote in that email, which has resulted in me repeating everything I said there several times in different words. I would like to stop doing that now. It appears to be clarifying nothing, failing to advance the patch, and irritating you. > I'm also not convinced that such a design would even be practical- ... Again, as I said upthread, Tom had the exact feature about which I am talking in the first version of the patch. That is a strong argument in favor of it being practical. It's also a pretty good argument that it is at least potentially useful, because Tom doesn't usually do useless things for no reason. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Add pg_file_sync() to adminpack
Julien Rouhaud writes: > On Thu, Jan 9, 2020 at 6:16 PM Stephen Frost wrote: >> Why would you expect that when it isn't the case for the filesystem >> itself..? > Just a usual habit with durable property. I tend to agree with Stephen on this, mainly because the point of these adminpack functions is to expose filesystem access. If these functions were more "database-y" and less "filesystem-y", I'd agree with trying to impose database-like consistency requirements. We don't have to expose every wart of the filesystem semantics --- for example, it would be reasonable to make pg_file_sync() Do The Right Thing when applied to a directory, even if the particular platform we're on doesn't behave sanely for that. But having fsync separated from write is a pretty fundamental part of most filesystems' semantics, so we ought not try to hide that. regards, tom lane
Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR
Dent John wrote: > I’ve made a revision of this patch. Some comments: * the commitfest app did not extract up the patch from the mail, possibly because it's buried in the MIME structure of the mail (using plain text instead of HTML messages might help with that). The patch has no status in http://commitfest.cputube.org/ probably because of this too. * unnest-refcursor-v3.patch needs a slight rebase because this chunk in the Makefile fails - regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \ + refcursor.o regexp.o regproc.o ri_triggers.o rowtypes.o ruleutils.o \ * I'm under the impression that refcursor_unnest() is functionally equivalent to a plpgsql implementation like this: create function unnest(x refcursor) returns setof record as $$ declare r record; begin loop fetch x into r; exit when not found; return next r; end loop; end $$ language plpgsql; but it would differ in performance, because internally a materialization step could be avoided, but only when the other patch "Allow FunctionScans to pipeline results" gets in? Or are performance benefits expected right away with this patch? * --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -5941,7 +5941,7 @@ array_fill_internal(ArrayType *dims, ArrayType *lbs, /* - * UNNEST + * UNNEST (array) */ This chunk looks unnecessary? * some user-facing doc would be needed. * Is it good to overload "unnest" rather than choosing a specific function name? Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Re: Add pg_file_sync() to adminpack
On Thu, Jan 9, 2020 at 6:16 PM Stephen Frost wrote: > > Greetings, > > * Julien Rouhaud (rjuju...@gmail.com) wrote: > > On Thu, Jan 9, 2020 at 7:43 AM Fujii Masao wrote: > > > On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud > > > wrote: > > > > On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao > > > > wrote: > > > > > I'd like to propose to add pg_file_sync() function into > > > > > contrib/adminpack. > > > > > This function fsyncs the specified file or directory named by its > > > > > argument. > > > > > IMO this is useful, for example, when you want to fsync the file that > > > > > pg_file_write() writes out or that COPY TO exports the data into, > > > > > for durability. Thought? > > > > > > > > +1, that seems like a useful wrapper. Looking at existing functions, > > > > I see that there's a pg_file_rename() in adminpack, but it doesn't use > > > > durable_rename nor does it try to perform any fsync. Same for > > > > pg_file_unlink vs. durable_unlink. It's probably worth fixing that at > > > > the same time? > > > > > > I don't think that's a bug. I'm not sure if every users of those functions > > > need durable rename and unlink at the expense of performance. > > > So IMO it's better to add new argument like "durable" to those functions > > > and durable_rename or _unlink is used only if it's true. > > > > It's probably a POLA violation. I'm pretty sure that most people > > using those functions would expect that a successful call to > > pg_file_unlink() mean that the file cannot raise from the dead even > > with certain unlikely circumstances, at least I'd expect so. If > > performance is a problem here, I'd rather have a new wrapper with a > > sync flag that defaults to true so it's possible to disable it if > > needed instead of calling a different function. That being said, I > > agree with Arthur, it should be handled in a different patch. > > Why would you expect that when it isn't the case for the filesystem > itself..? Just a usual habit with durable property. > I agree with Fujii on this- you should have to explicitly ask > for us to do more than the equivilant filesystem-level operation. We > shouldn't be forcing that on you. I just checked other somehow related cases and saw that for instance COPY TO doesn't flush data either, so it's indeed the expected behavior.
Re: Make autovacuum sort tables in descending order of xid_age
On Thu, Dec 12, 2019 at 2:26 PM David Fetter wrote: > > I wonder if you might add information about table size, table changes, > > and bloat to your RelFrozenXidAge struct and modify rfxa_comparator to > > use a heuristic to cost the (age, size, bloat, changed) grouping and > > sort on that cost, such that really large bloated tables with old xids > > might get vacuumed before smaller, less bloated tables that have > > even older xids. Sorting the tables based purely on xid_age seems to > > ignore other factors that are worth considering. I do not have a > > formula for how those four factors should be weighted in the heuristic, > > but you are implicitly assigning three of them a weight of zero in > > your current patch. > > I think it's vastly premature to come up with complex sorting systems > right now. Just sorting in descending order of age should either have > or not have positive effects. A lot of previous efforts to improve autovacuum scheduling have fallen down precisely because they did something that was so simple that it was doomed to regress as many cases as it improved, so I wouldn't be too quick to dismiss Mark's suggestion. In general, sorting by XID age seems like it should be better, but it's not hard to come up with a counterexample: suppose table T1 is going to wrap around in 4 hours and takes 4 hours to vacuum, but table T2 is going to wrap around in 2 hours and takes 1 hour to vacuum. Your algorithm will prioritize T2, but it's better to prioritize T1. A second autovacuum worker may become available for this database later and still get T2 done before we run into trouble, but if we don't start T1 right now, we're hosed. The current algorithm gets this right if T1 was defined before T2 and thus appears earlier in pg_class; your algorithm gets it wrong regardless. I've had the thought for a while now that perhaps we ought to try to estimate the rate of XID consumption, because without that it's really hard to make smart decisions. In the above example, if the rate of XID consumption is 4x slower, then it might be smarter to vacuum T2 first, especially if T2 is very heavily updated compared to T1 and might bloat if we don't deal with it right away. At the lower rate of XID consumption, T1 is an urgent problem, but not yet an emergency. However, I've noticed that most people who complain about unexpected wraparound vacuums have them hit in peak periods, which when you think about it, makes a lot of sense. If you consume XIDs 10x as fast during your busy time as your non-busy times, then the XID that triggers the wraparound scan on any given table is very likely to occur during a busy period. So the *current* rate of XID consumption might not be very informative, which makes figuring out what to do here awfully tricky. I think Mark's suggestion of some kind of formula that takes into account the XID age as well as table size and bloat is probably a pretty good one. We'll probably need to make some of the parameters of that formula configurable. Ideally, they'll be easy enough to understand that users can say "oh, I'm using XIDs more or less quickly than normal here, so I need to change parameter X" and even figure out -- without using a calculator -- what sort of value for X might be appropriate. When there's a replication slot or prepared transaction or open transaction holding back xmin, you can't advance the relfrozenxid of that table past that point no matter how aggressively you vacuum it, so it would probably be a good idea to set up the formula so that the weight is based on the amount by which we think we'll be able to advance relfrozenxid rather than, say, the age relative to the last XID assigned. The dominant cost of vacuuming a table is often the number and size of the indexes rather than the size of the heap, particularly because the visibility map may permit skipping a lot of the heap. So you have N indexes that need to be read completely and 1 heap that needs to be read only partially. So, whatever portion of the score comes from estimating the cost of vacuuming that table ought to factor in the size of the indexes. Perhaps it should also consider the contents of the visibility map, although I'm less sure about that. One problem with the exponential in Mark's formula is that it might treat small XID differences between old tables as more important than they really are. I wonder if it might be a better idea to compute several different quantities and use the maximum from among them as the prioritization. We can model the priority of vacuuming a particular table as the benefit of vacuuming that table multiplied by the effort. The effort is easy to model: just take the size of the table and its indexes. The benefit is trickier, because there are four different possible benefits: relfrozenxid advancement, relminmxid advancement, dead tuple removal, and marking pages all-visible. So, suppose we model each benefit by a separate equation. For XID advancement, figure
Re: Add pg_file_sync() to adminpack
Greetings, * Julien Rouhaud (rjuju...@gmail.com) wrote: > On Thu, Jan 9, 2020 at 7:43 AM Fujii Masao wrote: > > On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud wrote: > > > On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao wrote: > > > > I'd like to propose to add pg_file_sync() function into > > > > contrib/adminpack. > > > > This function fsyncs the specified file or directory named by its > > > > argument. > > > > IMO this is useful, for example, when you want to fsync the file that > > > > pg_file_write() writes out or that COPY TO exports the data into, > > > > for durability. Thought? > > > > > > +1, that seems like a useful wrapper. Looking at existing functions, > > > I see that there's a pg_file_rename() in adminpack, but it doesn't use > > > durable_rename nor does it try to perform any fsync. Same for > > > pg_file_unlink vs. durable_unlink. It's probably worth fixing that at > > > the same time? > > > > I don't think that's a bug. I'm not sure if every users of those functions > > need durable rename and unlink at the expense of performance. > > So IMO it's better to add new argument like "durable" to those functions > > and durable_rename or _unlink is used only if it's true. > > It's probably a POLA violation. I'm pretty sure that most people > using those functions would expect that a successful call to > pg_file_unlink() mean that the file cannot raise from the dead even > with certain unlikely circumstances, at least I'd expect so. If > performance is a problem here, I'd rather have a new wrapper with a > sync flag that defaults to true so it's possible to disable it if > needed instead of calling a different function. That being said, I > agree with Arthur, it should be handled in a different patch. Why would you expect that when it isn't the case for the filesystem itself..? I agree with Fujii on this- you should have to explicitly ask for us to do more than the equivilant filesystem-level operation. We shouldn't be forcing that on you. Thanks, Stephen signature.asc Description: PGP signature
Re: [Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDE
Hi, On 2020-01-09 13:17:59 +0530, Dilip Kumar wrote: > I am able to reproduce the failure, I think the assert in the > 'logicalrep_write_insert' is not correct. IMHO even if the replica > identity is set to NOTHING we should be able to replicate INSERT? > > This will fix the issue. > > diff --git a/src/backend/replication/logical/proto.c > b/src/backend/replication/logical/proto.c > index dcf7c08..471461c 100644 > --- a/src/backend/replication/logical/proto.c > +++ b/src/backend/replication/logical/proto.c > @@ -145,7 +145,8 @@ logicalrep_write_insert(StringInfo out, Relation > rel, HeapTuple newtuple) > > Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || >rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || > - rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX); > + rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX || > + rel->rd_rel->relreplident == REPLICA_IDENTITY_NOTHING); > > /* use Oid as relation identifier */ > pq_sendint32(out, RelationGetRelid(rel)); There's not much point in having this assert, right? Given that it covers all choices? Seems better to just drop it. Greetings, Andres Freund
Re: [HACKERS] pg_shmem_allocations view
Hi, On 2020-01-09 11:13:46 -0500, Robert Haas wrote: > On Wed, Jan 8, 2020 at 2:30 AM Kyotaro Horiguchi > wrote: > > I agree to the patch as-is. Thanks for the explanatin. > > OK. Thanks for the review, and for your thoughtful consideration of my > comments. > > I went ahead and committed this. Wee! Greetings, Andres Freund
Re: [Proposal] Global temporary tables
On Thu, Jan 09, 2020 at 02:17:08PM +0300, Konstantin Knizhnik wrote: On 06.01.2020 8:04, 曾文旌(义从) wrote: In the previous communication 1 we agreed on the general direction 1.1 gtt use local (private) buffer 1.2 no replica access in first version 2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done. 3 Still no one commented on GTT's transaction information processing, they include 3.1 Should gtt's frozenxid need to be care? 3.2 gtt’s clog clean 3.3 How to deal with "too old" gtt data I suggest we discuss further, reach an agreement, and merge the two patches to one. I also hope that we should come to the common solution for GTT. If we do not try to address parallel execution issues and access to temp tables at replicas (and I agreed that it should be avoided in first version of the patch), then GTT patch becomes quite small. Well, that was kinda my goal - making the patch as small as possible by eliminating bits that are contentious or where we don't know the solution (like planning for parallel queries). The most complex and challenged task is to support GTT for all kind of indexes. Unfortunately I can not proposed some good universal solution for it. Just patching all existed indexes implementation seems to be the only choice. I haven't looked at the indexing issue closely, but IMO we need to ensure that every session sees/uses only indexes on GTT that were defined before the seesion started using the table. Can't we track which indexes a particular session sees, somehow? Statistic is another important case. But once again I do not completely understand why we want to address all this issues with statistic in first version of the patch? I think the question is which "issues with statistic" you mean. I'm sure we can ignore some of them, e.g. the one with parallel workers not having any stats (assuming we consider functions using GTT to be parallel restricted). It contradicts to the idea to make this patch as small as possible. Well, there's "making patch as small as possible" vs. "patch behaving correctly" trade-off ;-) Also it seems to me that everybody agreed that users very rarely create indexes for temp tables and explicitly analyze them. I certainly *disagree* with this. We often see temporary tables as a fix or misestimates in complex queries, and/or as a replacement for CTEs with statistics/indexes. In fact it's a pretty valuable tool when helping customers with complex queries affected by poor estimates. So I think GTT will be useful even with limited support of statistic. In my version statistics for GTT is provided by pushing correspondent information to backend's cache for pg_statistic table. I think someone pointed out pushing stuff directly into the cache is rather problematic, but I don't recall the details. Also I provided pg_temp_statistic view for inspecting it by users. The idea to make pg_statistic a view which combines statistic of normal and temporary tables is overkill from my point of view. I do not understand why do we need to maintain hash with some extra information for GTT in backends memory (as it was done in Wenjing patch). Also idea to use create extension for accessing this information seems to be dubious. I think the extension was more a PoC rather than a final solution. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Recognizing superuser in pg_hba.conf
Greetings, * Robert Haas (robertmh...@gmail.com) wrote: > But, again, we already *have* a way of solving this problem: use > quotes. As Simon pointed out, your proposed solution isn't really a > solution at all, because & can appear in role names. It probably > won't, but there probably also won't be a role name that matches > either of these keywords, so it's just six of one, half a dozen of the > other. The thing that really solves it is quoting. I really just can't agree with the idea that: "" and in pg_hba.conf should mean materially different things and have far reaching security differences. Depending on quoting in pg_hba.conf for this distinction is an altogether bad idea. > Now I admit that if we decide pg_hba.conf keywords have to start with > "pg_" and prevent names beginning with "pg_" from being used as object > names, then we'd have TWO ways of distinguishing between a keyword and > an object name. But I don't think TMTOWTDI is the right design > principle here. There is a *really* big difference here though which makes this not "two ways to do the same thing"- you *can't* create a user starting with "pg_". You *can* create a user with an '&' in it. If we prevented you from being able to create users with '&' in it then I'd be more open to the idea of using '&' to mean something special in pg_hba, and then it really would be two different ways to do the same thing, but that's not actually what's being proposed here. Thanks, Stephen signature.asc Description: PGP signature
Re: Recognizing superuser in pg_hba.conf
Robert Haas writes: > On Thu, Jan 9, 2020 at 11:06 AM Tom Lane wrote: >> The problem is that we keep deciding that okay, it probably won't hurt >> anybody if this particular thing-that-ought-to-be-a-reserved-word isn't >> really reserved. > But, again, we already *have* a way of solving this problem: use > quotes. As Simon pointed out, your proposed solution isn't really a > solution at all, because & can appear in role names. I'm not sure that the pg_hba.conf parser allows that without quotes, but in any case it's irrelevant to the proposal to use a pg_ prefix. We don't allow non-built-in role names to be spelled that way, quoted or not. > Now I admit that if we decide pg_hba.conf keywords have to start with > "pg_" and prevent names beginning with "pg_" from being used as object > names, then we'd have TWO ways of distinguishing between a keyword and > an object name. But I don't think TMTOWTDI is the right design > principle here. The principle I'm concerned about is not letting a configuration file that was perfectly fine in version N silently become a security hazard in version N+1. The only way I will accept your proposal is if we change the pg_hba.conf parser to *require* quotes around every role and DB name that's not meant to be a keyword, so that people get used to that requirement. But I doubt that idea will fly. regards, tom lane
Re: [Proposal] Global temporary tables
On Thu, Jan 09, 2020 at 06:07:46PM +0300, Konstantin Knizhnik wrote: On 06.01.2020 14:01, Tomas Vondra wrote: On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote: In the previous communication 1 we agreed on the general direction 1.1 gtt use local (private) buffer 1.2 no replica access in first version OK, good. 2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done. I certainly agree GTT needs to maintain statistics, otherwise it'll lead to poor query plans. AFAIK the current patch stores the info in a hash table in a backend private memory, and I don't see how else to do that (e.g. storing it in a catalog would cause catalog bloat). FWIW this is a reasons why I think just using shared buffers (instead of local ones) is not sufficient to support parallel queriesl as proposed by Alexander. The workers would not know the stats, breaking planning of queries in PARALLEL SAFE plpgsql functions etc. I do not think that "all or nothing" approach is so good for software development as for database transactions. Well, sure. I'm not saying we need to have a perfect solution in v1. I'm saying if we have two choices: (1) Use shared buffers even if it means the parallel query plan may be arbitrarily bad. (2) Use private buffers, even if it means no parallel queries with temp tables. Then I'm voting for (2) because it's less likely to break down. I can imagine allowing parallel queries with GTT when there's no risk of having to plan in the worker, but that's not there yet. If we can come up with a reasonable solution for the parallel case, we can enable it later. Yes, if we have function in PL/pgSQL which performs queries om temporary tables, then parallel workers may build inefficient plan for this queries due to lack of statistics. IMHO that's a pretty awful deficiency, because it essentially means users may need to disable parallelism for such queries. Which means we'll get complaints from users, and we'll have to come up with some sort of solution. I'd rather not be in that position. From my point of view this is not a pitfall of GTT but result of lack of global plan cache in Postgres. And it should be fixed not at GTT level. That doesn't give us free pass to just ignore the issue. Even if it really was due to a lack of global plan cache, the fact is we don't have that feature, so we have a problem. I mean, if you need infrastructure that is not available, you either have to implement that infrastructure or make it work properly without it. Also I never see real use cases with such functions, even in the systems which using hard temporary tables and stored procedures. But there are many other real problems with temp tables (except already mentioned in this thread). Oh, I'm sure there are pretty large plpgsql applications, and I'd be surprised if at least some of those were not affected. And I'm sure there are apps using UDF to do all sorts of stuff (e.g. I wonder if PostGIS would have this issue - IIRC it's using SPI etc.). The question is whether we should consider existing apps affected, because they are using the regular temporary tables and not GTT. So unless they switch to GTT there is no regression ... But even in that case I don't think it's a good idea to accept this as an acceptable limitation. I admit one of the reasons why I think that may be that statistics and planning are my areas of interest, so I'm not quite willing to accept incomplete stuff as OK. In PgPro/EE we have fixes for some of them, for example: 1. Do not reserve space in the file for temp relations. Right now append of relation cause writing zero page to the disk by mdextend. It cause useless disk IO for temp tables which in most cases fit in memory and should not be written at disk. 2. Implicitly perform analyze of temp table intermediately after storing data in it. Usually tables are analyzed by autovacuum in background. But it doesn't work for temp tables which are not processes by autovacuum and are accessed immediately after filling them with data and lack of statistic may cause building very inefficient plan. We have online_analyze extension which force analyze of the table after appending some bulk of data to it. It can be used for normal table but most of all it is useful for temp relations. Unlike hypothetical example with parallel safe function working with temp tables, this are real problems observed by some of our customers. Them are applicable both to local and global temp tables and this is why I do not want to discuss them in context of GTT. I think those are both interesting issues worth fixing, but I don't think it makes the issue discussed here less important. 3 Still no one commented on GTT's transaction information processing, they include 3.1 Should gtt's frozenxid need to be care? 3.2 gtt’s clog clean 3.3 How to deal with "too old" gtt data No idea what to do about this. I wonder
Re: Removing pg_pltemplate and creating "trustable" extensions
Greetings, * Robert Haas (robertmh...@gmail.com) wrote: > On Thu, Jan 9, 2020 at 10:09 AM Stephen Frost wrote: > > [ wall of text ] This really isn't helpful. > I don't see anything in here I really disagree with, but nor do I > understand why any of it means that giving superusers the ability to > customize which extensions are database-owner-installable would be a > bad thing. Alright, I think there's definitely something we need to sort through. If you agree that the approach I advocated means less code for hosting providers to have to change in their fork, and that they don't want to give users superuser, and that they want non-superusers to be able to install extensions, and that they really don't want to modify things post-install, then I don't get why you're against the DB-level privilege that I've been advocating for except that it's "not as customizable." Are you saying that in order to have something here that we must make it so that a superuser is able to specifiy, individually, which extensions can be installed by which users? You keep coming back to this point of saying that you want this to be 'customizable' but I really don't see any justification for the level of customization you're asking for- but I see an awful lot of work involved. When there's a lot of work involved for a use-case that no one is actually asking for, I'm really skeptical. The use-cases that you've presented, at least thus far, certainly haven't swayed me into thinking that you're right that there's a justifiable use-case here for this level of complicated privileges. I'm also not convinced that such a design would even be practical- we don't know all of the extensions that a given PG install will be able to have when it's first installed. If postgis isn't on the filesystem when someone installs PG, how do I, as a superuser, say that $user is allowed to install postgis? Or do we always have to have this two-step "install on filesystem", "grant privs to $user to install" process? What if that extension is then uninstalled from the filesystem? Do we have to clean up the GRANT that was done? > > > I don't think changing what's in contrib helps much. Even if we rm > > > -rf'd it, there's the same problem with out-of-core extensions. Joe > > > Extensionman may think his extension ought to be trusted, and package > > > it as such, but Paula Skepticaldba is entitled to think Joe's view of > > > the security risks originating from his code is overly rosy. > > > > Out of core extensions have to get installed on to the system though, > > they don't just show up magically, and lots and lots of folks out there > > from corporate infrastructure groups to hosting providers have got lots > > of experience with deciding what they'll allow to be installed on a > > system and what they won't, what repositories of code they'll trust and > > which they won't. > > You seem to be ignoring the actual point of that example, which is > that someone may want to install the extension but have a different > view than the packager about whether it should be trusted. Why would someone want to install something that isn't trusted? You're implying that's what is happening here, but it doesn't make any sense to me and without it making sense I can't agree that it's a sensible enough use-case to demand a lot of work be put into it. > You seem to think that that hosting providers and system > administrators will be thrilled to accept the judgement of developers > about which extensions should be trusted in their environment. Great! Huh? Hosting providers are the ones that choose what gets installed on the filesystem, certainly not developers, so I am baffled how you came to the conclusion that I'm suggesting administrators are trusting the judgement of developers. That's just not at all the case. > Evidently you disagree, and that's fine, even if I don't understand > why. Given some of the development projects you've done in the past, I > find it extremely surprising to here you now taking the position that > fine-grained security controls are, in this case, unnecessary and > useless, but you don't have to like it everywhere just because you > like it for some things. I'm all for fine-grained control- where it makes sense. I'm still *very* much of the opinion that we should be able to let DB owners and schema owners control what kind of objects users are allowed to create in their DBs/schemas. I want a "GRANT CREATE FUNCTION ON SCHEMA mine TO you;" ability. I'm not clamouring for a way to say "GRANT CREATE THISSPECIFICFUNCTION ON SCHEMA mine TO you;" or something like "GRANT CREATE FUNCTION MATCHING REGEXP 'abc_*' ON SCHEMA mine TO you;". In the end, superusers are, in fact, the ones who grant out ALL access, the question is what level of access they want to grant out and to whom. When it comes to trusted objects, we've historically said that it's the DB owner who gets to say who can grant out that access and all I'm trying
Re: Recognizing superuser in pg_hba.conf
On Thu, Jan 9, 2020 at 11:06 AM Tom Lane wrote: > The problem is that we keep deciding that okay, it probably won't hurt > anybody if this particular thing-that-ought-to-be-a-reserved-word isn't > really reserved. Your exercise in justifying that for "superuser" is > not unlike every other previous argument about this. Sooner or later > that's going to fail, and somebody's going to have a security problem > because they didn't know that a particular name has magic properties > in a particular context. (Which, indeed, maybe it didn't have when > they chose it.) Claiming they should have known better isn't where > I want to be when that happens. But, again, we already *have* a way of solving this problem: use quotes. As Simon pointed out, your proposed solution isn't really a solution at all, because & can appear in role names. It probably won't, but there probably also won't be a role name that matches either of these keywords, so it's just six of one, half a dozen of the other. The thing that really solves it is quoting. Now I admit that if we decide pg_hba.conf keywords have to start with "pg_" and prevent names beginning with "pg_" from being used as object names, then we'd have TWO ways of distinguishing between a keyword and an object name. But I don't think TMTOWTDI is the right design principle here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pgsql: Add basic TAP tests for psql's tab-completion logic.
Peter Eisentraut writes: > On 2020-01-09 16:50, Tom Lane wrote: >> That would disable psql's tab completion, command editing, and history >> altogether, which I doubt is what you want for production builds. >> If we conclude we can't work around the testing issues for ancient >> libedit, probably the right answer is to provide a switch to >> disable just the test. I've been trying to dance around that >> conclusion, but maybe we should just do it and move on. > I think he means something like > make check with_readline=no > not for the actual build. Oh, I see. I'd rather not codify that though, because it risks problems if that symbol ever gets used any other way. I was thinking of making the test script check for some independent environment variable, say SKIP_READLINE_TESTS. regards, tom lane
Re: [HACKERS] pg_shmem_allocations view
On Wed, Jan 8, 2020 at 2:30 AM Kyotaro Horiguchi wrote: > I agree to the patch as-is. Thanks for the explanatin. OK. Thanks for the review, and for your thoughtful consideration of my comments. I went ahead and committed this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pgsql: Add basic TAP tests for psql's tab-completion logic.
On 2020-01-09 16:50, Tom Lane wrote: Christoph Berg writes: Fwiw if libedit in xenial is Just Broken and fixing the tests would just codify the brokenness without adding any value, I'll just disable that test in that particular build. It looks like setting with_readline=no on the make command line should work. That would disable psql's tab completion, command editing, and history altogether, which I doubt is what you want for production builds. If we conclude we can't work around the testing issues for ancient libedit, probably the right answer is to provide a switch to disable just the test. I've been trying to dance around that conclusion, but maybe we should just do it and move on. I think he means something like make check with_readline=no not for the actual build. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Recognizing superuser in pg_hba.conf
Robert Haas writes: > On Thu, Jan 9, 2020 at 10:06 AM Tom Lane wrote: >> What I'm basically objecting to is the pseudo-reservedness. If we >> don't want to dodge that with special syntax, we should dodge it >> by making sure the keywords are actually reserved names. > ... > But I think I'm coming around to the view that we're making what ought > to be a simple change complicated, and we should just not do that. The problem is that we keep deciding that okay, it probably won't hurt anybody if this particular thing-that-ought-to-be-a-reserved-word isn't really reserved. Your exercise in justifying that for "superuser" is not unlike every other previous argument about this. Sooner or later that's going to fail, and somebody's going to have a security problem because they didn't know that a particular name has magic properties in a particular context. (Which, indeed, maybe it didn't have when they chose it.) Claiming they should have known better isn't where I want to be when that happens. I don't want to keep going down that path. These things are effectively reserved words, and they need to act like reserved words, so that you get an error if you misuse them. Silently doing something else than what (one reasonable reading of) a pg_hba.conf entry seems to imply is *bad*. regards, tom lane
Re: pgsql: Add basic TAP tests for psql's tab-completion logic.
Christoph Berg writes: > Fwiw if libedit in xenial is Just Broken and fixing the tests would > just codify the brokenness without adding any value, I'll just disable > that test in that particular build. It looks like setting > with_readline=no on the make command line should work. That would disable psql's tab completion, command editing, and history altogether, which I doubt is what you want for production builds. If we conclude we can't work around the testing issues for ancient libedit, probably the right answer is to provide a switch to disable just the test. I've been trying to dance around that conclusion, but maybe we should just do it and move on. regards, tom lane
Re: Patch to document base64 encoding
On Thu, 9 Jan 2020 08:27:28 +0100 (CET) Fabien COELHO wrote: > > Another option would be to use "bytes bytea". > > > (The current patch uses "string bytea".) > > This would probably also require some re-wording throughout. > I like it, but this is only my own limited opinion, and I'm not a > native English speaker. Per your request for consistency I made this change throughout the entire binary string section. New patch attached: doc_base64_v13.patch This required surprisingly little re-wording. Added word "binary" into the descriptions of convert(), substring(), convert_from(), and convert_to(). I also added data types to the call syntax of set_bit() and set_byte(). And this patch adds hyperlinks from the get_bit(), get_byte(), set_bit(), and set_byte() descriptions to the note that offsets are zero-based. I also removed the hyperlinked asterisks about the hash function results and instead hyperlinked the word "hash" in the descriptions. (Links to the note about md5() returning hex text and the others returning bytea and how to convert between the two.) Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 4b42f12862..3bd7232814 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1459,6 +1459,13 @@ natively for the bit-string types. + + Functions which convert, both to and from, strings and + the bytea type + are documented + separately. + + SQL defines some string functions that use key words, rather than commas, to separate @@ -1820,101 +1827,6 @@ abcde,2,22 - - - - convert - -convert(string bytea, -src_encoding name, -dest_encoding name) - - bytea - -Convert string to dest_encoding. The -original encoding is specified by -src_encoding. The -string must be valid in this encoding. -Conversions can be defined by CREATE CONVERSION. -Also there are some predefined conversions. See for available conversions. - - convert('text_in_utf8', 'UTF8', 'LATIN1') - text_in_utf8 represented in Latin-1 - encoding (ISO 8859-1) - - - - - - convert_from - -convert_from(string bytea, -src_encoding name) - - text - -Convert string to the database encoding. The original encoding -is specified by src_encoding. The -string must be valid in this encoding. - - convert_from('text_in_utf8', 'UTF8') - text_in_utf8 represented in the current database encoding - - - - - - convert_to - -convert_to(string text, -dest_encoding name) - - bytea - -Convert string to dest_encoding. - - convert_to('some text', 'UTF8') - some text represented in the UTF8 encoding - - - - - - decode - -decode(string text, -format text) - - bytea - -Decode binary data from textual representation in string. -Options for format are same as in encode. - - decode('MTIzAAE=', 'base64') - \x3132330001 - - - - - - encode - -encode(data bytea, -format text) - - text - -Encode binary data into a textual representation. Supported -formats are: base64, hex, escape. -escape converts zero bytes and high-bit-set bytes to -octal sequences (\nnn) and -doubles backslashes. - - encode('123\000\001', 'base64') - MTIzAAE= - - @@ -1982,19 +1894,6 @@ 4 - - length(string bytea, -encoding name ) - int - -Number of characters in string in the given -encoding. The string -must be valid in this encoding. - - length('jose', 'UTF8') - 4 - - @@ -2044,8 +1943,8 @@ text -Calculates the MD5 hash of string, -returning the result in hexadecimal +Calculates the MD5 hash +of string, returning the result in hexadecimal md5('abc') 900150983cd24fb0 d6963f7d28e17f72 @@ -2358,6 +2257,66 @@ test + + + + sha224 + +sha224(string) + + bytea + +SHA-224 hash + + sha224('abc') + \x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7 + + + + + + sha256 + +sha256(string) + + bytea + +SHA-256 hash + + sha256('abc') +
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
On 2020-Jan-09, Tom Lane wrote: > Alvaro Herrera writes: > > Hmm, so why not revert the test only in the back branches, given that > > it's not so onerous in master? > > I grow tired of repeating myself, but: it's purely accidental that this > test passes in master for the existing set of buildfarm members. Oh, I forgot we had that problem. I agree with reverting the test, rather than building all the extra functionality needed to make it more stable, in that case. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Libpq support to connect to standby server as priority
On 2020-Jan-06, tsunakawa.ta...@fujitsu.com wrote: > Let me check my understanding. Are you proposing these? > > * The canonical libpq connection parameter is target_session_attr = {primary > | standby | prefer-standby}. Leave and document read-write as a synonym for > primary. > > * When the server version is 13 or later, libpq just checks in_recovery, not > checking transaction_read_only or sending SHOW transaction_read_only. > > * When the server version is before 13, libpq sends SHOW > transaction_read_only as before. Yes, that sounds good to me. > Personally, 100% agreed, considering what we really wanted to do when > target_session_attr was introduced is to tell if the server is primary or > standby. The questions are: > > Q1: Should we continue to use the name target_session_attr, or rename it to > target_server_type and make target_session_attr a synonym for it? I'm in > favor of the latter. I'm not 100% sure about this. I think part of the reason of making it target_session_attrs (note plural) is that the user could be able to specify more than one attribute (a comma-separated list, like the DateStyle GUC), if we supported some hypothetical attributes in the future that are independent of the existing ones. I'm not inclined to break that, unless the authors of the original feature agree to that. Maybe one possible improvement would be to add target_server_type as an additional one, that only accepts a single item (primary/standby/prefer-standby), as a convenience, while target_session_attrs retains its ability to receive more than one value. The two would be somewhat redundant but not exact synonyms. > Q2: Can we accept the subtle incompatibility that > target_session_attr=read-write and target_server_type=primary are not > the same, when default_transaction_read_only is on? (I'd like to hear > yes) ... on servers versions 12 and older, yes. (If I understand correctly, we wouldn't have such a difference in version 13). > Q3: Can we go without supporting standby and prefer-standby for older > servers? (I think yes because we can say that it's a new feature > effective for new servers.) Yes. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Alvaro Herrera writes: > Hmm, so why not revert the test only in the back branches, given that > it's not so onerous in master? I grow tired of repeating myself, but: it's purely accidental that this test passes in master for the existing set of buildfarm members. If I have to do so to prove my point, I will set up a buildfarm member that uses USE_NAMED_POSIX_SEMAPHORES, and then insist that the patch cope with that. But the real issue is that the test is abusing max_files_per_process to do something it was never intended for. What it was intended for, and works well at, is to constrain the total FD consumption of a collection of backends. It doesn't work well to constrain the maximum allocatedDescs consumption, because there's too much variability in our demand for other FDs. If we feel that we should have a test that is constraining that, then we need to invent some other mechanism to do it with. If we're not willing to invent an appropriate mechanism to support the test, then we should drop the test, because a half-baked test is worse than none. An appropriate mechanism, perhaps, would be some way to constrain max_safe_fds directly, without any platform- or environment-dependent effects in the way. It could be as simple as /* * Take off the FDs reserved for system() etc. */ max_safe_fds -= NUM_RESERVED_FDS; + /* +* Apply debugging limit, if defined. +*/ +#ifdef MAX_SAFE_FDS_LIMIT + max_safe_fds = Min(max_safe_fds, MAX_SAFE_FDS_LIMIT); +#endif + /* * Make sure we still have enough to get by. */ and then somebody who was concerned about this could run a buildfarm member with "-DMAX_SAFE_FDS_LIMIT=10" or so. regards, tom lane
Re: Recognizing superuser in pg_hba.conf
On Thu, Jan 9, 2020 at 10:06 AM Tom Lane wrote: > What I'm basically objecting to is the pseudo-reservedness. If we > don't want to dodge that with special syntax, we should dodge it > by making sure the keywords are actually reserved names. You know, as I was reading this email, I got to thinking: aren't we engineering a solution to a problem for which we already have a solution? The documentation says: "Quoting one of the keywords in a database, user, or address field (e.g., all or replication) makes the word lose its special character, and just match a database, user, or host with that name." So if you've writing a pg_hba.conf file that contains a specific role name, and you want to make sure it doesn't get confused with a current or future keyword, just quote it. If you don't quote it, make sure to RTFM at the time and when upgrading. If you want to argue that this isn't the cleanest possible solution to the problem, I think I would agree. If we were doing this over again, we could probably design a better syntax for pg_hba.conf, perhaps one where all specific role names have to be quoted and anything that's not quoted is expected to be a keyword. But, as it is, nothing's really broken here, and practical confusion is unlikely. If someone has a role named "superuser", then it's probably a superuser account, and the worst that will happen is that we'll match all superuser accounts rather than only that one. If someone has a non-superuser account called "superuser", or if they have an account named "nonsuperuser," then, uh, that's lame, and if this patch causes them to improve their choice of role names, that's good. If it causes them to use quotes, that's also good. But I think I'm coming around to the view that we're making what ought to be a simple change complicated, and we should just not do that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Recognizing superuser in pg_hba.conf
Greetings, * Tom Lane (t...@sss.pgh.pa.us) wrote: > What I'm basically objecting to is the pseudo-reservedness. If we > don't want to dodge that with special syntax, we should dodge it > by making sure the keywords are actually reserved names. In other > words, add a "pg_" prefix, as somebody else suggested upthread. Yes, that was my suggestion, and it was also my change a few major versions ago that actually reserved the "pg_" prefix for roles. > BTW, although that solution works for the immediate need of > keywords that have to be distinguished from role names, it doesn't > currently scale to keywords for the database column, because we > don't treat "pg_" as a reserved prefix for database names: > > regression=# create role pg_zit; > ERROR: role name "pg_zit" is reserved > DETAIL: Role names starting with "pg_" are reserved. > regression=# create database pg_zit; > CREATE DATABASE > > Should we do so, or wait till there's an immediate need to? I seem to recall (but it was years ago, so I might be wrong) advocating that we should reserve the 'pg_' prefix for *all* object types. All I can recall is that there wasn't much backing for the idea (though I also don't recall any specific objection, and it's also quite possible that there was simply no response to the idea). For my 2c, I'd *much* rather we reserve it across the board, and sooner than later, since that would hopefully reduce the impact on people. The only justification for *not* reserving it is if we KNOW that we'll never need a special one of those, but, well, we're well past that for database names already- look at the fact that we've got a "replication" one, for example. Maybe we can't ever un-reserve that, but I like the idea of reserving "pg_" for database names and then having "pg_replication" be allowed to mean replication connections and then encouraging users to use that instead. Thanks, Stephen signature.asc Description: PGP signature
Re: Removing pg_pltemplate and creating "trustable" extensions
Greetings, * Robert Haas (robertmh...@gmail.com) wrote: > On Wed, Jan 8, 2020 at 6:09 PM Stephen Frost wrote: > > > To me, this seems more accidental than the natural fallout of good design. > > > > I disagree that the privilege design for FDWs was accidental. > > That seems like a stronger statement than the one I made... Perhaps I misunderstood what you were referring to then. > > What I see differently is that the purview of those decisions should be > > that of the DB owner and not the superuser. > > Yeah, I don't agree with that at all. If I'm a hosting provider, I > want to tell my customers what they're allowed to run and what they're > not allowed to run, but I don't want them to have to call me when they > want to run an extension I've decided is OK. Now I'm really floored because I've also been contemplating exactly the situation of a hosting provider. In my experience, there's a few pretty common themes among the ones that I've played with: - Users definitely don't get true PG superuser - They certainly like users to be able to install trusted extensions - They'd generally prefer to minimize the size of their fork I've been feeling like the solution I'm pushing for would, in the end, *reduce* the amount of code they have that's different from PG, which I believe they'd see as an entirely good thing. The approach proposed in this thread seems likely to either increase the size of the fork or, at best, be about the same size. Now the question boils down to "what's trusted?" and if I'm a hosting provider, it'd sure be nice to foist off the responsibility of figuring that out on a community that I can trust that will do so at no cost to me, and who will address any bugs or security holes in those trusted extensions for me. Realistically, we pretty much already do that for contrib and that's really all that's relevant here- anything else would need to be physically installed on the system in order for a user to be able to install it, and I expect hosting providers to be pretty happy with a solution that boils down to just having to install RPMs, or not (or the equivilant with regard to putting files into the right places and such). Now, if we're talking about defining repositories or allowing users to upload their *own* C code from their systems in to PG and install that as an extension, then, sure, that's gotta be limited to a superuser (and wouldn't be allowed by hosting providers). If the extension is installed on the system though, and it's marked as trusted, then why do we need a superuser to take some additional action to allow it to be installed..? When it comes to the hosting provider case, I'd argue that what we're missing here is a way for them to give their "not-quite-a-superuser role" the ability to have certain capabilities (at install time, of course)- a good example of that would be "allowed to make outbound network connections", with a way to then be able to delegate out that ability to others. Then they'd actually be able to use things like postgres_fdw and dblink after they're installed and without having to get a superuser to grant them that ability post-install (and that order of operations issue is a pretty key one- it's far simpler to set things up and hand them over to the user than to have some operation that has to happen as an actual superuser after the installation is done). Even if we marked postgres_fdw as trusted, and used this default-role based approach, we're almost certainly going to have the FDW itself be owned by the bootstrap superuser and therefore whatever non-superuser role that installs it wouldn't be able to actually use it without some other changes. I'd love to get to a point where you could have an initially set up system with a not-quite-superuser role that would be able to actually install AND use postgres_fdw, without having to fork PG. > > As it relates to things in contrib that could be classified as 'a pile > > of crap' or 'too experimental'- that's *our* failing, and one which we > > should accept and address instead of punting on it. > > I don't think changing what's in contrib helps much. Even if we rm > -rf'd it, there's the same problem with out-of-core extensions. Joe > Extensionman may think his extension ought to be trusted, and package > it as such, but Paula Skepticaldba is entitled to think Joe's view of > the security risks originating from his code is overly rosy. Out of core extensions have to get installed on to the system though, they don't just show up magically, and lots and lots of folks out there from corporate infrastructure groups to hosting providers have got lots of experience with deciding what they'll allow to be installed on a system and what they won't, what repositories of code they'll trust and which they won't. Of course, when it comes to contrib extensions, if we don't feel comfortable with them and we don't want to spend the time to vet them, we can certainly just leave them marked as
Re: [Proposal] Global temporary tables
On 06.01.2020 14:01, Tomas Vondra wrote: On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote: In the previous communication 1 we agreed on the general direction 1.1 gtt use local (private) buffer 1.2 no replica access in first version OK, good. 2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done. I certainly agree GTT needs to maintain statistics, otherwise it'll lead to poor query plans. AFAIK the current patch stores the info in a hash table in a backend private memory, and I don't see how else to do that (e.g. storing it in a catalog would cause catalog bloat). FWIW this is a reasons why I think just using shared buffers (instead of local ones) is not sufficient to support parallel queriesl as proposed by Alexander. The workers would not know the stats, breaking planning of queries in PARALLEL SAFE plpgsql functions etc. I do not think that "all or nothing" approach is so good for software development as for database transactions. Yes, if we have function in PL/pgSQL which performs queries om temporary tables, then parallel workers may build inefficient plan for this queries due to lack of statistics. From my point of view this is not a pitfall of GTT but result of lack of global plan cache in Postgres. And it should be fixed not at GTT level. Also I never see real use cases with such functions, even in the systems which using hard temporary tables and stored procedures. But there are many other real problems with temp tables (except already mentioned in this thread). In PgPro/EE we have fixes for some of them, for example: 1. Do not reserve space in the file for temp relations. Right now append of relation cause writing zero page to the disk by mdextend. It cause useless disk IO for temp tables which in most cases fit in memory and should not be written at disk. 2. Implicitly perform analyze of temp table intermediately after storing data in it. Usually tables are analyzed by autovacuum in background. But it doesn't work for temp tables which are not processes by autovacuum and are accessed immediately after filling them with data and lack of statistic may cause building very inefficient plan. We have online_analyze extension which force analyze of the table after appending some bulk of data to it. It can be used for normal table but most of all it is useful for temp relations. Unlike hypothetical example with parallel safe function working with temp tables, this are real problems observed by some of our customers. Them are applicable both to local and global temp tables and this is why I do not want to discuss them in context of GTT. 3 Still no one commented on GTT's transaction information processing, they include 3.1 Should gtt's frozenxid need to be care? 3.2 gtt’s clog clean 3.3 How to deal with "too old" gtt data No idea what to do about this. I wonder what is the specific of GTT here? The same problem takes place for normal (local) temp tables, doesn't it? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Recognizing superuser in pg_hba.conf
Simon Riggs writes: > On Wed, 8 Jan 2020 at 23:55, Vik Fearing > wrote: >> What is being proposed is what is in the Subject and the original >> patch. The other patch is because Tom didn't like "the continuing creep >> of pseudo-reserved database and user names" so I wrote a patch to mark >> such reserved names and rebased my original patch on top of it. Only >> the docs changed in the rebase. The original patch (or its rebase) is >> what I am interested in. > Hopefully there will be no danger of me gaining access if I have a crafted > rolename? > postgres=# create role ""; > CREATE ROLE Well, the existence of such a role name wouldn't by itself cause any change in the way that pg_hba.conf is parsed. If you could then persuade a superuser to insert a pg_hba.conf line that is trying to match your username, the line might do something else than what the superuser expected, which is bad. But the *exact* same hazard applies to proposals based on inventing pseudo-reserved keywords (by which I mean things that look like names, but aren't reserved words, so that somebody could create a role name matching them). Either way, an uninformed superuser could be tricked. What I'm basically objecting to is the pseudo-reservedness. If we don't want to dodge that with special syntax, we should dodge it by making sure the keywords are actually reserved names. In other words, add a "pg_" prefix, as somebody else suggested upthread. I don't personally find that prettier than a punctuation prefix, but I can live with it if other people do. BTW, although that solution works for the immediate need of keywords that have to be distinguished from role names, it doesn't currently scale to keywords for the database column, because we don't treat "pg_" as a reserved prefix for database names: regression=# create role pg_zit; ERROR: role name "pg_zit" is reserved DETAIL: Role names starting with "pg_" are reserved. regression=# create database pg_zit; CREATE DATABASE Should we do so, or wait till there's an immediate need to? regards, tom lane
Re: pgsql: Add basic TAP tests for psql's tab-completion logic.
Re: Robert Haas 2020-01-09 > > So that raises the question: why does xenial's version of libedit > > not match either its documentation or the distributed source code? > > Because it doesn't. > > The level of effort you've put into this is extremely impressive, but > I can't shake the feeling that you're going to keep finding issues in > the test setup, the operating system, or the upstream libraries, > rather than bugs in PostgreSQL. Maybe this is all just one-time > stabilization effort, but... Fwiw if libedit in xenial is Just Broken and fixing the tests would just codify the brokenness without adding any value, I'll just disable that test in that particular build. It looks like setting with_readline=no on the make command line should work. Christoph
Re: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
On Thu, Jan 9, 2020 at 5:30 AM Christoph Berg wrote: > I have some concerns about security, though. It's true that the > sslcert/sslkey options can only be set/modified by superusers when > "password_required" is set. But when password_required is not set, any > user and create user mappings that reference arbitrary files on the > server filesystem. I believe the options are still used in that case > for creating connections, even when that means the remote server isn't > set up for cert auth, which needs password_required=false to succeed. > > In short, I believe these options need explicit superuser checks. I share the concern about the security issue here. I can't testify to whether Christoph's whole analysis is here, but as a general point, non-superusers can't be allowed to do things that cause the server to access arbitrary local files. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pgsql: Add basic TAP tests for psql's tab-completion logic.
On Tue, Jan 7, 2020 at 5:29 PM Tom Lane wrote: > So that raises the question: why does xenial's version of libedit > not match either its documentation or the distributed source code? > Because it doesn't. The level of effort you've put into this is extremely impressive, but I can't shake the feeling that you're going to keep finding issues in the test setup, the operating system, or the upstream libraries, rather than bugs in PostgreSQL. Maybe this is all just one-time stabilization effort, but... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: logical decoding : exceeded maxAllocatedDescs for .spill files
On 2020-Jan-09, Amit Kapila wrote: > In HEAD, we have a guc variable 'logical_decoding_work_mem' by which > we can control the memory usage of changes and we have used that, but > for back branches, we don't have such a control. > After the latest changes by Noah, the tern and mandrill both are > green. I will revert the test added by this patch unless there is > some strong argument to keep it. Hmm, so why not revert the test only in the back branches, given that it's not so onerous in master? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH] Resolve Parallel Hash Join Performance Issue
Thomas Munro writes: > Right, I see. The funny thing is that the match bit is not even used > in this query (it's used for right and full hash join, and those > aren't supported for parallel joins yet). Hmm. So, instead of the > test you proposed, an alternative would be to use if (!parallel). > That's a value that will be constant-folded, so that there will be no > branch in the generated code (see the pg_attribute_always_inline > trick). If, in a future release, we need the match bit for parallel > hash join because we add parallel right/full hash join support, we > could do it the way you showed, but only if it's one of those join > types, using another constant parameter. Can we base the test off the match type today, and avoid leaving something that will need to be fixed later? I'm pretty sure that the existing coding is my fault, and that it's like that because I reasoned that setting the bit was too cheap to justify having a test-and-branch around it. Apparently that's not true anymore in a parallel join, but I have to say that it's unclear why. In any case, the reasoning probably still holds good in non-parallel cases, so it'd be a shame to introduce a run-time test if we can avoid it. regards, tom lane
Re: pgsql: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
Re: Robert Haas 2020-01-09 > Does this mean that a non-superuser can induce postgres_fdw to read an > arbitrary file from the local filesystem? Yes, see my comments in the "Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings" thread. Christoph
Re: Fixing parallel make of libpq
Tom Lane writes: > My thoughts about the patch: > > 1) Changing from an "|"-style dependency to a plain dependency seems > like a semantics change. I've never been totally clear on the > difference though. I think Peter introduced our use of the "|" style, > so maybe he can comment. Makefile.shlib puts $(SHLIB_PREREQS) after the "|": $ grep SHLIB_PREREQS src/Makefile.shlib # SHLIB_PREREQS Order-only prerequisites for library build target $(stlib): $(OBJS) | $(SHLIB_PREREQS) $(shlib): $(OBJS) | $(SHLIB_PREREQS) $(shlib): $(OBJS) | $(SHLIB_PREREQS) $(shlib): $(OBJS) | $(SHLIB_PREREQS) $(shlib): $(OBJS) | $(SHLIB_PREREQS) $(shlib): $(OBJS) $(DLL_DEFFILE) | $(SHLIB_PREREQS) > 2) The same coding pattern is used in a bunch of other places, so if > this spot is broken, there probably are a lot of others that need a > similar change. On the other hand, there may not be that many > directories that are likely places to start a parallel build from, > so maybe we don't care elsewhere. Grepping the Makefiles for ':.*submake-' shows that they are on the actual build artifact target, libpq was just the outlier having it on the phony "all" target. For example pg_basebackup: pg_basebackup: pg_basebackup.o $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils $(CC) $(CFLAGS) pg_basebackup.o $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) pg_receivewal: pg_receivewal.o $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils $(CC) $(CFLAGS) pg_receivewal.o $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) pg_recvlogical: pg_recvlogical.o $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils $(CC) $(CFLAGS) pg_recvlogical.o $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) - ilmari -- "A disappointingly low fraction of the human race is, at any given time, on fire." - Stig Sandbeck Mathisen
Re: pgsql: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
On Thu, Jan 9, 2020 at 3:11 AM Andrew Dunstan wrote: > Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings > > This allows different users to authenticate with different certificates. > > Author: Craig Ringer > > https://git.postgresql.org/pg/commitdiff/f5fd995a1a24e6571d26b1e29c4dc179112b1003 Does this mean that a non-superuser can induce postgres_fdw to read an arbitrary file from the local filesystem? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Fixing parallel make of libpq
ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes: > Michael Paquier writes: >> Hmm. That logically makes sense. Isn't that a side effect of 7143b3e >> then? Now, FWIW, I am not able to reproduce it here, after trying on >> two different machines, various parallel job numbers (up to 32), and a >> couple of dozen attempts. Perhaps somebody else can see the failures? > It fails reliably for me on Debian Buster, with make 4.2.1-1.2and -j4. Yeah, it's also reliable for me on Fedora 30: $ make -s clean $ make -s -j4 -C src/interfaces/libpq /usr/bin/ld: cannot find -lpgcommon_shlib /usr/bin/ld: cannot find -lpgport_shlib collect2: error: ld returned 1 exit status make: *** [../../../src/Makefile.shlib:293: libpq.so.5.13] Error 1 make: *** Waiting for unfinished jobs On a RHEL6 box, the same test only draws a complaint about -lpgcommon_shlib, so it does seem like there's some make version dependency in here. And of course the whole thing is a race condition anyway, so naturally it's going to be pretty context-sensitive. My thoughts about the patch: 1) Changing from an "|"-style dependency to a plain dependency seems like a semantics change. I've never been totally clear on the difference though. I think Peter introduced our use of the "|" style, so maybe he can comment. 2) The same coding pattern is used in a bunch of other places, so if this spot is broken, there probably are a lot of others that need a similar change. On the other hand, there may not be that many directories that are likely places to start a parallel build from, so maybe we don't care elsewhere. regards, tom lane
Re: Add pg_file_sync() to adminpack
On Thu, Jan 9, 2020 at 7:31 AM Fujii Masao wrote: > > On Mon, Jan 6, 2020 at 3:42 PM Michael Paquier wrote: > > > > On Mon, Jan 06, 2020 at 03:20:13PM +0900, Arthur Zakirov wrote: > > > It isn't case if a file doesn't exist. But if there are no permissions on > > > the file: > > > > > > PANIC: could not open file "testfile": Permissions denied > > > server closed the connection unexpectedly > > > > > > It could be fixed by implementing a function like pg_file_sync_internal() > > > or > > > by making the function fsync_fname_ext() external. > > > > The patch uses stat() to make sure that the file exists and has no > > issues. Though it could be a problem with any kind of TOCTOU-like > > issues (looking at you, Windows, for ENOPERM), so I agree that it > > would make more sense to use pg_fsync() here with a fd opened first. > > I agree that it's not good for pg_file_sync() to cause a PANIC. > I updated the patch so that pg_file_sync() uses fsync_fname_ext() > instead of fsync_fname() as Arthur suggested. > > It's one of ideas to make pg_file_sync() open the file and directly call > pg_fsync(). But fsync_fname_ext() has already such code and > I'd like to avoid the code duplication. This looks good to me. > Attached is the updated version of the patch. + + pg_catalog.pg_file_sync(filename text) + boolean + + Sync a file or directory + + "Flush to disk" looks better than "sync" here. +/* + * pg_file_sync + * + * We REVOKE EXECUTE on the function from PUBLIC. + * Users can then grant access to it based on their policies. + */ I think that pg_write_server_files should be allowed to call that function by default.
Re: Add pg_file_sync() to adminpack
On Thu, Jan 9, 2020 at 7:43 AM Fujii Masao wrote: > > On Wed, Dec 25, 2019 at 11:11 PM Julien Rouhaud wrote: > > > > On Wed, Dec 25, 2019 at 2:01 PM Fujii Masao wrote: > > > > > > Hi, > > > > > > I'd like to propose to add pg_file_sync() function into contrib/adminpack. > > > This function fsyncs the specified file or directory named by its > > > argument. > > > IMO this is useful, for example, when you want to fsync the file that > > > pg_file_write() writes out or that COPY TO exports the data into, > > > for durability. Thought? > > > > +1, that seems like a useful wrapper. Looking at existing functions, > > I see that there's a pg_file_rename() in adminpack, but it doesn't use > > durable_rename nor does it try to perform any fsync. Same for > > pg_file_unlink vs. durable_unlink. It's probably worth fixing that at > > the same time? > > I don't think that's a bug. I'm not sure if every users of those functions > need durable rename and unlink at the expense of performance. > So IMO it's better to add new argument like "durable" to those functions > and durable_rename or _unlink is used only if it's true. It's probably a POLA violation. I'm pretty sure that most people using those functions would expect that a successful call to pg_file_unlink() mean that the file cannot raise from the dead even with certain unlikely circumstances, at least I'd expect so. If performance is a problem here, I'd rather have a new wrapper with a sync flag that defaults to true so it's possible to disable it if needed instead of calling a different function. That being said, I agree with Arthur, it should be handled in a different patch.
Re: Allow 'sslkey' and 'sslcert' in postgres_fdw user mappings
Re: To Andrew Dunstan 2020-01-09 <20200109103014.ga4...@msg.df7cb.de> > I believe the options are still used in that case > for creating connections, even when that means the remote server isn't > set up for cert auth, which needs password_required=false to succeed. They are indeed: stat("/var/lib/postgresql/.postgresql/root.crt", 0x7ffcff3e2bb0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden) stat("/foo", 0x7ffcff3e2bb0)= -1 ENOENT (Datei oder Verzeichnis nicht gefunden) sslcert I'm not sure if that could be exploited in any way, but let's just forbid it. Christoph
Re: Error message inconsistency
Hi Mahendra, Thanks for the patch. I am not sure but maybe the relation name should also be added to the following test case? create table t4 (id int); insert into t4 values (1); ALTER TABLE t4 ADD CONSTRAINT c1 CHECK (id > 10) NOT VALID; -- succeeds ALTER TABLE t4 VALIDATE CONSTRAINT c1; *ERROR: check constraint "c1" is violated by some row* On Mon, 6 Jan 2020 at 18:31, Mahendra Singh Thalor wrote: > On Sat, 6 Jul 2019 at 09:53, Amit Kapila wrote: > > > > On Mon, Jul 1, 2019 at 10:05 PM Alvaro Herrera > wrote: > > > > > > Do we have an actual patch here? > > > > > > > We have a patch, but it needs some more work like finding similar > > places and change all of them at the same time and then change the > > tests to adapt the same. > > > > Hi all, > Based on above discussion, I tried to find out all the places where we > need to change error for "not null constraint". As Amit Kapila pointed out > 1 place, I changed the error and adding modified patch. > > > *What does this patch? * > Before this patch, to display error of "not-null constraint", we were not > displaying relation name in some cases so attached patch is adding relation > name with the "not-null constraint" error in 2 places. I didn't changed out > files of test suite as we haven't finalized error messages. > > I verified Robert's point of for partition tables also. With the error, we > are adding relation name of "child table" and i think, it is correct. > > Please review attached patch and let me know feedback. > > Thanks and Regards > Mahendra Singh Thalor > EnterpriseDB: http://www.enterprisedb.com > -- -- M Beena Emerson
Re: [HACKERS] Block level parallel vacuum
Hello I noticed that parallel vacuum uses min_parallel_index_scan_size GUC to skip small indexes but this is not mentioned in documentation for both vacuum command and GUC itself. + /* Determine the number of parallel workers to launch */ + if (lps->lvshared->for_cleanup) + { + if (lps->lvshared->first_time) + nworkers = lps->nindexes_parallel_cleanup + + lps->nindexes_parallel_condcleanup - 1; + else + nworkers = lps->nindexes_parallel_cleanup - 1; + + } + else + nworkers = lps->nindexes_parallel_bulkdel - 1; (lazy_parallel_vacuum_indexes) Perhaps we need to add a comment for future readers, why we reduce the number of workers by 1. Maybe this would be cleaner? + /* Determine the number of parallel workers to launch */ + if (lps->lvshared->for_cleanup) + { + if (lps->lvshared->first_time) + nworkers = lps->nindexes_parallel_cleanup + + lps->nindexes_parallel_condcleanup; + else + nworkers = lps->nindexes_parallel_cleanup; + + } + else + nworkers = lps->nindexes_parallel_bulkdel; + + /* The leader process will participate */ + nworkers--; I have no more comments after reading the patches. regards, Sergei
Re: [Proposal] Global temporary tables
On 06.01.2020 8:04, 曾文旌(义从) wrote: In the previous communication 1 we agreed on the general direction 1.1 gtt use local (private) buffer 1.2 no replica access in first version 2 We feel that gtt needs to maintain statistics, but there is no agreement on what it will be done. 3 Still no one commented on GTT's transaction information processing, they include 3.1 Should gtt's frozenxid need to be care? 3.2 gtt’s clog clean 3.3 How to deal with "too old" gtt data I suggest we discuss further, reach an agreement, and merge the two patches to one. I also hope that we should come to the common solution for GTT. If we do not try to address parallel execution issues and access to temp tables at replicas (and I agreed that it should be avoided in first version of the patch), then GTT patch becomes quite small. The most complex and challenged task is to support GTT for all kind of indexes. Unfortunately I can not proposed some good universal solution for it. Just patching all existed indexes implementation seems to be the only choice. Statistic is another important case. But once again I do not completely understand why we want to address all this issues with statistic in first version of the patch? It contradicts to the idea to make this patch as small as possible. Also it seems to me that everybody agreed that users very rarely create indexes for temp tables and explicitly analyze them. So I think GTT will be useful even with limited support of statistic. In my version statistics for GTT is provided by pushing correspondent information to backend's cache for pg_statistic table. Also I provided pg_temp_statistic view for inspecting it by users. The idea to make pg_statistic a view which combines statistic of normal and temporary tables is overkill from my point of view. I do not understand why do we need to maintain hash with some extra information for GTT in backends memory (as it was done in Wenjing patch). Also idea to use create extension for accessing this information seems to be dubious. -- Konstantin Knizhnik Postgres Professional:http://www.postgrespro.com The Russian Postgres Company