Re: [HACKERS] cache lookup failed error for partition key with custom opclass
On Mon, Jul 24, 2017 at 7:23 PM, Tom Lane wrote: > Rushabh Lathia writes: > > PFA patch, where added elog() to add the error message same as all other > > places. > > Some looking around says that this *isn't* the only place that just > blithely assumes that it will find an opfamily entry. But I agree > that not checking for that isn't up to project standards. > Thanks Tom. I go thorough the get_opfamily_proc() in the system and added the check for InvalidOid. Thanks, Rushabh Lathia www.EnterpriseDB.com diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index 5267a01..c9c1a54 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -1640,6 +1640,9 @@ ExecInitExprRec(Expr *node, PlanState *parent, ExprState *state, lefttype, righttype, BTORDER_PROC); + if (!OidIsValid(proc))/* should not happen */ + elog(ERROR, "missing support function %d(%u,%u) in opfamily %u", + BTORDER_PROC, lefttype, righttype, opfamily); /* Set up the primary fmgr lookup information */ finfo = palloc0(sizeof(FmgrInfo)); diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c index d8aceb1..ff758d6 100644 --- a/src/backend/executor/nodeIndexscan.c +++ b/src/backend/executor/nodeIndexscan.c @@ -1367,6 +1367,9 @@ ExecIndexBuildScanKeys(PlanState *planstate, Relation index, op_lefttype, op_righttype, BTORDER_PROC); +if (!OidIsValid(opfuncid))/* should not happen */ + elog(ERROR, "missing support function %d(%u,%u) in opfamily %u", + BTORDER_PROC, op_lefttype, op_righttype, opfamily); inputcollation = lfirst_oid(collids_cell); collids_cell = lnext(collids_cell); diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 43238dd..6bd93b0 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -945,6 +945,10 @@ RelationBuildPartitionKey(Relation relation) opclassform->opcintype, opclassform->opcintype, BTORDER_PROC); + if (!OidIsValid(funcid)) /* should not happen */ + elog(ERROR, "missing support function %d(%u,%u) in opfamily %u", + BTORDER_PROC, opclassform->opcintype, opclassform->opcintype, + opclassform->opcfamily); fmgr_info(funcid, &key->partsupfunc[i]); diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c index 7ec31eb..bc2f164 100644 --- a/src/backend/utils/cache/typcache.c +++ b/src/backend/utils/cache/typcache.c @@ -440,6 +440,10 @@ lookup_type_cache(Oid type_id, int flags) typentry->btree_opintype, typentry->btree_opintype, BTORDER_PROC); + if (!OidIsValid(cmp_proc))/* should not happen */ + elog(ERROR, "missing support function %d(%u,%u) in opfamily %u", + BTORDER_PROC, typentry->btree_opintype, + typentry->btree_opintype, typentry->btree_opf); /* As above, make sure array_cmp or record_cmp will succeed */ if (cmp_proc == F_BTARRAYCMP && -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Testlib.pm vs msys
I wrote: > Andrew Dunstan writes: >> The problem is command_like's use of redirection to strings. Why this >> should be a problem for this particular use is a matter of speculation. > I looked at jacana's two recent pg_ctlCheck failures, and they both > seem to have failed on this: > command_like([ 'pg_ctl', 'start', '-D', "$tempdir/data", > '-l', "$TestLib::log_path/001_start_stop_server.log" ], > qr/done.*server started/s, 'pg_ctl start'); > That is redirecting the postmaster's stdout/stderr into a file, > for sure, so the child processes shouldn't impact EOF detection AFAICS. > It's also hard to explain this way why it only fails some of the time. I reflected a bit more on this issue, and realized that there's a pretty obvious mechanism for this to happen. While on Unix, we have pg_ctl fork-and-exec the postmaster so that no extra processes are laying about, that's not the case on Windows. The Windows code in pg_ctl.c creates a subprocess that runs CMD.EXE, which in turn runs the postmaster as a subprocess. The CMD.EXE process doesn't disappear until the postmaster exits. Now, we tell CMD to redirect the postmaster's stdout and stderr into files, but there's no way to tell CMD to redirect its own handles. So if the CMD process inherits pg_ctl's stdout and stderr, then the prove process would not see EOF on those pipes after pg_ctl exits. Now, this theory still has a Mack-truck-sized hole in it, which is if that's the failure mechanism then why isn't it 100% effective? Seems like the TAP test should fail every time, if this is a full description. But maybe it's part of the answer, so it seems worth experimenting in this direction. A bit of googling Microsoft's documentation suggests that maybe all we have to do is pass CreateProcessAsUser's bInheritHandles parameter as FALSE not TRUE in pg_ctl.c. It's not apparent to me that there are any handles we do need the CMD process to inherit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Mon, Jul 24, 2017 at 9:21 PM, Jeff Janes wrote: > On Sat, Jul 22, 2017 at 8:53 PM, Amit Kapila > wrote: >> >> On Thu, Jul 13, 2017 at 7:38 AM, Amit Kapila >> wrote: >> > On Wed, Jul 12, 2017 at 11:20 PM, Jeff Janes >> > wrote: >> >> >> >> >> >> >> >> Setting parallel_workers to 8 changes the threshold for the parallel to >> >> even >> >> be considered from parellel_tuple_cost <= 0.0049 to <= 0.0076. So it >> >> is >> >> going in the correct direction, but not by enough to matter. >> >> >> > >> > You might want to play with cpu_tuple_cost and or seq_page_cost. >> > >> >> I don't know whether the patch will completely solve your problem, but >> this seems to be the right thing to do. Do you think we should stick >> this for next CF? > > > It doesn't solve the problem for me, but I agree it is an improvement we > should commit. > Okay, added the patch for next CF. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More race conditions in logical replication
Alvaro Herrera wrote: > I'm back at looking into this again, after a rather exhausting week. I > think I have a better grasp of what is going on in this code now, Here's an updated patch, which I intend to push tomorrow. > and it > appears to me that we should change the locking so that active_pid is > protected by ReplicationSlotControlLock instead of the slot's spinlock; > but I haven't analyzed the idea fully yet and I don't have the patch > done yet either. This doesn't seem to be a good idea actually. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 33f08678bf20eed3a4cb3d10960bb06543a1b3db Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Wed, 12 Jul 2017 18:38:33 -0400 Subject: [PATCH v4] Wait for slot to become free in when dropping it --- src/backend/replication/logical/logicalfuncs.c | 2 +- src/backend/replication/slot.c | 115 ++--- src/backend/replication/slotfuncs.c| 32 --- src/backend/replication/walsender.c| 6 +- src/include/replication/slot.h | 10 ++- 5 files changed, 110 insertions(+), 55 deletions(-) diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c index 363ca82cb0..a3ba2b1266 100644 --- a/src/backend/replication/logical/logicalfuncs.c +++ b/src/backend/replication/logical/logicalfuncs.c @@ -244,7 +244,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin else end_of_wal = GetXLogReplayRecPtr(&ThisTimeLineID); - ReplicationSlotAcquire(NameStr(*name)); + ReplicationSlotAcquire(NameStr(*name), true); PG_TRY(); { diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index dc7de20e11..ea9cd1f22b 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -157,6 +157,7 @@ ReplicationSlotsShmemInit(void) /* everything else is zeroed by the memset above */ SpinLockInit(&slot->mutex); LWLockInitialize(&slot->io_in_progress_lock, LWTRANCHE_REPLICATION_SLOT_IO_IN_PROGRESS); + ConditionVariableInit(&slot->active_cv); } } } @@ -313,25 +314,35 @@ ReplicationSlotCreate(const char *name, bool db_specific, LWLockRelease(ReplicationSlotControlLock); /* -* Now that the slot has been marked as in_use and in_active, it's safe to +* Now that the slot has been marked as in_use and active, it's safe to * let somebody else try to allocate a slot. */ LWLockRelease(ReplicationSlotAllocationLock); + + /* Let everybody know we've modified this slot */ + ConditionVariableBroadcast(&slot->active_cv); } /* * Find a previously created slot and mark it as used by this backend. */ void -ReplicationSlotAcquire(const char *name) +ReplicationSlotAcquire(const char *name, bool nowait) { - ReplicationSlot *slot = NULL; + ReplicationSlot *slot; + int active_pid; int i; - int active_pid = 0; /* Keep compiler quiet */ +retry: Assert(MyReplicationSlot == NULL); - /* Search for the named slot and mark it active if we find it. */ + /* +* Search for the named slot and mark it active if we find it. If the +* slot is already active, we exit the loop with active_pid set to the PID +* of the backend that owns it. +*/ + active_pid = 0; + slot = NULL; LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); for (i = 0; i < max_replication_slots; i++) { @@ -339,35 +350,59 @@ ReplicationSlotAcquire(const char *name) if (s->in_use && strcmp(name, NameStr(s->data.name)) == 0) { + /* Found the slot we want -- can we activate it? */ SpinLockAcquire(&s->mutex); + active_pid = s->active_pid; if (active_pid == 0) active_pid = s->active_pid = MyProcPid; + SpinLockRelease(&s->mutex); slot = s; + break; } } LWLockRelease(ReplicationSlotControlLock); - /* If we did not find the slot or it was already active, error out. */ + /* If we did not find the slot, error out. */ if (slot == NULL) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT), errmsg("replication slot \"%s\" does not exist", name))); + + /* +* If we found the slot but it's already active in another backend, we +* either error out or retry after a short
Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Tomas Vondra writes: > It seems to me that VACUUM and ANALYZE somewhat disagree on what exactly > reltuples means. VACUUM seems to be thinking that > reltuples = live + dead > while ANALYZE apparently believes that > reltuples = live > The question is - which of the reltuples definitions is the right one? > I've always assumed that "reltuples = live + dead" but perhaps not? I think the planner basically assumes that reltuples is the live tuple count, so maybe we'd better change VACUUM to get in step. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Hi, It seems to me that VACUUM and ANALYZE somewhat disagree on what exactly reltuples means. VACUUM seems to be thinking that reltuples = live + dead while ANALYZE apparently believes that reltuples = live This causes somewhat bizarre changes in the value, depending on which of those commands was executed last. To demonstrate the issue, let's create a simple table with 1M rows, delete 10% rows and then we'll do a bunch of VACUUM / ANALYZE and check reltuples, n_live_tup and n_dead_tup in the catalogs. I've disabled autovacuum so that it won't interfere with this, and there's another transaction blocking VACUUM from actually cleaning any dead tuples. test=# create table t as select i from generate_series(1,100) s(i); test=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup ---++ 1e+06 |100 | 0 So, that's nice. Now let's delete 10% of rows, and run VACUUM and ANALYZE a few times. test=# delete from t where random() < 0.1; test=# vacuum t; test=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup ---++ 1e+06 | 900413 | 99587 test=# analyze t; reltuples | n_live_tup | n_dead_tup ---++ 900413 | 900413 | 99587 test=# vacuum t; reltuples | n_live_tup | n_dead_tup ---++ 1e+06 | 900413 | 99587 So, analyze and vacuum disagree. To further confuse the poor DBA, VACUUM always simply ignores the old values while ANALYZE combines the old and new values on large tables (and converges to the "correct" value after a few steps). This table is small (less than 30k pages), so ANALYZE does not do that. This is quite annoying, because people tend to look at reltuples while investigating bloat (e.g. because the check_postgres query mentioned on our wiki [1] uses reltuples in the formula). [1] https://wiki.postgresql.org/wiki/Show_database_bloat And when the cleanup is blocked for some reason (as in the example above), VACUUM tends to be running much more often (because it can't cleanup anything). So reltuples tend to be set to the higher value, which I'd argue is the wrong value for estimating bloat. I haven't looked at the code yet, but I've confirmed this happens both on 9.6 and 10. I haven't checked older versions, but I guess those are affected too. The question is - which of the reltuples definitions is the right one? I've always assumed that "reltuples = live + dead" but perhaps not? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Fwd: [HACKERS] Syncing sql extension versions with shared library versions
Mat Arye writes: > On Mon, Jul 24, 2017 at 1:38 PM, Tom Lane wrote: >> I'm not really sure why planner hooks would have anything to do with your >> exposed SQL API? > Sorry what I meant was i'd like to package different versions of my > extension -- both .sql and .c -- > and have the extension act consistently for any version until I do a ALTER > EXTENSION UPDATE. > So, I'd prefer a DB with an older extension to have the logic/code in the > hook not change even if I install a newer version's .so for use in another > database > (but don't update the extension to the newer version). Does that make any > sense? The newer version's .so simply is not going to load into the older version; we intentionally prevent that from happening. It's not necessary anyway because versions do not share library directories. Therefore, you can have foo.so for 9.5 in your 9.5 version's library directory, and foo.so for 9.6 in your 9.6 version's library directory, and the filesystem will keep them straight for you. It is not necessary to call them foo-9.5.so and foo-9.6.so. As for the other point, the usual idea is that if you have a SQL-accessible C function xyz() that needs to behave differently after an extension version update, then you make the extension update script point the SQL function to a different library entry point. If your 1.0 extension version originally had CREATE FUNCTION xyz(...) RETURNS ... LANGUAGE C AS 'MODULE_PATHNAME', 'xyz'; (note that the second part of the AS clause might have been implicit; no matter), then your update script for version 1.1 could do CREATE OR REPLACE FUNCTION xyz(...) RETURNS ... LANGUAGE C AS 'MODULE_PATHNAME', 'xyz_1_1'; Then in the 1.1 version of the C code, the xyz_1_1() C function provides the new behavior, while the xyz() C function provides the old behavior, or maybe just throws an error if you conclude it's impractical to emulate the old behavior exactly. As I mentioned earlier, you can usually set things up so that you can share much of the code between two such functions. The pgstattuple C function in contrib/pgstattuple is one example of having changed a C function's behavior in this way over multiple versions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity in error handling of constraint violation in ExecConstraints for partitioned tables
On Mon, Jul 24, 2017 at 6:21 AM, Amit Langote wrote: > Yes, we need that there too. > > Done that in the attached v3 (including the test where > ExecPartitionCheck() would have crashed without the patch). Committed. Thanks to all of you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Fwd: [HACKERS] Syncing sql extension versions with shared library versions
(adding -hackers back into thread, got dropped by my email client, sorry) On Mon, Jul 24, 2017 at 1:38 PM, Tom Lane wrote: > Mat Arye writes: > > I tried looking in the contrib modules and didn't find many with lots of > > planner hook usage. > > I'm not really sure why planner hooks would have anything to do with your > exposed SQL API? > Sorry what I meant was i'd like to package different versions of my extension -- both .sql and .c -- and have the extension act consistently for any version until I do a ALTER EXTENSION UPDATE. So, I'd prefer a DB with an older extension to have the logic/code in the hook not change even if I install a newer version's .so for use in another database (but don't update the extension to the newer version). Does that make any sense? > > You will need to have separate builds of your extension for each PG > release branch you work with; we force that through PG_MODULE_MAGIC > whether you like it or not. But that doesn't translate to needing > different names for the library .so files. Generally people either > mantain separate source code per-branch (just as the core code does) > or put in a lot of #ifs testing CATALOG_VERSION_NO to see which > generation of PG they're compiling against. > Yeah we plan to use different branches for different PG versions. > > regards, tom lane >
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
On 2017-07-24 23:31, Mark Rofail wrote: On Mon, Jul 24, 2017 at 11:25 PM, Erik Rijkers wrote: This patch doesn't apply to HEAD at the moment ( e2c8100e6072936 ). My bad, I should have mentioned that the patch is dependant on the original patch. Here is a *unified* patch that I just tested. Thanks. Apply is now good, but I get this error when compiling: ELEMENT' not present in UNRESERVED_KEYWORD section of gram.y make[4]: *** [gram.c] Error 1 make[3]: *** [parser/gram.h] Error 2 make[2]: *** [../../src/include/parser/gram.h] Error 2 make[1]: *** [all-common-recurse] Error 2 make: *** [all-src-recurse] Error 2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
On 2017-07-24 23:08, Mark Rofail wrote: Here is the new Patch with the bug fixes and the New Patch with the Index in place performance results. I just want to point this out because I still can't believe the numbers. In reference to the old patch: The new patch without the index suffers a 41.68% slow down, while the new patch with the index has a 95.18% speed up! [elemOperatorV4.patch] This patch doesn't apply to HEAD at the moment ( e2c8100e6072936 ). Can you have a look? thanks, Erik Rijkers patching file doc/src/sgml/ref/create_table.sgml Hunk #1 succeeded at 816 with fuzz 3. patching file src/backend/access/gin/ginarrayproc.c patching file src/backend/utils/adt/arrayfuncs.c patching file src/backend/utils/adt/ri_triggers.c Hunk #1 FAILED at 2650. Hunk #2 FAILED at 2694. 2 out of 2 hunks FAILED -- saving rejects to file src/backend/utils/adt/ri_triggers.c.rej patching file src/include/catalog/pg_amop.h patching file src/include/catalog/pg_operator.h patching file src/include/catalog/pg_proc.h patching file src/test/regress/expected/arrays.out patching file src/test/regress/expected/opr_sanity.out patching file src/test/regress/sql/arrays.sql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
It certainly is, thank you for the heads up. I included a note to encourage the user to index the referencing column instead. On Sun, Jul 23, 2017 at 4:41 AM, Robert Haas wrote: > > This is a jumbo king-sized can of worms, and even a very experienced > contributor would likely find it extremely difficult to sort all of > the problems that would result from a change in this area. Best Regards, Mark Rofail
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
> > However, there is a bug that prevented me from testing the third scenario, > I assume there's an issue of incompatible types problem since the right > operand type is anyelement and the supporting procedures expect anyarray. > I am working on debugging it right now. > I have also solved the bug that prevented me from performance testing the New Patch with the Index in place. Here is a summary of the results: A- Original Patch DELETE Average Execution time = 3.508 ms UPDATE Average Execution time = 3.239 ms B- New Patch DELETE Average Execution time = 4.970 ms UPDATE Average Execution time = 4.170 ms C- With Index DELETE Average Execution time = 0.169 ms UPDATE Average Execution time = 0.147 ms
Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel
On Fri, Jul 21, 2017 at 8:05 AM, Alexey Chernyshov wrote: > the following patch transfers functionality from gevel module > (http://www.sai.msu.su/~megera/wiki/Gevel) which provides functions for > analyzing GIN and GiST indexes to pageinspect. Gevel was originally > designed by Oleg Bartunov, and Teodor Sigaev for developers of GiST and > GIN indexes. It's not clear from the web site in question that the relevant code is released under the PostgreSQL license. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Mishandling of WCO constraints in direct foreign table modification
On Fri, Jul 21, 2017 at 6:21 AM, Etsuro Fujita wrote: > I mean constraints derived from WITH CHECK OPTIONs specified for parent > views. We use the words "WITH CHECK OPTION constraints" in comments in > nodeModifyTable.c, so the expression "CHECK OPTION constrains" doesn't sound > not that bad to me. (I used "CHECK OPTION", not "WITH CHECK OPTION", > because we use "CHECK OPTION" a lot more in the documentation than "WITH > CHECK OPTION".) Yeah, it seems OK to me, too; if the consensus is otherwise, we also have the option to change it later. Committed and back-patched as you had it, but I removed a spurious comma. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
On 7/24/17 3:28 PM, David Steele wrote: Yes, and this is another behavioral change to consider -- one that is more likely to impact users than the change to pg_stop_backup(). If pg_basebackup is not currently waiting for WAL on a standby (but does on a primary) then that's pretty serious. My bad, before PG10 pg_basebackup did not check that WAL was archived on a primary *or* a standby unless the -x option was specified, as documented. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue with circular references in VIEW
Le 24/07/2017 à 21:18, Tom Lane a écrit : > Gilles Darold writes: >> Le 24/07/2017 à 19:19, Tom Lane a écrit : >>> ... I'm inclined to think in terms of fixing it at that level >>> rather than in pg_dump. It doesn't look like it would be hard to fix: >>> both functions ultimately call get_query_def(), it's just that one passes >>> down a tuple descriptor for the view while the other currently doesn't. >> I was thinking that this was intentional that pg_get_ruledef() returns >> the raw code typed by the user. I will fix it and send a patch following >> your explanation. > Oh, I just committed a patch. That's fine, I'm sure it is better than the one I could produce :-) Thanks for fixing this issue. -- Gilles Darold Consultant PostgreSQL http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
On Mon, Jul 24, 2017 at 6:45 PM, Stephen Frost wrote: > What the change would do is make the pg_stop_backup() caller block until > the last WAL is archvied, and perhaps that ends up taking hours, and > then the connection is dropped for whatever reason and the backup fails > where it otherwise what? wouldn't have been valid anyway at that > point, since it's not valid until the last WAL is actually archived. > Perhaps eventually it would be archived and the caller was planning for > that and everything is fine, but, well, that feels like an awful lot of > wishful thinking. Letting users taking unconsciously inconsistent backups is worse than potentially breaking scripts that were actually not working as Postgres would expect. So I am +1 for back-patching a lighter version of the proposed patch that makes the wait happen on purpose. >> > I'd hate to have to do it, but we could technically add a GUC to address >> > this in the back-branches, no? I'm not sure that's really worthwhile >> > though.. >> >> That would be mighty ugly. > > Oh, absolutely agreed. Yes, let's avoid that. We are talking about a switch aimed at making backups potentially inconsistent. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buildfarm failure and dubious coding in predicate.c
Thomas Munro writes: > Ahh, I think I see it. This is an EXEC_BACKEND build farm animal. > Theory: After the backend we see had removed the scratch entry and > before it had restored it, another backend started up and ran > InitPredicateLocks(), which inserted a new scratch entry without > interlocking. Ouch. Yes, I think you're probably right. It needs to skip that if IsUnderPostmaster. Seems like there ought to be an Assert(!found) there, too. And I don't think I entirely like the fact that there's no assertions about the found/not found cases below, either. Will fix, unless you're already on it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
On 7/24/17 12:28 PM, Stephen Frost wrote: * David Steele (da...@pgmasters.net) wrote: While this patch brings pg_stop_backup() in line with the documentation, it also introduces a behavioral change compared to 9.6. Currently, the default 9.6 behavior on a standby is to return immediately from pg_stop_backup(), but this patch will cause it to block by default instead. Since action on the master may be required to unblock the process, I see this as a pretty significant change. I still think we should fix and backpatch, but I wanted to point this out. This will need to be highlighted in the release notes for 9.6.4 also, assuming there is agreement to back-patch this, and we'll need to update the documentation in 9.6 also to be clearer about what happens on a standby. Agreed. "If the WAL segments required for this backup have not been archived then it might be necessary to force a segment switch on the primary." I'm a bit on the fence regarding the distinction here for the backup-from-standby and this errhint(). The issue is that the WAL for the backup hasn't been archived yet and that may be because of either: archive_command is failing OR No WAL is getting generated In either case, both of these apply to the primary and the standby. As such, I'm inclined to try to mention both in the original errhint() instead of making two different ones. I'm open to suggestions on this, of course, but my thinking would be more like: - Either archive_command is failing or not enough WAL has been generated to require a segment switch. Run pg_switch_wal() to request a WAL switch and monitor your logs to check that your archive_command is executing properly. - Yes, that seems more concise. I like the idea of not having to maintain two separate hints. And then change pg_switch_wal()'s errhint for when it's run on a replica to be: HINT: WAL control functions cannont be executed during recovery; they should be executed on the primary instead. Looks good to me. This explanation is useful in general. 2) At backup.sgml L1015 it says that pg_stop_backup() will automatically switch the WAL segment. There should be a caveat here for standby backups, like: When called on a master it terminates the backup mode and performs an automatic switch to the next WAL segment. The reason for the switch is to arrange for the last WAL segment written during the backup interval to be ready to archive. When called on a standby the WAL segment switch must be performed manually on the master if it does not happen due to normal write activity. s/master/primary/g Yes. 3) The fact that this fix requires "archive_mode = always" seems like a pretty big caveat, thought I don't have any ideas on how to make it better without big code changes. Maybe it would be help to change: + the backup is taken on a standby, pg_stop_backup waits + for WAL to be archived when archive_mode To: + the backup is taken on a standby, pg_stop_backup waits + for WAL to be archived *only* when archive_mode I'm thinking of rewording that a bit to say "When archive_mode = always" instead, as I think that might be a little clearer. Sure. Perhaps this should be noted in the pg_basebackup docs as well? That seems like it's probably a good idea too, as people might wonder why pg_basebackup hasn't finished yet if it's waiting for WAL to be archived. Yes, and this is another behavioral change to consider -- one that is more likely to impact users than the change to pg_stop_backup(). If pg_basebackup is not currently waiting for WAL on a standby (but does on a primary) then that's pretty serious. Any thoughts on this, Magnus? -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buildfarm failure and dubious coding in predicate.c
On Tue, Jul 25, 2017 at 7:24 AM, Tom Lane wrote: > Thomas Munro writes: >> Ahh, I think I see it. This is an EXEC_BACKEND build farm animal. >> Theory: After the backend we see had removed the scratch entry and >> before it had restored it, another backend started up and ran >> InitPredicateLocks(), which inserted a new scratch entry without >> interlocking. > > Ouch. Yes, I think you're probably right. It needs to skip that if > IsUnderPostmaster. Seems like there ought to be an Assert(!found) > there, too. And I don't think I entirely like the fact that there's > no assertions about the found/not found cases below, either. > > Will fix, unless you're already on it? I was going to send a short patch that would test IsUnderPostmaster, but I got lost down a rabbit hole trying to figure out how to make my EXEC_BACKEND builds run on this machine... Please go ahead. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue with circular references in VIEW
Gilles Darold writes: > Le 24/07/2017 à 19:19, Tom Lane a écrit : >> ... I'm inclined to think in terms of fixing it at that level >> rather than in pg_dump. It doesn't look like it would be hard to fix: >> both functions ultimately call get_query_def(), it's just that one passes >> down a tuple descriptor for the view while the other currently doesn't. > I was thinking that this was intentional that pg_get_ruledef() returns > the raw code typed by the user. I will fix it and send a patch following > your explanation. Oh, I just committed a patch. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buildfarm failure and dubious coding in predicate.c
On Mon, Jul 24, 2017 at 11:51 AM, Thomas Munro wrote: > On Sun, Jul 23, 2017 at 8:32 AM, Tom Lane wrote: >> Meanwhile, it's still pretty unclear what happened yesterday on >> culicidae. > > That failure is indeed baffling. The only code that inserts > (HASH_ENTER[_NULL]) into PredicateLockTargetHash: > > 1. CreatePredicateLock(). I would be a bug if that ever tried to > insert a { 0, 0, 0, 0 } tag, and in any case it holds > SerializablePredicateLockListLock in LW_SHARED. > > 2. TransferPredicateLocksToNewTarget(), which removes and restores > the scratch entry and also explicitly inserts a transferred entry. It > asserts that it holds SerializablePredicateLockListLock and is called > only by PredicateLockPageSplit() which acquires it in LW_EXCLUSIVE. > > 3. DropAllPredicateLocksFromTable(), which removes and restores the > scratch entry and also explicitly inserts a transferred entry. > Acquires SerializablePredicateLockListLock in LW_EXCLUSIVE. Ahh, I think I see it. This is an EXEC_BACKEND build farm animal. Theory: After the backend we see had removed the scratch entry and before it had restored it, another backend started up and ran InitPredicateLocks(), which inserted a new scratch entry without interlocking. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue with circular references in VIEW
Le 24/07/2017 à 19:19, Tom Lane a écrit : > Gilles Darold writes: >> There is an issue with version prior to 10 when dumping views with circular >> references. I know that these views are now exported as views in 10 but they >> are still exported as TABLE + RULE in prior versions. This conduct to the >> following error when columns of sub-queries doesn't have the same aliases >> names: > The core of this issue, I think, is that pg_get_viewdef() knows that it > should make what it prints have output column names that match the view, > whereas pg_get_ruledef() does not, even when it is printing an ON SELECT > rule. This is a little bit surprising --- you'd really expect those > functions to produce identical SELECT statements --- and I think it's > likely to break other tools even if pg_dump has managed to skirt the > issue. So I'm inclined to think in terms of fixing it at that level > rather than in pg_dump. It doesn't look like it would be hard to fix: > both functions ultimately call get_query_def(), it's just that one passes > down a tuple descriptor for the view while the other currently doesn't. I was thinking that this was intentional that pg_get_ruledef() returns the raw code typed by the user. I will fix it and send a patch following your explanation. Thanks. -- Gilles Darold Consultant PostgreSQL http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Definitional questions for pg_sequences view
On 7/20/17 16:36, Tom Lane wrote: > What exactly is the point of the new pg_sequences view? It is analogous to pg_tables, pg_matviews, pg_indexes, and other such system views that are sort of half-way between system catalogs and information schema. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
On Mon, Jul 24, 2017 at 10:50 AM, Joshua D. Drake wrote: > Does this suggest that we don't have a cleanup problem but a fragmentation > problem (or both at least for the index)? Having an index that is almost > twice the uncleaned up size isn't that uncommon. As Tom pointed out up-thread, it's important to distinguish between inherent overhead, and overhead due to garbage that needs to be cleaned-up by vacuum. It's really hard to delineate which is which here, and I'm not going to try to put a number on it. What I will point out is that you can see quite a significant difference between the space utilization of a B-Tree without any dead tuples, just from the order in which tuples are initially inserted. You can get about a 1/3 loss of space by inserting randomly, rather than inserting in sorted order, which is what REINDEX will more or less do for you. That's because random workloads almost entirely get 50:50 page splits, whereas sorted input will always split the rightmost page, and so will always get 90:10 splits. The space in the random case isn't exactly wasted; it's there for the taking, for key values that happen to fit on the page. You effectively require a larger average reserve of free space on pages with the random workload, because the implementation does not and cannot reason that it would be best to concentrate free space in parts of the keyspace where there is most need for it. That having been said, I do think that this workload suffers from index bloat in a way that isn't so easily explained. It does seem to be an issue with VACUUM controlling bloat in the index in particular. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] segfault in HEAD when too many nested functions call
On 2017-07-24 13:27:58 -0400, Tom Lane wrote: > Andres Freund writes: > >> I seriously, seriously, seriously dislike that. You practically might as > >> well put miscadmin.h into postgres.h. Instead, what do you think of > >> requiring the individual ExecProcNode functions to perform > >> CHECK_FOR_INTERRUPTS? Since they're already responsible for doing that > >> if they have any long-running internal loops, this doesn't seem like a > >> modularity violation. It is a risk for bugs-of-omission, sure, but so > >> are a lot of other things that the per-node code has to do. > > > That'd work. Another alternative would be to move the inline definition > > of ExecProcNode() (and probably a bunch of other related functions) into > > a more internals oriented header. It seems likely that we're going to > > add more inline functions to the executor, and that'd reduce the > > coupling of external and internal users a bit. > > Well, it still ends up that the callers of ExecProcNode need to include > miscadmin.h, whereas if we move it into the per-node functions, then the > per-node files need to include miscadmin.h. I think the latter is better > because those files may need to have other CHECK_FOR_INTERRUPTS calls > anyway. > It's less clear from a modularity standpoint that executor > callers should need miscadmin.h. Well, that's why I'm pondering an executor_internal.h or something - there shouldn't be ExecProcNode() callers that don't also need CFI(), and no executor callers should need ExecProcNode(). executor.h right now really defines infrastructure to *use* the executor (Executor{Start,Run,Finish,End,Rewind}), functions internal to the executor (lots of initialization functions, EPQ, partition logic), some things inbetween (e.g. expression related stuff), and some things that really should be separate ExecOpenIndices etc, execReplication.c functions. But that's not something we can easily clear up just now. > (Or in short, I'm not really okay with *any* header file including > miscadmin.h.) Perhaps that's a sign that we should split it up? It's a weird grab bag atm. utils/interrupt.h or such would e.g. make sense for for the *INTERRUPTS, and *CRIT_SECTION macros, as well as ProcessInterrupts() itself, which imo isn't super well placed in postgres.c either. Including utils/interrupt.h in a header would be much less odious in my opinion than including miscadmin.h. > >> * Can we redefine the ExecCustomScan function pointer as type > >> ExecProcNodeCB, eliminating the unsightly cast in nodeCustom.c? > > > That'd change an "extension API", which is why I skipped it at this > > point of the release cycle. It's not like we didn't have this type of > > cast all over before. Ok, with changing it, but that's where I came > > down. > > Is this patch really not changing anything else that a custom-scan > extension would touch? If not, I'm okay with postponing this bit > of cleanup to v11. Not that I can see - I've build & tested citus which uses custom scans these days with and without patch without trouble. Nor do I see any change in the current patch that'd be troublesome - after all the API of ExecProcNode() stays the same. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location
On Mon, Jul 24, 2017 at 2:43 PM, Peter Geoghegan wrote: > On Mon, Jul 24, 2017 at 10:13 AM, Claudio Freire > wrote: >> Vacuum *might* be able to redistribute tuples too, while holding locks >> to all 3 pages (the two children and the parent page), since it can >> move the TID to the middle point of two actual child TIDs, mindlessly, >> without checking to see if such TID actually exists - it just needs to >> choose a TID cutoff point that will distribute item pointers evently. >> I haven't tried this, but it is theoretically possible. > > I don't understand what you mean here. As long as the TIDs are a first > class part of the keyspace, what VACUUM does or doesn't do should not > matter. Page deletion stashes a TID in the highkey position of a leaf > page within _bt_mark_page_halfdead(), but that shouldn't matter to > you. > > I guess you're talking about contriving a TID value that is the mean > of two real TID values -- the two that are on each side of the split > point during a leaf page split. While that seems possible, I don't see > much value in it. Unless you have normalized keys, you can only really > truncate whole attributes. And, I think it's a bad idea to truncate > anything other than the new high key for leaf pages, with or without > normalized keys. Changing the keys once they're in internal pages is > never going to be worth it. That's what I'm saying. It might not be worth it. I haven't tested yet. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
On 07/23/2017 12:03 PM, Joshua D. Drake wrote: As you can see even with aggressive vacuuming, over a period of 36 hours life gets increasingly miserable. The largest table is: postgres=# select pg_size_pretty(pg_total_relation_size('bmsql_order_line')); pg_size_pretty 148 GB (1 row) [snip] With the PK being postgres=# select pg_size_pretty(pg_relation_size('bmsql_order_line_pkey')); pg_size_pretty 48 GB (1 row) I tried to see how much data we are dealing with here: -hackers, I cleaned up the table with VACUUM FULL and ended up with the following: postgres=# select pg_size_pretty(pg_total_relation_size('bmsql_order_line')); pg_size_pretty 118 GB (1 row) postgres=# select pg_size_pretty(pg_relation_size('bmsql_order_line_pkey')); pg_size_pretty 27 GB (1 row) Does this suggest that we don't have a cleanup problem but a fragmentation problem (or both at least for the index)? Having an index that is almost twice the uncleaned up size isn't that uncommon. Thanks in advance, JD -- Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc PostgreSQL Centered full stack support, consulting and development. Advocate: @amplifypostgres || Learn: https://pgconf.us * Unless otherwise stated, opinions are my own. * -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location
On Mon, Jul 24, 2017 at 10:13 AM, Claudio Freire wrote: > In most cases, the TID itself can be omitted when the key itself > differs already. > > When a split happens, a TID will be added referring to a real tid from > a child page iff necessary. > > This gives a lot of leeway in choosing the cutoff point, since a > cutoff point is only added when necessary. I agree with all that. That just sounds like a basic implementation of suffix truncation, to support making heap TID a part of the keyspace without paying a big cost in fan-in. > Vacuum *might* be able to redistribute tuples too, while holding locks > to all 3 pages (the two children and the parent page), since it can > move the TID to the middle point of two actual child TIDs, mindlessly, > without checking to see if such TID actually exists - it just needs to > choose a TID cutoff point that will distribute item pointers evently. > I haven't tried this, but it is theoretically possible. I don't understand what you mean here. As long as the TIDs are a first class part of the keyspace, what VACUUM does or doesn't do should not matter. Page deletion stashes a TID in the highkey position of a leaf page within _bt_mark_page_halfdead(), but that shouldn't matter to you. I guess you're talking about contriving a TID value that is the mean of two real TID values -- the two that are on each side of the split point during a leaf page split. While that seems possible, I don't see much value in it. Unless you have normalized keys, you can only really truncate whole attributes. And, I think it's a bad idea to truncate anything other than the new high key for leaf pages, with or without normalized keys. Changing the keys once they're in internal pages is never going to be worth it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] segfault in HEAD when too many nested functions call
Andres Freund writes: > On 2017-07-21 20:17:54 -0400, Tom Lane wrote: >>> I dislike having the miscadmin.h include in executor.h - but I don't >>> quite see a better alternative. >> I seriously, seriously, seriously dislike that. You practically might as >> well put miscadmin.h into postgres.h. Instead, what do you think of >> requiring the individual ExecProcNode functions to perform >> CHECK_FOR_INTERRUPTS? Since they're already responsible for doing that >> if they have any long-running internal loops, this doesn't seem like a >> modularity violation. It is a risk for bugs-of-omission, sure, but so >> are a lot of other things that the per-node code has to do. > That'd work. Another alternative would be to move the inline definition > of ExecProcNode() (and probably a bunch of other related functions) into > a more internals oriented header. It seems likely that we're going to > add more inline functions to the executor, and that'd reduce the > coupling of external and internal users a bit. Well, it still ends up that the callers of ExecProcNode need to include miscadmin.h, whereas if we move it into the per-node functions, then the per-node files need to include miscadmin.h. I think the latter is better because those files may need to have other CHECK_FOR_INTERRUPTS calls anyway. It's less clear from a modularity standpoint that executor callers should need miscadmin.h. (Or in short, I'm not really okay with *any* header file including miscadmin.h.) >> * I think the comments need more work. Am willing to make a pass over >> that if you want. > That'd be good, but let's wait till we have something more final. Agreed, I'll wait till you produce another version. >> * Can we redefine the ExecCustomScan function pointer as type >> ExecProcNodeCB, eliminating the unsightly cast in nodeCustom.c? > That'd change an "extension API", which is why I skipped it at this > point of the release cycle. It's not like we didn't have this type of > cast all over before. Ok, with changing it, but that's where I came > down. Is this patch really not changing anything else that a custom-scan extension would touch? If not, I'm okay with postponing this bit of cleanup to v11. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increase Vacuum ring buffer.
On Mon, Jul 24, 2017 at 2:20 PM, Claudio Freire wrote: > On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura > wrote: >> On 2017-07-24 19:11, Claudio Freire wrote: >>> I was mostly thinking about something like the attached patch. >>> >>> Simple, unintrusive, and shouldn't cause any noticeable slowdown. >> >> >> Your change is small, clear, and currently useful for huge tables under >> high update load (until "allowing vacuum to use more than 1GB memory" >> is merged). > > In high-bloat conditions, it doesn't take long to accumulate 1GB of > dead tuples (which is about 178M tuples, btw). > > The index scan takes way longer than the heap scan in that case. > >> But it still delays updating fsm until whole first batch of dead tuples >> cleared (ie all indices scanned, and all heap pages cleared), and on such >> huge table it will be hours. > > So, true, it will get delayed considerably. But as you realized, > there's not much point in trying to vacuum the FSM sooner, since it > won't be accurate shortly afterwards anyway. Dead line pointers do use > up a fair bit of space, especially on narrow tables. > > In a particular table I have that exhibits this problem, most of the > time is spent scanning the index. It performs dozens of index scans > before it's done, so it would vacuum the FSM quite often enough, even > if I were to increase the mwm setting n-fold. I hate to reply to myself, but I wanted to add: in any case, the case I'm trying to avoid is the case where the FSM *never* gets vacuumed. That's bad. But it may not be the phenomenon you're experiencing in your tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increase Vacuum ring buffer.
On Mon, Jul 24, 2017 at 2:10 PM, Sokolov Yura wrote: > On 2017-07-24 19:11, Claudio Freire wrote: >> >> On Mon, Jul 24, 2017 at 6:37 AM, Sokolov Yura >> wrote: >>> >>> Good day, Claudio >>> >>> >>> On 2017-07-22 00:27, Claudio Freire wrote: On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura wrote: > > > > My friend noticed, that I didn't said why I bother with autovacuum. > Our customers suffers from table bloating. I've made synthetic > bloating test, and started experiments with modifying micro- and > auto-vacuum. My first attempts were to update FSM early (both in > micro and autovacuum) and update it upto root, not only low level. This FSM thing is probably not a bad idea as well. We're forced to run regular manual vacuums because for some tables autovacuums seems to never be enough, no matter how it's configured, mostly because it gets canceled all the time. These are high-churn, huge tables, so vacuuming them takes hours or days, there's always someone with a conflicting lock at some point that ends up canceling the autovacuum task. The above paragraph triggered me to go check, and it seems in those cases the FSM never gets vacuumed. That's probably not a good thing, but I don't see how to vacuum the FSM after a cancel. So vacuuming the FSM from time to time during long-running vacuums seems like a good idea at this point. >>> >>> >>> >>> Attached patch changes fsm update: instead of updating only lowest >>> level, it propagates space increase up to root. >>> >>> It slows autovacuum a bit, so that I didn't propose it together with >>> ring buffer increase. >> >> >> I was mostly thinking about something like the attached patch. >> >> Simple, unintrusive, and shouldn't cause any noticeable slowdown. > > > Your change is small, clear, and currently useful for huge tables under > high update load (until "allowing vacuum to use more than 1GB memory" > is merged). In high-bloat conditions, it doesn't take long to accumulate 1GB of dead tuples (which is about 178M tuples, btw). The index scan takes way longer than the heap scan in that case. > But it still delays updating fsm until whole first batch of dead tuples > cleared (ie all indices scanned, and all heap pages cleared), and on such > huge table it will be hours. So, true, it will get delayed considerably. But as you realized, there's not much point in trying to vacuum the FSM sooner, since it won't be accurate shortly afterwards anyway. Dead line pointers do use up a fair bit of space, especially on narrow tables. In a particular table I have that exhibits this problem, most of the time is spent scanning the index. It performs dozens of index scans before it's done, so it would vacuum the FSM quite often enough, even if I were to increase the mwm setting n-fold. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Issue with circular references in VIEW
Gilles Darold writes: > There is an issue with version prior to 10 when dumping views with circular > references. I know that these views are now exported as views in 10 but they > are still exported as TABLE + RULE in prior versions. This conduct to the > following error when columns of sub-queries doesn't have the same aliases > names: The core of this issue, I think, is that pg_get_viewdef() knows that it should make what it prints have output column names that match the view, whereas pg_get_ruledef() does not, even when it is printing an ON SELECT rule. This is a little bit surprising --- you'd really expect those functions to produce identical SELECT statements --- and I think it's likely to break other tools even if pg_dump has managed to skirt the issue. So I'm inclined to think in terms of fixing it at that level rather than in pg_dump. It doesn't look like it would be hard to fix: both functions ultimately call get_query_def(), it's just that one passes down a tuple descriptor for the view while the other currently doesn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increase Vacuum ring buffer.
On Thu, Jul 20, 2017 at 12:51 PM, Tom Lane wrote: > Robert Haas writes: > > I think that's a valid point. There are also other concerns here - > > e.g. whether instead of adopting the patch as proposed we ought to (a) > > use some smaller size, or (b) keep the size as-is but reduce the > > maximum fraction of shared_buffers that can be consumed, or (c) divide > > the ring buffer size through by autovacuum_max_workers. Personally, > > of those approaches, I favor (b). I think a 16MB ring buffer is > > probably just fine if you've got 8GB of shared_buffers but I'm > > skeptical about it when you've got 128MB of shared_buffers. > > WFM. I agree with *not* dividing the basic ring buffer size by > autovacuum_max_workers. If you have allocated more AV workers, I think > you expect AV to go faster, not for the workers to start fighting among > themselves. > But fighting among themselves is just what they do regarding the autovacuum_vacuum_cost_limit, so I don't see why it should be one way there but different here. The reason for setting autovacuum_max_workers to N is so that small tables aren't completely starved of vacuuming even if N-1 larger tables are already being vacuumed simultaneously. Now the small tables get vacuumed at speed 1/N, which kind of sucks, but that is the mechanism we currently have. Of course just because we are in a hole with vacuum_cost_limit doesn't mean we should dig ourselves deeper, but we are being inconsistent then. Cheers, Jeff
Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location
On Mon, Jul 24, 2017 at 2:02 PM, Peter Geoghegan wrote: > On Mon, Jul 24, 2017 at 9:51 AM, Claudio Freire > wrote: >> My point was that the TID doesn't have to point to an actual tuple. >> >> It's more of a keyspace thing, so it doesn't need to match real >> tuples, it can just divide the keyspace with an arbitrary cutoff, and >> should be cheapter to maintain without that requirement. > > I agree, but unless you're using normalized keys, then I don't see > that you get much useful leeway from using fake or truncated TID > values. Presumably the comparison logic will be based on comparing an > ItemPointerData field, which is impractical to truncate. In most cases, the TID itself can be omitted when the key itself differs already. When a split happens, a TID will be added referring to a real tid from a child page iff necessary. This gives a lot of leeway in choosing the cutoff point, since a cutoff point is only added when necessary. Vacuum *might* be able to redistribute tuples too, while holding locks to all 3 pages (the two children and the parent page), since it can move the TID to the middle point of two actual child TIDs, mindlessly, without checking to see if such TID actually exists - it just needs to choose a TID cutoff point that will distribute item pointers evently. I haven't tried this, but it is theoretically possible. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increase Vacuum ring buffer.
On 2017-07-24 19:11, Claudio Freire wrote: On Mon, Jul 24, 2017 at 6:37 AM, Sokolov Yura wrote: Good day, Claudio On 2017-07-22 00:27, Claudio Freire wrote: On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura wrote: My friend noticed, that I didn't said why I bother with autovacuum. Our customers suffers from table bloating. I've made synthetic bloating test, and started experiments with modifying micro- and auto-vacuum. My first attempts were to update FSM early (both in micro and autovacuum) and update it upto root, not only low level. This FSM thing is probably not a bad idea as well. We're forced to run regular manual vacuums because for some tables autovacuums seems to never be enough, no matter how it's configured, mostly because it gets canceled all the time. These are high-churn, huge tables, so vacuuming them takes hours or days, there's always someone with a conflicting lock at some point that ends up canceling the autovacuum task. The above paragraph triggered me to go check, and it seems in those cases the FSM never gets vacuumed. That's probably not a good thing, but I don't see how to vacuum the FSM after a cancel. So vacuuming the FSM from time to time during long-running vacuums seems like a good idea at this point. Attached patch changes fsm update: instead of updating only lowest level, it propagates space increase up to root. It slows autovacuum a bit, so that I didn't propose it together with ring buffer increase. I was mostly thinking about something like the attached patch. Simple, unintrusive, and shouldn't cause any noticeable slowdown. Your change is small, clear, and currently useful for huge tables under high update load (until "allowing vacuum to use more than 1GB memory" is merged). But it still delays updating fsm until whole first batch of dead tuples cleared (ie all indices scanned, and all heap pages cleared), and on such huge table it will be hours. On the other hand, if "dead" tuples consumes all useful item pointer ( MaxHeapTuplesPerPage ~ 290 on 8k page), then space, that actually exists on a page, could not be used until "dead" tuples are converted into "unused" tuples. With my patch I've seen that writing FSM until dead tuples cleared helps a little: while bloating is slowed a little by this change, it is stopped only after final cleaning of dead tuples. -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location
On Mon, Jul 24, 2017 at 9:51 AM, Claudio Freire wrote: > My point was that the TID doesn't have to point to an actual tuple. > > It's more of a keyspace thing, so it doesn't need to match real > tuples, it can just divide the keyspace with an arbitrary cutoff, and > should be cheapter to maintain without that requirement. I agree, but unless you're using normalized keys, then I don't see that you get much useful leeway from using fake or truncated TID values. Presumably the comparison logic will be based on comparing an ItemPointerData field, which is impractical to truncate. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [WIP] [B-Tree] Keep indexes sorted by heap physical location
On Wed, Nov 23, 2016 at 12:27 AM, Peter Geoghegan wrote: > On Mon, Nov 21, 2016 at 5:15 PM, Claudio Freire > wrote: >>> There are a couple >>> of tricky issues with that that you'd have to look out for, like >>> making sure that the high key continues to hold a real TID, which at a >>> glance looks like something that just happens already. I'm not sure >>> that we preserve the item pointer TID in all cases, since the current >>> assumption is that a high key's TID is just filler -- maybe we can >>> lose that at some point. >> >> I thought long about that, and inner pages don't need a real TID in >> their keys, as those keys only specify key space cutoff points and not >> real tids, and high tids in leaf pages are always real. > > Not if there are duplicates in the inner pages. Then, you have to add > in the TID, and separate the key space, to guide insertions to the > right leaf page directly (particularly for non-HOT UPDATEs). That's > what I'm mostly interested in investigating, here. My point was that the TID doesn't have to point to an actual tuple. It's more of a keyspace thing, so it doesn't need to match real tuples, it can just divide the keyspace with an arbitrary cutoff, and should be cheapter to maintain without that requirement. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Mon, Jul 24, 2017 at 12:31 PM, Stephen Frost wrote: > > Those backup scripts might very well be, today, producing invalid > > backups though, which would be much less good.. > > True. However, I suspect that depends on what procedure is actually > being followed. If *everyone* who is using this is getting corrupt > backups, then of course changing the behavior is a no-brainer. But if > *some* people are getting correct backups and *some* people are > getting incorrect backups, depending on procedure, then I think > changing it is unwise. We should optimize for the case of a user who > is currently doing something smart, not one who is doing something > dumb. I'm not sure that we can call "writing code that depends on what the docs say" dumb, unfortunately, and if even one person is getting an invalid backup while following what the docs say then that's a strong case to do *something*. Of course, we also don't want to break the scripts of people who are doing things correctly, but I'm trying to figure out exactly how this change would break such scripts. What the change would do is make the pg_stop_backup() caller block until the last WAL is archvied, and perhaps that ends up taking hours, and then the connection is dropped for whatever reason and the backup fails where it otherwise what? wouldn't have been valid anyway at that point, since it's not valid until the last WAL is actually archived. Perhaps eventually it would be archived and the caller was planning for that and everything is fine, but, well, that feels like an awful lot of wishful thinking. And that's assuming that the script author made sure to write additional code that didn't mark the backup as valid until the ending WAL segment from pg_stop_backup() was confirmed to have been archived. Last, but perhaps not least, this is at least just an issue going back to where pg_start/stop_backup was allowed on replicas, which is only 9.6 and therefore it's been out less than a year and anyone's script which might break due to this would at least be relatively new code. > > I'd hate to have to do it, but we could technically add a GUC to address > > this in the back-branches, no? I'm not sure that's really worthwhile > > though.. > > That would be mighty ugly. Oh, absolutely agreed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
On Mon, Jul 24, 2017 at 12:31 PM, Stephen Frost wrote: > Those backup scripts might very well be, today, producing invalid > backups though, which would be much less good.. True. However, I suspect that depends on what procedure is actually being followed. If *everyone* who is using this is getting corrupt backups, then of course changing the behavior is a no-brainer. But if *some* people are getting correct backups and *some* people are getting incorrect backups, depending on procedure, then I think changing it is unwise. We should optimize for the case of a user who is currently doing something smart, not one who is doing something dumb. > I'd hate to have to do it, but we could technically add a GUC to address > this in the back-branches, no? I'm not sure that's really worthwhile > though.. That would be mighty ugly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Mon, Jul 24, 2017 at 11:40 AM, David Steele wrote: > > Before reviewing the patch, I would note that it looks like this issue was > > introduced in b6a323a8c before the 9.6 release. The documentation states > > that the pg_stop_backup() function will wait for all required WAL to be > > archived, which corresponds to the default in the new patch of > > waitforarchive = true. The commit notes that the documentation needs > > updating, but since that didn't happen I think we should consider this a bug > > in 9.6 and back patch. > > > > While this patch brings pg_stop_backup() in line with the documentation, it > > also introduces a behavioral change compared to 9.6. Currently, the default > > 9.6 behavior on a standby is to return immediately from pg_stop_backup(), > > but this patch will cause it to block by default instead. Since action on > > the master may be required to unblock the process, I see this as a pretty > > significant change. I still think we should fix and backpatch, but I wanted > > to point this out. > > Hmm. That seems to me like it could break backup scripts that are > currently working, which seems like a *very* unfriendly thing to do in > a minor release, especially since 9.6 has no option to override the > default behavior (nor can we add one, since it would require a change > to catalog contents). Those backup scripts might very well be, today, producing invalid backups though, which would be much less good.. I'd hate to have to do it, but we could technically add a GUC to address this in the back-branches, no? I'm not sure that's really worthwhile though.. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
David, * David Steele (da...@pgmasters.net) wrote: > On 7/23/17 11:48 PM, Masahiko Sawada wrote: > >On Sat, Jul 22, 2017 at 8:04 AM, Stephen Frost wrote: > >> > >>I started discussing this with David off-list and he'd like a chance to > >>review it in a bit more detail (he's just returning from being gone for > >>a few weeks). That review will be posted to this thread on Monday, and > >>I'll wait until then to move forward with the patch. > > Before reviewing the patch, I would note that it looks like this > issue was introduced in b6a323a8c before the 9.6 release. The > documentation states that the pg_stop_backup() function will wait > for all required WAL to be archived, which corresponds to the > default in the new patch of waitforarchive = true. The commit notes > that the documentation needs updating, but since that didn't happen > I think we should consider this a bug in 9.6 and back patch. I tend to agree with this. The documentation clearly says that pg_stop_backup() waits for all WAL to be archived, but, today, if it's run on a standby then it doesn't wait, which might lead to invalid backups if the backup software is depending on that. > While this patch brings pg_stop_backup() in line with the > documentation, it also introduces a behavioral change compared to > 9.6. Currently, the default 9.6 behavior on a standby is to return > immediately from pg_stop_backup(), but this patch will cause it to > block by default instead. Since action on the master may be > required to unblock the process, I see this as a pretty significant > change. I still think we should fix and backpatch, but I wanted to > point this out. This will need to be highlighted in the release notes for 9.6.4 also, assuming there is agreement to back-patch this, and we'll need to update the documentation in 9.6 also to be clearer about what happens on a standby. > The patch looks sensible to me. A few comments: > > 1) I would change: > > "Check if the WAL segment needed for this backup have been > completed, in which case a forced segment switch may be needed on > the primary." > > To something like: > > "If the WAL segments required for this backup have not been archived > then it might be necessary to force a segment switch on the > primary." I'm a bit on the fence regarding the distinction here for the backup-from-standby and this errhint(). The issue is that the WAL for the backup hasn't been archived yet and that may be because of either: archive_command is failing OR No WAL is getting generated In either case, both of these apply to the primary and the standby. As such, I'm inclined to try to mention both in the original errhint() instead of making two different ones. I'm open to suggestions on this, of course, but my thinking would be more like: - Either archive_command is failing or not enough WAL has been generated to require a segment switch. Run pg_switch_wal() to request a WAL switch and monitor your logs to check that your archive_command is executing properly. - And then change pg_switch_wal()'s errhint for when it's run on a replica to be: HINT: WAL control functions cannont be executed during recovery; they should be executed on the primary instead. (or similar, again, open to suggestions here). > 2) At backup.sgml L1015 it says that pg_stop_backup() will > automatically switch the WAL segment. There should be a caveat here > for standby backups, like: > > When called on a master it terminates the backup mode and performs > an automatic switch to the next WAL segment. The reason for the > switch is to arrange for the last WAL segment written during the > backup interval to be ready to archive. When called on a standby > the WAL segment switch must be performed manually on the master if > it does not happen due to normal write activity. s/master/primary/g Otherwise it looks alright.. Might try to reword to use similar language as to what we have today in 25.3.3.1. > 3) The fact that this fix requires "archive_mode = always" seems > like a pretty big caveat, thought I don't have any ideas on how to > make it better without big code changes. Maybe it would be help to > change: > > + the backup is taken on a standby, pg_stop_backup waits > + for WAL to be archived when archive_mode > > To: > > + the backup is taken on a standby, pg_stop_backup waits > + for WAL to be archived *only* when archive_mode I'm thinking of rewording that a bit to say "When archive_mode = always" instead, as I think that might be a little clearer. > Perhaps this should be noted in the pg_basebackup docs as well? That seems like it's probably a good idea too, as people might wonder why pg_basebackup hasn't finished yet if it's waiting for WAL to be archived. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
On Mon, Jul 24, 2017 at 11:40 AM, David Steele wrote: > Before reviewing the patch, I would note that it looks like this issue was > introduced in b6a323a8c before the 9.6 release. The documentation states > that the pg_stop_backup() function will wait for all required WAL to be > archived, which corresponds to the default in the new patch of > waitforarchive = true. The commit notes that the documentation needs > updating, but since that didn't happen I think we should consider this a bug > in 9.6 and back patch. > > While this patch brings pg_stop_backup() in line with the documentation, it > also introduces a behavioral change compared to 9.6. Currently, the default > 9.6 behavior on a standby is to return immediately from pg_stop_backup(), > but this patch will cause it to block by default instead. Since action on > the master may be required to unblock the process, I see this as a pretty > significant change. I still think we should fix and backpatch, but I wanted > to point this out. Hmm. That seems to me like it could break backup scripts that are currently working, which seems like a *very* unfriendly thing to do in a minor release, especially since 9.6 has no option to override the default behavior (nor can we add one, since it would require a change to catalog contents). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] segfault in HEAD when too many nested functions call
Hi, On 2017-07-21 20:17:54 -0400, Tom Lane wrote: > > I dislike having the miscadmin.h include in executor.h - but I don't > > quite see a better alternative. > > I seriously, seriously, seriously dislike that. You practically might as > well put miscadmin.h into postgres.h. Instead, what do you think of > requiring the individual ExecProcNode functions to perform > CHECK_FOR_INTERRUPTS? Since they're already responsible for doing that > if they have any long-running internal loops, this doesn't seem like a > modularity violation. It is a risk for bugs-of-omission, sure, but so > are a lot of other things that the per-node code has to do. That'd work. Another alternative would be to move the inline definition of ExecProcNode() (and probably a bunch of other related functions) into a more internals oriented header. It seems likely that we're going to add more inline functions to the executor, and that'd reduce the coupling of external and internal users a bit. > * I think the comments need more work. Am willing to make a pass over > that if you want. That'd be good, but let's wait till we have something more final. > * In most places, if there's an immediate return-if-trivial-case test, > we check stack depth only after that. There's no point in checking > and then returning; either you already crashed, or you're at peak > stack so far as this code path is concerned. I went back/forth a bit on that one. The calling code might call other functions that go deeper on the stack, which won't have the checks. Fine with moving, just wanted to explain why I got there. > * Can we redefine the ExecCustomScan function pointer as type > ExecProcNodeCB, eliminating the unsightly cast in nodeCustom.c? That'd change an "extension API", which is why I skipped it at this point of the release cycle. It's not like we didn't have this type of cast all over before. Ok, with changing it, but that's where I came down. > * The various callers of ExecScan() are pretty inconsistently coded. > I don't care that much whether they use castNode() or just forcibly > cast to ScanState*, but let's make them all look the same. I tried changed the minimum, perfectly fine to move to castNode in a wholesale manner. Btw, I really want to get rid of ExecScan(), at least as an external function. Does a lot of unnecessary things in a lot of cases, and makes branch prediction a lot worse. Not v10 stuff tho. > * I believe the usual term for what these function pointers are is > "methods", not "callbacks". Certainly we'd call them that if we > were working in C++. K. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increase Vacuum ring buffer.
On Mon, Jul 24, 2017 at 6:37 AM, Sokolov Yura wrote: > Good day, Claudio > > > On 2017-07-22 00:27, Claudio Freire wrote: >> >> On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura >> wrote: >>> >>> >>> My friend noticed, that I didn't said why I bother with autovacuum. >>> Our customers suffers from table bloating. I've made synthetic >>> bloating test, and started experiments with modifying micro- and >>> auto-vacuum. My first attempts were to update FSM early (both in >>> micro and autovacuum) and update it upto root, not only low level. >> >> >> This FSM thing is probably not a bad idea as well. >> >> We're forced to run regular manual vacuums because for some tables >> autovacuums seems to never be enough, no matter how it's configured, >> mostly because it gets canceled all the time. These are high-churn, >> huge tables, so vacuuming them takes hours or days, there's always >> someone with a conflicting lock at some point that ends up canceling >> the autovacuum task. >> >> The above paragraph triggered me to go check, and it seems in those >> cases the FSM never gets vacuumed. That's probably not a good thing, >> but I don't see how to vacuum the FSM after a cancel. So vacuuming the >> FSM from time to time during long-running vacuums seems like a good >> idea at this point. > > > Attached patch changes fsm update: instead of updating only lowest > level, it propagates space increase up to root. > > It slows autovacuum a bit, so that I didn't propose it together with > ring buffer increase. I was mostly thinking about something like the attached patch. Simple, unintrusive, and shouldn't cause any noticeable slowdown. From 5da264507058175e614f6ce7c77d2bd0491b1416 Mon Sep 17 00:00:00 2001 From: Claudio Freire Date: Mon, 24 Jul 2017 13:09:10 -0300 Subject: [PATCH] Vacuum FSM after each index pass This prevents concurrent writes from accumulating bloat due to recently freed space being invisible in the FSM yet. When vacuum can run for hours or days, this can make a huge difference. --- src/backend/commands/vacuumlazy.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index fabb2f8d52..4d8d90e833 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -735,6 +735,9 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, /* Remove tuples from heap */ lazy_vacuum_heap(onerel, vacrelstats); + /* Vacuum the Free Space Map */ + FreeSpaceMapVacuum(onerel); + /* * Forget the now-vacuumed tuples, and press on, but be careful * not to reset latestRemovedXid since we want that value to be -- 2.12.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Stephen Frost writes: > * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: >> On 3/6/17 03:33, Michael Banck wrote: >>> Would this be a candidate for backpatching, or is the behaviour change >>> in pg_dump trumping the issues it solves? >> Unless someone literally has a materialized view on pg_policy, it >> wouldn't make a difference, so I'm not very keen on bothering to >> backpatch this. > Agreed. So actually, the problem with Jim's patch is that it doesn't fix the problem. pg_dump's attempts to REFRESH matviews will still fail in common cases, because they still come out before GRANTs, because pg_dump treats ACLs as a completely independent thing to be done last. This was noted as far back as 2015 (in a thread previously linked from this thread), and it's also the cause of Jordan Gigov's current complaint at https://www.postgresql.org/message-id/CA%2BnBocAmQ%2BOPNSKUzaaLa-6eGYVw5KqexWJaRoGvrvLyDir9gg%40mail.gmail.com Digging around in the archives, I find that Kevin had already proposed a fix in https://www.postgresql.org/message-id/flat/20160202161407.2778.24659%40wrigleys.postgresql.org which I didn't particularly care for, and apparently nobody else did either. But we really oughta do *something*. The main problem with Kevin's fix, after thinking about it more, is that it shoves matview refresh commands into the same final processing phase where ACLs are done, which means that in a parallel restore they will not be done in parallel. That seems like a pretty serious objection, although maybe not so serious that we'd be willing to accept a major rewrite in the back branches to avoid it. I'm wondering at this point about having restore create a fake DO_ACLS object (fake in the sense that it isn't in the dump file) that would participate normally in the dependency sort, and which we'd give a priority before matview refreshes but after everything else. "Restore" of that object would perform the same operation we do now of running through the whole TOC and emitting grants/revokes. So it couldn't be parallelized in itself (at least not without an additional batch of work) but it could be treated as an indivisible parallelized task, and then the matview refreshes could be parallelizable tasks after that. There's also Peter's proposal of splitting up GRANTs from REVOKEs and putting only the latter at the end. I'm not quite convinced that that's a good idea but it certainly merits consideration. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Sat, Jul 22, 2017 at 8:53 PM, Amit Kapila wrote: > On Thu, Jul 13, 2017 at 7:38 AM, Amit Kapila > wrote: > > On Wed, Jul 12, 2017 at 11:20 PM, Jeff Janes > wrote: > >> > >> > >> > >> Setting parallel_workers to 8 changes the threshold for the parallel to > even > >> be considered from parellel_tuple_cost <= 0.0049 to <= 0.0076. So it is > >> going in the correct direction, but not by enough to matter. > >> > > > > You might want to play with cpu_tuple_cost and or seq_page_cost. > > > > I don't know whether the patch will completely solve your problem, but > this seems to be the right thing to do. Do you think we should stick > this for next CF? > It doesn't solve the problem for me, but I agree it is an improvement we should commit. Cheers, Jeff
Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server
On 7/23/17 11:48 PM, Masahiko Sawada wrote: On Sat, Jul 22, 2017 at 8:04 AM, Stephen Frost wrote: I started discussing this with David off-list and he'd like a chance to review it in a bit more detail (he's just returning from being gone for a few weeks). That review will be posted to this thread on Monday, and I'll wait until then to move forward with the patch. Before reviewing the patch, I would note that it looks like this issue was introduced in b6a323a8c before the 9.6 release. The documentation states that the pg_stop_backup() function will wait for all required WAL to be archived, which corresponds to the default in the new patch of waitforarchive = true. The commit notes that the documentation needs updating, but since that didn't happen I think we should consider this a bug in 9.6 and back patch. While this patch brings pg_stop_backup() in line with the documentation, it also introduces a behavioral change compared to 9.6. Currently, the default 9.6 behavior on a standby is to return immediately from pg_stop_backup(), but this patch will cause it to block by default instead. Since action on the master may be required to unblock the process, I see this as a pretty significant change. I still think we should fix and backpatch, but I wanted to point this out. The patch looks sensible to me. A few comments: 1) I would change: "Check if the WAL segment needed for this backup have been completed, in which case a forced segment switch may be needed on the primary." To something like: "If the WAL segments required for this backup have not been archived then it might be necessary to force a segment switch on the primary." 2) At backup.sgml L1015 it says that pg_stop_backup() will automatically switch the WAL segment. There should be a caveat here for standby backups, like: When called on a master it terminates the backup mode and performs an automatic switch to the next WAL segment. The reason for the switch is to arrange for the last WAL segment written during the backup interval to be ready to archive. When called on a standby the WAL segment switch must be performed manually on the master if it does not happen due to normal write activity. 3) The fact that this fix requires "archive_mode = always" seems like a pretty big caveat, thought I don't have any ideas on how to make it better without big code changes. Maybe it would be help to change: + the backup is taken on a standby, pg_stop_backup waits + for WAL to be archived when archive_mode To: + the backup is taken on a standby, pg_stop_backup waits + for WAL to be archived *only* when archive_mode Perhaps this should be noted in the pg_basebackup docs as well? Regards, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Syncing sql extension versions with shared library versions
On Sat, Jul 22, 2017 at 10:50 AM, Robert Haas wrote: > On Fri, Jul 21, 2017 at 4:17 PM, Mat Arye wrote: > > (I > > want to avoid having to keep backwards-compatibility for all functions in > > future shared-libraries). > > Are you sure that's a good idea? No :). But we have a lot of (most of) code that is not user-called-functions (planner/other hooks etc.). It seems dangerous to update that code in the .so and have it possibly affect customers that are using old versions of the extension. While it's possible to do that kind of _v1 suffix code for planner functions as well, this seems like a nightmare in terms of code maintenance (we already have 1000s of lines of C code). I think a dynamic loader might be more work upfront but have major payoffs for speed of development in the long term for us. It may also have advantages in terms of update safety. It's also worth noting that our C code has some SPI upcalls, so keeping some sync between the sql and C code is even more of an issue for us (if we can't make the dynamic/stub loader approach work, this might be an anti-pattern and we may have to review doing upcalls at all). > It seems like swimming upstream > against the design. I mean, instead of creating a dispatcher library > that loads either v1 or v2, maybe you could just put it all in one > library, add a "v1" or "v2" suffix to the actual function names where > appropriate, and then set up the SQL definitions to call the correct > one. I mean, it's the same thing, but with less chance of the dynamic > loader ruining your day. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
[HACKERS] Issue with circular references in VIEW
Hi, There is an issue with version prior to 10 when dumping views with circular references. I know that these views are now exported as views in 10 but they are still exported as TABLE + RULE in prior versions. This conduct to the following error when columns of sub-queries doesn't have the same aliases names: ERROR: SELECT rule's target entry 1 has different column name from column "col_a" DETAIL: SELECT target entry is named "other_name1". Here is the steps to reproduce: CREATE TABLE t1 (f1 INT PRIMARY KEY, f2 text); CREATE VIEW v_t1 (col_a, col_b) AS WITH win_query AS ( SELECT 1::INTEGER AS col1, 'b' ::text AS col2 ) SELECT imp.col1 AS other_name1, imp.col2 AS other_name2 FROM win_query imp UNION SELECT 2::INTEGER AS col1, 'z'::text AS col2 UNION SELECT * FROM t1 GROUP BY f1 ; This is translated into the following code by pg_dump with PostgreSQL 9.x: CREATE TABLE t1 ( f1 integer NOT NULL, f2 text ); CREATE TABLE v_t1 ( col_a integer, col_b text ); COPY t1 (f1, f2) FROM stdin; \. CREATE RULE "_RETURN" AS ON SELECT TO v_t1 DO INSTEAD WITH win_query AS ( SELECT 1 AS col1, 'b'::text AS col2 ) SELECT imp.col1 AS other_name1, imp.col2 AS other_name2 FROM win_query imp UNION SELECT 2 AS col1, 'z'::text AS col2 UNION SELECT t1.f1, t1.f2 FROM t1 GROUP BY t1.f1; and this dump can't be restored because of the error reported above. It is clear that the user is responsible of using wrong aliases but this doesn't generate error at creation time, and looking at the view through the call of pg_get_viewdef(), aliases are correctly rewritten: test_view=# \d+ v_t1 View "public.v_t1" Column | Type | Modifiers | Storage | Description +-+---+--+- col_a | integer | | plain| col_b | text| | extended | View definition: WITH win_query AS ( SELECT 1 AS col1, 'b'::text AS col2 ) SELECT imp.col1 AS col_a, imp.col2 AS col_b FROM win_query imp UNION SELECT 2 AS col_a, 'z'::text AS col_b UNION SELECT t1.f1 AS col_a, t1.f2 AS col_b FROM t1 GROUP BY t1.f1; The rule code retrieved using pg_get_ruledef() reports the use of original incorrect column's aliases: CREATE RULE "_RETURN" AS ON SELECT TO v_t1 DO INSTEAD WITH win_query AS ( SELECT 1 AS col1, 'b'::text AS col2 ) SELECT imp.col1 AS other_name1, imp.col2 AS other_name2 FROM win_query imp UNION SELECT 2 AS col1, 'z'::text AS col2 UNION SELECT t1.f1, t1.f2 FROM t1 GROUP BY t1.f1; PostgreSQL 10 now use views and no more table+rule, so call to pg_get_viewdef() self fix this issue. My question is do this method to export views will be back-ported to prior version or should we have to fix it an other way? In the last case does the use of pg_get_viewdef() to reconstruct the _RETURN rule could be a simple fix? For example: 'CREATE RULE "_RETURN" AS ON SELECT TO v_t1 DO INSTEAD %s;', pg_get_viewdef(...) Of course manually rewriting the view and replace it fixes the issue but I think that generating dump that can't be restored easily can confuse users. -- Gilles Darold Consultant PostgreSQL http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cache lookup failed error for partition key with custom opclass
Rushabh Lathia writes: > PFA patch, where added elog() to add the error message same as all other > places. Some looking around says that this *isn't* the only place that just blithely assumes that it will find an opfamily entry. But I agree that not checking for that isn't up to project standards. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] cache lookup failed error for partition key with custom opclass
Hi, Consider the following test: CREATE OR REPLACE FUNCTION dummy_binaryint4(a int4, b int4) RETURNS int4 AS $$ BEGIN RETURN a; END; $$ LANGUAGE 'plpgsql' IMMUTABLE; CREATE OPERATOR CLASS custom_opclass2 FOR TYPE int2 USING BTREE AS OPERATOR 1 = , FUNCTION 1 dummy_binaryint4(int4, int4); t=# CREATE TABLE list_tab(a int2, b int) PARTITION BY LIST (a custom_opclass2); *ERROR: cache lookup failed for function 0* In the above test creating OP class type int2, but passing the function of int4 type. During CREATE PARTITION, ComputePartitionAttrs() able to resolve the opclass for the partition key (partition key type is int2), but while looking for a method for the int2 - it unable to find the proper function and end up with the cache lookup failed error. Error coming from RelationBuildPartitionKey(). I think overall this is expected but still error can be better - like all the other places where get_opfamily_proc() unable to find valid function oid. PFA patch, where added elog() to add the error message same as all other places. Thanks, Rushabh Lathia www.EnterpriseDB.com diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 43238dd..6bd93b0 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -945,6 +945,10 @@ RelationBuildPartitionKey(Relation relation) opclassform->opcintype, opclassform->opcintype, BTORDER_PROC); + if (!OidIsValid(funcid)) /* should not happen */ + elog(ERROR, "missing support function %d(%u,%u) in opfamily %u", + BTORDER_PROC, opclassform->opcintype, opclassform->opcintype, + opclassform->opcfamily); fmgr_info(funcid, &key->partsupfunc[i]); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity in error handling of constraint violation in ExecConstraints for partitioned tables
On 2017/07/24 17:30, Etsuro Fujita wrote: > On 2017/07/24 16:16, Amit Khandekar wrote: >> On 24 July 2017 at 12:11, Amit Langote >> wrote: >>> Attached is the updated version of your patch. > > Good catch, Amit K. and Amit L.! > >> Now that this is done, any particular reason it is not done in >> ExecPartitionCheck() ? I see that there is a do_convert_tuple() in >> that function, again without changing the slot descriptor. > > Yeah, I think we would need that in ExecPartitionCheck() as well. Yes, we need that there too. Done that in the attached v3 (including the test where ExecPartitionCheck() would have crashed without the patch). Thanks, Amit diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index b22de78516..78cbcd1a32 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -1879,6 +1879,7 @@ ExecPartitionCheck(ResultRelInfo *resultRelInfo, TupleTableSlot *slot, if (map != NULL) { tuple = do_convert_tuple(tuple, map); + ExecSetSlotDescriptor(slot, tupdesc); ExecStoreTuple(tuple, slot, InvalidBuffer, false); } } @@ -1956,6 +1957,7 @@ ExecConstraints(ResultRelInfo *resultRelInfo, if (map != NULL) { tuple = do_convert_tuple(tuple, map); + ExecSetSlotDescriptor(slot, tupdesc); ExecStoreTuple(tuple, slot, InvalidBuffer, false); } } @@ -2003,6 +2005,7 @@ ExecConstraints(ResultRelInfo *resultRelInfo, if (map != NULL) { tuple = do_convert_tuple(tuple, map); + ExecSetSlotDescriptor(slot, tupdesc); ExecStoreTuple(tuple, slot, InvalidBuffer, false); } } @@ -2112,6 +2115,7 @@ ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo, if (map != NULL) { tuple = do_convert_tuple(tuple, map); + ExecSetSlotDescriptor(slot, tupdesc); ExecStoreTuple(tuple, slot, InvalidBuffer, false); } } diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out index c608ce377f..0dcc86fef4 100644 --- a/src/test/regress/expected/insert.out +++ b/src/test/regress/expected/insert.out @@ -410,6 +410,21 @@ with ins (a, b, c) as mlparted4 | 1 | 30 | 39 (5 rows) +alter table mlparted add c text; +create table mlparted5 (c text, a int not null, b int not null) partition by list (c); +create table mlparted5a (a int not null, c text, b int not null); +alter table mlparted5 attach partition mlparted5a for values in ('a'); +alter table mlparted attach partition mlparted5 for values from (1, 40) to (1, 50); +alter table mlparted add constraint check_b check (a = 1 and b < 45); +insert into mlparted values (1, 45, 'a'); +ERROR: new row for relation "mlparted5a" violates check constraint "check_b" +DETAIL: Failing row contains (1, 45, a). +create function mlparted5abrtrig_func() returns trigger as $$ begin new.c = 'b'; return new; end; $$ language plpgsql; +create trigger mlparted5abrtrig before insert on mlparted5a for each row execute procedure mlparted5abrtrig_func(); +insert into mlparted5 (a, b, c) values (1, 40, 'a'); +ERROR: new row for relation "mlparted5a" violates partition constraint +DETAIL: Failing row contains (b, 1, 40). +drop table mlparted5; -- check that message shown after failure to find a partition shows the -- appropriate key description (or none) in various situations create table key_desc (a int, b int) partition by list ((a+0)); diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out index caca81a70b..eab5c0334c 100644 --- a/src/test/regress/expected/updatable_views.out +++ b/src/test/regress/expected/updatable_views.out @@ -2368,8 +2368,8 @@ DETAIL: Failing row contains (-1, invalid). DROP VIEW v1; DROP TABLE t1; -- check that an auto-updatable view on a partitioned table works correctly -create table pt (a int, b int) partition by range (a, b); -create table pt1 (b int not null, a int not null) partition by range (b); +create table pt (a int, b int, v varchar) partition by range (a, b); +create
Re: [HACKERS] Increase Vacuum ring buffer.
On 2017-07-21 20:41, Sokolov Yura wrote: On 2017-07-21 19:32, Robert Haas wrote: On Fri, Jul 21, 2017 at 4:19 AM, Sokolov Yura wrote: Probably with increased ring buffer there is no need in raising vacuum_cost_limit. Will you admit it? No, I definitely won't admit that. With default settings autovacuum won't write more than ~2.3MB/s if I remember the math correctly, so if you've got a 1TB table you're probably going to need a bigger value. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company I've seed autovacuum process spending >50% of its time in fsync (with current ring buffer) (but I used autovacuum_cost_delay=2ms). fsync could lasts up to second on hdd if there is concurrent IO. Even on ssd fsync could be really noticeable. But, I agree that for 1TB table autovacuum_cost_limit still should be increased, even with larger ring buffer. My friend noticed, that I didn't said why I bother with autovacuum. Our customers suffers from table bloating. I've made synthetic bloating test, and started experiments with modifying micro- and auto-vacuum. My first attempts were to update FSM early (both in micro and autovacuum) and update it upto root, not only low level. Then I looked to strace of autovacuum process, and noticed storm of fsync. I catched backtraces with gdb rooting on fsync, and found that evicting dirty pages from small ring buffer it the reason. After some experiments with combining my "early fsm update" and size of ring buffer, I understood that increasing ring buffer gives most of benefits: autovacuum runs faster, and bloating is greatly reduced. On extreme case, 400mb table bloats to 17GB on master, and only to 5GB with faster autovacuum. I used custom scripts, and that is why my statistic is not full. Though, I didn't found performance reduction. In fact, it looks like tests with "larger autovacuum ring" did more queries per hour than tests against master. I will run pgbench for weekend, so latencies and percentiles will be collected. With regards, -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company Default pgbench script wasn't able to trigger autovacuum of pgbench_accounts table in 8 hours (scale 400, 40 clients, 900tps average), so weekend testing were not useful. I will re-run with custom script for next day-two. -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increase Vacuum ring buffer.
Good day, Claudio On 2017-07-22 00:27, Claudio Freire wrote: On Fri, Jul 21, 2017 at 2:41 PM, Sokolov Yura wrote: My friend noticed, that I didn't said why I bother with autovacuum. Our customers suffers from table bloating. I've made synthetic bloating test, and started experiments with modifying micro- and auto-vacuum. My first attempts were to update FSM early (both in micro and autovacuum) and update it upto root, not only low level. This FSM thing is probably not a bad idea as well. We're forced to run regular manual vacuums because for some tables autovacuums seems to never be enough, no matter how it's configured, mostly because it gets canceled all the time. These are high-churn, huge tables, so vacuuming them takes hours or days, there's always someone with a conflicting lock at some point that ends up canceling the autovacuum task. The above paragraph triggered me to go check, and it seems in those cases the FSM never gets vacuumed. That's probably not a good thing, but I don't see how to vacuum the FSM after a cancel. So vacuuming the FSM from time to time during long-running vacuums seems like a good idea at this point. Attached patch changes fsm update: instead of updating only lowest level, it propagates space increase up to root. It slows autovacuum a bit, so that I didn't propose it together with ring buffer increase. -- Sokolov Yura aka funny_falcon Postgres Professional: https://postgrespro.ru The Russian Postgres CompanyFrom 60f76fc83ee8752362e037c1e19ed089d861e026 Mon Sep 17 00:00:00 2001 From: Sokolov Yura Date: Mon, 3 Jul 2017 15:14:07 +0300 Subject: [PATCH] fsm&vacuum: write increasing of free space on upper levels Every RecordPageWithFreeSpace update upper levels, if amount of free spaces increased. Also, do FreeSpaceMapVacuum after scanning heap and before vacuuming indices. --- src/backend/commands/vacuumlazy.c | 16 +- src/backend/storage/freespace/freespace.c | 49 ++- src/backend/storage/freespace/fsmpage.c | 4 ++- src/include/storage/fsm_internals.h | 2 +- 4 files changed, 53 insertions(+), 18 deletions(-) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index fc9c4f0fb1..a7fff0c5ae 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -595,7 +595,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, maxoff; bool tupgone, hastup; - int prev_dead_count; int nfrozen; Size freespace; bool all_visible_according_to_vm = false; @@ -925,7 +924,6 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, has_dead_tuples = false; nfrozen = 0; hastup = false; - prev_dead_count = vacrelstats->num_dead_tuples; maxoff = PageGetMaxOffsetNumber(page); /* @@ -1245,16 +1243,16 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, vacrelstats->nonempty_pages = blkno + 1; /* - * If we remembered any tuples for deletion, then the page will be - * visited again by lazy_vacuum_heap, which will compute and record - * its post-compaction free space. If not, then we're done with this - * page, so remember its free space as-is. (This path will always be - * taken if there are no indexes.) + * heap_page_prune could free a bit of space. Lets record it + * immediatly despite it will by recorded again in lazy_vacuum_heap + * after more compaction. */ - if (vacrelstats->num_dead_tuples == prev_dead_count) - RecordPageWithFreeSpace(onerel, blkno, freespace); + RecordPageWithFreeSpace(onerel, blkno, freespace); } + /* fix up all tiny bits of freed space before vacuuming indices */ + FreeSpaceMapVacuum(onerel); + /* report that everything is scanned and vacuumed */ pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno); diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c index 4648473523..ca0c356f28 100644 --- a/src/backend/storage/freespace/freespace.c +++ b/src/backend/storage/freespace/freespace.c @@ -107,6 +107,8 @@ static Size fsm_space_cat_to_avail(uint8 cat); /* workhorse functions for various operations */ static int fsm_set_and_search(Relation rel, FSMAddress addr, uint16 slot, uint8 newValue, uint8 minValue); +static void fsm_set_recursive(Relation rel, FSMAddress addr, uint16 slot, + uint8 new_cat, bool only_increase); static BlockNumber fsm_search(Relation rel, uint8 min_cat); static uint8 fsm_vacuum_page(Relation rel, FSMAddress addr, bool *eof); static BlockNumber fsm_get_lastblckno(Relation rel, FSMAddress addr); @@ -173,9 +175,8 @@ RecordAndGetPageWithFreeSpace(Relation rel, BlockNumber oldPage, /* * RecordPageWithFreeSpace - update info about a page. * - * Note that if the new spaceAvail value is higher than the old value stored - * in the FSM, the space might not become visible to searchers until the next - * FreeSpaceMa
Re: [HACKERS] Oddity in error handling of constraint violation in ExecConstraints for partitioned tables
On 2017/07/24 16:16, Amit Khandekar wrote: On 24 July 2017 at 12:11, Amit Langote wrote: Attached is the updated version of your patch. Good catch, Amit K. and Amit L.! Now that this is done, any particular reason it is not done in ExecPartitionCheck() ? I see that there is a do_convert_tuple() in that function, again without changing the slot descriptor. Yeah, I think we would need that in ExecPartitionCheck() as well. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity in error handling of constraint violation in ExecConstraints for partitioned tables
On 24 July 2017 at 12:11, Amit Langote wrote: > Hi Amit, > > On 2017/07/24 14:09, Amit Khandekar wrote: On 2017/07/10 14:15, Etsuro Fujita wrote: Another thing I noticed is the error handling in ExecWithCheckOptions; it doesn't take any care for partition tables, so the column description in the error message for WCO_VIEW_CHECK is built using the partition table's rowtype, not the root table's. But I think it'd be better to build that using the root table's rowtype, like ExecConstraints, because that would make the column description easier to understand since the parent view(s) (from which WithCheckOptions evaluated there are created) would have been defined on the root table. This seems independent from the above issue, so I created a separate patch, which I'm attaching. What do you think about that? >> >> + if (map != NULL) >> + { >> + tuple = do_convert_tuple(tuple, map); >> + ExecStoreTuple(tuple, slot, InvalidBuffer, false); >> + } >> >> Above, the tuple descriptor also needs to be set, since the parent and >> child partitions can have different column ordering. >> >> Due to this, the following testcase crashes : >> >> CREATE TABLE range_parted (a text,b int, c int) partition by range (b); >> CREATE VIEW upview AS SELECT * FROM range_parted WHERE (select c > >> 120) WITH CHECK OPTION; >> create table part_a_1_a_10(b int, c int, a text); >> alter table range_parted attach partition part_a_1_a_10 for values >> from (1) to (10); >> insert into upview values ('a', 2, 100); > > Good catch. Thanks for creating the patch. > > There are other places with similar code viz. those in ExecConstraints() > that would need fixing. Ah ok. I should have noticed that. Thanks. > Attached is the updated version of your patch. Now that this is done, any particular reason it is not done in ExecPartitionCheck() ? I see that there is a do_convert_tuple() in that function, again without changing the slot descriptor. -- Thanks, -Amit Khandekar EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] segfault in HEAD when too many nested functions call
On July 24, 2017 7:10:19 AM GMT+01:00, Noah Misch wrote: >On Tue, Jul 18, 2017 at 01:04:10PM -0700, Andres Freund wrote: >> Ok, I'll flesh out the patch till Thursday. But I do think we're >going >> to have to do something about the back branches, too. > >This PostgreSQL 10 open item is past due for your status update. >Kindly send >a status update within 24 hours, and include a date for your subsequent >status >update. Refer to the policy on open item ownership: >https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com I sent out a note fleshed out patch last week, which Tom reviewed. Planning to update it to address that review today or tomorrow. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers