Re: [HACKERS] forcing a rebuild of the visibility map
On Thu, Jun 16, 2016 at 12:06 PM, Robert Haas wrote: > [ Changing subject line in the hopes of keeping separate issues in > separate threads. ] > > On Mon, Jun 6, 2016 at 11:00 AM, Tom Lane wrote: >> Robert Haas writes: >>> I'm intuitively sympathetic to the idea that we should have an option >>> for this, but I can't figure out in what case we'd actually tell >>> anyone to use it. It would be useful for the kinds of bugs listed >>> above to have VACUUM (rebuild_vm) to blow away the VM fork and rebuild >>> it, but that's different semantics than what we proposed for VACUUM >>> (even_frozen_pages). And I'd be sort of inclined to handle that case >>> by providing some other way to remove VM forks (like a new function in >>> the pg_visibilitymap contrib module, maybe?) and then just tell people >>> to run regular VACUUM afterwards, rather than putting the actual VM >>> fork removal into VACUUM. >> >> There's a lot to be said for that approach. If we do it, I'd be a bit >> inclined to offer an option to blow away the FSM as well. > > After having been scared out of my mind by the discovery of > longstanding breakage in heap_update[1], it occurred to me that this > is an excellent example of a case in which the option for which Andres > was agitating - specifically forcing VACUUM to visit ever single page > in the heap - would be useful. Even if there's functionality > available to blow away the visibility map forks, you might not want to > just go remove them all and re-vacuum everything, because while you > were rebuilding the VM forks, index-only scans would stop being > index-only, and that might cause an operational problem. Being able > to simply force every page to be visited is better. Patch for that > attached. I decided to call the option disable_page_skipping rather > than even_frozen_pages because it should also force visiting pages > that are all-visible but not all-frozen. (I was sorely tempted to go > with the competing proposal of calling it VACUUM (THEWHOLEDAMNTHING) > but I refrained.) > > However, I also think that the approach described above - providing a > way to excise VM forks specifically - is useful. Patch for that > attached, too. It turns out that can't be done without either adding > a new WAL record type or extending an existing one; I chose to add a > "flags" argument to XLOG_SMGR_TRUNCATE, specifying the forks to be > truncated. Since this will require bumping the XLOG version, if we're > going to do it, and I think we should, it would be good to try to get > it done before beta2 rather than closer to release. However, I don't > want to commit this without some review, so please review this. > Actually, please review both patches.[2] The same core support could > be used to add an option to truncate the FSM, but I didn't write a > patch for that because I'm incredibly { busy | lazy }. > Option name DISABLE_PAGE_SKIPPING is good to me. I'm still working on this, but here are some review comments. --- +CREATE FUNCTION pg_truncate_visibility_map(regclass) +RETURNS void +AS 'MODULE_PATHNAME', 'pg_truncate_visibility_map' +LANGUAGE C STRICT +PARALLEL UNSAFE; -- let's not make this any more dangerous + "REVOKE ALL ON FUNCTION pg_truncate_visibility_map(regclass) FROM PUBLIC;" is missing in pg_visibility/pg_visibility--1.0--1.1.sql. --- I think that VACUUM with VERBOSE option can show the information for how many frozen pages we skipped like autovacuum does. Thought? Patch attached. Regards, -- Masahiko Sawada diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 7a67fa5..ddbb835 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -1329,6 +1329,8 @@ lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats, "Skipped %u pages due to buffer pins.\n", vacrelstats->pinskipped_pages), vacrelstats->pinskipped_pages); + appendStringInfo(&buf, _("Skipped %u frozen pages.\n"), + vacrelstats->frozenskipped_pages); appendStringInfo(&buf, ngettext("%u page is entirely empty.\n", "%u pages are entirely empty.\n", empty_pages), -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On 6/15/16, Michael Paquier wrote: > On Thu, Jun 16, 2016 at 12:10 PM, Vitaly Burovoy > wrote: >> In the initial letter[1] I posted a digest from the SQL-2011 standard >> and a conclusion as a design of a new patch. >> Now I have more free time and I'm hacking it that way. The new patch >> is at the very early stage, full of WIPs and TODOs. I hope it'll be >> ready to be shown in a month (may be two). > > I have just read both your patch and the one of Alvaro, but yours does > not touch pg_constraint in any way. Isn't that unexpected? The consensus was reached to use CHECK constraints instead of new type of constrains. Alvaro made attempt[1] to write PoC in 2012 but it failed to apply on top of master (and after solving conflicts led to core dumps) in Jan 2016. I just rebased Alvaro's one to understand how he wanted to solve issue and to run tests and queries. After all I sent rebased working patch for anyone who wants to read it and try it without core dumps. I have not published my patch for NOT NULLs yet. Alvaro, the correct path[2] in the second message[3] of the thread. What's wrong in it (I got the source in the previous[1] thread)? [1] https://www.postgresql.org/message-id/flat/1343682669-sup-2...@alvh.no-ip.org [2] https://www.postgresql.org/message-id/attachment/41886/catalog-notnull-2-c477e84_cleaned.patch [3] https://www.postgresql.org/message-id/CAKOSWNnXbOY4pEiwN9wePOx8J%2BB44yTj40BQ8RVo-eWkdiGDFQ%40mail.gmail.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On Thu, Jun 16, 2016 at 1:27 PM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Wed, Jun 15, 2016 at 10:34 PM, Tom Lane wrote: > >> > IIRC, Alvaro posted a WIP patch for that awhile back. Not sure what the >> > current state is. >> >> Are you talking about that? >> https://www.postgresql.org/message-id/20110707213401.GA27098%40alvh.no-ip.org >> This is not a small patch :) >> >> Alvaro, others, any opinions? > > I haven't looked at this in a while. I suppose I should get it in shape > sometime. Unless you would like to work on it? Well, I could, but there is another big patch I am getting into shape for the next CF (Ahem. scram. Ahem), so I am going to stay focused on this one to have fuel for it during the CF. But I'm fine reviewing the patch for the DROP NOT NULL. > Note that Vitaly Burovoy rebased it, but I think the rebase is wrong. > https://www.postgresql.org/message-id/cakoswnkn6hsyatuys8xzxzrcr-kl1okhs5-b9qd9bf1rad3...@mail.gmail.com Yes, it is definitely wrong. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo in parallel comment of heap_delete()
On Tue, Jun 7, 2016 at 12:00 AM, Robert Haas wrote: > On Sun, Jun 5, 2016 at 4:39 PM, Jim Nasby wrote: >> I'm pretty sure this is a typo... > > Sure is. Thanks. The same typo appears in heap_update. PS Far be it from me, but postgres_fdw.c seems to have a stray conditional form where I would expect a present subjunctive: "... lest the new path *would* kick ..." /me ducks -- Thomas Munro http://www.enterprisedb.com lest.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
Michael Paquier wrote: > On Wed, Jun 15, 2016 at 10:34 PM, Tom Lane wrote: > > IIRC, Alvaro posted a WIP patch for that awhile back. Not sure what the > > current state is. > > Are you talking about that? > https://www.postgresql.org/message-id/20110707213401.GA27098%40alvh.no-ip.org > This is not a small patch :) > > Alvaro, others, any opinions? I haven't looked at this in a while. I suppose I should get it in shape sometime. Unless you would like to work on it? Note that Vitaly Burovoy rebased it, but I think the rebase is wrong. https://www.postgresql.org/message-id/cakoswnkn6hsyatuys8xzxzrcr-kl1okhs5-b9qd9bf1rad3...@mail.gmail.com -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On Thu, Jun 16, 2016 at 12:10 PM, Vitaly Burovoy wrote: > In the initial letter[1] I posted a digest from the SQL-2011 standard > and a conclusion as a design of a new patch. > Now I have more free time and I'm hacking it that way. The new patch > is at the very early stage, full of WIPs and TODOs. I hope it'll be > ready to be shown in a month (may be two). I have just read both your patch and the one of Alvaro, but yours does not touch pg_constraint in any way. Isn't that unexpected? > But it already forbids dropping NOT NULLs if they were set as result > of inheritance. Okay, I'll drop any proposal on my side in this case. Looking forward to seeing what you got for the first commit fest of 10.0. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
On Tue, Jun 14, 2016 at 12:18 PM, Tom Lane wrote: > Amit Kapila writes: >> On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane wrote: >>> ... I got a core dump in the window.sql test: >>> which I think may be another manifestation of the failure-to-apply-proper- >>> pathtarget issue we're looking at in this thread. Or maybe it's just >>> an unjustified assumption in make_partialgroup_input_target that the >>> input path must always have some sortgrouprefs assigned. > >> I don't see this problem after your recent commit >> - 89d53515e53ea080029894939118365b647489b3. Are you still getting it? > > No. I am suspicious that there's still a bug there, ie we're looking at > the wrong pathtarget; but the regression test doesn't actually crash. > That might only be because we don't choose the parallelized path. OK, so there are quite a number of separate things here: 1. The case originally reported by Thomas Munro still fails. To fix that, we probably need to apply scanjoin_target to each partial path. But we can only do that if it's parallel-safe. It seems like what we want is something like this: (1) During scan/join planning, somehow skip calling generate_gather_paths for the topmost scan/join rel as we do to all the others. (2) If scanjoin_target is not parallel-safe, build a path for the scan/join phase that applies a Gather node to the cheapest path and does projection at the Gather node. Then forget all the partial paths so we can't do any bogus upper planning. (3) If scanjoin_target is parallel-safe, replace the list of partial paths for the topmost scan/join rel with a new list where scanjoin_target has been applied to each one. I haven't tested this so I might be totally off-base about what's actually required here... 2. In https://www.postgresql.org/message-id/15695.1465827...@sss.pgh.pa.us Tom raised the issue that we need some test cases covering this area. 3. In https://www.postgresql.org/message-id/25521.1465837...@sss.pgh.pa.us Tom proposed adding a GUC to set the minimum relation size required for consideration of parallelism. 4. In https://www.postgresql.org/message-id/1597.1465846...@sss.pgh.pa.us Tom raised the question of whether there is a danger that MinMaxAggPath might not be parallel-safe. 5. In https://www.postgresql.org/message-id/20391.1465850...@sss.pgh.pa.us Tom proposed a patch to fix an apparent confusion on my part between subplans and subqueries. 6. In https://www.postgresql.org/message-id/ca+tgmozwjb9eaibj-meeaq913wkkoz4aq7vqncfrdls9wyh...@mail.gmail.com I discussed the need to fix the way consider_parallel is set for upper rels, and in https://www.postgresql.org/message-id/22832.1465854...@sss.pgh.pa.us Tom observed that, once that work is done, we can get rid of the wholePlanParallelSafe flag. I don't think it's remotely realistic to get all of this fixed in time for beta2; I think we'll be lucky if we can get #1 fixed. But we'd better track all of these issues so that we can crank through them later. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On 6/15/16, Tom Lane wrote: > Michael Paquier writes: >> To put it short, it should not be possible to drop a NOT NULL >> constraint on a child relation when its parent table is using it, this >> should only be possible from the parent. Attached is a patch handling >> this problem by adding a new function in pg_inherits.c to fetch the >> list of parent relations for a given relation OID, and did some >> refactoring to stick with what is done when scanning child relations. > > This doesn't sound like the right approach; in particular, it won't really > help for deciding whether to propagate a DROP NOT NULL on a parent rel to > its children. What we've discussed in the past is to store NOT NULL > constraints in pg_constraint, much like CHECK constraints are already, and > use use-count logic identical to the CHECK case to keep track of whether > NOT NULL constraints are inherited or not. My feeling is that we'd keep > the pg_attribute.attnotnull field and continue to drive actual enforcement > off that, but it would just reflect a summary of the pg_constraint state. > > IIRC, Alvaro posted a WIP patch for that awhile back. Not sure what the > current state is. > > regards, tom lane The last thread about NOT NULLs as constraints is accessible by the link[1]. I rebased[2] Alvaro's patch to the actual master at that time, but I have not repeated it since then. In the initial letter[1] I posted a digest from the SQL-2011 standard and a conclusion as a design of a new patch. Now I have more free time and I'm hacking it that way. The new patch is at the very early stage, full of WIPs and TODOs. I hope it'll be ready to be shown in a month (may be two). But it already forbids dropping NOT NULLs if they were set as result of inheritance. [1] https://www.postgresql.org/message-id/flat/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA%40mail.gmail.com [2] https://www.postgresql.org/message-id/attachment/41886/catalog-notnull-2-c477e84_cleaned.patch -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On Wed, Jun 15, 2016 at 10:34 PM, Tom Lane wrote: > Michael Paquier writes: > This doesn't sound like the right approach; in particular, it won't really > help for deciding whether to propagate a DROP NOT NULL on a parent rel to > its children. What we've discussed in the past is to store NOT NULL > constraints in pg_constraint, much like CHECK constraints are already, and > use use-count logic identical to the CHECK case to keep track of whether > NOT NULL constraints are inherited or not. My feeling is that we'd keep > the pg_attribute.attnotnull field and continue to drive actual enforcement > off that, but it would just reflect a summary of the pg_constraint state. OK, I see. Hm, by storing this information I would actually think that we want to drop this attnotnull so as we don't need to bother about updating pg_attribute through the whole tree when dropping a NOT NULL constraint on the parent, and we do not actually need to store this information in two different places.. I would also rather do nothing for the DDL interface regarding for example the possibility to change the constraint names for domains and tables to keep things simple. A patch of this caliber would be complicated enough if a catalog switch is done. > IIRC, Alvaro posted a WIP patch for that awhile back. Not sure what the > current state is. Are you talking about that? https://www.postgresql.org/message-id/20110707213401.GA27098%40alvh.no-ip.org This is not a small patch :) Alvaro, others, any opinions? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] forcing a rebuild of the visibility map
[ Changing subject line in the hopes of keeping separate issues in separate threads. ] On Mon, Jun 6, 2016 at 11:00 AM, Tom Lane wrote: > Robert Haas writes: >> I'm intuitively sympathetic to the idea that we should have an option >> for this, but I can't figure out in what case we'd actually tell >> anyone to use it. It would be useful for the kinds of bugs listed >> above to have VACUUM (rebuild_vm) to blow away the VM fork and rebuild >> it, but that's different semantics than what we proposed for VACUUM >> (even_frozen_pages). And I'd be sort of inclined to handle that case >> by providing some other way to remove VM forks (like a new function in >> the pg_visibilitymap contrib module, maybe?) and then just tell people >> to run regular VACUUM afterwards, rather than putting the actual VM >> fork removal into VACUUM. > > There's a lot to be said for that approach. If we do it, I'd be a bit > inclined to offer an option to blow away the FSM as well. After having been scared out of my mind by the discovery of longstanding breakage in heap_update[1], it occurred to me that this is an excellent example of a case in which the option for which Andres was agitating - specifically forcing VACUUM to visit ever single page in the heap - would be useful. Even if there's functionality available to blow away the visibility map forks, you might not want to just go remove them all and re-vacuum everything, because while you were rebuilding the VM forks, index-only scans would stop being index-only, and that might cause an operational problem. Being able to simply force every page to be visited is better. Patch for that attached. I decided to call the option disable_page_skipping rather than even_frozen_pages because it should also force visiting pages that are all-visible but not all-frozen. (I was sorely tempted to go with the competing proposal of calling it VACUUM (THEWHOLEDAMNTHING) but I refrained.) However, I also think that the approach described above - providing a way to excise VM forks specifically - is useful. Patch for that attached, too. It turns out that can't be done without either adding a new WAL record type or extending an existing one; I chose to add a "flags" argument to XLOG_SMGR_TRUNCATE, specifying the forks to be truncated. Since this will require bumping the XLOG version, if we're going to do it, and I think we should, it would be good to try to get it done before beta2 rather than closer to release. However, I don't want to commit this without some review, so please review this. Actually, please review both patches.[2] The same core support could be used to add an option to truncate the FSM, but I didn't write a patch for that because I'm incredibly { busy | lazy }. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] https://www.postgresql.org/message-id/ca+tgmoz-1txcsxwcp3i-kmo5dnb-6p-odqw7c-n_q3pzzy1...@mail.gmail.com [2] This request is directed at the list generally, not at any specific individual ... well, OK, I mean you. Yeah, you! From 010e99b403ec733d50c71a7d4ef646b1b446ef07 Mon Sep 17 00:00:00 2001 From: Robert Haas Date: Wed, 15 Jun 2016 22:52:58 -0400 Subject: [PATCH 2/2] Add VACUUM (DISABLE_PAGE_SKIPPING) for emergencies. --- doc/src/sgml/ref/vacuum.sgml | 21 - src/backend/commands/vacuum.c| 9 src/backend/commands/vacuumlazy.c| 87 src/backend/parser/gram.y| 10 + src/include/nodes/parsenodes.h | 3 +- src/test/regress/expected/vacuum.out | 1 + src/test/regress/sql/vacuum.sql | 2 + 7 files changed, 92 insertions(+), 41 deletions(-) diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index 19fd748..dee1c5b 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ table_name [ (column_name [, ...] ) ] ] +VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE | DISABLE_PAGE_SKIPPING } [, ...] ) ] [ table_name [ (column_name [, ...] ) ] ] VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ table_name ] VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ table_name [ (column_name [, ...] ) ] ] @@ -130,6 +130,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ +DISABLE_PAGE_SKIPPING + + + Normally, VACUUM will skip pages based on the visibility map. Pages where + all tuples are known to be frozen can always be skipped, and those + where all tuples are known to be visible to all transactions may be + skipped except when performing an aggressive vacuum. Furthermore, + except when performing an aggressive vacuum, some pages may be skipped + in order to avoid waiting for other sessions to finish using them. + This option disables all page-skipping behavior, and is intended to + be used only the contents of the visi
Re: [HACKERS] parallel.c is not marked as test covered
On Wed, Jun 15, 2016 at 10:48 PM, Amit Kapila wrote: > exec_stmt_execsql() is used to execute SQL statements insider plpgsql which > includes dml statements as well, so probably you wanted to play safe by not > allowing parallel option from that place. However, I think there shouldn't > be a problem in using CURSOR_OPT_PARALLEL_OK from this place as we have a > check in standard_planner which will take care of whether to choose parallel > mode or not for a particular statement. If you want, I can do more detailed > analysis and prepare a patch. That would be great. I have a vague recollection that that function might be called in some contexts where execution can be suspended, which would be no good for parallel query. But that might be wrong. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Thu, Jun 16, 2016 at 12:46 AM, Robert Haas wrote: > > On Wed, Jun 15, 2016 at 5:23 AM, Amit Kapila wrote: > >> > Considering above analysis is correct, we have below options: > >> > a. Modify the test such that it actually generates an error and to hide > >> > the > >> > context, we can exception block and raise some generic error. > >> > b. Modify the test such that it actually generates an error and to hide > >> > the > >> > context, we can use force_parallel_mode = regress; > >> > >> Either of those sounds okay. No need to raise a generic error; one can > >> raise > >> SQLERRM to keep the main message and not the context. I lean toward (a) > >> so we > >> have nonzero test coverage of force_parallel_mode=on. > > > > Patch implementing option (a) attached with this mail. > > OK, committed. Thanks. >I also changed "select" to "perform" per your > analysis. oops, it seems I have forgotten to make that change in patch. > > I wonder if we need to revisit the choices I made inside > PL/pgsql and see why CURSOR_OPT_PARALLEL_OK is not being set here. > exec_stmt_execsql() is used to execute SQL statements insider plpgsql which includes dml statements as well, so probably you wanted to play safe by not allowing parallel option from that place. However, I think there shouldn't be a problem in using CURSOR_OPT_PARALLEL_OK from this place as we have a check in standard_planner which will take care of whether to choose parallel mode or not for a particular statement. If you want, I can do more detailed analysis and prepare a patch. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
Kevin Grittner wrote: > test=# -- connection 1 > analyze verbose t1; -- when run after threshold, STO error occurs > INFO: analyzing "public.t1" > INFO: "t1": scanned 45 of 45 pages, containing 8999 live rows and > 1001 dead rows; 8999 rows in sample, 8999 estimated total rows > ERROR: snapshot too old > CONTEXT: SQL statement "SELECT (i * (select c1 from t2))" > PL/pgSQL function mysq(integer) line 3 at RETURN > > Is there some other behavior which would be preferred? The fact that the ERROR is being thrown seems okay to me. I was a bit surprised that the second INFO line is shown, but there's a simple explanation: we first acquire the sample rows (using acquire_sample_rows) and only after that's done we try to compute the stats from them. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On 2016-06-15 20:22:24 -0500, Kevin Grittner wrote: > I'm not clear where you see this as being in any way different with > STO. Above it seemed that you saw this as an issue related to > ANALYZE. If there is not early pruning for the table being > analyzed, nothing is at all different. If there is early pruning > the rows are not seen and there could be no detoasting. If there > is a function that lies about IMMUTABLE and reads from a table, it > either functions as before or throws a STO error on page access > (long before any detoasting). Am I missing something? I'm not sure, I might be missing something myself. Given the frequency of confusion of all senior hackers involved in this discussion... I previously was thinking of this in the context of ANALYZE, but I now think it's a bigger problem (and might not really affect ANALYZE itself). The simplest version of the scenario I'm concerned about is that a tuple in a tuple is *not* vacuumed, even though it's elegible to be removed due to STO. If that tuple has toasted columns, it could be the that the toast table was independently vacuumed (autovac considers main/toast tables separately, or the horizon could change between the two heap scans, or pins could prevent vacuuming of one page, or ...). Toast accesses via tuptoaster.c currently don't perform TestForOldSnapshot_impl(), because they use SnapshotToastData, not SnapshotMVCC. That seems to means two things: 1) You might get 'missing chunk number ...' errors on access to toasted columns. Misleading error, but ok. 2) Because the tuple has been pruned from the toast table, it's possible that the toast oid/va_valueid is reused, because now there's no conflict with GetNewOidWithIndex() anymore. In that case the toast_fetch_datum() might return a tuple from another column & type (all columns in a table share the same toast table), which could lead to almost arbitrary problems. That's not super likely to happen, but could have quite severe consequences once it starts. It seems the easiest way to fix this would be to make TestForOldSnapshot() "accept" SnapshotToast as well. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 5:34 PM, Andres Freund wrote: > On 2016-06-15 16:58:25 -0500, Kevin Grittner wrote: >> On Wed, Jun 15, 2016 at 3:25 PM, Andres Freund wrote: >>> On 2016-06-15 14:24:58 -0500, Kevin Grittner wrote: On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund wrote: > We might fetch a toast tuple which > since have been re-purposed for a datum of a different type. How would that happen? >>> >>> Autovac vacuums toast and heap tables independently. Once a toast datum >>> isn't used anymore, the oid used can be reused (because it doesn't >>> conflict via GetNewOidWithIndex() anymore. If analyze then detoasts a >>> datum, which hasn't been removed, the contents of that toast id, might >>> actually be for something different. >> >> What prevents that from happening now, without STO? > > Afaics we shouldn't ever look (i.e. detoast) at a "dead for everyone" > tuple in autovacuum (or anywhere else). There's one minor exception to > that, and that's enum datums in indexes, which is why we currently have > weird transactional requirements for them. I'm not entirely sure this > can be hit, but it's worth checking. I'm not clear where you see this as being in any way different with STO. Above it seemed that you saw this as an issue related to ANALYZE. If there is not early pruning for the table being analyzed, nothing is at all different. If there is early pruning the rows are not seen and there could be no detoasting. If there is a function that lies about IMMUTABLE and reads from a table, it either functions as before or throws a STO error on page access (long before any detoasting). Am I missing something? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increase message string buffer size of watch command of psql
Thanks, I agree that strftime() is better then asctime(). regards, Ioseph 2016년 06월 16일 08:37에 Tom Lane 이(가) 쓴 글: > Alvaro Herrera writes: >> +1 to strftime("%c"). If we wanted to provide additional flexibility we >> could have a \pset option to change the strftime format string to >> something other than %c, but I don't think there's enough demand to >> justify it. > Agreed, that seems like something for later (or never). Pushed that way. > > I also widened the buffer a bit in the back branches, to address the > original complaint. > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increase message string buffer size of watch command of psql
Alvaro Herrera writes: > +1 to strftime("%c"). If we wanted to provide additional flexibility we > could have a \pset option to change the strftime format string to > something other than %c, but I don't think there's enough demand to > justify it. Agreed, that seems like something for later (or never). Pushed that way. I also widened the buffer a bit in the back branches, to address the original complaint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MultiXactId error after upgrade to 9.3.4
Stephen Frost wrote: > Greetings, > > Looks like we might not be entirely out of the woods yet regarding > MultiXactId's. After doing an upgrade from 9.2.6 to 9.3.4, we saw the > following: > > ERROR: MultiXactId 6849409 has not been created yet -- apparent wraparound > > The table contents can be select'd out and match the pre-upgrade > backup, but any attempt to VACUUM / VACUUM FULL / CLUSTER fails, > unsurprisingly. I finally figured what is going on here, though I don't yet have a patch. This has been reported a number of times: https://www.postgresql.org/message-id/20140330040029.GY4582%40tamriel.snowman.net https://www.postgresql.org/message-id/538F3D70.6080902%40publicrelay.com https://www.postgresql.org/message-id/556439CF.7070109%40pscs.co.uk https://www.postgresql.org/message-id/20160614173150.GA443784@alvherre.pgsql https://www.postgresql.org/message-id/20160615203829.5798.4...@wrigleys.postgresql.org We theorised that we were missing some place that was failing to pass the "allow_old" flag to GetMultiXactIdMembers; and since we couldn't find any and the problem was worked around simply (by doing SELECT FOR UPDATE or equivalent on the affected tuples), there was no further research. (The allow_old flag is passed for tuples that match an infomask bit pattern that can only come from tuples locked in 9.2 and prior, i.e. one that is never set by 9.3ff). Yesterday I had to deal with it and quickly found what is going wrong: the problem is that 9.2 and earlier it was acceptable (and common) to leave tuples with very old multixacts in xmax, even after multixact counter wraparound. When one such value was found in a live tuple, GetMultiXactIdMembers() would notice that it was out of range and simply return "no members", at which point heap_update and siblings would consider the tuple as not locked and move on. When pg_upgrading a database containing tuples marked like that, the new code would error out, because during 9.3 multixact we considered that it was dangerous to silently allow tuples to be marked by values we didn't keep track of, so we made it an error instead, per https://www.postgresql.org/message-id/20111204122027.GA10035%40tornado.leadboat.com Some cases are allowed to be downgraded to DEBUG, when allow_old is true. I think that was a good choice in general so that possibly-data-eating bugs could be reported, but there's a problem in the specific case of tuples carried over by pg_upgrade whose Multixact is "further in the future" compared to the nextMultiXactId counter. I think it's pretty clear that we should let that error be downgraded to DEBUG too, like the other checks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH v12] GSSAPI encryption support
Michael Paquier writes: > On Thu, Apr 7, 2016 at 8:20 AM, Tom Lane wrote: >> Robbie Harwood writes: >>> Tom Lane writes: >>> Wait a second. So the initial connection-request packet is necessarily unencrypted under this scheme? >> >>> Yes, by necessity. The username must be sent in the clear, even if >>> only as part of the GSSAPI handshake (i.e., the GSSAPI username will >>> appear in plantext in the GSSAPI blobs which are otherwise >>> encrypted). GSSAPI performs authentication before it can start >>> encryption. >> >> Ugh. I had thought we were putting work into this because it >> represented something we could recommend as best practice, but now >> you're telling me that it's always going to be inferior to what we >> have already. > > It does not seem necessary to have an equivalent of > pqsecure_open_client, just some extra handling in fe-connect.c to set > up the initial context with a proper message handling... Not that > direct anyway. So should the patch be marked as returned with feedback > at this stage? I think in order to satisfy Tom's (valid) concern, there does need to be a separate handshake - i.e., GSSAPI support in pqsecure_open_client(). If I were to continue as I have been - using the plaintext connection and auth negotiation path - then at the time of startup the client has no way of knowing whether to send connection parameters or not. Personally, I would be in favor of not frontloading these connection parameters over insecure connections, but it is my impression that the project does not want to go this way (which is fine). The way I'm seeing this, when a connection comes in, we take the 'G' character for GSSAPI much as for SSL. At that time, we need to perform an *authentication* handshake (because GSSAPI will not do encryption before authenticating). I expect to use a consistent format for all GSSAPI packets - four bytes for length, and a payload. (I would prefer tagging them, but previously preference for not doing this has been expressed.) Once GSSAPI authentication is complete, the normal handshake process can be tunneled through a GSSAPI encryption layer, as is done with TLS. The server will need to retain some of the earlier authentication data (e.g., to check that the presented user-name matches GSSAPI credentials), but there will be no authentication packets exchanged (more specifically, it will resemble the anonymous case). Authorization will be checked as normal, and we then proceed in the usual fashion, all over the GSSAPI tunnel. On the server, I'll need to implement `hostgss` (by analogy to `hostssl`), and we'll want to lock authentication on those connections to GSSAPI-only. Clients will explicitly probe for GSSAPI support as they do for TLS support (I look forward to the bikeshed on the order of these) and should have a parameter to require said support. One thing I'm not clear on is what our behavior should be when the user doesn't explicitly request GSSAPI and doesn't have a ccache - do we prompt? Skip probing? I'm not sure what the best option there is. Before I implement this design, does anyone have any additional concerns or feedback on it? Thanks, --Robbie signature.asc Description: PGP signature
Re: [HACKERS] New design for FK-based join selectivity estimation
Tomas Vondra writes: > Attached is a reworked patch, mostly following the new design proposal > from this thread. I've whacked this around quite a bit and am now reasonably happy with it. The issue of planning speed hit should be pretty much gone, although I've not done testing to prove that. I've rearranged the actual selectivity calculation some too, because tests I did here did not look very good for anything but the plain-inner-join case. It may be that more work is needed there, but at least there's reasonable commentary about what we're doing and why. I have not adopted the plan of ignoring single-column FKs. While it would only take a couple more lines to do so, I think the argument that there will be a material planning speed hit no longer has much force. And I see a couple of arguments in favor of allowing this code to trigger on single-column FKs. First, it should work well even when pg_statistic data is missing or out of date, which would be an improvement; and second, we are likely not going to get much beta testing of the behavior if we restrict it to multi-col FKs. So I think we should ship it like this for beta even if we end up adding a filter against single-column FKs for release. Comments and testing appreciated. regards, tom lane diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 08ed990..8e5b33c 100644 *** a/src/backend/nodes/copyfuncs.c --- b/src/backend/nodes/copyfuncs.c *** *** 27,32 --- 27,33 #include "nodes/plannodes.h" #include "nodes/relation.h" #include "utils/datum.h" + #include "utils/rel.h" /* *** _copyValue(const Value *from) *** 4252,4257 --- 4253,4276 return newnode; } + + static ForeignKeyCacheInfo * + _copyForeignKeyCacheInfo(const ForeignKeyCacheInfo *from) + { + ForeignKeyCacheInfo *newnode = makeNode(ForeignKeyCacheInfo); + + COPY_SCALAR_FIELD(conrelid); + COPY_SCALAR_FIELD(confrelid); + COPY_SCALAR_FIELD(nkeys); + /* COPY_SCALAR_FIELD might work for these, but let's not assume that */ + memcpy(newnode->conkey, from->conkey, sizeof(newnode->conkey)); + memcpy(newnode->confkey, from->confkey, sizeof(newnode->confkey)); + memcpy(newnode->conpfeqop, from->conpfeqop, sizeof(newnode->conpfeqop)); + + return newnode; + } + + /* * copyObject * *** copyObject(const void *from) *** 5050,5055 --- 5069,5081 retval = _copyRoleSpec(from); break; + /* + * MISCELLANEOUS NODES + */ + case T_ForeignKeyCacheInfo: + retval = _copyForeignKeyCacheInfo(from); + break; + default: elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from)); retval = 0; /* keep compiler quiet */ diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index c7b4153..eaaa370 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *** *** 30,35 --- 30,36 #include "nodes/plannodes.h" #include "nodes/relation.h" #include "utils/datum.h" + #include "utils/rel.h" /* *** _outPlannerInfo(StringInfo str, const Pl *** 2048,2053 --- 2049,2055 WRITE_NODE_FIELD(append_rel_list); WRITE_NODE_FIELD(rowMarks); WRITE_NODE_FIELD(placeholder_list); + WRITE_NODE_FIELD(fkey_list); WRITE_NODE_FIELD(query_pathkeys); WRITE_NODE_FIELD(group_pathkeys); WRITE_NODE_FIELD(window_pathkeys); *** _outIndexOptInfo(StringInfo str, const I *** 2138,2143 --- 2140,2174 } static void + _outForeignKeyOptInfo(StringInfo str, const ForeignKeyOptInfo *node) + { + int i; + + WRITE_NODE_TYPE("FOREIGNKEYOPTINFO"); + + WRITE_UINT_FIELD(con_relid); + WRITE_UINT_FIELD(ref_relid); + WRITE_INT_FIELD(nkeys); + appendStringInfoString(str, " :conkey"); + for (i = 0; i < node->nkeys; i++) + appendStringInfo(str, " %d", node->conkey[i]); + appendStringInfoString(str, " :confkey"); + for (i = 0; i < node->nkeys; i++) + appendStringInfo(str, " %d", node->confkey[i]); + appendStringInfoString(str, " :conpfeqop"); + for (i = 0; i < node->nkeys; i++) + appendStringInfo(str, " %u", node->conpfeqop[i]); + WRITE_INT_FIELD(nmatched); + /* for compactness, just print the number of matches per column: */ + appendStringInfoString(str, " :eclass"); + for (i = 0; i < node->nkeys; i++) + appendStringInfo(str, " %d", (node->eclass[i] != NULL)); + appendStringInfoString(str, " :rinfos"); + for (i = 0; i < node->nkeys; i++) + appendStringInfo(str, " %d", list_length(node->rinfos[i])); + } + + static void _outEquivalenceClass(StringInfo str, const EquivalenceClass *node) { /* *** _outConstraint(StringInfo str, const Con *** 3207,3212 --- 3238,3264 } } + static void + _outForeignKeyCacheInfo(StringInfo str, const ForeignKeyCacheInfo *node) + { + int i; + + WRITE_NODE_TYPE("FOREIGNKEYCACHEINFO"); + + WRITE_OID_FIELD(conrelid); + WRITE_OID_FIELD
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On 2016-06-15 16:58:25 -0500, Kevin Grittner wrote: > On Wed, Jun 15, 2016 at 3:25 PM, Andres Freund wrote: > > On 2016-06-15 14:24:58 -0500, Kevin Grittner wrote: > >> On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund wrote: > >> > >> > We might fetch a toast tuple which > >> > since have been re-purposed for a datum of a different type. > >> > >> How would that happen? > > > > Autovac vacuums toast and heap tables independently. Once a toast datum > > isn't used anymore, the oid used can be reused (because it doesn't > > conflict via GetNewOidWithIndex() anymore. If analyze then detoasts a > > datum, which hasn't been removed, the contents of that toast id, might > > actually be for something different. > > What prevents that from happening now, without STO? Afaics we shouldn't ever look (i.e. detoast) at a "dead for everyone" tuple in autovacuum (or anywhere else). There's one minor exception to that, and that's enum datums in indexes, which is why we currently have weird transactional requirements for them. I'm not entirely sure this can be hit, but it's worth checking. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 2:40 PM, Alvaro Herrera wrote: > Kevin Grittner wrote: >> On Wed, Jun 15, 2016 at 1:59 PM, Alvaro Herrera >> wrote: > >> > We actually go quite some lengths to support this case, even when it's >> > the opinion of many that we shouldn't. For example VACUUM doesn't try >> > to find index entries using the values in each deleted tuple; instead we >> > remember the TIDs and then scan the indexes (possibly many times) to >> > find entries that match those TIDs -- which is much slower. Yet we do >> > it this way to protect the case that somebody is doing the >> > not-really-IMMUTABLE function. >> > >> > In other words, I don't think we consider the position you argued as >> > acceptable. >> >> What are you saying is unacceptable, and what behavior would be >> acceptable instead? > > The answer "we don't support the situation where you have an index using > an IMMUTABLE function that isn't actually immutable" is not acceptable. > The acceptable solution would be a design that doesn't have that > property as a requisite. > > I think having various executor(/heapam) checks that raise errors when > queries are executed from within ANALYZE is acceptable. Here is an example of a test case showing that: -- connection 1 drop table if exists t1; create table t1 (c1 int not null); drop table if exists t2; create table t2 (c1 int not null); insert into t1 select generate_series(1, 1); drop function mysq(i int); create function mysq(i int) returns int language plpgsql immutable as $mysq$ begin return (i * (select c1 from t2)); end $mysq$; insert into t2 values (1); create index t1_c1sq on t1 ((mysq(c1))); begin transaction isolation level repeatable read; select 1; -- connection 2 vacuum analyze verbose t1; delete from t1 where c1 between 1000 and 1999; delete from t1 where c1 = 8000; update t2 set c1 = 1; -- connection 1 analyze verbose t1; -- when run after threshold, STO error occurs The tail end of that, running the analyze once immediately and once after the threshold is: test=# -- connection 1 test=# analyze verbose t1; -- when run after threshold, STO error occurs INFO: analyzing "public.t1" INFO: "t1": scanned 45 of 45 pages, containing 8999 live rows and 1001 dead rows; 8999 rows in sample, 8999 estimated total rows ANALYZE test=# -- connection 1 analyze verbose t1; -- when run after threshold, STO error occurs INFO: analyzing "public.t1" INFO: "t1": scanned 45 of 45 pages, containing 8999 live rows and 1001 dead rows; 8999 rows in sample, 8999 estimated total rows ERROR: snapshot too old CONTEXT: SQL statement "SELECT (i * (select c1 from t2))" PL/pgSQL function mysq(integer) line 3 at RETURN Is there some other behavior which would be preferred? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 3:25 PM, Andres Freund wrote: > On 2016-06-15 14:24:58 -0500, Kevin Grittner wrote: >> On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund wrote: >> >> > We might fetch a toast tuple which >> > since have been re-purposed for a datum of a different type. >> >> How would that happen? > > Autovac vacuums toast and heap tables independently. Once a toast datum > isn't used anymore, the oid used can be reused (because it doesn't > conflict via GetNewOidWithIndex() anymore. If analyze then detoasts a > datum, which hasn't been removed, the contents of that toast id, might > actually be for something different. What prevents that from happening now, without STO? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Wed, Jun 15, 2016 at 4:59 PM, Peter Eisentraut wrote: > On 6/14/16 12:37 PM, Robert Haas wrote: >> ERROR: can't generate random numbers because you haven't specified a seed >> >> ...to which the user will reply, "oh yes I did; in fact I ran SELECT >> magic_setseed(42) just before I ran the offending query!". They'll >> then contact an expert (hopefully) who will very possibly spend a lot >> of time troubleshooting the wrong thing. If the message says: >> >> ERROR: can't generate random numbers because you haven't specified a seed >> CONTEXT: parallel worker, PID 62413 >> >> ...then the expert has a very good chance of guessing what has gone >> wrong right away. > > My updated opinion is to keep the context but remove the PID. I can see > that as this feature is being introduced, we will want to know that > something is happening inside a parallel worker. Maybe in the future we'll > have it all figured out and won't need the context anymore. But the PID is > not useful by the time you see the message. The server log should contain a > trace of the PIDs if you need to do deep digging. I'm comfortable with that. Feel free to make it so, unless you want me to do it for some reason. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On 6/14/16 12:37 PM, Robert Haas wrote: ERROR: can't generate random numbers because you haven't specified a seed ...to which the user will reply, "oh yes I did; in fact I ran SELECT magic_setseed(42) just before I ran the offending query!". They'll then contact an expert (hopefully) who will very possibly spend a lot of time troubleshooting the wrong thing. If the message says: ERROR: can't generate random numbers because you haven't specified a seed CONTEXT: parallel worker, PID 62413 ...then the expert has a very good chance of guessing what has gone wrong right away. My updated opinion is to keep the context but remove the PID. I can see that as this feature is being introduced, we will want to know that something is happening inside a parallel worker. Maybe in the future we'll have it all figured out and won't need the context anymore. But the PID is not useful by the time you see the message. The server log should contain a trace of the PIDs if you need to do deep digging. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increase message string buffer size of watch command of psql
Tom Lane wrote: > I wrote: > > Well, we did part of that, but it's still using asctime(). Should we > > change that to strftime(..."%c"...) to be less English-centric? > > (The result seems to be the same in C locale. pg_controldata has done > > it that way for a long time, with few complaints.) If we want to do so, > > now would be the time, since 9.6 already whacked around the format > > of \watch output. > > I take it from the vast silence that nobody particularly cares one way > or the other. On reflection I think that this would be a good change > to make, so I'll go do so unless I hear complaints soon. +1 to strftime("%c"). If we wanted to provide additional flexibility we could have a \pset option to change the strftime format string to something other than %c, but I don't think there's enough demand to justify it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On 2016-06-15 14:24:58 -0500, Kevin Grittner wrote: > On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund wrote: > > > We might fetch a toast tuple which > > since have been re-purposed for a datum of a different type. > > How would that happen? Autovac vacuums toast and heap tables independently. Once a toast datum isn't used anymore, the oid used can be reused (because it doesn't conflict via GetNewOidWithIndex() anymore. If analyze then detoasts a datum, which hasn't been removed, the contents of that toast id, might actually be for something different. That's not super likely to happen (given how rare oid wraparounds usually are), but it appears to be possible. Regards, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 3:40 PM, Alvaro Herrera wrote: > Kevin Grittner wrote: > > On Wed, Jun 15, 2016 at 1:59 PM, Alvaro Herrera > > wrote: > > > > We actually go quite some lengths to support this case, even when it's > > > the opinion of many that we shouldn't. For example VACUUM doesn't try > > > to find index entries using the values in each deleted tuple; instead > we > > > remember the TIDs and then scan the indexes (possibly many times) to > > > find entries that match those TIDs -- which is much slower. Yet we do > > > it this way to protect the case that somebody is doing the > > > not-really-IMMUTABLE function. > > > > > > In other words, I don't think we consider the position you argued as > > > acceptable. > > > > What are you saying is unacceptable, and what behavior would be > > acceptable instead? > > The answer "we don't support the situation where you have an index using > an IMMUTABLE function that isn't actually immutable" is not acceptable. > The acceptable solution would be a design that doesn't have that > property as a requisite. > Yes, a much better solution would be for PostgreSQL to examine the body of every function and determine on its own the proper volatility - or lacking that to "sandbox" (for lack of a better term) function execution so it simply cannot do things that conflict with its user specified marking. But the prevailing opinion on this list is that such an effort is not worthy of resources and that "let the user keep both pieces" is the more expedient policy. That this patch is being defended using that argument is consistent to policy and thus quite acceptable. The technical details here are just beyond my reach ATM but I think Robert's meta-points are spot on. Though to be fair we are changing a fundamental assumption underlying how the system and transactions operate - the amount of code whose assumptions are now being stressed is non-trivial; and for a feature that will generally have less use in production - and likely in much higher-stakes arenas - having a professionally hostile approach will help to ensure that what is released has been thoroughly vetted. These edge cases should be thought of, discussed, and ideally documented somewhere so that future coders can see and understand that said edges have been considered even if the answer is: "well, we don't blow up and at worse have some slightly off statistics, that seems fine". David J.
Re: [HACKERS] WIP: Data at rest encryption
On 06/14/2016 09:59 PM, Jim Nasby wrote: On 6/12/16 2:13 AM, Ants Aasma wrote: On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi wrote: > 1. Instead of doing the entire database files encryption, how about > providing user an option to protect only some particular tables that > wants the encryption at table/tablespace level. This not only provides > an option to the user, it reduces the performance impact on tables > that doesn't need any encryption. The problem with this approach > is that every xlog record needs to validate to handle the encryption > /decryption, instead of at page level. Is there a real need for this? The customers I have talked to want to encrypt the whole database and my goal is to make the feature fast enough to make that feasible for pretty much everyone. I guess switching encryption off per table would be feasible, but the key setup would still need to be done at server startup. Per record encryption would result in some additional information leakage though. Overall I thought it would not be worth it, but I'm willing to have my mind changed on this. I actually design with this in mind. Tables that contain sensitive info go into designated schemas, partly so that you can blanket move all of those to an encrypted tablespace (or safer would be to move things not in those schemas to an unencrypted tablespace). Since that can be done with an encrypted filesystem maybe that's good enough. (It's not really clear to me what this buys us over an encrypted FS, other than a feature comparison checkmark...) the reason why this is needed is actually very simple: security guidelines and legal requirements ... we have dealt with a couple of companies recently, who explicitly demanded PostgreSQL level encryption in a transparent way to fulfill some internal or legal requirements. this is especially true for financial stuff. and yes, sure ... you can do a lot of stuff with filesystem encryption. the core idea of this entire thing is however to have a counterpart on the database level. if you don't have the key you cannot start the instance and if you happen to get access to the filesystem you are still not able to fire up the DB. as it said: requirements by ever bigger companies. as far as benchmarking is concerned: i did a quick test yesterday (not with the final AES implementation yet) and i got pretty good results. with a reasonably well cached database in a typical application I expect to loose around 10-20%. if everything fits in memory there is 0 loss of course. the worst I got with the standard AES (no hardware support used yet) I lost around 45% or so. but this requires a value as low as 32 MB of shared buffers or so. many thanks, hans -- Hans-Jürgen Schönig Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 2:59 PM, Alvaro Herrera wrote: > Kevin Grittner wrote: >> On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas wrote: >> >> > The expression index case is the one to worry about; if there is a >> > problem, that's where it is. What bothers me is that a function used >> > in an expression index could do anything at all - it can read any >> > table in the database. >> >> It *can*, but then you are lying to the database when you call it >> IMMUTABLE. Such an index can easily become corrupted through >> normal DML. Without DML the ANALYZE has no problem. So you seem >> to be concerned that if someone is lying to the database engine to >> force it accept a function as IMMUTABLE when it actually isn't, and >> then updating the referenced rows (which is very likely to render >> the index corrupted), that statistics might also become stale. > > We actually go quite some lengths to support this case, even when it's > the opinion of many that we shouldn't. For example VACUUM doesn't try > to find index entries using the values in each deleted tuple; instead we > remember the TIDs and then scan the indexes (possibly many times) to > find entries that match those TIDs -- which is much slower. Yet we do > it this way to protect the case that somebody is doing the > not-really-IMMUTABLE function. > > In other words, I don't think we consider the position you argued as > acceptable. Well, I actually don't think there's a giant problem here. I'm just trying to follow the chain of the argument to its (illogical) conclusion. I mean, if ANALYZE itself fails to see a tuple subjected to early pruning, that should be fine. And if some query run by a supposedly-but-not-actually immutable function errors out because snapshot_too_old is set, that also seems more or less fine. The statistics might not get updated, but oh well: either make your supposedly-immutable function actually immutable, or else turn off snapshot_too_old, or else live with the fact that ANALYZE will fail some percentage of the time. Supporting people who cheat and do things that technically aren't allowed is one thing; saying that every new feature must never have any downsides for such people is something else. If we took the latter approach, parallel query would be right out, because you sure can break things by mislabeling functions as PARALLEL SAFE. I *do* think it *must* be possible to get an ANALYZE to do something funky - either error out, or give wrong answers - if you set up a strange enough set of circumstances, but I don't think that's necessarily something we need to do anything about. I think this whole discussion of snapshot too old has provoked far too many bitter emails -- on all sides. I entirely believe that there are legitimate reasons to have concerns about this feature, and I entirely suspect that it has problems we haven't found yet, and I also entirely believe that there will be some future bugs that stem from this feature that we would not have had otherwise. I think it is entirely legitimate to have concerns about those things. On the other hand, I *also* believe that Kevin is a pretty careful guy and that he's done his best to make this patch safe and that he did not just go off and commit something half-baked without due reflection. We have to expect that if people who are committers don't get much review of their patches, they will eventually commit them anyway. The "I can't believe you committed this" reactions seem out of line to me. This feature is not perfect. Nor is it the worst thing anybody's ever committed. But it's definitely provoked more ire than most. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Choosing the cheapest optimizer cost
Right now, the optimizer chooses the path with the cheapest cost. However, do we take into account the behavior of the plan in handling mis-estimated row counts? For example, if a path has a log(n) behavior for changes in the row count, and another plan that is slightly cheaper has a log(n^2) behavior, should we choose the former, knowing the the row counts are often inaccurate? I suppose one approach would be to track not only the path costs, but the handling of mis-estimated, and account for that in the final path choice? Do we already do this by giving less stable plans higher costs? Does that have the same effect? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
Kevin Grittner wrote: > On Wed, Jun 15, 2016 at 1:59 PM, Alvaro Herrera > wrote: > > We actually go quite some lengths to support this case, even when it's > > the opinion of many that we shouldn't. For example VACUUM doesn't try > > to find index entries using the values in each deleted tuple; instead we > > remember the TIDs and then scan the indexes (possibly many times) to > > find entries that match those TIDs -- which is much slower. Yet we do > > it this way to protect the case that somebody is doing the > > not-really-IMMUTABLE function. > > > > In other words, I don't think we consider the position you argued as > > acceptable. > > What are you saying is unacceptable, and what behavior would be > acceptable instead? The answer "we don't support the situation where you have an index using an IMMUTABLE function that isn't actually immutable" is not acceptable. The acceptable solution would be a design that doesn't have that property as a requisite. I think having various executor(/heapam) checks that raise errors when queries are executed from within ANALYZE is acceptable. I don't know about the TOAST related angle Andres just raised. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund wrote: > We might fetch a toast tuple which > since have been re-purposed for a datum of a different type. How would that happen? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On 2016-06-15 14:50:46 -0400, Robert Haas wrote: > On Wed, Jun 15, 2016 at 2:45 PM, Alvaro Herrera > wrote: > > Robert Haas wrote: > >> On Wed, Jun 15, 2016 at 1:44 PM, Kevin Grittner > >> wrote: > >> > The test I showed creates a situation which (to ANALYZE) is > >> > identical to what you describe -- ANALYZE sees a page with an LSN > >> > recent enough that it could have been (and actually has been) > >> > pruned. Why would it be better for the ANALYZE to fail than to > >> > complete? > >> > >> As I understand it, the reason we need to sometimes give "ERROR: > >> snapshot too old" after early pruning is because we might otherwise > >> give the wrong answer. > > > > So what constitutes "the wrong answer"? A regular transaction reading a > > page and not finding a tuple that should have been there but was > > removed, is a serious problem and should be aborted. For ANALYZE it may > > not matter a great deal. Some very old tuple that might have been > > chosen for the sample is not there; a different tuple is chosen instead, > > so the stats might be slightly difference. No big deal. > > > > Maybe it is possible to get into trouble if you're taking a sample for > > an expression index. > > The expression index case is the one to worry about; if there is a > problem, that's where it is. What bothers me is that a function used > in an expression index could do anything at all - it can read any > table in the database. Isn't that also a problem around fetching toast tuples? As we don't TestForOldSnapshot_impl() for toast, We might fetch a toast tuple which since have been re-purposed for a datum of a different type. Which can have arbitrarily bad consequences afaics. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Wed, Jun 15, 2016 at 5:23 AM, Amit Kapila wrote: >> > Considering above analysis is correct, we have below options: >> > a. Modify the test such that it actually generates an error and to hide >> > the >> > context, we can exception block and raise some generic error. >> > b. Modify the test such that it actually generates an error and to hide >> > the >> > context, we can use force_parallel_mode = regress; >> >> Either of those sounds okay. No need to raise a generic error; one can >> raise >> SQLERRM to keep the main message and not the context. I lean toward (a) >> so we >> have nonzero test coverage of force_parallel_mode=on. > > Patch implementing option (a) attached with this mail. OK, committed. I also changed "select" to "perform" per your analysis. I wonder if we need to revisit the choices I made inside PL/pgsql and see why CURSOR_OPT_PARALLEL_OK is not being set here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 1:59 PM, Alvaro Herrera wrote: > Kevin Grittner wrote: >> On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas wrote: >> >>> The expression index case is the one to worry about; if there is a >>> problem, that's where it is. What bothers me is that a function used >>> in an expression index could do anything at all - it can read any >>> table in the database. >> >> It *can*, but then you are lying to the database when you call it >> IMMUTABLE. Such an index can easily become corrupted through >> normal DML. Without DML the ANALYZE has no problem. So you seem >> to be concerned that if someone is lying to the database engine to >> force it accept a function as IMMUTABLE when it actually isn't, and >> then updating the referenced rows (which is very likely to render >> the index corrupted), that statistics might also become stale. > > We actually go quite some lengths to support this case, even when it's > the opinion of many that we shouldn't. For example VACUUM doesn't try > to find index entries using the values in each deleted tuple; instead we > remember the TIDs and then scan the indexes (possibly many times) to > find entries that match those TIDs -- which is much slower. Yet we do > it this way to protect the case that somebody is doing the > not-really-IMMUTABLE function. > > In other words, I don't think we consider the position you argued as > acceptable. What are you saying is unacceptable, and what behavior would be acceptable instead? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 1:54 PM, Alvaro Herrera wrote: > Robert Haas wrote: >> The expression index case is the one to worry about; if there is a >> problem, that's where it is. What bothers me is that a function used >> in an expression index could do anything at all - it can read any >> table in the database. > > Hmm, but if it does that, then the code that actually implements that > query would run the STO checks, right? The analyze code doesn't need > to. Right. In the described case, you would get a "snapshot too old" error inside the expression which is trying to generate the index value. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
Kevin Grittner wrote: > On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas wrote: > > > The expression index case is the one to worry about; if there is a > > problem, that's where it is. What bothers me is that a function used > > in an expression index could do anything at all - it can read any > > table in the database. > > It *can*, but then you are lying to the database when you call it > IMMUTABLE. Such an index can easily become corrupted through > normal DML. Without DML the ANALYZE has no problem. So you seem > to be concerned that if someone is lying to the database engine to > force it accept a function as IMMUTABLE when it actually isn't, and > then updating the referenced rows (which is very likely to render > the index corrupted), that statistics might also become stale. We actually go quite some lengths to support this case, even when it's the opinion of many that we shouldn't. For example VACUUM doesn't try to find index entries using the values in each deleted tuple; instead we remember the TIDs and then scan the indexes (possibly many times) to find entries that match those TIDs -- which is much slower. Yet we do it this way to protect the case that somebody is doing the not-really-IMMUTABLE function. In other words, I don't think we consider the position you argued as acceptable. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas wrote: > The expression index case is the one to worry about; if there is a > problem, that's where it is. What bothers me is that a function used > in an expression index could do anything at all - it can read any > table in the database. It *can*, but then you are lying to the database when you call it IMMUTABLE. Such an index can easily become corrupted through normal DML. Without DML the ANALYZE has no problem. So you seem to be concerned that if someone is lying to the database engine to force it accept a function as IMMUTABLE when it actually isn't, and then updating the referenced rows (which is very likely to render the index corrupted), that statistics might also become stale. They might. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
Robert Haas wrote: > On Wed, Jun 15, 2016 at 2:45 PM, Alvaro Herrera > wrote: > > Maybe it is possible to get into trouble if you're taking a sample for > > an expression index. > > The expression index case is the one to worry about; if there is a > problem, that's where it is. What bothers me is that a function used > in an expression index could do anything at all - it can read any > table in the database. Hmm, but if it does that, then the code that actually implements that query would run the STO checks, right? The analyze code doesn't need to. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 2:45 PM, Alvaro Herrera wrote: > Robert Haas wrote: >> On Wed, Jun 15, 2016 at 1:44 PM, Kevin Grittner >> wrote: >> > The test I showed creates a situation which (to ANALYZE) is >> > identical to what you describe -- ANALYZE sees a page with an LSN >> > recent enough that it could have been (and actually has been) >> > pruned. Why would it be better for the ANALYZE to fail than to >> > complete? >> >> As I understand it, the reason we need to sometimes give "ERROR: >> snapshot too old" after early pruning is because we might otherwise >> give the wrong answer. > > So what constitutes "the wrong answer"? A regular transaction reading a > page and not finding a tuple that should have been there but was > removed, is a serious problem and should be aborted. For ANALYZE it may > not matter a great deal. Some very old tuple that might have been > chosen for the sample is not there; a different tuple is chosen instead, > so the stats might be slightly difference. No big deal. > > Maybe it is possible to get into trouble if you're taking a sample for > an expression index. The expression index case is the one to worry about; if there is a problem, that's where it is. What bothers me is that a function used in an expression index could do anything at all - it can read any table in the database. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 1:45 PM, Alvaro Herrera wrote: > Maybe it is possible to get into trouble if you're taking a sample for > an expression index. Maybe -- if you are using a function for an index expression which does an explicit SELECT against some database table rather than only using values from the row itself. In such a case you would have had to mark a function as IMMUTABLE which depends on table contents. I say you get to keep both pieces. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
Robert Haas wrote: > On Wed, Jun 15, 2016 at 1:44 PM, Kevin Grittner > wrote: > > The test I showed creates a situation which (to ANALYZE) is > > identical to what you describe -- ANALYZE sees a page with an LSN > > recent enough that it could have been (and actually has been) > > pruned. Why would it be better for the ANALYZE to fail than to > > complete? > > As I understand it, the reason we need to sometimes give "ERROR: > snapshot too old" after early pruning is because we might otherwise > give the wrong answer. So what constitutes "the wrong answer"? A regular transaction reading a page and not finding a tuple that should have been there but was removed, is a serious problem and should be aborted. For ANALYZE it may not matter a great deal. Some very old tuple that might have been chosen for the sample is not there; a different tuple is chosen instead, so the stats might be slightly difference. No big deal. Maybe it is possible to get into trouble if you're taking a sample for an expression index. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 1:29 PM, Robert Haas wrote: > On Wed, Jun 15, 2016 at 1:44 PM, Kevin Grittner > wrote:>> So what happens in this scenario: >>> 1. ANALYZE runs really slowly - maybe the user-defined function it's >>> running for the expression index is extremely long-running. >>> 2. Eventually, the snapshot for ANALYZE is older than the configured >>> value of snapshot_too_old. >>> 3. Then, ANALYZE selects a page with an LSN new enough that it might >>> have been pruned. >>> >>> Presumably, the ANALYZE ought to error out in this scenario, just as >>> it would in any other situation where an old snapshot sees a new page. >>> No? >> >> The test I showed creates a situation which (to ANALYZE) is >> identical to what you describe -- ANALYZE sees a page with an LSN >> recent enough that it could have been (and actually has been) >> pruned. Why would it be better for the ANALYZE to fail than to >> complete? > > As I understand it, the reason we need to sometimes give "ERROR: > snapshot too old" after early pruning is because we might otherwise > give the wrong answer. > > Maybe I'm confused. In the scenario that you describe, ANALYZE is coming up with statistics to use in query planning, and the numbers are not expected to always be 100% accurate. I can think of conditions which might prevail when seeing the recent LSN. (1) The recent LSN is from a cause having nothing to do with the STO feature, like DML. As things stand, the behavior is the same as without the patch -- the rows are counted just the same as always. If we did as you suggest, we instead would abort ANALYZE and have stale statistics. (2) The recent LSN is related to STO pruning. The dead rows are gone forever, and will not be counted. This seems more correct than counting them, even if it were possible, and also superior to aborting the ANALYZE and leaving stale statistics. Of course, a subset of (1) is the case that the rows can be early-pruned at the next opportunity. In such a case ANALYZE is still counting them according to the rules that we had before the snapshot too old feature. If someone wants to argue that those rules are wrong, that seems like material for a separate patch. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 1:44 PM, Kevin Grittner wrote:>> So what happens in this scenario: >> 1. ANALYZE runs really slowly - maybe the user-defined function it's >> running for the expression index is extremely long-running. >> 2. Eventually, the snapshot for ANALYZE is older than the configured >> value of snapshot_too_old. >> 3. Then, ANALYZE selects a page with an LSN new enough that it might >> have been pruned. >> >> Presumably, the ANALYZE ought to error out in this scenario, just as >> it would in any other situation where an old snapshot sees a new page. >> No? > > The test I showed creates a situation which (to ANALYZE) is > identical to what you describe -- ANALYZE sees a page with an LSN > recent enough that it could have been (and actually has been) > pruned. Why would it be better for the ANALYZE to fail than to > complete? As I understand it, the reason we need to sometimes give "ERROR: snapshot too old" after early pruning is because we might otherwise give the wrong answer. Maybe I'm confused. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rename max_parallel_degree?
On 15/06/2016 18:14, Julien Rouhaud wrote: > On 15/06/2016 17:49, Robert Haas wrote: >> On Tue, Jun 14, 2016 at 7:10 AM, Julien Rouhaud >> wrote: I don't entirely like the new logic in RegisterDynamicBackgroundWorker. >>> I wonder if we can't drive this off of a couple of counters, instead of having the process registering the background worker iterate over every slot. [...] >> >> I think we should go that way. Some day we might try to make the >> process of finding a free slot more efficient than it is today; I'd >> rather not double down on linear search. >> > > Ok. > >> Are you going to update this patch? >> > > Sure, I'll post a new patch ASAP. > Attached v4 implements the design you suggested, I hope everything's ok. I didn't do anything for the statistics part, because I'm not sure having the counters separately is useful (instead of just having the current number of parallel workers), and if it's useful I'd find strange to have these counters get reset at restart instead of being stored and accumulated as other stats, and that's look too much for 9.6 material. I'd be happy to work on this though. -- Julien Rouhaud http://dalibo.com - http://dalibo.org diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e0e5a1e..c661c7a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2018,6 +2018,23 @@ include_dir 'conf.d' + + max_parallel_workers (integer) + +max_parallel_workers configuration parameter + + + + + Sets the maximum number of workers that can be launched at the same + time for the whole server. This parameter allows the administrator to + reserve background worker slots for for third part dynamic background + workers. The default value is 4. Setting this value to 0 disables + parallel query execution. + + + + backend_flush_after (integer) diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index ab5ef25..b429474 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -453,7 +453,8 @@ LaunchParallelWorkers(ParallelContext *pcxt) snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d", MyProcPid); worker.bgw_flags = - BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION; + BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION + | BGWORKER_IS_PARALLEL_WORKER; worker.bgw_start_time = BgWorkerStart_ConsistentState; worker.bgw_restart_time = BGW_NEVER_RESTART; worker.bgw_main = ParallelWorkerMain; diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index cc8ba61..2bcd86b 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -719,9 +719,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel) } /* -* In no case use more than max_parallel_workers_per_gather workers. +* In no case use more than max_parallel_workers or +* max_parallel_workers_per_gather workers. */ - parallel_workers = Min(parallel_workers, max_parallel_workers_per_gather); + parallel_workers = Min(max_parallel_workers, Min(parallel_workers, + max_parallel_workers_per_gather)); /* If any limit was set to zero, the user doesn't want a parallel scan. */ if (parallel_workers <= 0) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index e7f63f4..fd62126 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -113,6 +113,7 @@ int effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE; Cost disable_cost = 1.0e10; +intmax_parallel_workers = 4; intmax_parallel_workers_per_gather = 2; bool enable_seqscan = true; diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 07b925e..66e65c8 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -246,8 +246,9 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams) IsUnderPostmaster && dynamic_shared_memory_type != DSM_IMPL_NONE && parse->commandType == CMD_SELECT && !parse->hasModifyingCTE && parse->utilityStmt == NULL && max_parallel_workers_per_gather > 0 && - !IsParallelWorker() && !IsolationIsSerializable() && - !has_parallel_hazard((Node *) parse, true); + max_parallel_workers > 0 && !IsParallelWorker() && + !IsolationIsSerializable() && !has_parallel_hazard((Node *) parse,
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Wed, Jun 15, 2016 at 10:46 AM, Robert Haas wrote: > On Sat, Jun 11, 2016 at 11:29 AM, Kevin Grittner wrote: >> I have reviewed the code and run tests to try to find something >> here which could be considered a bug, without finding any problem. >> When reading pages for the random sample for ANALYZE (or >> auto-analyze) there is not an age check; so ANALYZE completes >> without error, keeping statistics up-to-date. >> >> There really is no difference in behavior except in the case that: >> >> (1) old_snapshot_threshold >= 0 to enable the "snapshot too old" >>feature, and >> (2) there were tuples that were dead as the result of completed >>transactions, and >> (3) those tuples became older than the threshold, and >> (4) those tuples were pruned or vacuumed away, and >> (5) an ANALYZE process would have read those dead tuples had they >>not been removed. >> >> In such a case the irrevocably dead, permanently removed tuples are >> not counted in the statistics. I have trouble seeing a better >> outcome than that. Among my tests, I specifically checked for an >> ANALYZE of a table having an index on an expression, using an old >> snapshot: >> >> -- connection 1 >> drop table if exists t1; >> create table t1 (c1 int not null); >> drop table if exists t2; >> create table t2 (c1 int not null); >> insert into t1 select generate_series(1, 1); >> drop function mysq(i int); >> create function mysq(i int) >> returns int >> language plpgsql >> immutable >> as $mysq$ >> begin >> return (i * i); >> end >> $mysq$; >> create index t1_c1sq on t1 ((mysq(c1))); >> begin transaction isolation level repeatable read; >> select 1; >> >> -- connection 2 >> vacuum analyze verbose t1; >> delete from t1 where c1 between 1000 and 1999; >> delete from t1 where c1 = 8000; >> insert into t2 values (1); >> select pg_sleep_for('2min'); >> vacuum verbose t1; -- repeat if necessary to see the dead rows >> disappear >> >> -- connection 1 >> analyze verbose t1; >> >> This runs to completion, as I would want and expect. >> >> I am closing this item on the "PostgreSQL 9.6 Open Items" page. If >> anyone feels that I've missed something, please provide a test to >> show the problem, or a clear description of the problem and how you >> feel behavior should be different. > > So what happens in this scenario: > > 1. ANALYZE runs really slowly - maybe the user-defined function it's > running for the expression index is extremely long-running. > 2. Eventually, the snapshot for ANALYZE is older than the configured > value of snapshot_too_old. > 3. Then, ANALYZE selects a page with an LSN new enough that it might > have been pruned. > > Presumably, the ANALYZE ought to error out in this scenario, just as > it would in any other situation where an old snapshot sees a new page. > No? The test I showed creates a situation which (to ANALYZE) is identical to what you describe -- ANALYZE sees a page with an LSN recent enough that it could have been (and actually has been) pruned. Why would it be better for the ANALYZE to fail than to complete? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_isready features
yup, it does for (1) :-) On Wed, Jun 15, 2016 at 9:53 AM Imre Samu wrote: > >Why I need it? There is tiny window between postgres being ready to > accept connections > >and final scripts which create initial user/database. > >Ideally having option to say "postgres is ready after specific user can > login to specific database" would be ideal. > > temporary - the official docker-postgres solution is not OK? > see : > https://github.com/docker-library/postgres/blob/master/9.5/docker-entrypoint.sh#L50 > > *# internal start of server in order to allow set-up using psql-client * > *# does not listen on external TCP/IP and waits until start finishes* > *gosu postgres pg_ctl -D "$PGDATA" -o "-c listen_addresses='localhost'" * > > *.* > > > more info: https://github.com/docker-library/postgres/issues/146 > > > 2016-06-15 18:09 GMT+02:00 Jimmy : > >> Not sure if this was discussed in the past and decided it does not belong >> to pg_isready utility >> >> I am using pg_isready in dockerized development environment as part of >> docker-compose. Simply say I have POSTGRES container and APP container. I >> use pg_isready inside app container and wait till postgres is ready >> to accept connections before app starts. >> >> There are two features which would make this a bit smoother and better. >> >> >> *1. enforce login* >> This could be optional and turned off by default. Basically if user >> supplies username/database as part of pg_isready call and the login fails >> (for whatever reason), pg_isready should fail. >> >> Why I need it? There is tiny window between postgres being ready to >> accept connections and final scripts which create initial user/database. >> Ideally having option to say "postgres is ready after specific user can >> login to specific database" would be ideal. Again, this can be optional and >> turned off by default. >> >> >> *2. retry* >> This is something I can do via unix script, but ideally it would be nice >> if there would be retry mechanism. For example if I say retry=30 then >> pg_isready would try 30x with x seconds pause in between and fail only >> after 30 retries. This could use default retry=0 (current behavior) >> >> >> thanks a lot! >> >> >> >
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
We need to reach a consensus here, since there is no way to say "I don't know". I inclined to agree with you, that returning false is better in such a case.That will indicate user to the source of problem. Here is a patch, now phrase operation returns false if there is not postion information. If this behavior looks more reasonable, I'll commit that. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase_no_fallback.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_isready features
>Why I need it? There is tiny window between postgres being ready to accept connections >and final scripts which create initial user/database. >Ideally having option to say "postgres is ready after specific user can login to specific database" would be ideal. temporary - the official docker-postgres solution is not OK? see : https://github.com/docker-library/postgres/blob/master/9.5/docker-entrypoint.sh#L50 *# internal start of server in order to allow set-up using psql-client * *# does not listen on external TCP/IP and waits until start finishes* *gosu postgres pg_ctl -D "$PGDATA" -o "-c listen_addresses='localhost'" * *.* more info: https://github.com/docker-library/postgres/issues/146 2016-06-15 18:09 GMT+02:00 Jimmy : > Not sure if this was discussed in the past and decided it does not belong > to pg_isready utility > > I am using pg_isready in dockerized development environment as part of > docker-compose. Simply say I have POSTGRES container and APP container. I > use pg_isready inside app container and wait till postgres is ready > to accept connections before app starts. > > There are two features which would make this a bit smoother and better. > > > *1. enforce login* > This could be optional and turned off by default. Basically if user > supplies username/database as part of pg_isready call and the login fails > (for whatever reason), pg_isready should fail. > > Why I need it? There is tiny window between postgres being ready to accept > connections and final scripts which create initial user/database. Ideally > having option to say "postgres is ready after specific user can login to > specific database" would be ideal. Again, this can be optional and turned > off by default. > > > *2. retry* > This is something I can do via unix script, but ideally it would be nice > if there would be retry mechanism. For example if I say retry=30 then > pg_isready would try 30x with x seconds pause in between and fail only > after 30 retries. This could use default retry=0 (current behavior) > > > thanks a lot! > > >
Re: [HACKERS] pg_isready features
On 06/15/2016 09:30 AM, David G. Johnston wrote: On Wed, Jun 15, 2016 at 12:09 PM, Jimmy *2. retry* This is something I can do via unix script, but ideally it would be nice if there would be retry mechanism. For example if I say retry=30 then pg_isready would try 30x with x seconds pause in between and fail only after 30 retries. This could use default retry=0 (current behavior) And the value of this instead of setting a timeout 30x higher is? Simplicity. JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_isready features
(1) I can (and do) use psql, pg_isready seems lighter and since it already supports credentials and DB name, I see very little harm for pg_isready to say back whether user logged IN or not using these credentials. (2) timeout is not the same - timeout does not retry, its a simple timeout in case connection hangs, try it On Wed, Jun 15, 2016 at 9:30 AM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Wed, Jun 15, 2016 at 12:09 PM, Jimmy wrote: > >> Not sure if this was discussed in the past and decided it does not belong >> to pg_isready utility >> >> I am using pg_isready in dockerized development environment as part of >> docker-compose. Simply say I have POSTGRES container and APP container. I >> use pg_isready inside app container and wait till postgres is ready >> to accept connections before app starts. >> >> There are two features which would make this a bit smoother and better. >> >> >> *1. enforce login* >> This could be optional and turned off by default. Basically if user >> supplies username/database as part of pg_isready call and the login fails >> (for whatever reason), pg_isready should fail. >> >> Why I need it? There is tiny window between postgres being ready to >> accept connections and final scripts which create initial user/database. >> Ideally having option to say "postgres is ready after specific user can >> login to specific database" would be ideal. Again, this can be optional and >> turned off by default. >> >> > It shouldn't have to enforce login if there is a way for the server to > communicate ready or not ready for login without requiring credentials to > actually be supplied. I guess it would be more effort and invasive. Are > you trying to avoid psql here? > > >> >> *2. retry* >> This is something I can do via unix script, but ideally it would be nice >> if there would be retry mechanism. For example if I say retry=30 then >> pg_isready would try 30x with x seconds pause in between and fail only >> after 30 retries. This could use default retry=0 (current behavior) >> >> > And the value of this instead of setting a timeout 30x higher is? > > >
Re: [HACKERS] pg_isready features
On Wed, Jun 15, 2016 at 12:09 PM, Jimmy wrote: > Not sure if this was discussed in the past and decided it does not belong > to pg_isready utility > > I am using pg_isready in dockerized development environment as part of > docker-compose. Simply say I have POSTGRES container and APP container. I > use pg_isready inside app container and wait till postgres is ready > to accept connections before app starts. > > There are two features which would make this a bit smoother and better. > > > *1. enforce login* > This could be optional and turned off by default. Basically if user > supplies username/database as part of pg_isready call and the login fails > (for whatever reason), pg_isready should fail. > > Why I need it? There is tiny window between postgres being ready to accept > connections and final scripts which create initial user/database. Ideally > having option to say "postgres is ready after specific user can login to > specific database" would be ideal. Again, this can be optional and turned > off by default. > > It shouldn't have to enforce login if there is a way for the server to communicate ready or not ready for login without requiring credentials to actually be supplied. I guess it would be more effort and invasive. Are you trying to avoid psql here? > > *2. retry* > This is something I can do via unix script, but ideally it would be nice > if there would be retry mechanism. For example if I say retry=30 then > pg_isready would try 30x with x seconds pause in between and fail only > after 30 retries. This could use default retry=0 (current behavior) > > And the value of this instead of setting a timeout 30x higher is?
Re: [HACKERS] An extra error for client disconnection on Windows
On Thu, Jun 2, 2016 at 4:51 AM, Kyotaro HORIGUCHI wrote: > After a process termination without PQfinish() of a client, > server emits the following log message not seen on Linux boxes. > >> LOG: could not receive data from client: An existing connection was >> forcibly closed by the remote host. > > This is because pgwin32_recv reuturns an error ECONNRESET for the > situation instead of returning non-error EOF as recv(2) does. > > This patch translates WSAECONNRESET of WSARecv to an EOF so that > pgwin32_recv behaves the same way with Linux. > > The attached patch does this. Please add this to the next CommitFest so it gets reviewed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rename max_parallel_degree?
On 15/06/2016 17:49, Robert Haas wrote: > On Tue, Jun 14, 2016 at 7:10 AM, Julien Rouhaud > wrote: >>> I don't entirely like the new logic in >>> RegisterDynamicBackgroundWorker. >> >>> I wonder if we can't drive this off >>> of a couple of counters, instead of having the process registering the >>> background worker iterate over every slot. [...] >>> > > I think we should go that way. Some day we might try to make the > process of finding a free slot more efficient than it is today; I'd > rather not double down on linear search. > Ok. > Are you going to update this patch? > Sure, I'll post a new patch ASAP. -- Julien Rouhaud http://dalibo.com - http://dalibo.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_isready features
Not sure if this was discussed in the past and decided it does not belong to pg_isready utility I am using pg_isready in dockerized development environment as part of docker-compose. Simply say I have POSTGRES container and APP container. I use pg_isready inside app container and wait till postgres is ready to accept connections before app starts. There are two features which would make this a bit smoother and better. *1. enforce login* This could be optional and turned off by default. Basically if user supplies username/database as part of pg_isready call and the login fails (for whatever reason), pg_isready should fail. Why I need it? There is tiny window between postgres being ready to accept connections and final scripts which create initial user/database. Ideally having option to say "postgres is ready after specific user can login to specific database" would be ideal. Again, this can be optional and turned off by default. *2. retry* This is something I can do via unix script, but ideally it would be nice if there would be retry mechanism. For example if I say retry=30 then pg_isready would try 30x with x seconds pause in between and fail only after 30 retries. This could use default retry=0 (current behavior) thanks a lot!
Re: [HACKERS] Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
what's about word with several infinitives select to_tsvector('en', 'leavings'); to_tsvector 'leave':1 'leavings':1 (1 row) select to_tsvector('en', 'leavings') @@ 'leave <0> leavings'::tsquery; ?column? -- t (1 row) Second example is not correct: select phraseto_tsquery('en', 'leavings') will produce 'leave | leavings' and select phraseto_tsquery('en', 'leavings cats') will produce 'leave <-> cat | leavings <-> cat' which seems correct and we don't need special threating of <0>. This brings up something else that I am not very sold on: to wit, do we really want the "less than or equal" distance behavior at all? The documentation gives the example that phraseto_tsquery('cat ate some rats') produces ( 'cat' <-> 'ate' ) <2> 'rat' because "some" is a stopword. However, that pattern will also match "cat ate rats", which seems surprising and unexpected to me; certainly it would surprise a user who did not realize that "some" is a stopword. So I think there's a reasonable case for decreeing that should only match lexemes *exactly* N apart. If we did that, we would no longer have the misbehavior that Jean-Pierre is complaining about, and we'd not need to argue about whether <0> needs to be treated specially. Agree, seems that's easy to change. I thought that I saw an issue with hyphenated word but, fortunately, I forget that hyphenated words don't share a position: # select to_tsvector('foo-bar'); to_tsvector - 'bar':3 'foo':2 'foo-bar':1 # select phraseto_tsquery('foo-bar'); phraseto_tsquery --- ( 'foo-bar' <-> 'foo' ) <-> 'bar' and # select to_tsvector('foo-bar') @@ phraseto_tsquery('foo-bar'); ?column? -- t Patch is attached -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ phrase_exact_distance.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 6/15/16 9:04 AM, Merlin Moncure wrote: We could stand to be more explicit here. The docs for version() >> > indicated >> > the server_version_num should be used for "machine processing". On a somewhat related note, any objection to adding server_version_num to pg_config? It's common to need to know what version you're handling in a Makefile, and today that's pretty ugly (especially when something is stamped as beta, since it breaks assumptions about numeric). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rename max_parallel_degree?
On Tue, Jun 14, 2016 at 7:10 AM, Julien Rouhaud wrote: >> I don't entirely like the new logic in >> RegisterDynamicBackgroundWorker. > > I'm not that happy with it too. We can avoid iterating over every slots > if the feature isn't activated though (max_parallel_workers >= > max_worker_processes). > >> I wonder if we can't drive this off >> of a couple of counters, instead of having the process registering the >> background worker iterate over every slot. Suppose we add two >> counters to BackgroundWorkerArray, parallel_register_count and >> parallel_terminate_count. Whenever a backend successfully registers a >> parallel worker, it increments parallel_register_count. Whenever the >> postmaster marks a parallel wokrer slot as no longer in use, it >> increments parallel_terminate_count. Then, the number of active >> parallel workers is just parallel_register_count - >> parallel_terminate_count. (We can't have the postmaster and the >> backends share the same counter, because then it would need locking, >> and the postmaster can't try to take spinlocks - can't even use >> atomics, because those might be emulated using spinlocks.) >> > > I wanted to maintain counters at first, but it seemed more invasive, and > I thought that the max_parallel_worker would be ueful in environnements > where there're lots of parallel workers and dynamic workers used, so > finding a free slot would require iterating over most of the slots most > of the time anyway. I'm of course also ok with maintaining counters. I think we should go that way. Some day we might try to make the process of finding a free slot more efficient than it is today; I'd rather not double down on linear search. Are you going to update this patch? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold
On Sat, Jun 11, 2016 at 11:29 AM, Kevin Grittner wrote: > I have reviewed the code and run tests to try to find something > here which could be considered a bug, without finding any problem. > When reading pages for the random sample for ANALYZE (or > auto-analyze) there is not an age check; so ANALYZE completes > without error, keeping statistics up-to-date. > > There really is no difference in behavior except in the case that: > > (1) old_snapshot_threshold >= 0 to enable the "snapshot too old" >feature, and > (2) there were tuples that were dead as the result of completed >transactions, and > (3) those tuples became older than the threshold, and > (4) those tuples were pruned or vacuumed away, and > (5) an ANALYZE process would have read those dead tuples had they >not been removed. > > In such a case the irrevocably dead, permanently removed tuples are > not counted in the statistics. I have trouble seeing a better > outcome than that. Among my tests, I specifically checked for an > ANALYZE of a table having an index on an expression, using an old > snapshot: > > -- connection 1 > drop table if exists t1; > create table t1 (c1 int not null); > drop table if exists t2; > create table t2 (c1 int not null); > insert into t1 select generate_series(1, 1); > drop function mysq(i int); > create function mysq(i int) > returns int > language plpgsql > immutable > as $mysq$ > begin > return (i * i); > end > $mysq$; > create index t1_c1sq on t1 ((mysq(c1))); > begin transaction isolation level repeatable read; > select 1; > > -- connection 2 > vacuum analyze verbose t1; > delete from t1 where c1 between 1000 and 1999; > delete from t1 where c1 = 8000; > insert into t2 values (1); > select pg_sleep_for('2min'); > vacuum verbose t1; -- repeat if necessary to see the dead rows > disappear > > -- connection 1 > analyze verbose t1; > > This runs to completion, as I would want and expect. > > I am closing this item on the "PostgreSQL 9.6 Open Items" page. If > anyone feels that I've missed something, please provide a test to > show the problem, or a clear description of the problem and how you > feel behavior should be different. So what happens in this scenario: 1. ANALYZE runs really slowly - maybe the user-defined function it's running for the expression index is extremely long-running. 2. Eventually, the snapshot for ANALYZE is older than the configured value of snapshot_too_old. 3. Then, ANALYZE selects a page with an LSN new enough that it might have been pruned. Presumably, the ANALYZE ought to error out in this scenario, just as it would in any other situation where an old snapshot sees a new page. No? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should pg_export_snapshot() and currtid() be tagged parallel-unsafe?
On Tue, Jun 14, 2016 at 5:01 PM, Andreas Seltenreich wrote: > Digging through the sqlsmith logging db, I noticed the following errors: > > ERROR: cannot update SecondarySnapshot during a parallel operation > ERROR: cannot assign XIDs during a parallel operation > > Queries raising the first one always contain calls to currtid() or > currtid2(). Queries raising the second one always contain a call to > pg_export_snapshot(). Patch to tag them as unsafe attached. Thanks, committed. None of these are exercised by 'make check', or this would have been caught sooner. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
On Wed, Jun 15, 2016 at 03:02:15PM +0300, Teodor Sigaev wrote: > On Wed, Jun 15, 2016 at 02:54:33AM -0400, Noah Misch wrote: > > On Mon, Jun 13, 2016 at 10:44:06PM -0400, Noah Misch wrote: > > > On Fri, Jun 10, 2016 at 03:10:40AM -0400, Noah Misch wrote: > > > > [Action required within 72 hours. This is a generic notification.] > > > > > > > > The above-described topic is currently a PostgreSQL 9.6 open item. > > > > Teodor, > > > > since you committed the patch believed to have created it, you own this > > > > open > > > > item. If some other commit is more relevant or if this does not belong > > > > as a > > > > 9.6 open item, please let us know. Otherwise, please observe the > > > > policy on > > > > open item ownership[1] and send a status update within 72 hours of this > > > > message. Include a date for your subsequent status update. Testers may > > > > discover new open items at any time, and I want to plan to get them all > > > > fixed > > > > well in advance of shipping 9.6rc1. Consequently, I will appreciate > > > > your > > > > efforts toward speedy resolution. Thanks. > > > > > > > > [1] > > > > http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com > > > > > > This PostgreSQL 9.6 open item is past due for your status update. Kindly > > > send > > > a status update within 24 hours, and include a date for your subsequent > > > status > > > update. Refer to the policy on open item ownership: > > > http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com > > > >IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 9.6 open item is long past due > >for your status update. Please reacquaint yourself with the policy on open > >item ownership[1] and then reply immediately. If I do not hear from you by > >2016-06-16 07:00 UTC, I will transfer this item to release management team > >ownership without further notice. > > > >[1] > >http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com > > I'm working on it right now. That is good news, but it is not a valid status update. In particular, it does not specify a date for your next update. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 9:59 AM, Amit Kapila wrote: >> That just kicks the can down the road. Then you have PD_ALL_VISIBLE >> clear but the VM bit is still set. > > I mean to say clear both as we are doing currently in code: > if (PageIsAllVisible(BufferGetPage(buffer))) > { > all_visible_cleared = true; > PageClearAllVisible(BufferGetPage(buffer)); > visibilitymap_clear(relation, BufferGetBlockNumber(buffer), > vmbuffer); > } Sure, but without emitting a WAL record, that's just broken. You could have the heap page get flushed to disk and the VM page not get flushed to disk, and then crash, and now you have the classic VM corruption scenario. >> And you still haven't WAL-logged >> anything. > > Yeah, I think WAL requirement is more difficult to meet and I think > releasing the lock on buffer before writing WAL could lead to flush of such > a buffer before WAL. > > I feel this is an existing-bug and should go to Older Bugs Section in open > items page. It does seem to be an existing bug, but the freeze map makes the problem more serious, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 10:03 AM, Masahiko Sawada wrote: >> I'm not sure what to do about this: this part of the heap_update() >> logic has been like this forever, and I assume that if it were easy to >> refactor this away, somebody would have done it by now. > > How about changing collect_corrupt_items to acquire > AccessExclusiveLock for safely checking? Well, that would make it a lot less likely for pg_check_{visible,frozen} to detect the bug in heap_update(), but it wouldn't fix the bug in heap_update(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety tagging of extension functions
On 06/14/2016 07:51 PM, Robert Haas wrote: On Tue, Jun 14, 2016 at 6:37 AM, Andreas Karlsson wrote: I have rebased all my patches on the current master now (and skipped the extensions I previously listed). Thanks, this is really helpful. It was starting to get hard to keep track of what hadn't been applied yet. I decided to prioritize getting committed the patches where the extension version had already been bumped by 749a787c5b25ae33b3d4da0ef12aa05214aa73c7, so I've now committed the patches for cube, hstore, intarray, ltree, pg_trgm, and seg. However, in pg_trgm, I changed some of the functions that you had marked as PARALLEL RESTRICTED to be PARALLEL SAFE, because I didn't see any reason why they needed to be PARALLEL RESTRICTED. It's OK for a parallel-safe function to depend on GUC values, because those are synchronized from the leader to all worker processes. Random global variables won't necessarily be kept in sync, but anything controlled by the GUC mechanism will be. If there's some other reason you think those functions aren't parallel-safe, please let me know. Nah, this is a leftover from before I realized that GUCs are safe. I thought I went through all the code and fixed this misunderstanding. Thanks for spotting this. I'm still in favor of rejecting the adminpack patch. To me, that just seems like attaching a larger magazine to the gun pointed at your foot. I can't deny that in a strict sense those functions are parallel-safe, but I think they are better left alone. Making them parallel restricted should prevent them from being a footgun, but I also do not see any huge benefit from doing so (while making them safe seems dangerous). I do not care either way. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 9:56 PM, Robert Haas wrote: > On Wed, Jun 15, 2016 at 2:41 AM, Thomas Munro > wrote: >> I spent some time chasing down the exact circumstances. I suspect >> that there may be an interlocking problem in heap_update. Using the >> line numbers from cae1c788 [1], I see the following interaction >> between the VACUUM, UPDATE and SELECT (pg_check_visible) backends, all >> in reference to the same block number: >> >> [VACUUM] sets all visible bit >> >> [UPDATE] heapam.c:3931 HeapTupleHeaderSetXmax(oldtup.t_data, >> xmax_old_tuple); >> [UPDATE] heapam.c:3938 LockBuffer(buffer, BUFFER_LOCK_UNLOCK); >> >> [SELECT] LockBuffer(buffer, BUFFER_LOCK_SHARE); >> [SELECT] observes VM_ALL_VISIBLE as true >> [SELECT] observes tuple in HEAPTUPLE_DELETE_IN_PROGRESS state >> [SELECT] barfs >> >> [UPDATE] heapam.c:4116 visibilitymap_clear(...) > > Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2, > and CTID without logging anything or clearing the all-visible flag and > then releases the lock on the heap page to go do some more work that > might even ERROR out. Only if that other work all goes OK do we > relock the page and perform the WAL-logged actions. > > That doesn't seem like a good idea even in existing releases, because > you've taken a tuple on an all-visible page and made it not > all-visible, and you've made a page modification that is not > necessarily atomic without logging it. This is is particularly bad in > 9.6, because if that page is also all-frozen then XMAX will eventually > be pointing into space and VACUUM will never visit the page to > re-freeze it the way it would have done in earlier releases. However, > even in older releases, I think there's a remote possibility of data > corruption. Backend #1 makes these changes to the page, releases the > lock, and errors out. Backend #2 writes the page to the OS. DBA > takes a hot backup, tearing the page in the middle of XMAX. Oops. > > I'm not sure what to do about this: this part of the heap_update() > logic has been like this forever, and I assume that if it were easy to > refactor this away, somebody would have done it by now. > How about changing collect_corrupt_items to acquire AccessExclusiveLock for safely checking? Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On Wed, Jun 15, 2016 at 8:59 AM, David G. Johnston wrote: > On Wed, Jun 15, 2016 at 9:38 AM, Merlin Moncure wrote: >> >> On Tue, Jun 14, 2016 at 5:48 PM, David G. Johnston >> wrote: >> > On Tue, Jun 14, 2016 at 5:58 PM, Merlin Moncure >> > wrote: >> >> >> >> On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasby >> >> wrote: >> >> > On 6/14/16 3:56 PM, Tom Lane wrote: >> >> >> >> >> >> Jim Nasby writes: >> >> >>> >> >> >>> On 6/14/16 3:01 PM, Robert Haas wrote: >> >> >> >> This seems kind of silly, because anybody who is writing code that >> >> might have to run against an existing version of the database >> >> won't >> >> be >> >> able to use it. The one thing that absolutely has to be >> >> cross-version >> >> is the method of determining which version you're running against. >> >> >> >> >> >> >> >> >>> We're talking about a function that doesn't currently exist anyway. >> >> >> >> >> >> >> >> >> Huh? We're talking about PQserverVersion(), comparisons to >> >> >> PG_VERSION_NUM, >> >> >> and related APIs. Those most certainly exist now, and trying to >> >> >> supplant >> >> >> them seems like a giant waste of time. >> >> >> >> >> >> On the other hand, parsing fields out of version() mechanically has >> >> >> been >> >> >> deprecated for as long as those other APIs have existed (which is >> >> >> since >> >> >> 8.0 or so). version() is only meant for human consumption, so I see >> >> >> no >> >> >> reason it shouldn't just start returning "10.0", "10.1", etc. If >> >> >> that >> >> >> breaks anyone's code, well, they should have switched to one of the >> >> >> easier methods years ago. >> >> > >> >> > >> >> > The original post was: >> >> > >> >> >> IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= >> >> >> 9.3 >> >> > >> >> > and \df *version* on my HEAD doesn't show any other options. >> >> >> >> Right. It's the only way to handle things on the SQL level well, >> >> that, and pg_settings approaches. In other words, there is no easier >> >> way. I think it's pretty reasonable to assume there's a lot more code >> >> interfacing with the database from SQL than from C. >> > >> > >> > We could stand to be more explicit here. The docs for version() >> > indicated >> > the server_version_num should be used for "machine processing". >> >> whoop! I didn't know about that setting. I guess it dismantles a lot >> of the case for more aggressive action. That said, maybe it's a good >> idea to construct the versioning change so that 10.x releases have a >> server_version_num > 9.x releases and leave the other functions alone >> (assuming that wasn't already the plan). >> >> > Option E: Give the DBA control. If they know they have one or more >> > mis-behaving applications but it is out their control to patch the code >> > to >> > work properly they can change this supposedly human-readable output to >> > conform the historical x.y.z format. This would disabled by default. >> > Humans can easily interpret both versions so please save the comment >> > about >> > not having GUCs that influence user-visible behavior. If your argument >> > for >> > changing the format outright is "its for human consumption only" then >> > apparently this output should not be considered important enough to >> > adhere >> > to that rule. Non-humans depending on its format are subject to our, or >> > the >> > DBA's, whims. >> >> Nah -- my argument could be restated as "I wasn't aware of the machine >> variant of the version #". Do you think it's a good idea to have the >> machine version number be 10 for version 10.0? What would 10.1 >> be? 100100 or 11? > > > Sleeping on it I too came to the conclusion that the GUC was largely an > undesirable option. > > IIRC the plan is to have the machine version behave as if the middle number > is present and always zero. It would be (the?) one place that the > historical behavior remains visible but it is impossible to have a totally > clean break. > > Tom's comment back on May 14th (different thread) was that we'd basically > redefine the minor portion to be 4-digits instead of 2. That makes sense -- I'm good then. Thanks for engaging merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On Wed, Jun 15, 2016 at 9:38 AM, Merlin Moncure wrote: > On Tue, Jun 14, 2016 at 5:48 PM, David G. Johnston > wrote: > > On Tue, Jun 14, 2016 at 5:58 PM, Merlin Moncure > wrote: > >> > >> On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasby > >> wrote: > >> > On 6/14/16 3:56 PM, Tom Lane wrote: > >> >> > >> >> Jim Nasby writes: > >> >>> > >> >>> On 6/14/16 3:01 PM, Robert Haas wrote: > >> > >> This seems kind of silly, because anybody who is writing code that > >> might have to run against an existing version of the database won't > >> be > >> able to use it. The one thing that absolutely has to be > >> cross-version > >> is the method of determining which version you're running against. > >> >> > >> >> > >> >>> We're talking about a function that doesn't currently exist anyway. > >> >> > >> >> > >> >> Huh? We're talking about PQserverVersion(), comparisons to > >> >> PG_VERSION_NUM, > >> >> and related APIs. Those most certainly exist now, and trying to > >> >> supplant > >> >> them seems like a giant waste of time. > >> >> > >> >> On the other hand, parsing fields out of version() mechanically has > >> >> been > >> >> deprecated for as long as those other APIs have existed (which is > since > >> >> 8.0 or so). version() is only meant for human consumption, so I see > no > >> >> reason it shouldn't just start returning "10.0", "10.1", etc. If > that > >> >> breaks anyone's code, well, they should have switched to one of the > >> >> easier methods years ago. > >> > > >> > > >> > The original post was: > >> > > >> >> IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3 > >> > > >> > and \df *version* on my HEAD doesn't show any other options. > >> > >> Right. It's the only way to handle things on the SQL level well, > >> that, and pg_settings approaches. In other words, there is no easier > >> way. I think it's pretty reasonable to assume there's a lot more code > >> interfacing with the database from SQL than from C. > > > > > > We could stand to be more explicit here. The docs for version() > indicated > > the server_version_num should be used for "machine processing". > > whoop! I didn't know about that setting. I guess it dismantles a lot > of the case for more aggressive action. That said, maybe it's a good > idea to construct the versioning change so that 10.x releases have a > server_version_num > 9.x releases and leave the other functions alone > (assuming that wasn't already the plan). > > > Option E: Give the DBA control. If they know they have one or more > > mis-behaving applications but it is out their control to patch the code > to > > work properly they can change this supposedly human-readable output to > > conform the historical x.y.z format. This would disabled by default. > > Humans can easily interpret both versions so please save the comment > about > > not having GUCs that influence user-visible behavior. If your argument > for > > changing the format outright is "its for human consumption only" then > > apparently this output should not be considered important enough to > adhere > > to that rule. Non-humans depending on its format are subject to our, or > the > > DBA's, whims. > > Nah -- my argument could be restated as "I wasn't aware of the machine > variant of the version #". Do you think it's a good idea to have the > machine version number be 10 for version 10.0? What would 10.1 > be? 100100 or 11? Sleeping on it I too came to the conclusion that the GUC was largely an undesirable option. IIRC the plan is to have the machine version behave as if the middle number is present and always zero. It would be (the?) one place that the historical behavior remains visible but it is impossible to have a totally clean break. Tom's comment back on May 14th (different thread) was that we'd basically redefine the minor portion to be 4-digits instead of 2. David J.
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 7:13 PM, Robert Haas wrote: > > On Wed, Jun 15, 2016 at 9:43 AM, Amit Kapila wrote: > > On Wed, Jun 15, 2016 at 6:26 PM, Robert Haas wrote: > >> On Wed, Jun 15, 2016 at 2:41 AM, Thomas Munro > >> wrote: > >> > I spent some time chasing down the exact circumstances. I suspect > >> > that there may be an interlocking problem in heap_update. Using the > >> > line numbers from cae1c788 [1], I see the following interaction > >> > between the VACUUM, UPDATE and SELECT (pg_check_visible) backends, all > >> > in reference to the same block number: > >> > > >> > [VACUUM] sets all visible bit > >> > > >> > [UPDATE] heapam.c:3931 HeapTupleHeaderSetXmax(oldtup.t_data, > >> > xmax_old_tuple); > >> > [UPDATE] heapam.c:3938 LockBuffer(buffer, BUFFER_LOCK_UNLOCK); > >> > > >> > [SELECT] LockBuffer(buffer, BUFFER_LOCK_SHARE); > >> > [SELECT] observes VM_ALL_VISIBLE as true > >> > [SELECT] observes tuple in HEAPTUPLE_DELETE_IN_PROGRESS state > >> > [SELECT] barfs > >> > > >> > [UPDATE] heapam.c:4116 visibilitymap_clear(...) > >> > >> Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2, > >> and CTID without logging anything or clearing the all-visible flag and > >> then releases the lock on the heap page to go do some more work that > >> might even ERROR out. > > > > Can't we clear the all-visible flag before releasing the lock? We can use > > logic of already_marked as it is currently used in code to clear it just > > once. > > That just kicks the can down the road. Then you have PD_ALL_VISIBLE > clear but the VM bit is still set. I mean to say clear both as we are doing currently in code: if (PageIsAllVisible(BufferGetPage(buffer))) { all_visible_cleared = true; PageClearAllVisible(BufferGetPage(buffer)); visibilitymap_clear(relation, BufferGetBlockNumber(buffer), vmbuffer); } > > And you still haven't WAL-logged > anything. > Yeah, I think WAL requirement is more difficult to meet and I think releasing the lock on buffer before writing WAL could lead to flush of such a buffer before WAL. I feel this is an existing-bug and should go to Older Bugs Section in open items page. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 9:43 AM, Amit Kapila wrote: > On Wed, Jun 15, 2016 at 6:26 PM, Robert Haas wrote: >> On Wed, Jun 15, 2016 at 2:41 AM, Thomas Munro >> wrote: >> > I spent some time chasing down the exact circumstances. I suspect >> > that there may be an interlocking problem in heap_update. Using the >> > line numbers from cae1c788 [1], I see the following interaction >> > between the VACUUM, UPDATE and SELECT (pg_check_visible) backends, all >> > in reference to the same block number: >> > >> > [VACUUM] sets all visible bit >> > >> > [UPDATE] heapam.c:3931 HeapTupleHeaderSetXmax(oldtup.t_data, >> > xmax_old_tuple); >> > [UPDATE] heapam.c:3938 LockBuffer(buffer, BUFFER_LOCK_UNLOCK); >> > >> > [SELECT] LockBuffer(buffer, BUFFER_LOCK_SHARE); >> > [SELECT] observes VM_ALL_VISIBLE as true >> > [SELECT] observes tuple in HEAPTUPLE_DELETE_IN_PROGRESS state >> > [SELECT] barfs >> > >> > [UPDATE] heapam.c:4116 visibilitymap_clear(...) >> >> Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2, >> and CTID without logging anything or clearing the all-visible flag and >> then releases the lock on the heap page to go do some more work that >> might even ERROR out. > > Can't we clear the all-visible flag before releasing the lock? We can use > logic of already_marked as it is currently used in code to clear it just > once. That just kicks the can down the road. Then you have PD_ALL_VISIBLE clear but the VM bit is still set. And you still haven't WAL-logged anything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 6:26 PM, Robert Haas wrote: > > On Wed, Jun 15, 2016 at 2:41 AM, Thomas Munro > wrote: > > I spent some time chasing down the exact circumstances. I suspect > > that there may be an interlocking problem in heap_update. Using the > > line numbers from cae1c788 [1], I see the following interaction > > between the VACUUM, UPDATE and SELECT (pg_check_visible) backends, all > > in reference to the same block number: > > > > [VACUUM] sets all visible bit > > > > [UPDATE] heapam.c:3931 HeapTupleHeaderSetXmax(oldtup.t_data, xmax_old_tuple); > > [UPDATE] heapam.c:3938 LockBuffer(buffer, BUFFER_LOCK_UNLOCK); > > > > [SELECT] LockBuffer(buffer, BUFFER_LOCK_SHARE); > > [SELECT] observes VM_ALL_VISIBLE as true > > [SELECT] observes tuple in HEAPTUPLE_DELETE_IN_PROGRESS state > > [SELECT] barfs > > > > [UPDATE] heapam.c:4116 visibilitymap_clear(...) > > Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2, > and CTID without logging anything or clearing the all-visible flag and > then releases the lock on the heap page to go do some more work that > might even ERROR out. > Can't we clear the all-visible flag before releasing the lock? We can use logic of already_marked as it is currently used in code to clear it just once. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_dump vs. idle_in_transaction_session_timeout
Bernd Helmle writes: > Currently pg_dump doesn't turn off idle_in_transaction_session_timeout. Definitely an oversight, since it turns off other timeout settings. > Okay, the window of failure here is very narrow (on my machine it breaks > with an insane setting of 1ms only), Probably if you were using parallel dump or restore, it could be a lot more fragile, since an individual worker session might have nothing to do for some period of time. Will fix if no one beats me to it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On Tue, Jun 14, 2016 at 5:48 PM, David G. Johnston wrote: > On Tue, Jun 14, 2016 at 5:58 PM, Merlin Moncure wrote: >> >> On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasby >> wrote: >> > On 6/14/16 3:56 PM, Tom Lane wrote: >> >> >> >> Jim Nasby writes: >> >>> >> >>> On 6/14/16 3:01 PM, Robert Haas wrote: >> >> This seems kind of silly, because anybody who is writing code that >> might have to run against an existing version of the database won't >> be >> able to use it. The one thing that absolutely has to be >> cross-version >> is the method of determining which version you're running against. >> >> >> >> >> >>> We're talking about a function that doesn't currently exist anyway. >> >> >> >> >> >> Huh? We're talking about PQserverVersion(), comparisons to >> >> PG_VERSION_NUM, >> >> and related APIs. Those most certainly exist now, and trying to >> >> supplant >> >> them seems like a giant waste of time. >> >> >> >> On the other hand, parsing fields out of version() mechanically has >> >> been >> >> deprecated for as long as those other APIs have existed (which is since >> >> 8.0 or so). version() is only meant for human consumption, so I see no >> >> reason it shouldn't just start returning "10.0", "10.1", etc. If that >> >> breaks anyone's code, well, they should have switched to one of the >> >> easier methods years ago. >> > >> > >> > The original post was: >> > >> >> IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3 >> > >> > and \df *version* on my HEAD doesn't show any other options. >> >> Right. It's the only way to handle things on the SQL level well, >> that, and pg_settings approaches. In other words, there is no easier >> way. I think it's pretty reasonable to assume there's a lot more code >> interfacing with the database from SQL than from C. > > > We could stand to be more explicit here. The docs for version() indicated > the server_version_num should be used for "machine processing". whoop! I didn't know about that setting. I guess it dismantles a lot of the case for more aggressive action. That said, maybe it's a good idea to construct the versioning change so that 10.x releases have a server_version_num > 9.x releases and leave the other functions alone (assuming that wasn't already the plan). > Option E: Give the DBA control. If they know they have one or more > mis-behaving applications but it is out their control to patch the code to > work properly they can change this supposedly human-readable output to > conform the historical x.y.z format. This would disabled by default. > Humans can easily interpret both versions so please save the comment about > not having GUCs that influence user-visible behavior. If your argument for > changing the format outright is "its for human consumption only" then > apparently this output should not be considered important enough to adhere > to that rule. Non-humans depending on its format are subject to our, or the > DBA's, whims. Nah -- my argument could be restated as "I wasn't aware of the machine variant of the version #". Do you think it's a good idea to have the machine version number be 10 for version 10.0? What would 10.1 be? 100100 or 11? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
Michael Paquier writes: > To put it short, it should not be possible to drop a NOT NULL > constraint on a child relation when its parent table is using it, this > should only be possible from the parent. Attached is a patch handling > this problem by adding a new function in pg_inherits.c to fetch the > list of parent relations for a given relation OID, and did some > refactoring to stick with what is done when scanning child relations. This doesn't sound like the right approach; in particular, it won't really help for deciding whether to propagate a DROP NOT NULL on a parent rel to its children. What we've discussed in the past is to store NOT NULL constraints in pg_constraint, much like CHECK constraints are already, and use use-count logic identical to the CHECK case to keep track of whether NOT NULL constraints are inherited or not. My feeling is that we'd keep the pg_attribute.attnotnull field and continue to drive actual enforcement off that, but it would just reflect a summary of the pg_constraint state. IIRC, Alvaro posted a WIP patch for that awhile back. Not sure what the current state is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist
On Tue, Jun 14, 2016 at 4:48 PM, Amit Kapila wrote: > On Mon, Jun 13, 2016 at 8:11 PM, Tom Lane wrote: > > > I do > > not share your confidence that using apply_projection_to_path within > > create_grouping_paths is free of such a hazard. > > > > Fair enough, let me try to explain the problem and someways to solve it in > some more detail. The main thing that got missed by me in the patch > related to commit-04ae11f62 is that the partial path list of rel also needs > to have a scanjoin_target. I was under assumption that > create_grouping_paths will take care of assigning appropriate Path targets > for the parallel paths generated by it. If we see, create_grouping_paths() > do take care of adding the appropriate path targets for the paths generated > by that function. However, it doesn't do anything for partial paths. The > patch sent by me yesterday [1] which was trying to assign > partial_grouping_target to partial paths won't be the right fix, because > (a) partial_grouping_target includes Aggregates (refer > make_partialgroup_input_target) which we don't want for partial paths; (b) > it is formed using grouping_target passed in function > create_grouping_paths() whereas we need the path target formed from > final_target for partial paths (as partial paths are scanjoin relations) > as is done for main path list (in grouping_planner(), /* Forcibly apply > that target to all the Paths for the scan/join rel.*/). Now, I think we > have following ways to solve it: > > (a) check whether the scanjoin_target contains any parallel-restricted > clause before applying the same to partial path list in grouping_planner. > However it could lead to duplicate checks in some cases for > parallel-restricted clause, once in apply_projection_to_path() for main > pathlist and then again before applying to partial paths. I think we can > avoid that by having an additional parameter in apply_projection_to_path() > which can indicate if the check for parallel-safety is done inside that > function and what is the result of same. > > (b) generate the appropriate scanjoin_target for partial path list in > create_grouping_paths using final_target. However I think this might lead > to some duplicate code in create_grouping_paths() as we might have to some > thing similar to what we have done in grouping_planner for generating such > a target. I think if we want we can pass scanjoin_target to > create_grouping_paths() as well. Again, we have to check for > parallel-safety for scanjoin_target before applying it to partial paths, > but we need to do that only when grouped_rel is considered parallel-safe. > > One more idea could be to have a flag (say parallel_safe) in PathTarget and set it once we have ensured that the expressions in target doesn't contain any parallel-restricted entry. In create_pathtarget()/set_pathtarget_cost_width(), if the parallelmodeOK flag is set, then we can evaluate target expressions for parallel-restricted expressions and set the flag accordingly. Now, this has the benefit that we don't need to evaluate the expressions of targets for parallel-restricted clauses again and again. I think this way if the flag is set once for final_target in grouping_planner, we don't need to evaluate it again for other targets (grouping_target, scanjoin_target, etc.) as those are derived from final_target. Similarly, I think we need to set ReplOptInfo->reltarget and others as required. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Reviewing freeze map code
On Wed, Jun 15, 2016 at 2:41 AM, Thomas Munro wrote: > I spent some time chasing down the exact circumstances. I suspect > that there may be an interlocking problem in heap_update. Using the > line numbers from cae1c788 [1], I see the following interaction > between the VACUUM, UPDATE and SELECT (pg_check_visible) backends, all > in reference to the same block number: > > [VACUUM] sets all visible bit > > [UPDATE] heapam.c:3931 HeapTupleHeaderSetXmax(oldtup.t_data, > xmax_old_tuple); > [UPDATE] heapam.c:3938 LockBuffer(buffer, BUFFER_LOCK_UNLOCK); > > [SELECT] LockBuffer(buffer, BUFFER_LOCK_SHARE); > [SELECT] observes VM_ALL_VISIBLE as true > [SELECT] observes tuple in HEAPTUPLE_DELETE_IN_PROGRESS state > [SELECT] barfs > > [UPDATE] heapam.c:4116 visibilitymap_clear(...) Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2, and CTID without logging anything or clearing the all-visible flag and then releases the lock on the heap page to go do some more work that might even ERROR out. Only if that other work all goes OK do we relock the page and perform the WAL-logged actions. That doesn't seem like a good idea even in existing releases, because you've taken a tuple on an all-visible page and made it not all-visible, and you've made a page modification that is not necessarily atomic without logging it. This is is particularly bad in 9.6, because if that page is also all-frozen then XMAX will eventually be pointing into space and VACUUM will never visit the page to re-freeze it the way it would have done in earlier releases. However, even in older releases, I think there's a remote possibility of data corruption. Backend #1 makes these changes to the page, releases the lock, and errors out. Backend #2 writes the page to the OS. DBA takes a hot backup, tearing the page in the middle of XMAX. Oops. I'm not sure what to do about this: this part of the heap_update() logic has been like this forever, and I assume that if it were easy to refactor this away, somebody would have done it by now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Should phraseto_tsquery('simple', 'blue blue') @@ to_tsvector('simple', 'blue') be true ?
IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 9.6 open item is long past due for your status update. Please reacquaint yourself with the policy on open item ownership[1] and then reply immediately. If I do not hear from you by 2016-06-16 07:00 UTC, I will transfer this item to release management team ownership without further notice. [1] http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com I'm working on it right now. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in to_timestamp().
On Monday, 13 June 2016 9:57 PM, Robert Haas wrote: >I think a space in the format string should skip a whitespace >character in the input string, but not a non-whitespace character. +1, enough the benefit is we are giving correct output. PFA patch proposing this fix. Regards, Amul Sul. 0001-RM37358-space-in-the-format-string-should-skip-a-whi.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_dump vs. idle_in_transaction_session_timeout
Currently pg_dump doesn't turn off idle_in_transaction_session_timeout. Okay, the window of failure here is very narrow (on my machine it breaks with an insane setting of 1ms only), but for the sake of reliable backups and protection against over motivated DBA it looks better to me to turn that off, no? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parallel.c is not marked as test covered
On Wed, Jun 15, 2016 at 12:07 PM, Noah Misch wrote: > > On Wed, Jun 15, 2016 at 11:50:33AM +0530, Amit Kapila wrote: > > In short, this test doesn't serve it's purpose which is to generate an > > error from worker. > > That's bad. Thanks for figuring out the problem. > > > do $$begin > > Perform stringu1::int2 from tenk1 where unique1 = 1; > > end$$; > > > > ERROR: invalid input syntax for integer: "BA" > > CONTEXT: parallel worker, PID 4460 > > SQL statement "SELECT stringu1::int2 from tenk1 where unique1 = 1" > > PL/pgSQL function inline_code_block line 2 at PERFORM > > > > Considering above analysis is correct, we have below options: > > a. Modify the test such that it actually generates an error and to hide the > > context, we can exception block and raise some generic error. > > b. Modify the test such that it actually generates an error and to hide the > > context, we can use force_parallel_mode = regress; > > Either of those sounds okay. No need to raise a generic error; one can raise > SQLERRM to keep the main message and not the context. I lean toward (a) so we > have nonzero test coverage of force_parallel_mode=on. Patch implementing option (a) attached with this mail. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com fix_parallel_query_test_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails
Hello, At Tue, 14 Jun 2016 21:24:58 +0900, Michael Paquier wrote in > On Tue, Jun 14, 2016 at 8:31 PM, Kyotaro HORIGUCHI > wrote: > >> +# Take a second backup of the standby while the master is offline. > >> +$node_master->stop; > >> +$node_standby_1->backup('my_backup_2'); > >> +$node_master->start; > > > > I'm not sure that adding the test case for a particular bug like > > this is appropriate. But it would be acceptable because it > > doesn't take long time and it is separate from standard checks. > > We already take a backup from a standby when master is connected, it > should not cost much in terms of time. Agreed. > > It seems to me that we could more agressively advance the > > minRecoveryPoint (but must not let it go too far..), but it is > > right for it to aim a bit smaller than the ideal location. > > It may be risky to propose such a change for a backpatch. Anyway, in > any case there is no guarantee that when using the low-level backup > routines pg_start/stop_backup with a custom backup method the minimum > recovery point will be correct.. pg_basebackup does that a bit more > carefully if I recall correctly (too lazy to look at the code now :)). Yes, as written in another mail, minRecoveryPoint seems to be correctly upadtaed as its definition. The problem is that only redo point of the last restartpoint is updated, even if no update happened. But just inhibiting the update of redo point for the case in turn causes an annoying repetition of restartpoints. Addition to the TAP test above, the following SQL commands on the master causes the same problem without disconnection. =# checkpoint; =# select pg_switch_xlog(); This is somewhat artifical but the same situation could be made also in the nature. The exact condition for this is replaying a checkpoint record having no buffer modification since the preceding checkpoint in the previous WAL segments. Returning to the discussion on which way to choose, we agreed that using replayEndRecPtr instead of minRecoveryPoint in do_pg_stop_backup(), as the patch attached to the following message. https://www.postgresql.org/message-id/cab7npqtv5gmkqcndofgtgqoqxz2xlz4rrw247oqojzztvy6...@mail.gmail.com regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
Hi all, A couple of months back the $subject has been mentioned, though nobody actually wrote a patch to prevent that: http://www.postgresql.org/message-id/21633.1448383...@sss.pgh.pa.us So here is one.. To put it short, it should not be possible to drop a NOT NULL constraint on a child relation when its parent table is using it, this should only be possible from the parent. Attached is a patch handling this problem by adding a new function in pg_inherits.c to fetch the list of parent relations for a given relation OID, and did some refactoring to stick with what is done when scanning child relations. And here is what this patch can do: =# create table parent (a int not null); CREATE TABLE =# create table child (a int not null); CREATE TABLE =# alter table child inherit parent ; ALTER TABLE =# alter table child alter COLUMN a drop not null; -- would work on HEAD ERROR: 42P16: cannot drop NOT NULL constraint for attribute "a" DETAIL: The same column in parent relation "parent" is marked NOT NULL LOCATION: ATExecDropNotNull, tablecmds.c:5281 =# alter table parent alter COLUMN a drop not null; -- works on parent ALTER TABLE =# \d child Table "public.child" Column | Type | Modifiers +-+--- a | integer | Inherits: parent I have added a new index to pg_inherits, so that's not something that could be back-patched, still it would be good to fix this weird behavior on HEAD. I am adding that to the next CF. Regards, -- Michael child-drop-not-null-v1.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Partition-wise join for join between (declaratively) partitioned tables
Amit Langote is working on supporting declarative partitioning in PostgreSQL [1]. I have started working on supporting partition-wise join. This mail describes very high level design and some insight into the performance improvements. An equi-join between two partitioned tables can be broken down into pair-wise join between their partitions. This technique is called partition-wise join. Partition-wise joins between similarly partitioned tables with equi-join condition can be efficient because [2] 1. Each provably non-empty partition-wise join smaller. All such joins collectively might be more efficient than the join between their parent. 2. Such joins are able to exploit properties of partitions like indexes, their storage etc. 3. An N-way partition-wise join may have different efficient join orders compared to the efficient join order between the parent tables. A partition-wise join is processed in following stages [2], [3]. 1. Applicability testing: This phase checks if the join conditions match the partitioning scheme. A partition-wise join is efficient if there is an equi-join on the partition keys. E.g. join between tables R and S partitioned by columns a and b resp. can be broken down into partition-wise joins if there exists a join condition is R.a = S.b. Or in other words the number of provably non-empty partition-wise joins is O(N) where N is the number of partitions. 2. Matching: This phase determines which joins between the partitions of R and S can potentially produce tuples in the join and prunes empty joins between partitions. 3. Clustering: This phase aims at reducing the number of partition-wise joins by clubbing together partitions from joining relations. E.g. clubbing multiple partitions from either of the partitioned relations which can join to a single partition from the other partitioned relation. 4. Path/plan creation: This phase creates multiple paths for each partition-wise join. It also creates Append path/s representing the union of partition-wise joins. The work here focuses on a subset of use-cases discussed in [2]. It only considers partition-wise join for join between similarly partitioned tables with same number of partitions with same properties, thus producing at most as many partition-wise joins as there are partitions. It should be possible to apply partition-wise join technique (with some special handling for OUTER joins) if both relations have some extra partitions with non-overlapping partition conditions, apart from the matching partitions. But I am not planning to implement this optimization in the first cut. The attached patch is a POC implementation of partition-wise join. It is is based on the set of patches posted on 23rd May 2016 by Amit Langote for declarative partitioning. The patch gives an idea about the approach used. It has several TODOs, which I am working on. Attached is a script with output which measures potential performance improvement because of partition-wise join. The script uses a GUC enable_partition_wise_join to disable/enable this feature for performance measurement. The scripts measures performance improvement of a join between two tables partitioned by range on integer column. Each table contains 50K rows. Each table has an integer and a varchar column. It shows around 10-15% reduction in execution time when partition-wise join is used. Accompanied with parallel query and FDWs, it opens up avenues for further improvements for joins between partitioned tables. [1]. https://www.postgresql.org/message-id/55d3093c.5010...@lab.ntt.co.jp [2]. https://users.cs.duke.edu/~shivnath/papers/sigmod295-herodotou.pdf [3]. https://users.cs.duke.edu/~shivnath/tmp/paqo_draft.pdf -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company partitioned_join.out Description: Binary data \set num_samples 100 drop table t1 cascade; drop table t2 cascade; create table t1 (a int, b varchar) partition by range(a); create table t1_p1 partition of t1 for values start (1) end (1) inclusive; create table t1_p2 partition of t1 for values start (10001) end (2) inclusive; create table t1_p3 partition of t1 for values start (20001) end (3) inclusive; create table t1_p4 partition of t1 for values start (30001) end (4) inclusive; create table t1_p5 partition of t1 for values start (40001) end (5) inclusive; insert into t1 select i, to_char(i, 'FM00') from generate_series(1, 5) i; create table t2 (a int, b varchar) partition by range(a); create table t2_p1 partition of t2 for values start (1) end (1) inclusive; create table t2_p2 partition of t2 for values start (10001) end (2) inclusive; create table t2_p3 partition of t2 for values start (20001) end (3) inclusive; create table t2_p4 partition of t2 for values start (30001) end (4) inclusive; create table t2_p5 partition of t2 for values start (40001) end (5) inclusive; insert into t2 select i, to_char(i, 'FM00') from generate_series(1, 5
Re: [HACKERS] parallel.c is not marked as test covered
On Wed, Jun 15, 2016 at 12:07 PM, Noah Misch wrote: > > On Wed, Jun 15, 2016 at 11:50:33AM +0530, Amit Kapila wrote: > > do $$begin > > Perform stringu1::int2 from tenk1 where unique1 = 1; > > end$$; > > > > ERROR: invalid input syntax for integer: "BA" > > CONTEXT: parallel worker, PID 4460 > > SQL statement "SELECT stringu1::int2 from tenk1 where unique1 = 1" > > PL/pgSQL function inline_code_block line 2 at PERFORM > > > > Considering above analysis is correct, we have below options: > > a. Modify the test such that it actually generates an error and to hide the > > context, we can exception block and raise some generic error. > > b. Modify the test such that it actually generates an error and to hide the > > context, we can use force_parallel_mode = regress; > > Either of those sounds okay. No need to raise a generic error; one can raise > SQLERRM to keep the main message and not the context. I lean toward (a) so we > have nonzero test coverage of force_parallel_mode=on. > Do you mean to say nonzero test coverage of force_parallel_mode=on for error paths? I see that for force_parallel_mode=on, we have another test in select_parallel.sql set force_parallel_mode=1; explain (costs off) select stringu1::int2 from tenk1 where unique1 = 1; With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com