Re: [HACKERS] Supporting huge pages on Windows
On 2017-04-05 04:25:41 +, Tsunakawa, Takayuki wrote: > From: Craig Ringer [mailto:craig.rin...@2ndquadrant.com] > > On 5 April 2017 at 10:37, Tsunakawa, Takayuki > > wrote: > > > > OTOH, I tried again to leave the DISABLE_MAX_PRIVILEGE as is and add Lock > > Pages in Memory, using the attached pg_ctl.c. Please see > > EnableLockPagesPrivilege() and its call site. But pg_ctl -w start fails > > emitting the following message: > > > > That won't work. You'd have to pass 0 to the flags of CreateRestrictedToken > > and instead supply a PrivilegesToDelete array. > > You'd probably GetTokenInformation and AND with a mask of ones you wanted > > to retain. > > Uh, that's inconvenient. We can't determine what privileges to delete, and > we must be aware of new privileges added in the future version of Windows. > > Then, I have to say the last patch (v12) is the final answer. As I asked before, why can't we delete all privs and add the explicitly needed once back (using AdjustTokenPrivileges)? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] Small issue in online devel documentation build
On Thu, Mar 23, 2017 at 11:21:39PM -0400, Peter Eisentraut wrote: > I think the fix belongs into the web site CSS, so there is nothing to > commit into PostgreSQL here. I will close the commit fest entry, but I > have added a section to the open items list so we keep track of it. > (https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items#Documentation_tool_chain) [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Peter, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: extended stats not friendly towards ANALYZE with subset of columns
On Tue, Mar 28, 2017 at 10:30:03PM +1300, David Rowley wrote: > I'm just reviewing Tomas' code for the dependencies part of the stats > when I saw something that looked a bit unusual. > > I tested with: > > CREATE TABLE ab1 (a INTEGER, b INTEGER); > ALTER TABLE ab1 ALTER a SET STATISTICS 0; > INSERT INTO ab1 SELECT a, a%23 FROM generate_series(1, 1000) a; > CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; > ANALYZE ab1; > > And got: > > ERROR: extended statistics could not be collected for column "a" of > relation public.ab1 > HINT: Consider ALTER TABLE "public"."ab1" ALTER "a" SET STATISTICS -1 > > I don't think the error is useful here, as it means nothing gets done. > Probably better to just not (re)build those stats. > > Another option would be to check for extended stats before deciding > which rows to ANALYZE, then still gathering the columns required for > MV stats, but I think if the user asks for a subset of columns to be > analyzed, and that causes a column to be missing for an extended > statistics, that it would be pretty surprising if we rebuild the > extended stats. > > Perhaps the SET STATISTIC 0 for a column still needs to gather data > for extended statistics, though. Perhaps a debate should ensue about > how that should work exactly. > > I've attached a patch which fixes the problem above, but it does > nothing to change the analyze behaviour for 0 statistics columns. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Álvaro, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output
On Fri, Mar 24, 2017 at 12:26:33PM +0900, Michael Paquier wrote: > On Thu, Mar 23, 2017 at 1:10 AM, Stephen Frost wrote: > > * Andrew Dunstan (andrew.duns...@2ndquadrant.com) wrote: > >> On 03/22/2017 11:39 AM, Stephen Frost wrote: > >> > * Andrew Dunstan (and...@dunslane.net) wrote: > >> >> Sync pg_dump and pg_dumpall output > >> > This probably should have adjusted all callers of pg_dump in the > >> > regression tests to use the --no-sync option, otherwise we'll end up > >> > spending possibly a good bit of time calling fsync() during the > >> > regression tests unnecessairly. > >> > >> All of them? The imnpact is not likely to be huge in most cases > >> (possibly different on Windows). On crake, the bin-check stage actually > >> took less time after the change than before, so I suspect that the > >> impact will be pretty small. > > > > Well, perhaps not all, but I'd think --no-sync would be better to have > > in most cases. We turn off fsync for all of the TAP tests and all > > initdb calls, so it seems like we should here too. Perhaps it won't be > > much overhead on an unloaded system, but not all of the buildfarm > > animals seem to be on unloaded systems, nor are they particularly fast > > in general or have fast i/o.. > > Agreed. > > >> Still I agree that we should have tests for both cases. > > > > Perhaps, though if I recall correctly, we've basically had zero calls > > for fsync() until this. If we don't feel like we need to test that in > > the backend then it seems a bit silly to feel like we need it for > > pg_dump's regression coverage. > > > > That said, perhaps the right answer is to have a couple tests for both > > the backend and for pg_dump which do exercise the fsync-enabled paths. > > And agreed. > > The patch attached uses --no-sync in most places for pg_dump, except > in 4 places of 002_pg_dump.pl to stress as well the sync code path. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Andrew, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)
On Fri, Mar 10, 2017 at 11:49:46AM -0800, Andres Freund wrote: > On 2017-03-09 13:34:22 -0500, Robert Haas wrote: > > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane wrote: > > > Andres Freund writes: > > >> Wonder if we there's an argument to be made for implementing this > > >> roughly similarly to split_pathtarget_at_srf - instead of injecting a > > >> ProjectSet node we'd add a FunctionScan node below a Result node. > > > > > > Yeah, possibly. That would have the advantage of avoiding an ExecProject > > > step when the SRFs aren't buried, which would certainly be the expected > > > case. > > > > > > If you don't want to make ExecInitExpr responsible, then the planner would > > > have to do something like split_pathtarget_at_srf anyway to decompose the > > > expressions, no matter which executor representation we use. > > > > Did we do anything about this? Are we going to? > > Working on a patch. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Andres, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Quorum commit for multiple synchronous replication.
On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote: > Regarding this feature, there are some loose ends. We should work on > and complete them until the release. > > (1) > Which synchronous replication method, priority or quorum, should be > chosen when neither FIRST nor ANY is specified in s_s_names? Right now, > a priority-based sync replication is chosen for keeping backward > compatibility. However some hackers argued to change this decision > so that a quorum commit is chosen because they think that most users > prefer to a quorum. > > (2) > There will be still many source comments and documentations that > we need to update, for example, in high-availability.sgml. We need to > check and update them throughly. > > (3) > The priority value is assigned to each standby listed in s_s_names > even in quorum commit though those priority values are not used at all. > Users can see those priority values in pg_stat_replication. > Isn't this confusing? If yes, it might be better to always assign 1 as > the priority, for example. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Fujii, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
> From: pgsql-hackers-ow...@postgresql.org >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii >> I have done tests using pgproto. One thing I noticed a strange behavior. >> Below is an output of pgproto. The test first set the timeout to 3 seconds, >> and parse/bind for "SELECT pg_sleep(2)" then set timeout to 1 second using >> extended query. Subsequent Execute emits a statement timeout error as >> expected, but next "SELECT pg_sleep(2)" >> call using extended query does not emit a statement error. The test for >> this is "007-timeout-twice". Attached is the test cases including this. > > What's the handling of transactions like in pgproto? I guess the first > statement timeout error rolled back the effect of "SET statement_timeout = > '1s'", and the timeout reverted to 3s or some other value. Since pgproto is a dumb protocol machine, it does not start a transaction automatically (user needs to explicitly send a start transaction command via either simple or extended query). In this particular case no explicit transaction has started. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
On Wed, Apr 5, 2017 at 8:39 AM, Robert Haas wrote: > On Tue, Apr 4, 2017 at 10:22 AM, Ashutosh Bapat > wrote: >> Yes, I agree. For an inner join, the partition key types need to "shrink" >> and for outer join they need to be "widened". I don't know if there is a way >> to know "wider" or "shorter" of two given types. We might have to implement >> a method to merge partition keys to produce partition key of the join, which >> may be different from either of the partition keys. So, after-all we may >> have to abandon the idea of canonical partition scheme. I haven't included >> this change in the attached set of patches. > > I think this is why you need to regard the partitioning scheme as > something more like an equivalence class - possibly the partitioning > scheme should actually contain (or be?) an equivalence class. Suppose > this is the query: > > SELECT * FROM i4 INNER JOIN i8 ON i4.x = i8.x; > > ...where i4 (x) is an int4 partitioning key and i8 (x) is an int8 > partitioning key. It's meaningless to ask whether the result of the > join is partitioned by int4 or int8. It's partitioned by the > equivalence class that contains both i4.x and i8.x. If the result of > this join where joined to another table on either of those two > columns, a second partition-wise join would be theoretically possible. > If you insist on knowing the type of the partitioning scheme, rather > than just the opfamily, you've boxed yourself into a corner from which > there's no good escape. Only inner join conditions have equivalence classes associated with those. Outer join conditions create single element equivalence classes. So, we can not associate equivalence classes as they are with partition scheme. If we could do that, it makes life much easier since checking whether equi-join between all partition keys exist, is simply looking up equivalence classes that cover joining relations and find em_member corresponding to partition keys. It looks like we should only keep strategy, partnatts, partopfamily and parttypcoll in PartitionScheme. A partition-wise join between two relations would be possible if all those match. When matching partition bounds of joining relations, we should rely on partopfamily to give us comparison function based on the types of partition keys being joined. In that context it looks like all the partition bound comparision functions which accept partition key were not written keeping this use case in mind. They will need to be rewritten to accept strategy, partnatts, partopfamily and parttypcoll. There's a relevant comment in 0006, build_joinrel_partition_info() (probably that name needs to change, but I will do that once we have settled on design) + /* +* Construct partition keys for the join. +* +* An INNER join between two partitioned relations is partition by key +* expressions from both the relations. For tables A and B partitioned by a and b +* respectively, (A INNER JOIN B ON A.a = B.b) is partitioned by both A.a +* and B.b. +* +* An OUTER join like (A LEFT JOIN B ON A.a = B.b) may produce rows with +* B.b NULL. These rows may not fit the partitioning conditions imposed on +* B.b. Hence, strictly speaking, the join is not partitioned by B.b. +* Strictly speaking, partition keys of an OUTER join should include +* partition key expressions from the OUTER side only. Consider a join like +* (A LEFT JOIN B on (A.a = B.b) LEFT JOIN C ON B.b = C.c. If we do not +* include B.b as partition key expression for (AB), it prohibits us from +* using partition-wise join when joining (AB) with C as there is no +* equi-join between partition keys of joining relations. But two NULL +* values are never equal and no two rows from mis-matching partitions can +* join. Hence it's safe to include B.b as partition key expression for +* (AB), even though rows in (AB) are not strictly partitioned by B.b. +*/ I think that also needs to be reviewed carefully. Partition-wise joins may be happy including partition keys from all sides, but partition-wise aggregates may not be, esp. when pushing complete aggregation down to partitions. In that case, rows with NULL partition key, which falls on nullable side of join, will be spread across multiple partitions. Proabably, we should separate nullable and non-nullable partition key expressions. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics (v25)
Thanks Tomas and David for hacking on this patch. On 04.04.2017 20:19, Tomas Vondra wrote: I'm not sure we still need the min_group_size, when evaluating dependencies. It was meant to deal with 'noisy' data, but I think it after switching to the 'degree' it might actually be a bad idea. Consider this: create table t (a int, b int); insert into t select 1, 1 from generate_series(1, 1) s(i); insert into t select i, i from generate_series(2, 2) s(i); create statistics s with (dependencies) on (a,b) from t; analyze t; select stadependencies from pg_statistic_ext ; stadependencies [{1 => 2 : 0.44}, {2 => 1 : 0.44}] (1 row) So the degree of the dependency is just ~0.333 although it's obviously a perfect dependency, i.e. a knowledge of 'a' determines 'b'. The reason is that we discard 2/3 of rows, because those groups are only a single row each, except for the one large group (1/3 of rows). Just for me to follow the comments better. Is "dependency" roughly the same as when statisticians speak about " conditional probability"? Sven -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii > I have done tests using pgproto. One thing I noticed a strange behavior. > Below is an output of pgproto. The test first set the timeout to 3 seconds, > and parse/bind for "SELECT pg_sleep(2)" then set timeout to 1 second using > extended query. Subsequent Execute emits a statement timeout error as > expected, but next "SELECT pg_sleep(2)" > call using extended query does not emit a statement error. The test for > this is "007-timeout-twice". Attached is the test cases including this. What's the handling of transactions like in pgproto? I guess the first statement timeout error rolled back the effect of "SET statement_timeout = '1s'", and the timeout reverted to 3s or some other value. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
>> What do you think? I've not really tested this with the extended protocol, >> so I'd appreciate if you could rerun your test from the older thread. > > The patch looks good and cleaner. It looks like the code works as expected. > As before, I ran one INSERT statement with PgJDBC, with gdb's breakpoints set > on enable_timeout() and disable_timeout(). I confirmed that enable_timeout() > is called just once at Parse message, and disable_timeout() is called just > once at Execute message. > > I'd like to wait for Tatsuo-san's thorough testing with pgproto. I have done tests using pgproto. One thing I noticed a strange behavior. Below is an output of pgproto. The test first set the timeout to 3 seconds, and parse/bind for "SELECT pg_sleep(2)" then set timeout to 1 second using extended query. Subsequent Execute emits a statement timeout error as expected, but next "SELECT pg_sleep(2)" call using extended query does not emit a statement error. The test for this is "007-timeout-twice". Attached is the test cases including this. FE=> Query(query="SET statement_timeout = '3s'") <= BE CommandComplete(SET) <= BE ReadyForQuery(I) FE=> Parse(stmt="S1", query="SELECT pg_sleep(2)") FE=> Bind(stmt="S1", portal="S2") FE=> Parse(stmt="", query="SET statement_timeout = '1s'") FE=> Bind(stmt="", portal="") FE=> Execute(portal="") FE=> Execute(portal="S2") FE=> Sync <= BE ParseComplete <= BE BindComplete <= BE ParseComplete <= BE BindComplete <= BE CommandComplete(SET) <= BE ErrorResponse(S ERROR V ERROR C 57014 M canceling statement due to statement timeout F postgres.c L 2968 R ProcessInterrupts ) <= BE ReadyForQuery(I) FE=> Parse(stmt="S3", query="SELECT pg_sleep(2)") FE=> Bind(stmt="S3", portal="S2") FE=> Execute(portal="S2") FE=> Sync <= BE ParseComplete <= BE BindComplete <= BE DataRow <= BE CommandComplete(SELECT 1) <= BE ReadyForQuery(I) FE=> Terminate tests.tar.gz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
On Wed, Apr 5, 2017 at 10:59 AM, Amit Langote wrote: > On 2017/04/05 6:22, Keith Fiske wrote: > > On Tue, Apr 4, 2017 at 9:30 AM, Rahila Syed wrote: > >> Please find attached an updated patch. > >> Following has been accomplished in this update: > >> > >> 1. A new partition can be added after default partition if there are no > >> conflicting rows in default partition. > >> 2. Solved the crash reported earlier. > > > > Installed and compiled against commit > > 60a0b2ec8943451186dfa22907f88334d97cb2e0 (Date: Tue Apr 4 12:36:15 2017 > > -0400) without any issue > > > > However, running your original example, I'm getting this error > > > > keith@keith=# CREATE TABLE list_partitioned ( > > keith(# a int > > keith(# ) PARTITION BY LIST (a); > > CREATE TABLE > > Time: 4.933 ms > > keith@keith=# CREATE TABLE part_default PARTITION OF list_partitioned > FOR > > VALUES IN (DEFAULT); > > CREATE TABLE > > Time: 3.492 ms > > keith@keith=# CREATE TABLE part_1 PARTITION OF list_partitioned FOR > VALUES > > IN (4,5); > > ERROR: unrecognized node type: 216 > > It seems like the new ExecPrepareCheck should be used in the place of > ExecPrepareExpr in the code added in check_new_partition_bound(). > > > Also, I'm still of the opinion that denying future partitions of values > in > > the default would be a cause of confusion. In order to move the data out > of > > the default and into a proper child it would require first removing that > > data from the default, storing it elsewhere, creating the child, then > > moving it back. If it's only a small amount of data it may not be a big > > deal, but if it's a large amount, that could cause quite a lot of > > contention if done in a single transaction. Either that or the user would > > have to deal with data existing in the table, disappearing, then > > reappearing again. > > > > This also makes it harder to migrate an existing table easily. > Essentially > > no child tables for a large, existing data set could ever be created > > without going through one of the two situations above. > > I thought of the following possible way to allow future partitions when > the default partition exists which might contain rows that belong to the > newly created partition (unfortunately, nothing that we could implement at > this point for v10): > > Suppose you want to add a new partition which will accept key=3 rows. > > 1. If no default partition exists, we're done; no key=3 rows would have >been accepted by any of the table's existing partitions, so no need to >move any rows > > 2. Default partition exists which might contain key=3 rows, which we'll >need to move. If we do this in the same transaction, as you say, it >might result in unnecessary unavailability of table's data. So, it's >better to delegate that responsibility to a background process. The >current transaction will only add the new key=3 partition, so any key=3 >rows will be routed to the new partition from this point on. But we >haven't updated the default partition's constraint yet to say that it >no longer contains key=3 rows (constraint that the planner consumes), >so it will continue to be scanned for queries that request key=3 rows >(there should be some metadata to indicate that the default partition's >constraint is invalid), along with the new partition. > > 3. A background process receives a "work item" requesting it to move all >key=3 rows from the default partition heap to the new partition's heap. >The movement occurs without causing the table to become unavailable; >once all rows have been moved, we momentarily lock the table to update >the default partition's constraint to mark it valid, so that it will >no longer be accessed by queries that want to see key=3 rows. > > Regarding 2, there is a question of whether it should not be possible for > the row movement to occur in the same transaction. Somebody may want that > to happen because they chose to run the command during a maintenance > window, when the table's becoming unavailable is not an issue. In that > case, we need to think of the interface more carefully. > > Regarding 3, I think the new autovacuum work items infrastructure added by > the following commit looks very promising: > > * BRIN auto-summarization * > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h= > 7526e10224f0792201e99631567bbe44492bbde4 > > > However, thinking through this, I'm not sure I can see a solution without > > the global index support. If this restriction is removed, there's still > an > > issue of data duplication after the necessary child table is added. So > > guess it's a matter of deciding which user experience is better for the > > moment? > > I'm not sure about the fate of this particular patch for v10, but until we > implement a solution to move rows and design appropriate interface for the > same, we could error out if moving rows is required at all, like the patch > does. > >
Re: [HACKERS] BRIN cost estimate
> Interested to hear comments on this. I don't have chance to test it right now, but I am sure it would be an improvement over what we have right now. There is no single correct equation with so many unknowns we have. > *indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost); Have you preserved this line intentionally? This would make BRIN index scan cost even higher, but the primary property of BRIN is to be cheap to scan. Shouldn't bitmap heap scan node take into account of checking the tuples in its cost?
Re: [HACKERS] Adding support for Default partition in partitioning
On 2017/04/05 6:22, Keith Fiske wrote: > On Tue, Apr 4, 2017 at 9:30 AM, Rahila Syed wrote: >> Please find attached an updated patch. >> Following has been accomplished in this update: >> >> 1. A new partition can be added after default partition if there are no >> conflicting rows in default partition. >> 2. Solved the crash reported earlier. > > Installed and compiled against commit > 60a0b2ec8943451186dfa22907f88334d97cb2e0 (Date: Tue Apr 4 12:36:15 2017 > -0400) without any issue > > However, running your original example, I'm getting this error > > keith@keith=# CREATE TABLE list_partitioned ( > keith(# a int > keith(# ) PARTITION BY LIST (a); > CREATE TABLE > Time: 4.933 ms > keith@keith=# CREATE TABLE part_default PARTITION OF list_partitioned FOR > VALUES IN (DEFAULT); > CREATE TABLE > Time: 3.492 ms > keith@keith=# CREATE TABLE part_1 PARTITION OF list_partitioned FOR VALUES > IN (4,5); > ERROR: unrecognized node type: 216 It seems like the new ExecPrepareCheck should be used in the place of ExecPrepareExpr in the code added in check_new_partition_bound(). > Also, I'm still of the opinion that denying future partitions of values in > the default would be a cause of confusion. In order to move the data out of > the default and into a proper child it would require first removing that > data from the default, storing it elsewhere, creating the child, then > moving it back. If it's only a small amount of data it may not be a big > deal, but if it's a large amount, that could cause quite a lot of > contention if done in a single transaction. Either that or the user would > have to deal with data existing in the table, disappearing, then > reappearing again. > > This also makes it harder to migrate an existing table easily. Essentially > no child tables for a large, existing data set could ever be created > without going through one of the two situations above. I thought of the following possible way to allow future partitions when the default partition exists which might contain rows that belong to the newly created partition (unfortunately, nothing that we could implement at this point for v10): Suppose you want to add a new partition which will accept key=3 rows. 1. If no default partition exists, we're done; no key=3 rows would have been accepted by any of the table's existing partitions, so no need to move any rows 2. Default partition exists which might contain key=3 rows, which we'll need to move. If we do this in the same transaction, as you say, it might result in unnecessary unavailability of table's data. So, it's better to delegate that responsibility to a background process. The current transaction will only add the new key=3 partition, so any key=3 rows will be routed to the new partition from this point on. But we haven't updated the default partition's constraint yet to say that it no longer contains key=3 rows (constraint that the planner consumes), so it will continue to be scanned for queries that request key=3 rows (there should be some metadata to indicate that the default partition's constraint is invalid), along with the new partition. 3. A background process receives a "work item" requesting it to move all key=3 rows from the default partition heap to the new partition's heap. The movement occurs without causing the table to become unavailable; once all rows have been moved, we momentarily lock the table to update the default partition's constraint to mark it valid, so that it will no longer be accessed by queries that want to see key=3 rows. Regarding 2, there is a question of whether it should not be possible for the row movement to occur in the same transaction. Somebody may want that to happen because they chose to run the command during a maintenance window, when the table's becoming unavailable is not an issue. In that case, we need to think of the interface more carefully. Regarding 3, I think the new autovacuum work items infrastructure added by the following commit looks very promising: * BRIN auto-summarization * https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7526e10224f0792201e99631567bbe44492bbde4 > However, thinking through this, I'm not sure I can see a solution without > the global index support. If this restriction is removed, there's still an > issue of data duplication after the necessary child table is added. So > guess it's a matter of deciding which user experience is better for the > moment? I'm not sure about the fate of this particular patch for v10, but until we implement a solution to move rows and design appropriate interface for the same, we could error out if moving rows is required at all, like the patch does. Could you briefly elaborate why you think the lack global index support would be a problem in this regard? I agree that some design is required here to implement a solution redistribution of rows; not only in the context of supporting the notion of def
Re: [HACKERS] partitioned tables and contrib/sepgsql
Peter Eisentraut writes: > On 4/5/17 00:58, Tom Lane wrote: >> Another issue is whether you won't get compiler complaints about >> redefinition of the "true" and "false" macros. But those would >> likely only be warnings, not flat-out errors. > The complaint about bool is also just a warning. Really? $ cat test.c typedef char bool; typedef char bool; $ gcc -c test.c test.c:2: error: redefinition of typedef 'bool' test.c:1: note: previous declaration of 'bool' was here This is with gcc 4.4.7. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partitioned tables and contrib/sepgsql
On 4/5/17 00:58, Tom Lane wrote: > Another issue is whether you won't get compiler complaints about > redefinition of the "true" and "false" macros. But those would > likely only be warnings, not flat-out errors. The complaint about bool is also just a warning. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of 'Andres Freund' > Attached. I did not like that the previous patch had the timeout handling > duplicated in the individual functions, I instead centralized it into > start_xact_command(). Given that it already activated the timeout in the > most common cases, that seems to make more sense to me. In your version > we'd have called enable_statement_timeout() twice consecutively (which > behaviourally is harmless). > > What do you think? I've not really tested this with the extended protocol, > so I'd appreciate if you could rerun your test from the older thread. The patch looks good and cleaner. It looks like the code works as expected. As before, I ran one INSERT statement with PgJDBC, with gdb's breakpoints set on enable_timeout() and disable_timeout(). I confirmed that enable_timeout() is called just once at Parse message, and disable_timeout() is called just once at Execute message. I'd like to wait for Tatsuo-san's thorough testing with pgproto. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partitioned tables and contrib/sepgsql
Robert Haas writes: > On Tue, Apr 4, 2017 at 6:56 PM, Joe Conway wrote: >> Any objections? > I'm guessing Tom's going to have a strong feeling about whether 0001a > is the right way to address the stdbool issue, I will? [ looks ... ] Yup, you're right. I doubt that works at all, TBH. What I'd expect to happen with a typical compiler is a complaint about redefinition of typedef bool, because c.h already declared it and here this fragment is doing so again. It'd make sense to me to do + #ifdef bool + #undef bool + #endif to get rid of the macro definition of bool that stdbool.h is supposed to provide. But there should be no reason to declare our typedef a second time. Another issue is whether you won't get compiler complaints about redefinition of the "true" and "false" macros. But those would likely only be warnings, not flat-out errors. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiler warning in costsize.c
On Wed, Apr 5, 2017 at 2:54 AM, Tom Lane wrote: > Michael Paquier writes: >> In builds where USE_ASSERT_CHECKING is not enabled, costsize.c can >> generate warnings. Here is for example with MSVC: >> src/backend/optimizer/path/costsize.c(4520): warning C4101: 'rte' : >> unreferenced local variable [C:\Users\ioltas\git\postgres\postgres.vcxproj] >> src/backend/optimizer/path/costsize.c(4640): warning C4101: 'rte' : >> unreferenced local variable [C:\Users\ioltas\git\postgres\postgres.vcxproj] > >> a9c074ba7 has done an effort, but a bit more is needed as the attached. > > That doesn't look right at all: > > +#ifdef USE_ASSERT_CHECKING > RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY; > +#endif > > The entire point of the PG_USED_FOR_ASSERTS_ONLY annotation is to suppress > this type of warning without a need for an explicit #ifdef like that one. > > We should either fix PG_USED_FOR_ASSERTS_ONLY to work on MSVC as well, > or decide that we don't care about suppressing such warnings on MSVC, > or give up on PG_USED_FOR_ASSERTS_ONLY and remove it everywhere in > favor of plain #ifdefs. Visual does not have any equivalent of __attribute__((unused))... And visual does not have an equivalent after looking around. A trick that I can think of would be to change PG_USED_FOR_ASSERTS_ONLY to be roughly a macro like that: #if defined(__GNUC__) #define PG_ASSERT_ONLY(x) ASSERT_ONLY_ ## x __attribute__((unused)) #else #define PG_ASSERT_ONLY(x) ((void) x) #endif But that's ugly... > (I'm personally not that much in love with PG_USED_FOR_ASSERTS_ONLY, > because it tends to confuse pgindent.) I would be incline to just do that, any other solution I can think of is uglier than that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
Andres Freund writes: > On 2017-04-05 10:05:19 +0900, Tatsuo Ishii wrote: >> What's your point of the question? What kind of problem do you expect >> if the timeout starts only once at the first parse meesage out of >> bunch of parse messages? > It's perfectly valid to send a lot of Parse messages without > interspersed Sync or Bind/Execute message. There'll be one timeout > covering all of those Parse messages, which can thus lead to a timeout, > even though nothing actually takes long individually. It might well be reasonable to redefine statement_timeout as limiting the total time from the first client input to the response to Sync ... but if that's what we're doing, let's make sure we do it consistently. I haven't read the patch, but the comments in this thread make me fear that it's introducing some ad-hoc, inconsistent behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN cost estimate
On 3 April 2017 at 03:05, Emre Hasegeli wrote: > Unfortunately, I am on vacation for two weeks without my computer. I can > post another version after 18th. I know we are under time pressure for > release. I wouldn't mind if you or Alvaro would change it anyway you like. I've made some changes. Actually, I completely changed how the estimates work. I find this method more self-explanatory. Basically, we work out the total index ranges, then work out how many of those we'd touch in a perfectly correlated scenario. We then work out how many ranges we'll probably visit based on the correlation estimates from the stats, and assume the selectivity is probableRanges / totalIndexRanges. I've attached a spreadsheet that compares Emre's method to mine. Mine seems to favour the BRIN index less when the table is small. I think this is pretty natural since if there is only 1 range, and we narrow the result to one of them, then we might as well have performed a seqscan. My method seems favour BRIN a bit longer when the correlation is between about 1% and 100%. But penalises BRIN much more when correlation is less than around 1%. This may be better my way is certainly smarter than the unpatched version, but still holds on a bit longer, which may be more favourable if a BRIN index actually exists. It might be more annoying for a user to have added a BRIN index and never have the planner choose it. My method also never suffers from estimating > 100% of the table. I was a bit worried that Emre's method would penalise BRIN too much when the correlation is not so high. Interested to hear comments on this. Please feel free to play with the spreadsheet by changing rows 1-3 in column B. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services brin-correlation-drowley_v1.patch Description: Binary data BRIN_estimates2.ods Description: application/vnd.oasis.opendocument.spreadsheet -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting huge pages on Windows
From: Craig Ringer [mailto:craig.rin...@2ndquadrant.com] > On 5 April 2017 at 10:37, Tsunakawa, Takayuki > wrote: > > OTOH, I tried again to leave the DISABLE_MAX_PRIVILEGE as is and add Lock > Pages in Memory, using the attached pg_ctl.c. Please see > EnableLockPagesPrivilege() and its call site. But pg_ctl -w start fails > emitting the following message: > > That won't work. You'd have to pass 0 to the flags of CreateRestrictedToken > and instead supply a PrivilegesToDelete array. > You'd probably GetTokenInformation and AND with a mask of ones you wanted > to retain. Uh, that's inconvenient. We can't determine what privileges to delete, and we must be aware of new privileges added in the future version of Windows. Then, I have to say the last patch (v12) is the final answer. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of SASLprep for SCRAM-SHA-256
fore On Wed, Apr 5, 2017 at 7:05 AM, Heikki Linnakangas wrote: > I will continue tomorrow, but I wanted to report on what I've done so far. > Attached is a new patch version, quite heavily modified. Notable changes so > far: Great, thanks! > * Use Unicode codepoints, rather than UTF-8 bytes packed in a 32-bit ints. > IMHO this makes the tables easier to read (to a human), and they are also > packed slightly more tightly (see next two points), as you can fit more > codepoints in a 16-bit integer. Using directly codepoints is not much consistent with the existing backend, but for the sake of packing things more, OK. > * Reorganize the decomposition table. Instead of a separate binary-searched > table for each "size", have one table that's ordered by codepoint, which > contains a length and offset into another array, where all the decomposed > sequences are. This makes the recomposition step somewhat slower, as it now > has to grovel through an array that contains all decompositions, rather than > just the array that contains decompositions of two characters. But this > isn't performance-critical, readability and tight packing of the tables > matter more. Okay, no objection to that. > * In the lookup table, store singleton decompositions that decompose to a > single character, and the character fits in a uint16, directly in the main > lookup table, instead of storing an index into the other table. This makes > the tables somewhat smaller, as there are a lot of such decompositions. Indeed. > * Use uint32 instead of pg_wchar to represent unicode codepoints. pg_wchar > suggests something that holds multi-byte characters in the OS locale, used > by functions like wcscmp, so avoid that. I added a "typedef uint32 > Codepoint" alias, though, to make it more clear which variables hold Unicode > codepoints rather than e.g. indexes into arrays. > > * Unicode consortium publishes a comprehensive normalization test suite, as > a text file, NormalizationTest.txt. I wrote a small perl and C program to > run all the tests from that test suite, with the normalization function. > That uncovered a number of bugs in the recomposition algorithm, which I then > fixed: I was looking for something like that for some time, thanks! That's here actually: http://www.unicode.org/Public/8.0.0/ucd/NormalizationTest.txt > * decompositions that map to ascii characters cannot be excluded. > * non-canonical and non-starter decompositions need to be excluded. They > are now flagged in the lookup table, so that we only use them during the > decomposition phase, not when recomposing. Okay on that. > TODOs / Discussion points: > > * I'm not sure I like the way the code is organized, I feel that we're > littering src/common with these. Perhaps we should add a > src/common/unicode_normalization directory for all this. pg_utf8_islegal() and pg_utf_mblen() should as well be moved in their own file I think, and wchar.c can use that. > * The list of characters excluded from recomposition is currently hard-coded > in utf_norm_generate.pl. However, that list is available in machine-readable > format, in file CompositionExclusions.txt. Since we're reading most of the > data from UnicodeData.txt, would be good to read the exclusion table from a > file, too. Ouch. Those are present here... http://www.unicode.org/reports/tr41/tr41-19.html#Exclusions Definitely it makes more sense to read them from a file. > * SASLPrep specifies normalization form KC, but it also specifies that some > characters are mapped to space or nothing. Should do those mappings, too. Ah, right. Those ones are here: https://tools.ietf.org/html/rfc3454#appendix-B.1 The conversion tables need some extra tweaking... + else if ((*utf & 0xf0) == 0xe0) + { + if (utf[1] == 0 || utf[2] == 0) + goto invalid; + cp = (*utf++) & 0x0F; + cp = (cp << 6) | ((*utf++) & 0x3F); + cp = (cp << 6) | ((*utf++) & 0x3F); + } + else if ((*utf & 0xf8) == 0xf0) + { + if (utf[1] == 0 || utf[2] == 0|| utf[3] == 0) + goto invalid; + cp = (*utf++) & 0x07; + cp = (cp << 6) | ((*utf++) & 0x3F); + cp = (cp << 6) | ((*utf++) & 0x3F); + cp = (cp << 6) | ((*utf++) & 0x3F); + } I find more readable something like pg_utf2wchar_with_len(), but well... Some typos: s/fore/for/ s/reprsented/represented/ You seem to be fully on it... I can help out with any of those items as needed. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
On 4/4/17 12:55, Ashutosh Sharma wrote: > As I am not seeing any response from Tomas for last 2-3 days and since > the commit-fest is coming towards end, I have planned to work on the > review comments that I had given few days back and submit the updated > patch. PFA new version of patch that takes care of all the review > comments given by me. I have also ran pgindent on btreefuncs.c file > and have run some basic test cases. All looks fine to me now! committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On 4/4/17 22:47, Amit Kapila wrote: >> Committed first part to allow internal representation change (only). >> >> No commitment yet to increasing wal-segsize in the way this patch has it. >> > > What part of patch you don't like and do you have any suggestions to > improve the same? I think there are still some questions and disagreements about how it should behave. I suggest the next step is to dial up the allowed segment size in configure and run some tests about what a reasonable maximum value could be. I did a little bit of that, but somewhere around 256 MB, things got really slow. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Wed, Apr 5, 2017 at 8:42 AM, Robert Haas wrote: > On Tue, Apr 4, 2017 at 10:21 PM, Pavan Deolasee > wrote: > > On Thu, Mar 30, 2017 at 7:55 PM, Robert Haas > wrote: > >> but > >> try to access the TOAST table would be fatal; that probably would have > >> deadlock hazards among other problems. > > > > Hmm. I think you're right. We could make a copy of the heap tuple, drop > the > > lock and then access TOAST to handle that. Would that work? > > Yeah, but it might suck. :-) Well, better than causing a deadlock ;-) Lets see if we want to go down the path of blocking WARM when tuples have toasted attributes. I submitted a patch yesterday, but having slept over it, I think I made mistakes there. It might not be enough to look at the caller supplied new tuple because that may not have any toasted values, but the final tuple that gets written to the heap may be toasted. We could look at the new tuple's attributes to find if any indexed attributes are toasted, but that might suck as well. Or we can simply block WARM if the old or the new tuple has external attributes i.e. HeapTupleHasExternal() returns true. That could be overly restrictive because irrespective of whether the indexed attributes are toasted or just some other attribute is toasted, we will block WARM on such updates. May be that's not a problem. We will also need to handle the case where some older tuple in the chain has toasted value and that tuple is presented to recheck (I think we can handle that case fairly easily, but its not done in the code yet) because of a subsequent WARM update and the tuples updated by WARM did not have any toasted values (and hence allowed). Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Tue, Apr 4, 2017 at 10:21 PM, Pavan Deolasee wrote: > On Thu, Mar 30, 2017 at 7:55 PM, Robert Haas wrote: >> but >> try to access the TOAST table would be fatal; that probably would have >> deadlock hazards among other problems. > > Hmm. I think you're right. We could make a copy of the heap tuple, drop the > lock and then access TOAST to handle that. Would that work? Yeah, but it might suck. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
On Tue, Apr 4, 2017 at 10:22 AM, Ashutosh Bapat wrote: > Yes, I agree. For an inner join, the partition key types need to "shrink" > and for outer join they need to be "widened". I don't know if there is a way > to know "wider" or "shorter" of two given types. We might have to implement > a method to merge partition keys to produce partition key of the join, which > may be different from either of the partition keys. So, after-all we may > have to abandon the idea of canonical partition scheme. I haven't included > this change in the attached set of patches. I think this is why you need to regard the partitioning scheme as something more like an equivalence class - possibly the partitioning scheme should actually contain (or be?) an equivalence class. Suppose this is the query: SELECT * FROM i4 INNER JOIN i8 ON i4.x = i8.x; ...where i4 (x) is an int4 partitioning key and i8 (x) is an int8 partitioning key. It's meaningless to ask whether the result of the join is partitioned by int4 or int8. It's partitioned by the equivalence class that contains both i4.x and i8.x. If the result of this join where joined to another table on either of those two columns, a second partition-wise join would be theoretically possible. If you insist on knowing the type of the partitioning scheme, rather than just the opfamily, you've boxed yourself into a corner from which there's no good escape. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Supporting huge pages on Windows
On 5 April 2017 at 10:37, Tsunakawa, Takayuki wrote: > Good point! And I said earlier in this thread, I think managing privileges > (adding/revoking privileges from the user account) is the DBA's or sysadmin's > duty, and PG's removing all privileges feels overkill. I think it's a sensible alternative to refusing to run as a highly privileged role, which is what we used to do IIRC. > OTOH, I tried again to leave the DISABLE_MAX_PRIVILEGE as is and add Lock > Pages in Memory, using the attached pg_ctl.c. Please see > EnableLockPagesPrivilege() and its call site. But pg_ctl -w start fails > emitting the following message: That won't work. You'd have to pass 0 to the flags of CreateRestrictedToken and instead supply a PrivilegesToDelete array. You'd probably GetTokenInformation and AND with a mask of ones you wanted to retain. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics (v25)
At Tue, 4 Apr 2017 20:19:39 +0200, Tomas Vondra wrote in <56f40b20-c464-fad2-ff39-06b668fac...@2ndquadrant.com> > On 04/04/2017 09:55 AM, David Rowley wrote: > > On 1 April 2017 at 04:25, David Rowley > > wrote: > >> I've attached an updated patch. > > > > I've made another pass at this and ended up removing the tryextstats > > variable. We now only try to use extended statistics when > > clauselist_selectivity() is given a valid RelOptInfo with rtekind == > > RTE_RELATION, and of course, it must also have some extended stats > > defined too. > > > > I've also cleaned up a few more comments, many of which I managed to > > omit updating when I refactored how the selectivity estimates ties > > into clauselist_selectivity() > > > > I'm quite happy with all of this now, and would also be happy for > > other people to take a look and comment. > > > > As a reviewer, I'd be marking this ready for committer, but I've moved > > a little way from just reviewing this now, having spent two weeks > > hacking at it. > > > > The latest patch is attached. > > > > Thanks David, I agree the reworked patch is much cleaner that the last > version I posted. Thanks for spending your time on it. > > Two minor comments: > > 1) DEPENDENCY_MIN_GROUP_SIZE > > I'm not sure we still need the min_group_size, when evaluating > dependencies. It was meant to deal with 'noisy' data, but I think it > after switching to the 'degree' it might actually be a bad idea. > > Consider this: > > create table t (a int, b int); > insert into t select 1, 1 from generate_series(1, 1) s(i); > insert into t select i, i from generate_series(2, 2) s(i); > create statistics s with (dependencies) on (a,b) from t; > analyze t; > > select stadependencies from pg_statistic_ext ; > stadependencies > > [{1 => 2 : 0.44}, {2 => 1 : 0.44}] > (1 row) > > So the degree of the dependency is just ~0.333 although it's obviously > a perfect dependency, i.e. a knowledge of 'a' determines 'b'. The > reason is that we discard 2/3 of rows, because those groups are only a > single row each, except for the one large group (1/3 of rows). > > Without the mininum group size limitation, the dependencies are: > > test=# select stadependencies from pg_statistic_ext ; > stadependencies > > [{1 => 2 : 1.00}, {2 => 1 : 1.00}] > (1 row) > > which seems way more reasonable, I think. I think the same. Quite large part of functional dependency in reality is in this kind. > 2) A minor detail is that instead of this > > if (estimatedclauses != NULL && > bms_is_member(listidx, estimatedclauses)) > continue; > > perhaps we should do just this: > > if (bms_is_member(listidx, estimatedclauses)) > continue; > > bms_is_member does the same NULL check right at the beginning, so I > don't think this might make a measurable difference. I have some other comments. == - The comment for clauselist_selectivity, | + * When 'rel' is not null and rtekind = RTE_RELATION, we'll try to apply | + * selectivity estimates using any extended statistcs on 'rel'. The 'rel' is actually a parameter but rtekind means rel->rtekind so this might be better be such like the following. | When a relation of RTE_RELATION is given as 'rel', we try | extended statistcs on the relation. Then the following line doesn't seem to be required. | + * If we identify such extended statistics exist, we try to apply them. = The following comment in the same function, | +if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL) | +{ | +/* | + * Try to estimate with multivariate functional dependency statistics. | + * | + * The function will supply an estimate for the clauses which it | + * estimated for. Any clauses which were unsuitible were ignored. | + * Clauses which were estimated will have their 0-based list index set | + * in estimatedclauses. We must ignore these clauses when processing | + * the remaining clauses later. | + */ (Notice that I'm not a good writer) This might better be the following. | dependencies_clauselist_selectivity gives selectivity over | caluses that functional dependencies on the given relation is | applicable. 0-based index numbers of consumed clauses are | returned in the bitmap set estimatedclauses so that the | estimation here after can ignore them. = | +s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid, | + jointype, sjinfo, rel, &estimatedclauses); The name prefix "dependency_" means "functional_dependency" here and omitting "functional" is confusing to me. On the other hand "functional_dependency" is quite long as prefix. Could we use "func_dependency" or something that is sho
Re: [HACKERS] identity columns
On 4/3/17, Peter Eisentraut wrote: > On 3/30/17 22:57, Vitaly Burovoy wrote: >> Why do you still want to leave "ADD IF NOT EXISTS" instead of using >> "SET IF NOT EXISTS"? >> If someone wants to follow the standard he can simply not to use "IF >> NOT EXISTS" at all. Without it error should be raised. > > As I tried to mention earlier, it is very difficult to implement the IF > NOT EXISTS behavior here, because we need to run the commands the create > the sequence before we know whether we will need it. I understand. You did not mention the reason. But now you have the "get_attidentity" function to check whether the column is an identity or not. If it is not so create a sequence otherwise skip the creation. I'm afraid after the feature freeze it is impossible to do "major reworking of things", but after the release we have to support the "ALTER COLUMN col ADD" grammar. > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > The next nitpickings to the last patch. I try to get places with lacking of variables' initialization. All other things seem good for me now. I'll continue to review the patch while you're fixing the current notes. Unfortunately I don't know PG internals so deep to understand correctness of changes in the executor (what Tom and Andres are talking about). 0. There is a little conflict of applying to the current master because of 60a0b2e. 1. First of all, I think the previous usage of the constant "ATTRIBUTE_IDENTITY_NONE" (for setting a value) is better than just '\0'. It is OK for lacking of the constant in comparison. 2. Lack of "SET IDENTITY" and "RESTART" in the "Compatibility" chapter in "alter_table.sgml": "The forms ADD (without USING INDEX), DROP, SET DEFAULT, and SET DATA TYPE (without USING) conform with the SQL standard." Also ADD IDENTITY is an extension (I hope temporary), but it looks like a standard feature (from the above sentance). 3. It would be better if tab-completion has ability to complete the "RESTART" keyword like: ALTER TABLE x1 ALTER COLUMN i RESTART tab-complete.c:8898 - COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "ADD", "DROP"); + COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP"); 4. The block in "gram.y": /* ALTER TABLE ALTER [COLUMN] DROP IDENTITY */ does not contain "n->missing_ok = false;". I think it should be. 5. In the file "pg_dump.c" in the "getTableAttrs" function at row 7959 the "tbinfo->needs_override" is used (in the "||" operator) but it is initially nowhere set to "false". 6. And finally... a segfault when pre-9.6 databases are pg_dump-ing: SQL query in the file "pg_dump.c" in funcion "getTables" has the "is_identity_sequence" column only in the "if (fout->remoteVersion >= 90600)" block (frow 5536), whereas 9.6 does not support it (but OK, for 9.6 it is always FALSE, no sense to create a new "if" block), but in other blocks no ",FALSE as is_identity_sequence" is added. The same mistake is in the "getTableAttrs" function. Moreover whether "ORDER BY a.attrelid, a.attnum" is really necessary ("else if (fout->remoteVersion >= 90200)" has only "ORDER BY a.attnum")? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Wed, Apr 5, 2017 at 3:36 AM, Simon Riggs wrote: > On 27 March 2017 at 15:36, Beena Emerson wrote: > >> 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes >> the internal representation of max_wal_size and min_wal_size to mb. > > Committed first part to allow internal representation change (only). > > No commitment yet to increasing wal-segsize in the way this patch has it. > What part of patch you don't like and do you have any suggestions to improve the same? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 5 April 2017 at 08:23, Craig Ringer wrote: > On 5 April 2017 at 08:00, Craig Ringer wrote: > >> Taking a look at this now. > > Rebased to current master with conflicts and whitespace errors fixed. > Review pending. This patch fails to update the documentation at all. https://www.postgresql.org/docs/devel/static/spi.html The patch crashes in initdb with --enable-cassert builds: performing post-bootstrap initialization ... TRAP: FailedAssertion("!(myState->pub.mydest == DestSQLFunction)", File: "functions.c", Line: 800) sh: line 1: 30777 Aborted (core dumped) "/home/craig/projects/2Q/postgres/tmp_install/home/craig/pg/10/bin/postgres" --single -F -O -j -c search_path=pg_catalog -c exit_on_error=true template1 > /dev/null child process exited with exit code 134 Backtrace attached. Details on patch 1: missing newline +} +int +SPI_execute_callback(const char *src, bool read_only, long tcount, +/* Execute a previously prepared plan with a callback Destination */ caps "Destination" +// XXX throw error if callback is set ^^ +static DestReceiver spi_callbackDR = { +donothingReceive, donothingStartup, donothingCleanup, donothingCleanup, +DestSPICallback +}; Is the callback destreceiver supposed to be a blackhole? Why? Its name, spi_callbackDR and DestSPICallback, doesn't seem to indicate that it drops its input. Presumably that's a default destination you're then supposed to modify with your own callbacks? There aren't any comments to indicate what's going on here. Comments on patch 2: If this is the "bare minimum" patch, what is pending? Does it impose any downsides or limits? +/* Get switch execution contexts */ +extern PLyExecutionContext *PLy_switch_execution_context(PLyExecutionContext *new); Comment makes no sense to me. This seems something like memory context switch, where you supply the new and return the old. But the comment doesn't explain it at all. +void PLy_CSStartup(DestReceiver *self, int operation, TupleDesc typeinfo); +void PLy_CSDestroy(DestReceiver *self); These are declared in the plpy_spi.c. Why aren't these static? +volatile MemoryContext oldcontext; +volatile ResourceOwner oldowner; int rv; -volatile MemoryContext oldcontext; -volatile ResourceOwner oldowner; Unnecessary code movement. In PLy_Callback_New, I think your use of a sub-context is sensible. Is it necessary to palloc0 though? +static CallbackState * +PLy_Callback_Free(CallbackState *callback) The code here looks like it could be part of spi.c's callback support, rather than plpy_spi specific, since you provide a destroy callback in the SPI callback struct. +/* We need to store this because the TupleDesc the receive function gets has no names. */ +myState->desc = typeinfo; Is it safe to just store the pointer to the TupleDesc here? What's the lifetime? + * will clean it up when the time is right. XXX This might result in a leak + * if an error happens and the result doesn't get dereferenced. + */ +MemoryContextSwitchTo(TopMemoryContext); +result->tupdesc = CreateTupleDescCopy(typeinfo); especially given this XXX comment... Patch needs bug fix, docs updates, fixes for issues marked in comments. But overall approach looks sensible enough. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services #0 0x7f9d5d64ea28 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x7f9d5d65062a in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x0083a521 in ExceptionalCondition (conditionName=conditionName@entry=0x9b68d0 "!(myState->pub.mydest == DestSQLFunction)", errorType=errorType@entry=0x882d29 "FailedAssertion", fileName=fileName@entry=0x9b6920 "functions.c", lineNumber=lineNumber@entry=800) at assert.c:54 No locals. #3 0x006166e8 in postquel_start (fcache=0x27772c0, es=0x284eb58) at functions.c:800 myState = 0x284eeb8 dest = 0x284eeb8 #4 fmgr_sql (fcinfo=0x27a7a28) at functions.c:1149 fcache = 0x27772c0 sqlerrcontext = {previous = 0x0, callback = 0x614910 , arg = 0x27a79d0} randomAccess = 0 '\000' lazyEvalOK = is_first = pushed_snapshot = 0 '\000' es = 0x284eb58 slot = result = eslist = eslc = 0x284eb90 __func__ = "fmgr_sql" #5 0x0060810b in ExecInterpExpr (state=0x27a7528, econtext=0x27a4ce0, isnull=) at execExprInterp.c:670 fcinfo = 0x27a7a28 argnull = 0x27a7d68 "" argno = op = 0x27a76f8 resultslot = 0x27a7130 innerslot = outerslot = 0x27a5930 scanslot = 0x0 dispatch_table = {0x608960 , 0x607e00 , 0x607e20 , 0x607e38 , 0x607c90 , 0x607cb0 , 0x607cf8 , 0x607d14 , 0x607d58 , 0x607d73 , 0x607e50 , 0x607ea8 , 0x607ee0 , 0x607f18 , 0x607f30 , 0x607f80 , 0x607fb8
Re: [HACKERS] Supporting huge pages on Windows
From: Craig Ringer [mailto:craig.rin...@2ndquadrant.com] > TBH, anyone who cares about security and runs Win7 or Win2k8 or newer should > be using virtual service accounts and managed service accounts. > > https://technet.microsoft.com/en-us/library/dd548356 > > > Those are more like Unix service accounts. Notably they don't need a password, > getting rid of some of the management pain that led us to abandon the > 'postgres' system user on Windows. > > Now that older platforms are EoL and even the oldest that added this feature > are also near EoL or in extended maintenance, I think installers should > switch to these by default instead of using NETWORK SERVICE. > > Then the issue of priv dropping would be a lesser concern anyway. Good point! And I said earlier in this thread, I think managing privileges (adding/revoking privileges from the user account) is the DBA's or sysadmin's duty, and PG's removing all privileges feels overkill. OTOH, I tried again to leave the DISABLE_MAX_PRIVILEGE as is and add Lock Pages in Memory, using the attached pg_ctl.c. Please see EnableLockPagesPrivilege() and its call site. But pg_ctl -w start fails emitting the following message: error code 1300 waiting for server to startFATAL: could not enable "Lock pages in memory" privilege HINT: Assign "Lock pages in memory" privilege to the Windows user account which runs PostgreSQL. LOG: database system is shut down stopped waiting pg_ctl: could not start server Examine the log output. error code 1300 is ERROR_NOT_ALL_ASSIGNED, which means AdjustTokenPrivileges() cound not enable Lock Pages in Memory privilege. It seems that the privilege cannot be enabled once it was removed with CreateRestrictedToken(DISABLE_MAX_PRIVILEGE). Regards Takayuki Tsunakawa /*- * * pg_ctl --- start/stops/restarts the PostgreSQL server * * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group * * src/bin/pg_ctl/pg_ctl.c * *- */ #ifdef WIN32 /* * Need this to get defines for restricted tokens and jobs. And it * has to be set before any header from the Win32 API is loaded. */ #define _WIN32_WINNT 0x0501 #endif #include "postgres_fe.h" #include "libpq-fe.h" #include "pqexpbuffer.h" #include #include #include #include #include #include #include #include #ifdef HAVE_SYS_RESOURCE_H #include #include #endif #include "getopt_long.h" #include "miscadmin.h" /* PID can be negative for standalone backend */ typedef long pgpid_t; typedef enum { SMART_MODE, FAST_MODE, IMMEDIATE_MODE } ShutdownMode; typedef enum { NO_COMMAND = 0, INIT_COMMAND, START_COMMAND, STOP_COMMAND, RESTART_COMMAND, RELOAD_COMMAND, STATUS_COMMAND, PROMOTE_COMMAND, KILL_COMMAND, REGISTER_COMMAND, UNREGISTER_COMMAND, RUN_AS_SERVICE_COMMAND } CtlCommand; #define DEFAULT_WAIT60 static bool do_wait = false; static bool del_service_rid = false; static bool wait_set = false; static int wait_seconds = DEFAULT_WAIT; static bool wait_seconds_arg = false; static bool silent_mode = false; static ShutdownMode shutdown_mode = FAST_MODE; static int sig = SIGINT; /* default */ static CtlCommand ctl_command = NO_COMMAND; static char *pg_data = NULL; static char *pg_config = NULL; static char *pgdata_opt = NULL; static char *post_opts = NULL; static const char *progname; static char *log_file = NULL; static char *exec_path = NULL; static char *event_source = NULL; static char *register_servicename = "PostgreSQL"; /* FIXME: + version ID? */ static char *register_username = NULL; static char *register_password = NULL; static char *argv0 = NULL; static bool allow_core_files = false; static time_t start_time; static char postopts_file[MAXPGPATH]; static char version_file[MAXPGPATH]; static char pid_file[MAXPGPATH]; static char backup_file[MAXPGPATH]; static char recovery_file[MAXPGPATH]; static char promote_file[MAXPGPATH]; #ifdef WIN32 static DWORD pgctl_start_type = SERVICE_AUTO_START; static SERVICE_STATUS status; static SERVICE_STATUS_HANDLE hStatus = (SERVICE_STATUS_HANDLE) 0; static HANDLE shutdownHandles[2]; static pid_t postmasterPID = -1; #define shutdownEvent shutdownHandles[0] #define postmasterProcess shutdownHandles[1] #endif static void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2); static void do_advice(void); static void do_help(void); static void set_mode(char *modeopt); static void set_sig(char *signame); static void do_init(void); static void do_start(void); static void do_stop(void); static void do_restart(void); static void do_reload(void); static void do_status(void); static void do_promote(void); static void do_kill(pgpid_t pid); static void print_msg(const char *msg); static void adjust_data
Re: [HACKERS] Parallel Append implementation
On Wed, Apr 5, 2017 at 1:43 AM, Andres Freund wrote: > On 2017-04-04 08:01:32 -0400, Robert Haas wrote: > > On Tue, Apr 4, 2017 at 12:47 AM, Andres Freund > wrote: > > > I don't think the parallel seqscan is comparable in complexity with the > > > parallel append case. Each worker there does the same kind of work, > and > > > if one of them is behind, it'll just do less. But correct sizing will > > > be more important with parallel-append, because with non-partial > > > subplans the work is absolutely *not* uniform. > > > > Sure, that's a problem, but I think it's still absolutely necessary to > > ramp up the maximum "effort" (in terms of number of workers) > > logarithmically. If you just do it by costing, the winning number of > > workers will always be the largest number that we think we'll be able > > to put to use - e.g. with 100 branches of relatively equal cost we'll > > pick 100 workers. That's not remotely sane. > > I'm quite unconvinced that just throwing a log() in there is the best > way to combat that. Modeling the issue of starting more workers through > tuple transfer, locking, startup overhead costing seems a better to me. > > If the goal is to compute the results of the query as fast as possible, > and to not use more than max_parallel_per_XXX, and it's actually > beneficial to use more workers, then we should. Because otherwise you > really can't use the resources available. > +1. I had expressed similar opinion earlier, but yours is better articulated. Thanks. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
On Tue, Apr 4, 2017 at 10:52 PM, Peter Eisentraut wrote: > On 4/3/17 11:32, Andres Freund wrote: >> That doesn't strike as particularly future proof. We intentionally >> leave objects behind pg_regress runs, but that only works if we actually >> run them... > > I generally agree with the sentiments expressed later in this thread. > But just to clarify what I meant here: We don't need to run a, say, > 1-minute serial test to load a few "left behind" objects for the > pg_upgrade test, if we can load the same set of objects using dedicated > scripting in say 2 seconds. This would make both the pg_upgrade tests > faster and would reduce the hidden dependencies in the main tests about > which kinds of objects need to be left behind. Making the tests run shorter while maintaining the current code coverage is nice. But this makes more complicated the test suite maintenance as this needs either a dedicated regression schedule or an extra test suite where objects are created just for the sake of pg_upgrade. This increases the risks of getting a rotten test suite with the time if patch makers and reviewers are not careful. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewriting the test of pg_upgrade as a TAP test
On Tue, Apr 4, 2017 at 9:30 PM, Stephen Frost wrote: > * Michael Paquier (michael.paqu...@gmail.com) wrote: >> On Tue, Apr 4, 2017 at 7:38 AM, Stephen Frost wrote: >> The patch presented here does lower the coverage we have now. > > I assume (perhaps mistakenly) that this statement was based on an > analysis of before-and-after 'make coverage' runs. Here you are saying > that you're *not* lowering the coverage. My apologies here, when I used the work "coverage" in my previous emails it visibly implied that I meant that I had used the coverage stuff. But I did not because the TAP test proposed does exactly what test.sh is doing: 1) Initialize the old cluster and start it. 2) create a bunch of databases with full range of ascii characters. 3) Run regression tests. 4) Take dump on old cluster. 4) Stop the old cluster. 5) Initialize the new cluster. 6) Run pg_upgrade. 7) Start new cluster. 8) Take dump from it. 9) Run deletion script (Oops forgot this part!) > I understand how the current pg_upgrade tests work. I don't see > off-hand why the TAP tests would reduce the code coverage of pg_upgrade, > but if they do, we should be able to figure out why and correct it. Good news is that this patch at least does not lower the bar. >> So in short I don't think that this lack of infrastructure should be a >> barrier for what is basically a cleanup but... I just work here. > > I didn't mean to imply that this patch needs to address the > cross-version testing challenge, was merely mentioning that there's been > some work in this area already by Andrew and that if you're interested > in working on that problem that you should probably coordinate with him. Sure. > What I do think is a barrier to this patch moving forward is if it > reduces our current code coverage testing (with the same-version > pg_upgrade that's run in the regular regression tests). If it doesn't, > then great, but if it does, then the patch should be updated to fix > that. I did not do a coverage test first, but surely this patch needs numbers, so here you go. Without the patch, here is the coverage of src/bin/pg_upgrade: lines..: 57.7% (1311 of 2273 lines) functions..: 85.3% (87 of 102 functions) And with the patch: lines..: 58.8% (1349 of 2294 lines) functions..: 85.6% (89 of 104 functions) The coverage gets a bit higher as a couple of basic code paths like pg_upgrade --help get looked at. -- Michael pgupgrade-tap-test-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
On Thu, Mar 30, 2017 at 7:55 PM, Robert Haas wrote: > but > try to access the TOAST table would be fatal; that probably would have > deadlock hazards among other problems. Hmm. I think you're right. We could make a copy of the heap tuple, drop the lock and then access TOAST to handle that. Would that work? Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Statement timeout behavior in extended queries
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii > Hmm. IMO, that could happen even with the current statement timeout > implementation as well. > > Or we could start/stop the timeout in exec_execute_message() only. This > could avoid the problem above. Also this is more consistent with > log_duration/log_min_duration_statement behavior than now. I think it's better to include Parse and Bind as now. Parse or Bind could take long time when the table has many partitions, the query is complex and/or very long, some DDL statement is running against a target table, or the system load is high. Firing statement timeout during or after many Parses is not a problem, because the first Parse started running some statement and it's not finished. Plus, Andres confirmed that major client drivers don't use such a pattern. There may be an odd behavior purely from the perspective of E.Q.P, that's a compromise, which Andres meant by "not perfect but." Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
> On 2017-04-04 16:56:26 -0700, 'Andres Freund' wrote: >> On 2017-04-04 23:52:28 +, Tsunakawa, Takayuki wrote: >> > From: Andres Freund [mailto:and...@anarazel.de] >> > > Looks to me like npgsql doesn't do that either. None of libpq, pgjdbs >> > > and >> > > npgsql doing it seems like some evidence that it's ok. >> > >> > And psqlODBC now uses always libpq. >> > >> > Now time for final decision? >> >> I'll send an updated patch in a bit, and then will wait till tomorrow to >> push. Giving others a chance to chime in seems fair. > > Attached. I did not like that the previous patch had the timeout > handling duplicated in the individual functions, I instead centralized > it into start_xact_command(). Given that it already activated the > timeout in the most common cases, that seems to make more sense to > me. In your version we'd have called enable_statement_timeout() twice > consecutively (which behaviourally is harmless). > > What do you think? I've not really tested this with the extended > protocol, so I'd appreciate if you could rerun your test from the > older thread. Ok, I will look into your patch and test it out using pgproto. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
On 2017-04-04 16:56:26 -0700, 'Andres Freund' wrote: > On 2017-04-04 23:52:28 +, Tsunakawa, Takayuki wrote: > > From: Andres Freund [mailto:and...@anarazel.de] > > > Looks to me like npgsql doesn't do that either. None of libpq, pgjdbs and > > > npgsql doing it seems like some evidence that it's ok. > > > > And psqlODBC now uses always libpq. > > > > Now time for final decision? > > I'll send an updated patch in a bit, and then will wait till tomorrow to > push. Giving others a chance to chime in seems fair. Attached. I did not like that the previous patch had the timeout handling duplicated in the individual functions, I instead centralized it into start_xact_command(). Given that it already activated the timeout in the most common cases, that seems to make more sense to me. In your version we'd have called enable_statement_timeout() twice consecutively (which behaviourally is harmless). What do you think? I've not really tested this with the extended protocol, so I'd appreciate if you could rerun your test from the older thread. - Andres >From c47d904725f3ef0272a4540b6b5bcf0be5c1dea6 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Tue, 4 Apr 2017 18:44:42 -0700 Subject: [PATCH] Rearm statement_timeout after each executed query. Previously statement_timeout, in the extended protocol, affected all messages till a Sync message. For clients that pipeline/batch query execution that's problematic. Instead disable timeout after each Execute message, and enable, if necessary, the timer in start_xact_command(). Author: Tatsuo Ishii, editorialized by Andres Freund Reviewed-By: Takayuki Tsunakawa, Andres Freund Discussion: https://postgr.es/m/20170222.115044.1665674502985097185.t-is...@sraoss.co.jp --- src/backend/tcop/postgres.c | 77 ++--- 1 file changed, 65 insertions(+), 12 deletions(-) diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index a2282058c0..4ab3bd7f07 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -149,6 +149,11 @@ static bool doing_extended_query_message = false; static bool ignore_till_sync = false; /* + * Flag to keep track of whether statement timeout timer is active. + */ +static bool stmt_timeout_active = false; + +/* * If an unnamed prepared statement exists, it's stored here. * We keep it separate from the hashtable kept by commands/prepare.c * in order to reduce overhead for short-lived queries. @@ -188,6 +193,8 @@ static bool IsTransactionStmtList(List *pstmts); static void drop_unnamed_stmt(void); static void SigHupHandler(SIGNAL_ARGS); static void log_disconnections(int code, Datum arg); +static void enable_statement_timeout(void); +static void disable_statement_timeout(void); /* @@ -1234,7 +1241,8 @@ exec_parse_message(const char *query_string, /* string to execute */ /* * Start up a transaction command so we can run parse analysis etc. (Note * that this will normally change current memory context.) Nothing happens - * if we are already in one. + * if we are already in one. This also arms the statement timeout if + * necessary. */ start_xact_command(); @@ -1522,7 +1530,8 @@ exec_bind_message(StringInfo input_message) /* * Start up a transaction command so we can call functions etc. (Note that * this will normally change current memory context.) Nothing happens if - * we are already in one. + * we are already in one. This also arms the statement timeout if + * necessary. */ start_xact_command(); @@ -2014,6 +2023,9 @@ exec_execute_message(const char *portal_name, long max_rows) * those that start or end a transaction block. */ CommandCounterIncrement(); + + /* full command has been executed, reset timeout */ + disable_statement_timeout(); } /* Send appropriate CommandComplete to client */ @@ -2443,25 +2455,27 @@ start_xact_command(void) { StartTransactionCommand(); - /* Set statement timeout running, if any */ - /* NB: this mustn't be enabled until we are within an xact */ - if (StatementTimeout > 0) - enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout); - else - disable_timeout(STATEMENT_TIMEOUT, false); - xact_started = true; } + + /* + * Start statement timeout if necessary. Note that this'll intentionally + * not reset the clock on an already started timeout, to avoid the timing + * overhead when start_xact_command() is invoked repeatedly, without an + * interceding finish_xact_command() (e.g. parse/bind/execute). If that's + * not desired, the timeout has to be disabled explicitly. + */ + enable_statement_timeout(); } static void finish_xact_command(void) { + /* cancel active statement timeout after each command */ + disable_statement_timeout(); + if (xact_started) { - /* Cancel any active statement timeout before committing */ - disable_timeout(STATEMENT_TIMEOUT, false); - Co
Re: [HACKERS] partitioned tables and contrib/sepgsql
On Tue, Apr 4, 2017 at 6:56 PM, Joe Conway wrote: > On 04/04/2017 10:02 AM, Joe Conway wrote: >> On 04/04/2017 09:55 AM, Mike Palmiotto wrote: >>> After some discussion off-list, I've rebased and udpated the patches. >>> Please see attached for further review. >> >> Thanks -- will have another look and test on a machine with selinux >> setup. Robert, did you want me to take responsibility to commit on this >> or just provide review/feedback? > > I did some editorializing on these. > > In particular I did not like the approach to fixing "warning: ‘tclass’ > may be used uninitialized" and ended up just doing it the same as was > done elsewhere in relation.c already (set tclass = 0 in the variable > declaration). Along the way I also changed one instance of tclass from > uint16 to uint16_t for the sake of consistency. > > Interestingly we figured out that the warning was present with -Og, but > not present with -O0, -O2, or -O3. > > If you want to test, apply 0001a and 0001b before 0002. > > Any objections? I'm guessing Tom's going to have a strong feeling about whether 0001a is the right way to address the stdbool issue, but I don't personally know what the right thing to do is so I will avoid opining on that topic. So, nope, no objections here to you committing those. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
>> What's your point of the question? What kind of problem do you expect >> if the timeout starts only once at the first parse meesage out of >> bunch of parse messages? > > It's perfectly valid to send a lot of Parse messages without > interspersed Sync or Bind/Execute message. There'll be one timeout > covering all of those Parse messages, which can thus lead to a timeout, > even though nothing actually takes long individually. Hmm. IMO, that could happen even with the current statement timeout implementation as well. Or we could start/stop the timeout in exec_execute_message() only. This could avoid the problem above. Also this is more consistent with log_duration/log_min_duration_statement behavior than now. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE command progress checker
On Wed, Apr 5, 2017 at 1:49 AM, Robert Haas wrote: > On Tue, Apr 4, 2017 at 4:57 AM, Amit Langote > wrote: >> Hmm, you're right. It could be counted with a separate variable >> initialized to 0 and incremented every time we decide to add a row to the >> final set of sampled rows, although different implementations of >> AcquireSampleRowsFunc have different ways of deciding if a given row will >> be part of the final set of sampled rows. >> >> On the other hand, if we decide to count progress in terms of blocks as >> you suggested afraid, I'm afraid that FDWs won't be able to report the >> progress. > > I think it may be time to push this patch out to v11. It was > submitted one day before the start of the last CommitFest, the design > wasn't really right, and it's not clear even now that we know what the > right design is. And we're pretty much out of time. > +1 We're encountering the design issue and it takes more time to find out right design including FDWs support. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
On 2017-04-05 10:05:19 +0900, Tatsuo Ishii wrote: > > Hm. I started to edit it, but I'm halfway coming back to my previous > > view that this isn't necessarily ready. > > > > If a client were to to prepare a large number of prepared statements > > (i.e. a lot of parse messages), this'll only start the timeout once, at > > the first statement sent. It's not an issue for libpq which sends a > > sync message after each PQprepare, nor does it look to be an issue for > > pgjdbc based on a quick look. > > > > Does anybody think there might be a driver out there that sends a bunch > > of 'parse' messages, without syncs? > > What's your point of the question? What kind of problem do you expect > if the timeout starts only once at the first parse meesage out of > bunch of parse messages? It's perfectly valid to send a lot of Parse messages without interspersed Sync or Bind/Execute message. There'll be one timeout covering all of those Parse messages, which can thus lead to a timeout, even though nothing actually takes long individually. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Outdated comments around HandleFunctionRequest
Hi, PostgresMain() has the following blurb for fastpath functions: /* * Note: we may at this point be inside an aborted * transaction. We can't throw error for that until we've * finished reading the function-call message, so * HandleFunctionRequest() must check for it after doing so. * Be careful not to do anything that assumes we're inside a * valid transaction here. */ and in HandleFunctionRequest() there's: * INPUT: * In protocol version 3, postgres.c has already read the message body * and will pass it in msgBuf. * In old protocol, the passed msgBuf is empty and we must read the * message here. which is not true anymore. Followed by: /* * Now that we've eaten the input message, check to see if we actually * want to do the function call or not. It's now safe to ereport(); we * won't lose sync with the frontend. */ which is also not really meaningful, because there's no previous code in the function. This largely seems to be damage from commit 2b3a8b20c2da9f39ffecae25ab7c66974fbc0d3b Author: Heikki Linnakangas Date: 2015-02-02 17:08:45 +0200 Be more careful to not lose sync in the FE/BE protocol. Heikki? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
> Hm. I started to edit it, but I'm halfway coming back to my previous > view that this isn't necessarily ready. > > If a client were to to prepare a large number of prepared statements > (i.e. a lot of parse messages), this'll only start the timeout once, at > the first statement sent. It's not an issue for libpq which sends a > sync message after each PQprepare, nor does it look to be an issue for > pgjdbc based on a quick look. > > Does anybody think there might be a driver out there that sends a bunch > of 'parse' messages, without syncs? What's your point of the question? What kind of problem do you expect if the timeout starts only once at the first parse meesage out of bunch of parse messages? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - allow to store select results into variables
>> Please find attached a v8 which hopefully fixes these two issues. > Looks good to me, marking as ready for committer. I have looked into this a little bit. It seems the new feature \gset doesn't work with tables having none ascii column names: $ src/bin/pgbench/pgbench -t 1 -f /tmp/f test starting vacuum...end. gset: invalid variable name: "数字" client 0 file 0 command 0 compound 0: error storing into var 数字 transaction type: /tmp/f scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 1 number of transactions actually processed: 0/1 This is because pgbench variable names are limited to ascii ranges. IMO the limitation is unnecessary and should be removed. (I know that the limitation was brought in by someone long time ago and the patch author is not responsible for that). Anyway, now that the feature reveals the undocumented limitation, we should document the limitation of \gset at least. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reduce src/test/recovery verbosity
On Wed, Apr 5, 2017 at 12:39 AM, Stephen Frost wrote: > Committed, with those additions. Thanks for the commit. The final result is nice. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 5 April 2017 at 08:00, Craig Ringer wrote: > Taking a look at this now. Rebased to current master with conflicts and whitespace errors fixed. Review pending. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 71b6163734934db3160de81fd064e7d113b9122f Mon Sep 17 00:00:00 2001 From: Jim Nasby Date: Wed, 1 Mar 2017 15:45:51 -0600 Subject: [PATCH 1/2] Add SPI_execute_callback() and callback-based DestReceiver. Instead of placing results in a tuplestore, this method of execution uses the supplied callback when creating the Portal for a query. --- src/backend/executor/spi.c | 79 -- src/backend/tcop/dest.c| 11 +++ src/include/executor/spi.h | 4 +++ src/include/tcop/dest.h| 1 + 4 files changed, 85 insertions(+), 10 deletions(-) diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 3a89ccd..a0af31a 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -55,7 +55,8 @@ static void _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan); static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, Snapshot snapshot, Snapshot crosscheck_snapshot, - bool read_only, bool fire_triggers, uint64 tcount); + bool read_only, bool fire_triggers, uint64 tcount, + DestReceiver *callback); static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes, Datum *Values, const char *Nulls); @@ -321,7 +322,34 @@ SPI_execute(const char *src, bool read_only, long tcount) res = _SPI_execute_plan(&plan, NULL, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} +int +SPI_execute_callback(const char *src, bool read_only, long tcount, + DestReceiver *callback) +{ + _SPI_plan plan; + int res; + + if (src == NULL || tcount < 0) + return SPI_ERROR_ARGUMENT; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + memset(&plan, 0, sizeof(_SPI_plan)); + plan.magic = _SPI_PLAN_MAGIC; + plan.cursor_options = 0; + + _SPI_prepare_oneshot_plan(src, &plan); + + res = _SPI_execute_plan(&plan, NULL, + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -355,7 +383,34 @@ SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const char *Nulls, _SPI_convert_params(plan->nargs, plan->argtypes, Values, Nulls), InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); + + _SPI_end_call(true); + return res; +} + +/* Execute a previously prepared plan with a callback Destination */ +int +SPI_execute_plan_callback(SPIPlanPtr plan, Datum *Values, const char *Nulls, + bool read_only, long tcount, DestReceiver *callback) +{ + int res; + + if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0) + return SPI_ERROR_ARGUMENT; + + if (plan->nargs > 0 && Values == NULL) + return SPI_ERROR_PARAM; + + res = _SPI_begin_call(true); + if (res < 0) + return res; + + res = _SPI_execute_plan(plan, + _SPI_convert_params(plan->nargs, plan->argtypes, +Values, Nulls), + InvalidSnapshot, InvalidSnapshot, + read_only, true, tcount, callback); _SPI_end_call(true); return res; @@ -384,7 +439,7 @@ SPI_execute_plan_with_paramlist(SPIPlanPtr plan, ParamListInfo params, res = _SPI_execute_plan(plan, params, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); _SPI_end_call(true); return res; @@ -425,7 +480,7 @@ SPI_execute_snapshot(SPIPlanPtr plan, _SPI_convert_params(plan->nargs, plan->argtypes, Values, Nulls), snapshot, crosscheck_snapshot, - read_only, fire_triggers, tcount); + read_only, fire_triggers, tcount, NULL); _SPI_end_call(true); return res; @@ -472,7 +527,7 @@ SPI_execute_with_args(const char *src, res = _SPI_execute_plan(&plan, paramLI, InvalidSnapshot, InvalidSnapshot, - read_only, true, tcount); + read_only, true, tcount, NULL); _SPI_end_call(true); return res; @@ -1907,7 +1962,8 @@ _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan) static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, Snapshot snapshot, Snapshot crosscheck_snapshot, - bool read_only, bool fire_triggers, uint64 tcount) + bool read_only, bool fire_triggers, uint64 tcount, + DestReceiver *callback) { int my_res = 0; uint64 my_processed = 0; @@ -1918,6 +1974,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI, ErrorContextCallback spierrcontext; CachedPlan *cplan = NULL; ListCell *lc1; + DestReceiver *dest = callback; /* * Setup error traceback support for ereport() @@ -2037,7 +2094,6 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamLi
Re: [HACKERS] Speedup twophase transactions
On Tue, Mar 28, 2017 at 3:10 PM, Michael Paquier wrote: > OK, done. I have just noticed that Simon has marked himself as a > committer of this patch 24 hours ago. For the archive's sake, this has been committed as 728bd991. Thanks Simon! -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Wed, Apr 5, 2017 at 11:49 AM, Kevin Grittner wrote: > Worked on the docs some more and then pushed it. > > Nice job cutting the number of *.[ch] lines by 30 while adding support for > the other three core PLs. :-) Great. Thanks. I wonder if there is some way we can automatically include code fragments in the documentation without keeping them in sync manually. Now it looks like I have no more excuses for putting off reading the papers you've shared[1][2] about incremental matview algorithms! Looking forward to helping out with the next steps in that project if I can. [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.2254&rep=rep1&type=pdf [2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.3208&rep=rep1&type=pdf -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Faster methods for getting SPI results (460% improvement)
On 6 March 2017 at 05:09, Jim Nasby wrote: > On 2/28/17 9:42 PM, Jim Nasby wrote: >>> >>> >>> I'll post a plpython patch that doesn't add the output format control. >> >> >> I've attached the results of that. Unfortunately the speed improvement >> is only 27% at this point (with 999 tuples). Presumably that's >> because it's constructing a brand new dictionary from scratch for each >> tuple. Taking a look at this now. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
On 2017-04-04 23:52:28 +, Tsunakawa, Takayuki wrote: > From: Andres Freund [mailto:and...@anarazel.de] > > Looks to me like npgsql doesn't do that either. None of libpq, pgjdbs and > > npgsql doing it seems like some evidence that it's ok. > > And psqlODBC now uses always libpq. > > Now time for final decision? I'll send an updated patch in a bit, and then will wait till tomorrow to push. Giving others a chance to chime in seems fair. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
From: Andres Freund [mailto:and...@anarazel.de] > Looks to me like npgsql doesn't do that either. None of libpq, pgjdbs and > npgsql doing it seems like some evidence that it's ok. And psqlODBC now uses always libpq. Now time for final decision? Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On Mon, Apr 3, 2017 at 7:16 PM, Thomas Munro wrote: > On Tue, Apr 4, 2017 at 3:41 AM, Kevin Grittner wrote: >> On Mon, Apr 3, 2017 at 8:59 AM, Tom Lane wrote: >>> Thomas Munro writes: Or perhaps the code to inject trigger data transition tables into SPI (a near identical code block these three patches) should be somewhere common so that each PLs would only need to call a function. If so, where should that go? >>> >>> spi.c? >> >> Until now, trigger.c didn't know about SPI, and spi.c didn't know >> about triggers. The intersection was left to referencing code, like >> PLs. Is there any other common code among the PLs dealing with this >> intersection? If so, maybe a new triggerspi.c file (or >> spitrigger.c?) would make sense. Possibly it could make sense from >> a code structure PoV even for a single function, but it seems kinda >> iffy for just this function. As far as I can see it comes down to >> adding it to spi.c or creating a new file -- or just duplicating >> these 30-some lines of code to every PL. > > Ok, how about SPI_register_trigger_data(TriggerData *)? Or any name > you prefer... I didn't suggest something as specific as > SPI_register_transition_tables because think it's plausible that > someone might want to implement SQL standard REFERENCING OLD/NEW ROW > AS and make that work in all PLs too one day, and this would be the > place to do that. > > See attached, which add adds the call to all four built-in PLs. Thoughts? Worked on the docs some more and then pushed it. Nice job cutting the number of *.[ch] lines by 30 while adding support for the other three core PLs. :-) -- Kevin Grittner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
On 2017-04-04 16:38:53 -0700, Andres Freund wrote: > On 2017-04-04 16:10:32 +0900, Tatsuo Ishii wrote: > > >> If what Tatsuo-san said to Tom is correct (i.e. each Parse/Bind/Execute > > >> starts and stops the timer), then it's a concern and the patch should > > >> not be ready for committer. However, the current patch is not like that > > >> -- it seems to do what others in this thread are expecting. > > > > > > Oh, interesting - I kind of took the author's statement as, uh, > > > authoritative ;). A quick look over the patch confirms your > > > understanding. > > > > Yes, Tsunakawa-san is correct. Sorry for confusion. > > > > > I think the code needs a few clarifying comments around this, but > > > otherwise seems good. Not restarting the timeout in those cases > > > obviously isn't entirely "perfect"/"correct", but a tradeoff - the > > > comments should note that. > > > > > > Tatsuo-san, do you want to change those, and push? I can otherwise. > > > > Andres, > > > > If you don't mind, could you please fix the comments and push it. > > Hm. I started to edit it, but I'm halfway coming back to my previous > view that this isn't necessarily ready. > > If a client were to to prepare a large number of prepared statements > (i.e. a lot of parse messages), this'll only start the timeout once, at > the first statement sent. It's not an issue for libpq which sends a > sync message after each PQprepare, nor does it look to be an issue for > pgjdbc based on a quick look. > > Does anybody think there might be a driver out there that sends a bunch > of 'parse' messages, without syncs? Looks to me like npgsql doesn't do that either. None of libpq, pgjdbs and npgsql doing it seems like some evidence that it's ok. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
On 2017-04-05 08:34:43 +0900, Tatsuo Ishii wrote: > Andres, > > >> I think the code needs a few clarifying comments around this, but > >> otherwise seems good. Not restarting the timeout in those cases > >> obviously isn't entirely "perfect"/"correct", but a tradeoff - the > >> comments should note that. > >> > >> Tatsuo-san, do you want to change those, and push? I can otherwise. > > > > Andres, > > > > If you don't mind, could you please fix the comments and push it. > > I have changed the comments as you suggested. If it's ok, I can push > the patch myself (today I have time to work on this). I'm working on the patch, and I've edited it more heavily, so please hold off. Changes: I don't think the debugging statements are a good idea, the !xact_started should be an assert, and disable_timeout should be called from within enable_statement_timeout independent of stmt_timer_started. But more importantly I had just sent a question that I think merits discussion. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
On 2017-04-04 16:10:32 +0900, Tatsuo Ishii wrote: > >> If what Tatsuo-san said to Tom is correct (i.e. each Parse/Bind/Execute > >> starts and stops the timer), then it's a concern and the patch should not > >> be ready for committer. However, the current patch is not like that -- it > >> seems to do what others in this thread are expecting. > > > > Oh, interesting - I kind of took the author's statement as, uh, > > authoritative ;). A quick look over the patch confirms your > > understanding. > > Yes, Tsunakawa-san is correct. Sorry for confusion. > > > I think the code needs a few clarifying comments around this, but > > otherwise seems good. Not restarting the timeout in those cases > > obviously isn't entirely "perfect"/"correct", but a tradeoff - the > > comments should note that. > > > > Tatsuo-san, do you want to change those, and push? I can otherwise. > > Andres, > > If you don't mind, could you please fix the comments and push it. Hm. I started to edit it, but I'm halfway coming back to my previous view that this isn't necessarily ready. If a client were to to prepare a large number of prepared statements (i.e. a lot of parse messages), this'll only start the timeout once, at the first statement sent. It's not an issue for libpq which sends a sync message after each PQprepare, nor does it look to be an issue for pgjdbc based on a quick look. Does anybody think there might be a driver out there that sends a bunch of 'parse' messages, without syncs? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statement timeout behavior in extended queries
Andres, >> I think the code needs a few clarifying comments around this, but >> otherwise seems good. Not restarting the timeout in those cases >> obviously isn't entirely "perfect"/"correct", but a tradeoff - the >> comments should note that. >> >> Tatsuo-san, do you want to change those, and push? I can otherwise. > > Andres, > > If you don't mind, could you please fix the comments and push it. I have changed the comments as you suggested. If it's ok, I can push the patch myself (today I have time to work on this). Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index a2282058..1fd93359 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -149,6 +149,11 @@ static bool doing_extended_query_message = false; static bool ignore_till_sync = false; /* + * Flag to keep track of whether we have started statement timeout timer. + */ +static bool stmt_timer_started = false; + +/* * If an unnamed prepared statement exists, it's stored here. * We keep it separate from the hashtable kept by commands/prepare.c * in order to reduce overhead for short-lived queries. @@ -188,6 +193,8 @@ static bool IsTransactionStmtList(List *pstmts); static void drop_unnamed_stmt(void); static void SigHupHandler(SIGNAL_ARGS); static void log_disconnections(int code, Datum arg); +static void enable_statement_timeout(void); +static void disable_statement_timeout(void); /* @@ -1239,6 +1246,15 @@ exec_parse_message(const char *query_string, /* string to execute */ start_xact_command(); /* + * Set statement timeout running if it's not already set. The timer will + * be reset in a subsequent execute message. Ideally the timer should be + * reset in this function so that each parse/bind/execute message catches + * the timeout, but it needs gettimeofday() call for each, which is too + * expensive. + */ + enable_statement_timeout(); + + /* * Switch to appropriate context for constructing parsetrees. * * We have two strategies depending on whether the prepared statement is @@ -1526,6 +1542,15 @@ exec_bind_message(StringInfo input_message) */ start_xact_command(); + /* + * Set statement timeout running if it's not already set. The timer will + * be reset in a subsequent execute message. Ideally the timer should be + * reset in this function so that each parse/bind/execute message catches + * the timeout, but it needs gettimeofday() call for each, which is too + * expensive. + */ + enable_statement_timeout(); + /* Switch back to message context */ MemoryContextSwitchTo(MessageContext); @@ -1942,6 +1967,11 @@ exec_execute_message(const char *portal_name, long max_rows) start_xact_command(); /* + * Set statement timeout running if it's not already set. + */ + enable_statement_timeout(); + + /* * If we re-issue an Execute protocol request against an existing portal, * then we are only fetching more rows rather than completely re-executing * the query from the start. atStart is never reset for a v3 portal, so we @@ -2014,6 +2044,11 @@ exec_execute_message(const char *portal_name, long max_rows) * those that start or end a transaction block. */ CommandCounterIncrement(); + + /* + * We need to reset statement timeout if already set. + */ + disable_statement_timeout(); } /* Send appropriate CommandComplete to client */ @@ -2443,14 +2478,10 @@ start_xact_command(void) { StartTransactionCommand(); - /* Set statement timeout running, if any */ - /* NB: this mustn't be enabled until we are within an xact */ - if (StatementTimeout > 0) - enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout); - else - disable_timeout(STATEMENT_TIMEOUT, false); - xact_started = true; + + /* Set statement timeout running, if any */ + enable_statement_timeout(); } } @@ -2460,7 +2491,7 @@ finish_xact_command(void) if (xact_started) { /* Cancel any active statement timeout before committing */ - disable_timeout(STATEMENT_TIMEOUT, false); + disable_statement_timeout(); CommitTransactionCommand(); @@ -4511,3 +4542,53 @@ log_disconnections(int code, Datum arg) port->user_name, port->database_name, port->remote_host, port->remote_port[0] ? " port=" : "", port->remote_port))); } + +/* + * Set statement timeout if any. + */ +static void +enable_statement_timeout(void) +{ + if (!stmt_timer_started) + { + if (StatementTimeout > 0) + { + + /* + * Sanity check + */ + if (!xact_started) + { +ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("Transaction must have been already started to set statement timeout"))); + } + + ereport(DEBUG3, + (errmsg_internal("Set statement timeout"))); + + enable_timeout_after(STATEMENT_TIMEOUT, St
Re: [HACKERS] partitioned tables and contrib/sepgsql
On 04/04/2017 10:02 AM, Joe Conway wrote: > On 04/04/2017 09:55 AM, Mike Palmiotto wrote: >> After some discussion off-list, I've rebased and udpated the patches. >> Please see attached for further review. > > Thanks -- will have another look and test on a machine with selinux > setup. Robert, did you want me to take responsibility to commit on this > or just provide review/feedback? I did some editorializing on these. In particular I did not like the approach to fixing "warning: ‘tclass’ may be used uninitialized" and ended up just doing it the same as was done elsewhere in relation.c already (set tclass = 0 in the variable declaration). Along the way I also changed one instance of tclass from uint16 to uint16_t for the sake of consistency. Interestingly we figured out that the warning was present with -Og, but not present with -O0, -O2, or -O3. If you want to test, apply 0001a and 0001b before 0002. Any objections? Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c index 1a8f884..320c098 100644 *** a/contrib/sepgsql/label.c --- b/contrib/sepgsql/label.c *** *** 10,15 --- 10,25 */ #include "postgres.h" + #include + /* + * selinux/label.h includes stdbool.h, which redefines bool, so + * revert to the postgres definition of bool from c.h + */ + #ifdef bool + #undef bool + typedef char bool; + #endif + #include "access/heapam.h" #include "access/htup_details.h" #include "access/genam.h" *** *** 37,44 #include "sepgsql.h" - #include - /* * Saved hook entries (if stacked) */ --- 47,52 diff --git a/contrib/sepgsql/relation.c b/contrib/sepgsql/relation.c index ab98a9b..2ea6bfb 100644 *** a/contrib/sepgsql/relation.c --- b/contrib/sepgsql/relation.c *** sepgsql_relation_post_create(Oid relOid) *** 243,249 HeapTuple tuple; Form_pg_class classForm; ObjectAddress object; ! uint16 tclass; char *scontext; /* subject */ char *tcontext; /* schema */ char *rcontext; /* relation */ --- 243,249 HeapTuple tuple; Form_pg_class classForm; ObjectAddress object; ! uint16_t tclass; char *scontext; /* subject */ char *tcontext; /* schema */ char *rcontext; /* relation */ *** sepgsql_relation_drop(Oid relOid) *** 413,419 { ObjectAddress object; char *audit_name; ! uint16_t tclass; char relkind; relkind = get_rel_relkind(relOid); --- 413,419 { ObjectAddress object; char *audit_name; ! uint16_t tclass = 0; char relkind; relkind = get_rel_relkind(relOid); diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c index 1a8f884..4dda82a 100644 *** a/contrib/sepgsql/label.c --- b/contrib/sepgsql/label.c *** exec_object_restorecon(struct selabel_ha *** 779,785 case RelationRelationId: relForm = (Form_pg_class) GETSTRUCT(tuple); ! if (relForm->relkind == RELKIND_RELATION) objtype = SELABEL_DB_TABLE; else if (relForm->relkind == RELKIND_SEQUENCE) objtype = SELABEL_DB_SEQUENCE; --- 779,786 case RelationRelationId: relForm = (Form_pg_class) GETSTRUCT(tuple); ! if (relForm->relkind == RELKIND_RELATION || ! relForm->relkind == RELKIND_PARTITIONED_TABLE) objtype = SELABEL_DB_TABLE; else if (relForm->relkind == RELKIND_SEQUENCE) objtype = SELABEL_DB_SEQUENCE; diff --git a/contrib/sepgsql/relation.c b/contrib/sepgsql/relation.c index ab98a9b..f8689c0 100644 *** a/contrib/sepgsql/relation.c --- b/contrib/sepgsql/relation.c *** sepgsql_attribute_post_create(Oid relOid *** 54,65 ObjectAddress object; Form_pg_attribute attForm; StringInfoData audit_name; /* ! * Only attributes within regular relation have individual security ! * labels. */ ! if (get_rel_relkind(relOid) != RELKIND_RELATION) return; /* --- 54,66 ObjectAddress object; Form_pg_attribute attForm; StringInfoData audit_name; + char relkind = get_rel_relkind(relOid); /* ! * Only attributes within regular relations or partition relations have ! * individual security labels. */ ! if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* *** sepgsql_attribute_drop(Oid relOid, AttrN *** 135,142 { ObjectAddress object; char *audit_name; ! if (get_rel_relkind(relOid) != RELKIND_RELATION) return; /* --- 136,144 { ObjectAddress object; char *audit_name; + char relkind = get_rel_relkind(relOid); ! if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* *** sepgsql_attribute_relabel(Oid relOid, At *** 167,174 { ObjectAddress object; char *audit_name; ! if (get_rel_rel
Re: [HACKERS] increasing the default WAL segment size
On 27 March 2017 at 15:36, Beena Emerson wrote: > 02-increase-max-wal-segsize.patch - Increases the wal-segsize and changes > the internal representation of max_wal_size and min_wal_size to mb. Committed first part to allow internal representation change (only). No commitment yet to increasing wal-segsize in the way this patch has it. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implementation of SASLprep for SCRAM-SHA-256
On 04/04/2017 01:52 PM, Heikki Linnakangas wrote: On 03/31/2017 10:10 AM, Michael Paquier wrote: On Wed, Mar 8, 2017 at 10:39 PM, Robert Haas wrote: On Tue, Mar 7, 2017 at 10:01 PM, Michael Paquier wrote: I kinda hope Heikki is going to step up to the plate here, because I think he understands most of it already. The second person who knows a good deal about NFKC is Ishii-san. Attached is a rebased patch. Thanks, I'm looking at this now. I will continue tomorrow, but I wanted to report on what I've done so far. Attached is a new patch version, quite heavily modified. Notable changes so far: * Use Unicode codepoints, rather than UTF-8 bytes packed in a 32-bit ints. IMHO this makes the tables easier to read (to a human), and they are also packed slightly more tightly (see next two points), as you can fit more codepoints in a 16-bit integer. * Reorganize the decomposition table. Instead of a separate binary-searched table for each "size", have one table that's ordered by codepoint, which contains a length and offset into another array, where all the decomposed sequences are. This makes the recomposition step somewhat slower, as it now has to grovel through an array that contains all decompositions, rather than just the array that contains decompositions of two characters. But this isn't performance-critical, readability and tight packing of the tables matter more. * In the lookup table, store singleton decompositions that decompose to a single character, and the character fits in a uint16, directly in the main lookup table, instead of storing an index into the other table. This makes the tables somewhat smaller, as there are a lot of such decompositions. * Use uint32 instead of pg_wchar to represent unicode codepoints. pg_wchar suggests something that holds multi-byte characters in the OS locale, used by functions like wcscmp, so avoid that. I added a "typedef uint32 Codepoint" alias, though, to make it more clear which variables hold Unicode codepoints rather than e.g. indexes into arrays. * Unicode consortium publishes a comprehensive normalization test suite, as a text file, NormalizationTest.txt. I wrote a small perl and C program to run all the tests from that test suite, with the normalization function. That uncovered a number of bugs in the recomposition algorithm, which I then fixed: * decompositions that map to ascii characters cannot be excluded. * non-canonical and non-starter decompositions need to be excluded. They are now flagged in the lookup table, so that we only use them during the decomposition phase, not when recomposing. TODOs / Discussion points: * I'm not sure I like the way the code is organized, I feel that we're littering src/common with these. Perhaps we should add a src/common/unicode_normalization directory for all this. * The list of characters excluded from recomposition is currently hard-coded in utf_norm_generate.pl. However, that list is available in machine-readable format, in file CompositionExclusions.txt. Since we're reading most of the data from UnicodeData.txt, would be good to read the exclusion table from a file, too. * SASLPrep specifies normalization form KC, but it also specifies that some characters are mapped to space or nothing. Should do those mappings, too. - Heikki implement-SASLprep-2.patch.gz Description: application/gzip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SlabAlloc / VALGRIND_MAKE_MEM_DEFINED
On 2017-04-04 23:23:30 +0200, Tomas Vondra wrote: > On 04/04/2017 10:42 PM, Tomas Vondra wrote: > > Hi, > > > > Andres nagged to me about valgrind runs taking much longer since > > 9fab40ad introduced the SlabContext into reorderbuffer.c. And by > > "longer" I mean hours instead of minutes. > > > > After a bit of investigation I stumbled on this line in slab.c: > > > > VALGRIND_MAKE_MEM_DEFINED(chunk, SlabChunkGetPointer(chunk)); > > > > which is wrong, because the second parameter should be size and not > > another pointer. This essentially marks a lot of memory as defined, > > which results in the extreme test runtime. > > > > Instead, the line should be: > > > > VALGRIND_MAKE_MEM_DEFINED(SlabChunkGetPointer(chunk), sizeof(int32)); > > > > Patch attached. > > > > Turns out SlabCheck() was missing VALGRIND_MAKE_MEM_DEFINED, resulting in a > valgrind failure with --enable-assert. Updated patch version attached, > passing all tests in test_decoding. Pushed, and re-enabled TestDecoding on skink. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in SlabAlloc / VALGRIND_MAKE_MEM_DEFINED
On 04/04/2017 10:42 PM, Tomas Vondra wrote: Hi, Andres nagged to me about valgrind runs taking much longer since 9fab40ad introduced the SlabContext into reorderbuffer.c. And by "longer" I mean hours instead of minutes. After a bit of investigation I stumbled on this line in slab.c: VALGRIND_MAKE_MEM_DEFINED(chunk, SlabChunkGetPointer(chunk)); which is wrong, because the second parameter should be size and not another pointer. This essentially marks a lot of memory as defined, which results in the extreme test runtime. Instead, the line should be: VALGRIND_MAKE_MEM_DEFINED(SlabChunkGetPointer(chunk), sizeof(int32)); Patch attached. Turns out SlabCheck() was missing VALGRIND_MAKE_MEM_DEFINED, resulting in a valgrind failure with --enable-assert. Updated patch version attached, passing all tests in test_decoding. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services slab-valgrind-fix-v2.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding support for Default partition in partitioning
On Tue, Apr 4, 2017 at 9:30 AM, Rahila Syed wrote: > Hello, > > Please find attached an updated patch. > Following has been accomplished in this update: > > 1. A new partition can be added after default partition if there are no > conflicting rows in default partition. > 2. Solved the crash reported earlier. > > Thank you, > Rahila Syed > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > > Installed and compiled against commit 60a0b2ec8943451186dfa22907f88334d97cb2e0 (Date: Tue Apr 4 12:36:15 2017 -0400) without any issue However, running your original example, I'm getting this error keith@keith=# CREATE TABLE list_partitioned ( keith(# a int keith(# ) PARTITION BY LIST (a); CREATE TABLE Time: 4.933 ms keith@keith=# CREATE TABLE part_default PARTITION OF list_partitioned FOR VALUES IN (DEFAULT); CREATE TABLE Time: 3.492 ms keith@keith=# CREATE TABLE part_1 PARTITION OF list_partitioned FOR VALUES IN (4,5); ERROR: unrecognized node type: 216 Time: 0.979 ms Also, I'm still of the opinion that denying future partitions of values in the default would be a cause of confusion. In order to move the data out of the default and into a proper child it would require first removing that data from the default, storing it elsewhere, creating the child, then moving it back. If it's only a small amount of data it may not be a big deal, but if it's a large amount, that could cause quite a lot of contention if done in a single transaction. Either that or the user would have to deal with data existing in the table, disappearing, then reappearing again. This also makes it harder to migrate an existing table easily. Essentially no child tables for a large, existing data set could ever be created without going through one of the two situations above. However, thinking through this, I'm not sure I can see a solution without the global index support. If this restriction is removed, there's still an issue of data duplication after the necessary child table is added. So guess it's a matter of deciding which user experience is better for the moment? -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. http://www.keithf4.com
Re: [HACKERS] PATCH: recursive json_populate_record()
On 04/03/2017 05:17 PM, Andres Freund wrote: > On 2017-03-21 14:31:08 -0400, Andrew Dunstan wrote: >> >> On 03/21/2017 01:37 PM, David Steele wrote: >>> On 3/16/17 11:54 AM, David Steele wrote: On 2/1/17 12:53 AM, Michael Paquier wrote: > On Thu, Jan 26, 2017 at 6:49 AM, Tom Lane wrote: >> Nikita Glukhov writes: >>> On 25.01.2017 23:58, Tom Lane wrote: I think you need to take a second look at the code you're producing and realize that it's not so clean either. This extract from populate_record_field, for example, is pretty horrid: >>> But what if we introduce some helper macros like this: >>> #define JsValueIsNull(jsv) \ >>> ((jsv)->is_json ? !(jsv)->val.json.str \ >>> : !(jsv)->val.jsonb || (jsv)->val.jsonb->type == jbvNull) >>> #define JsValueIsString(jsv) \ >>> ((jsv)->is_json ? (jsv)->val.json.type == JSON_TOKEN_STRING \ >>> : (jsv)->val.jsonb && (jsv)->val.jsonb->type == jbvString) >> Yeah, I was wondering about that too. I'm not sure that you can make >> a reasonable set of helper macros that will fix this, but if you want >> to try, go for it. >> >> BTW, just as a stylistic thing, I find "a?b:c||d" unreadable: I have >> to go back to the manual every darn time to convince myself whether >> that means (a?b:c)||d or a?b:(c||d). It's better not to rely on >> the reader (... or the author) having memorized C's precedence rules >> in quite that much detail. Extra parens help. > Moved to CF 2017-03 as discussion is going on and more review is > needed on the last set of patches. > The current patches do not apply cleanly at cccbdde: $ git apply ../other/0001-introduce-JsonContainerXxx-macros-v04.patch error: patch failed: src/backend/utils/adt/jsonb_util.c:328 error: src/backend/utils/adt/jsonb_util.c: patch does not apply error: patch failed: src/backend/utils/adt/jsonfuncs.c:1266 error: src/backend/utils/adt/jsonfuncs.c: patch does not apply error: patch failed: src/include/utils/jsonb.h:218 error: src/include/utils/jsonb.h: patch does not apply In addition, it appears a number of suggestions have been made by Tom that warrant new patches. Marked "Waiting on Author". >>> This thread has been idle for months since Tom's review. >>> >>> The submission has been marked "Returned with Feedback". Please feel >>> free to resubmit to a future commitfest. >>> >>> >> Please revive. I am planning to look at this later this week. > Since that was 10 days ago - you're still planning to get this in before > the freeze? > Hoping to, other demands have intervened a bit, but I might be able to. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql - add special variable to reflect the last query status
2017-04-04 22:05 GMT+02:00 Fabien COELHO : > > After some discussions about what could be useful since psql scripts now > accepts tests, this patch sets a few variables which can be used by psql > after a "front door" (i.e. actually typed by the user) query: > > - RESULT_STATUS: the status of the query > - ERROR: whether the query failed > - ERROR_MESSAGE: ... > - ROW_COUNT: #rows affected > > SELECT * FROM ; > \if :ERROR >\echo oops >\q > \endif > > I'm not sure that the names are right. Maybe STATUS would be better than > RESULT_STATUS. good ideas Regards Pavel > > > -- > Fabien. > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument
On Tue, Apr 4, 2017 at 1:32 PM, Andres Freund wrote: > s/Avoid/Allow to avoid/ WFM. >> + * >> + * Callers cannot rely on memory for tuple in returned slot remaining valid >> + * past any subsequent manipulation of the sorter, such as another fetch of >> + * input from sorter. (The sorter may recycle memory.) >> */ > > It's not just the sorter, right? I'd just rephrase that the caller > cannot rely on tuple to stay valid beyond a single call. It started that way, but changed on the basis of Tom's feedback. I would be equally happy with either wording. >> static bool >> tuplesort_gettuple_common(Tuplesortstate *state, bool forward, >> @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, >> bool forward, >> * NULL value in leading attribute will set abbreviated value to zeroed >> * representation, which caller may rely on in abbreviated inequality check. >> * >> - * The slot receives a copied tuple (sometimes allocated in caller memory >> - * context) that will stay valid regardless of future manipulations of the >> - * tuplesort's state. >> + * If copy is TRUE, the slot receives a copied tuple that will stay valid >> + * regardless of future manipulations of the tuplesort's state. Memory is >> + * owned by the caller. If copy is FALSE, the slot will just receive a >> + * pointer to a tuple held within the tuplesort, which is more efficient, >> + * but only safe for callers that are prepared to have any subsequent >> + * manipulation of the tuplesort's state invalidate slot contents. >> */ > > Hm. Do we need to note anything about how long caller memory needs to > live? I think we'd get into trouble if the caller does a memory context > reset, without also clearing the slot? Since we arrange to have caller explicitly own memory (it's always in their own memory context (current context), which is not the case with other similar routines), it's 100% the caller's problem. As I assume you know, convention in executor around resource management of memory like this is pretty centralized within ExecStoreTuple(), and nearby routines. In general, the executor is more or less used to having this be its problem alone, and is pessimistic about memory lifetime unless otherwise noted. > Other than these minimal adjustments, this looks good to go to me. Cool. I'll try to get out a revision soon, maybe later today, including an updated 0002-* (Valgrind suppression), which I have not forgotten about. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] bug in SlabAlloc / VALGRIND_MAKE_MEM_DEFINED
Hi, Andres nagged to me about valgrind runs taking much longer since 9fab40ad introduced the SlabContext into reorderbuffer.c. And by "longer" I mean hours instead of minutes. After a bit of investigation I stumbled on this line in slab.c: VALGRIND_MAKE_MEM_DEFINED(chunk, SlabChunkGetPointer(chunk)); which is wrong, because the second parameter should be size and not another pointer. This essentially marks a lot of memory as defined, which results in the extreme test runtime. Instead, the line should be: VALGRIND_MAKE_MEM_DEFINED(SlabChunkGetPointer(chunk), sizeof(int32)); Patch attached. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services slab-valgrind-fix.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument
On 2017-03-13 18:14:07 -0700, Peter Geoghegan wrote: > From 5351b5db257cb39832647d9117465c0217e6268b Mon Sep 17 00:00:00 2001 > From: Peter Geoghegan > Date: Thu, 13 Oct 2016 10:54:31 -0700 > Subject: [PATCH 1/2] Avoid copying within tuplesort_gettupleslot(). s/Avoid/Allow to avoid/ > Add a "copy" argument to make it optional to receive a copy of caller > tuple that is safe to use following a subsequent manipulating of > tuplesort's state. This is a performance optimization. Most existing > tuplesort_gettupleslot() callers are made to opt out of copying. > Existing callers that happen to rely on the validity of tuple memory > beyond subsequent manipulations of the tuplesort request their own copy. > > This brings tuplesort_gettupleslot() in line with > tuplestore_gettupleslot(). In the future, a "copy" tuplesort_getdatum() > argument may be added, that similarly allows callers to opt out of > receiving their own copy of tuple. > > In passing, clarify assumptions that callers of other tuplesort fetch > routines may make about tuple memory validity, per gripe from Tom Lane. > --- > src/backend/executor/nodeAgg.c | 10 +++--- > src/backend/executor/nodeSort.c| 5 +++-- > src/backend/utils/adt/orderedsetaggs.c | 5 +++-- > src/backend/utils/sort/tuplesort.c | 28 +--- > src/include/utils/tuplesort.h | 2 +- > 5 files changed, 31 insertions(+), 19 deletions(-) > > diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c > index aa08152..b650f71 100644 > --- a/src/backend/executor/nodeAgg.c > +++ b/src/backend/executor/nodeAgg.c > @@ -570,6 +570,10 @@ initialize_phase(AggState *aggstate, int newphase) > * Fetch a tuple from either the outer plan (for phase 0) or from the sorter > * populated by the previous phase. Copy it to the sorter for the next phase > * if any. > + * > + * Callers cannot rely on memory for tuple in returned slot remaining valid > + * past any subsequent manipulation of the sorter, such as another fetch of > + * input from sorter. (The sorter may recycle memory.) > */ It's not just the sorter, right? I'd just rephrase that the caller cannot rely on tuple to stay valid beyond a single call. > static bool > tuplesort_gettuple_common(Tuplesortstate *state, bool forward, > @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool > forward, > * NULL value in leading attribute will set abbreviated value to zeroed > * representation, which caller may rely on in abbreviated inequality check. > * > - * The slot receives a copied tuple (sometimes allocated in caller memory > - * context) that will stay valid regardless of future manipulations of the > - * tuplesort's state. > + * If copy is TRUE, the slot receives a copied tuple that will stay valid > + * regardless of future manipulations of the tuplesort's state. Memory is > + * owned by the caller. If copy is FALSE, the slot will just receive a > + * pointer to a tuple held within the tuplesort, which is more efficient, > + * but only safe for callers that are prepared to have any subsequent > + * manipulation of the tuplesort's state invalidate slot contents. > */ Hm. Do we need to note anything about how long caller memory needs to live? I think we'd get into trouble if the caller does a memory context reset, without also clearing the slot? > bool > -tuplesort_gettupleslot(Tuplesortstate *state, bool forward, > +tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy, > TupleTableSlot *slot, Datum *abbrev) > { > MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); > @@ -2113,8 +2117,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool > forward, > if (state->sortKeys->abbrev_converter && abbrev) > *abbrev = stup.datum1; > > - stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple); > - ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true); > + if (copy) > + stup.tuple = heap_copy_minimal_tuple((MinimalTuple) > stup.tuple); > + > + ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy); > return true; Other than these minimal adjustments, this looks good to go to me. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding on standby
On 2017-04-04 22:32:40 +0800, Craig Ringer wrote: > I'm much happier with this. I'm still fixing some issues in the tests > for 03 and tidying them up, but 03 should allow 01 and 02 to be > reviewed in their proper context now. To me this very clearly is too late for v10, and now should be moved to the next CF. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Append implementation
On 2017-04-04 08:01:32 -0400, Robert Haas wrote: > On Tue, Apr 4, 2017 at 12:47 AM, Andres Freund wrote: > > I don't think the parallel seqscan is comparable in complexity with the > > parallel append case. Each worker there does the same kind of work, and > > if one of them is behind, it'll just do less. But correct sizing will > > be more important with parallel-append, because with non-partial > > subplans the work is absolutely *not* uniform. > > Sure, that's a problem, but I think it's still absolutely necessary to > ramp up the maximum "effort" (in terms of number of workers) > logarithmically. If you just do it by costing, the winning number of > workers will always be the largest number that we think we'll be able > to put to use - e.g. with 100 branches of relatively equal cost we'll > pick 100 workers. That's not remotely sane. I'm quite unconvinced that just throwing a log() in there is the best way to combat that. Modeling the issue of starting more workers through tuple transfer, locking, startup overhead costing seems a better to me. If the goal is to compute the results of the query as fast as possible, and to not use more than max_parallel_per_XXX, and it's actually beneficial to use more workers, then we should. Because otherwise you really can't use the resources available. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql - add special variable to reflect the last query status
After some discussions about what could be useful since psql scripts now accepts tests, this patch sets a few variables which can be used by psql after a "front door" (i.e. actually typed by the user) query: - RESULT_STATUS: the status of the query - ERROR: whether the query failed - ERROR_MESSAGE: ... - ROW_COUNT: #rows affected SELECT * FROM ; \if :ERROR \echo oops \q \endif I'm not sure that the names are right. Maybe STATUS would be better than RESULT_STATUS. -- Fabien.diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 3b86612..7006f23 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -3449,6 +3449,24 @@ bar + ERROR + + + Whether the last query failed. + + + + + + ERROR_MESSAGE + + + If the last query failed, the associated error message. + + + + + FETCH_COUNT @@ -3653,6 +3671,25 @@ bar + RESULT_STATUS + + + Text status of the last query. + + + + + + ROW_COUNT + + + When appropriate, how many rows were returned or affected by the + last query. + + + + + QUIET diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index a2f1259..74d22fb 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -1213,6 +1213,57 @@ PrintQueryResults(PGresult *results) return success; } +/* + * Set special variables for "front door" queries + * - RESULT_STATUS: last query status + * - ERROR: TRUE/FALSE, whether an error occurred + * - ERROR_MESSAGE: message if an error occured + * - ROW_COUNT: how many rows were returned or affected, if appropriate + */ +static void +SetResultVariables(PGresult *results) +{ + bool success; + ExecStatusType restat = PQresultStatus(results); + + SetVariable(pset.vars, "RESULT_STATUS", PQresStatus(restat)); + + switch (restat) + { + case PGRES_EMPTY_QUERY: + case PGRES_TUPLES_OK: + case PGRES_COMMAND_OK: + case PGRES_COPY_OUT: + case PGRES_COPY_IN: + case PGRES_COPY_BOTH: + case PGRES_SINGLE_TUPLE: + success = true; + SetVariable(pset.vars, "ERROR", "FALSE"); + SetVariable(pset.vars, "ERROR_MESSAGE", NULL); + break; + case PGRES_BAD_RESPONSE: + case PGRES_NONFATAL_ERROR: + case PGRES_FATAL_ERROR: + success = false; + SetVariable(pset.vars, "ERROR", "TRUE"); + SetVariable(pset.vars, "ERROR_MESSAGE", PQerrorMessage(pset.db)); + break; + default: + /* dead code */ + success = false; + psql_error("unexpected PQresultStatus: %d\n", restat); + break; + } + + if (success) + { + /* set variable if non empty */ + char *ntuples = PQcmdTuples(results); + SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : NULL); + } + else + SetVariable(pset.vars, "ROW_COUNT", NULL); +} /* * SendQuery: send the query string to the backend @@ -1346,6 +1397,9 @@ SendQuery(const char *query) elapsed_msec = INSTR_TIME_GET_MILLISEC(after); } + /* set special variables to reflect the result status */ + SetResultVariables(results); + /* but printing results isn't: */ if (OK && results) OK = PrintQueryResults(results); diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out index d602aee..3ee96b5 100644 --- a/src/test/regress/expected/psql.out +++ b/src/test/regress/expected/psql.out @@ -2904,6 +2904,36 @@ bar 'bar' "bar" \echo 'should print #8-1' should print #8-1 \endif +-- special result variables +SELECT 1 UNION SELECT 2; + ?column? +-- +1 +2 +(2 rows) + +\echo 'result status: ' :RESULT_STATUS +result status: PGRES_TUPLES_OK +\echo 'number of rows: ' :ROW_COUNT +number of rows: 2 +SELECT 1 UNION; +ERROR: syntax error at or near ";" +LINE 1: SELECT 1 UNION; + ^ +\echo 'result status: ' :RESULT_STATUS +result status: PGRES_FATAL_ERROR +\if :ERROR + \echo 'Oops, an error occured...' +Oops, an error occured... + \echo 'error message: ' :ERROR_MESSAGE +error message: ERROR: syntax error at or near ";" +LINE 1: SELECT 1 UNION; + ^ + +\endif +; +\echo 'result status: ' :RESULT_STATUS +result status: PGRES_EMPTY_QUERY -- SHOW_CONTEXT \set SHOW_CONTEXT never do $$ diff --git a/src/test/regress/sql/psql.sql b/src/test/regress/sql/psql.sql index b56a05f..716fe06 100644 --- a/src/test/regress/sql/psql.sql +++ b/src/test/regress/sql/psql.sql @@ -526,6 +526,21 @@ select \if false \\ (bogus \else \\ 42 \endif \\ forty_two; \echo 'should print #8-1' \endif +-- special result variables +SELECT 1 UNION SELECT 2; +\echo 'result status: ' :RESULT_STATUS +\echo 'number of rows: ' :ROW_COUNT + +SELECT 1 UNION; +\echo 'result status: ' :RESULT_STATUS +\if :ERROR + \echo 'Oops, an error occured...' + \echo 'error message: ' :ERROR_ME
Re: [HACKERS] Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
On 4/4/17 12:55 PM, Ashutosh Sharma wrote: > > As I am not seeing any response from Tomas for last 2-3 days and since > the commit-fest is coming towards end, I have planned to work on the > review comments that I had given few days back and submit the updated > patch. PFA new version of patch that takes care of all the review > comments given by me. I have also ran pgindent on btreefuncs.c file > and have run some basic test cases. All looks fine to me now! Awesome, Ashutosh! -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Covering + unique indexes.
On Tue, Apr 4, 2017 at 3:07 AM, Anastasia Lubennikova wrote: >> * I think that we should store this (the number of attributes), and >> use it directly when comparing, per my remarks to Tom over on that >> other thread. We should also use the free bit within >> IndexTupleData.t_info, to indicate that the IndexTuple was truncated, >> just to make it clear to everyone that might care that that's how >> these truncated IndexTuples need to be represented. >> >> Doing this would have no real impact on your patch, because for you >> this will be 100% redundant. It will help external tools, and perhaps >> another, more general suffix truncation patch that comes in the >> future. We should try very hard to have a future-proof on-disk >> representation. I think that this is quite possible. > > To be honest, I think that it'll make the patch overcomplified, because this > exact patch has nothing to do with suffix truncation. > Although, we can add any necessary flags if this work will be continued in > the future. Yes, doing things that way would mean adding a bit more complexity to your patch, but IMV would be worth it to have the on-disk format be compatible with what a full suffix truncation patch will eventually require. Obviously I disagree with what you say here -- I think that your patch *does* have plenty in common with suffix truncation. But, you don't have to even agree with me on that to see why what I propose is still a good idea. Tom Lane had a specific objection to this patch -- catalog metadata is currently necessary to interpret internal page IndexTuples [1]. However, by storing the true number of columns in the case of truncated tuples, we can make the situation with IndexTuples similar enough to the existing situation with heap tuples, where the number of attributes is available right in the header as "natts". We don't have to rely on something like catalog metadata from a great distance, where some caller may forget to pass through the metadata to a lower level. So, presumably doing things this way addresses Tom's exact objection to the truncation aspect of this patch [2]. We have the capacity to store something like natts "for free" -- let's use it. The lack of any direct source of metadata was called "dangerous". As much as anything else, I want to remove any danger. > There is already an assertion. > Assert(IndexTupleSize(newitup) <= IndexTupleSize(olditup)); > Do you think it is not enough? I think that a "can't happen" check will work better in the future, when user defined code could be involved in truncation. Any extra overhead will be paid relatively infrequently, and will be very low. >> * I see a small bug. You forgot to teach _bt_findsplitloc() about >> truncation. It does this currently, which you did not update: >> >> /* >> * The first item on the right page becomes the high key of the left >> page; >> * therefore it counts against left space as well as right space. >> */ >> leftfree -= firstrightitemsz; >> >> I think that this accounting needs to be fixed. > > Could you explain, what's wrong with this accounting? We may expect to take > more space on the left page, than will be taken after highkey truncation. > But I don't see any problem here. Obviously it would at least be slightly better to have the actual truncated high key size where that's expected -- not the would-be untruncated high key size. The code as it stands might lead to a bad choice of split point in edge-cases. At the very least, you should change comments to note the issue. I think it's highly unlikely that this could ever result in a failure to find a split point, which there are many defenses against already, but I think I would find that difficult to prove. The intent of the code is almost as important as the code, at least in my opinion. [1] postgr.es/m/CAH2-Wz=vmdh8pfazx9wah9bn5ast5vrna0xsz+gsfrs12bp...@mail.gmail.com [2] postgr.es/m/11895.1490983884%40sss.pgh.pa.us -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
Thanks. I planned to look into this today, but you've been faster ;-) regards Tomas On 04/04/2017 06:55 PM, Ashutosh Sharma wrote: Hi, As I am not seeing any response from Tomas for last 2-3 days and since the commit-fest is coming towards end, I have planned to work on the review comments that I had given few days back and submit the updated patch. PFA new version of patch that takes care of all the review comments given by me. I have also ran pgindent on btreefuncs.c file and have run some basic test cases. All looks fine to me now! Please note that this patch still belongs to Tomas not me. I still remain the reviewer of this patch. The same thing has been very clearly mentioned in the attached patch. The only intention behind Ashutosh (reviewer) working on this patch is to ensure that we do not miss the things that can easily get committed in this commit-fest. Thanks. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics (v25)
On 04/04/2017 09:55 AM, David Rowley wrote: On 1 April 2017 at 04:25, David Rowley wrote: I've attached an updated patch. I've made another pass at this and ended up removing the tryextstats variable. We now only try to use extended statistics when clauselist_selectivity() is given a valid RelOptInfo with rtekind == RTE_RELATION, and of course, it must also have some extended stats defined too. I've also cleaned up a few more comments, many of which I managed to omit updating when I refactored how the selectivity estimates ties into clauselist_selectivity() I'm quite happy with all of this now, and would also be happy for other people to take a look and comment. As a reviewer, I'd be marking this ready for committer, but I've moved a little way from just reviewing this now, having spent two weeks hacking at it. The latest patch is attached. Thanks David, I agree the reworked patch is much cleaner that the last version I posted. Thanks for spending your time on it. Two minor comments: 1) DEPENDENCY_MIN_GROUP_SIZE I'm not sure we still need the min_group_size, when evaluating dependencies. It was meant to deal with 'noisy' data, but I think it after switching to the 'degree' it might actually be a bad idea. Consider this: create table t (a int, b int); insert into t select 1, 1 from generate_series(1, 1) s(i); insert into t select i, i from generate_series(2, 2) s(i); create statistics s with (dependencies) on (a,b) from t; analyze t; select stadependencies from pg_statistic_ext ; stadependencies [{1 => 2 : 0.44}, {2 => 1 : 0.44}] (1 row) So the degree of the dependency is just ~0.333 although it's obviously a perfect dependency, i.e. a knowledge of 'a' determines 'b'. The reason is that we discard 2/3 of rows, because those groups are only a single row each, except for the one large group (1/3 of rows). Without the mininum group size limitation, the dependencies are: test=# select stadependencies from pg_statistic_ext ; stadependencies [{1 => 2 : 1.00}, {2 => 1 : 1.00}] (1 row) which seems way more reasonable, I think. 2) A minor detail is that instead of this if (estimatedclauses != NULL && bms_is_member(listidx, estimatedclauses)) continue; perhaps we should do just this: if (bms_is_member(listidx, estimatedclauses)) continue; bms_is_member does the same NULL check right at the beginning, so I don't think this might make a measurable difference. kind regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
So this is what I came up with on plane. Generalized the loading functionality into load_library_function which that can load either known postgres functions or call load_external_function. I am not quite sure if fmgr.c is best place to put it, but I didn't want to include stuff from executor in bgworker.c. I was thinking about putting it to dfmgr.c (as that's where load_external_function) but again seemed weird to including executor there. As with previous patch, 9.6 will need quite different patch as we need to keep compatibility there, but since I am traveling I think it's better to share what I have so far. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel index db9ac3d..b360887 100644 --- a/src/backend/access/transam/README.parallel +++ b/src/backend/access/transam/README.parallel @@ -198,7 +198,7 @@ pattern looks like this: EnterParallelMode(); /* prohibit unsafe state changes */ - pcxt = CreateParallelContext(entrypoint, nworkers); + pcxt = CreateParallelContext("library_name", "function_name", nworkers); /* Allow space for application-specific data here. */ shm_toc_estimate_chunk(&pcxt->estimator, size); diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index b3d3853..326d4f9 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -61,7 +61,7 @@ #define PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0x0006) #define PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0x0007) #define PARALLEL_KEY_TRANSACTION_STATE UINT64CONST(0x0008) -#define PARALLEL_KEY_EXTENSION_TRAMPOLINE UINT64CONST(0x0009) +#define PARALLEL_KEY_ENTRYPOINTUINT64CONST(0x0009) /* Fixed-size parallel state. */ typedef struct FixedParallelState @@ -77,9 +77,6 @@ typedef struct FixedParallelState pid_t parallel_master_pid; BackendId parallel_master_backend_id; - /* Entrypoint for parallel workers. */ - parallel_worker_main_type entrypoint; - /* Mutex protects remaining fields. */ slock_t mutex; @@ -109,7 +106,6 @@ static dlist_head pcxt_list = DLIST_STATIC_INIT(pcxt_list); /* Private functions. */ static void HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg); -static void ParallelExtensionTrampoline(dsm_segment *seg, shm_toc *toc); static void WaitForParallelWorkersToExit(ParallelContext *pcxt); @@ -119,7 +115,8 @@ static void WaitForParallelWorkersToExit(ParallelContext *pcxt); * destroyed before exiting the current subtransaction. */ ParallelContext * -CreateParallelContext(parallel_worker_main_type entrypoint, int nworkers) +CreateParallelContext(const char *library_name, const char *function_name, + int nworkers) { MemoryContext oldcontext; ParallelContext *pcxt; @@ -152,7 +149,8 @@ CreateParallelContext(parallel_worker_main_type entrypoint, int nworkers) pcxt = palloc0(sizeof(ParallelContext)); pcxt->subid = GetCurrentSubTransactionId(); pcxt->nworkers = nworkers; - pcxt->entrypoint = entrypoint; + pcxt->library_name = pstrdup(library_name); + pcxt->function_name = pstrdup(function_name); pcxt->error_context_stack = error_context_stack; shm_toc_initialize_estimator(&pcxt->estimator); dlist_push_head(&pcxt_list, &pcxt->node); @@ -164,33 +162,6 @@ CreateParallelContext(parallel_worker_main_type entrypoint, int nworkers) } /* - * Establish a new parallel context that calls a function provided by an - * extension. This works around the fact that the library might get mapped - * at a different address in each backend. - */ -ParallelContext * -CreateParallelContextForExternalFunction(char *library_name, - char *function_name, - int nworkers) -{ - MemoryContext oldcontext; - ParallelContext *pcxt; - - /* We might be running in a very short-lived memory context. */ - oldcontext = MemoryContextSwitchTo(TopTransactionContext); - - /* Create the context. */ - pcxt = CreateParallelContext(ParallelExtensionTrampoline, nworkers); - pcxt->library_name = pstrdup(library_name); - pcxt->function_name = pstrdup(function_name); - - /* Restore previous memory context. */ - MemoryContextSwitchTo(oldcontext); - - return pcxt; -} - -/* * Establish the dynamic shared memory segment for a parallel context and * copy state and other bookkeeping information that will be needed by * parallel workers into it. @@ -249,15 +220,10 @@ InitializeParallelDSM(ParallelContext *pcxt) pcxt->nworkers)); shm_toc_estimate_keys(&pcxt->estimator, 1); - /* Estimate how much we'll need for extension entrypoint info. */ - if (pcxt->library_name != NULL) - { - Assert(pcxt->entrypoint == ParallelExtensionTrampoline); - Assert(pcxt->function_name != NULL); - shm_toc_estimate_chunk(&pcxt->estimator, strlen(pcxt->library_name) -
Re: [HACKERS] Compiler warning in costsize.c
Michael Paquier writes: > In builds where USE_ASSERT_CHECKING is not enabled, costsize.c can > generate warnings. Here is for example with MSVC: > src/backend/optimizer/path/costsize.c(4520): warning C4101: 'rte' : > unreferenced local variable [C:\Users\ioltas\git\postgres\postgres.vcxproj] > src/backend/optimizer/path/costsize.c(4640): warning C4101: 'rte' : > unreferenced local variable [C:\Users\ioltas\git\postgres\postgres.vcxproj] > a9c074ba7 has done an effort, but a bit more is needed as the attached. That doesn't look right at all: +#ifdef USE_ASSERT_CHECKING RangeTblEntry *rte PG_USED_FOR_ASSERTS_ONLY; +#endif The entire point of the PG_USED_FOR_ASSERTS_ONLY annotation is to suppress this type of warning without a need for an explicit #ifdef like that one. We should either fix PG_USED_FOR_ASSERTS_ONLY to work on MSVC as well, or decide that we don't care about suppressing such warnings on MSVC, or give up on PG_USED_FOR_ASSERTS_ONLY and remove it everywhere in favor of plain #ifdefs. (I'm personally not that much in love with PG_USED_FOR_ASSERTS_ONLY, because it tends to confuse pgindent.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange parallel query behavior after OOM crashes
On 04/04/2017 06:52 PM, Robert Haas wrote: On Mon, Apr 3, 2017 at 6:08 AM, Kuntal Ghosh wrote: On Fri, Mar 31, 2017 at 6:50 PM, Robert Haas wrote: On Thu, Mar 30, 2017 at 4:35 PM, Kuntal Ghosh wrote: 2. the server restarts automatically, initialize BackgroundWorkerData->parallel_register_count and BackgroundWorkerData->parallel_terminate_count in the shared memory. After that, it calls ForgetBackgroundWorker and it increments parallel_terminate_count. Hmm. So this seems like the root of the problem. Presumably those things need to be reset AFTER forgetting any background workers from before the crash. IMHO, the fix would be not to increase the terminated parallel worker count whenever ForgetBackgroundWorker is called due to a bgworker crash. I've attached a patch for the same. PFA. While I'm not opposed to that approach, I don't think this is a good way to implement it. If you want to pass an explicit flag to ForgetBackgroundWorker telling it whether or not it should performing the increment, fine. But with what you've got here, you're essentially relying on "spooky action at a distance". It would be easy for future code changes to break this, not realizing that somebody's got a hard dependency on 0 having a specific meaning. I'm probably missing something, but I don't quite understand how these values actually survive the crash. I mean, what I observed is OOM followed by a restart, so shouldn't BackgroundWorkerShmemInit() simply reset the values back to 0? Or do we call ForgetBackgroundWorker() after the crash for some reason? In any case, the comment right before BackgroundWorkerArray says this: * These counters can of course overflow, but it's not important here * since the subtraction will still give the right number. which means that this assert + Assert(BackgroundWorkerData->parallel_register_count >= + BackgroundWorkerData->parallel_terminate_count); is outright broken, just like any other attempts to rely on simple comparisons of these two counters, no? BTW, if this isn't on the open items list, it should be. It's presumably the fault of b460f5d6693103076dc554aa7cbb96e1e53074f9. Unrelated, but perhaps we should also add this to open items: https://www.postgresql.org/message-id/flat/57bbc52c-5d40-bb80-2992-7e1027d0b67c%402ndquadrant.com (although that's more a 9.6 material). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
On 2017-04-04 13:06:13 +0300, Stas Kelvich wrote: > That is just argument against Andres concern that prepared transaction > is able to deadlock with decoding process — at least no such cases in > regression tests. There's few longer / adverse xacts, that doesn't say much. > And that concern is main thing blocking this patch. Except explicit catalog > locks in prepared tx nobody yet found such cases and it is hard to address > or argue about. I doubt that's the case. But even if it were so, it's absolutely not acceptable that a plain user can cause such deadlocks. So I don't think this argument buys you anything. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
Peter Eisentraut writes: > On 4/3/17 14:19, Andres Freund wrote: > + *op->resvalue = > Int64GetDatum(nextval_internal(op->d.nextvalueexpr.seqid, false)); >> Is it guaranteed that the caller expects an int64? I saw that >> nextvalueexpr's have a typeid field. > It expects one of the integer types. We could cast the result of > Int64GetDatum() to the appropriate type, but that wouldn't actually do > anything. Uh, really? On 32-bit platforms, int64 and int32 datums have entirely different representations. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
On Tue, Apr 4, 2017 at 7:06 PM, Stas Kelvich wrote: > >> On 4 Apr 2017, at 04:23, Masahiko Sawada wrote: >> >> >> I reviewed this patch but when I tried to build contrib/test_decoding >> I got the following error. >> > > Thanks! > > Yes, seems that 18ce3a4a changed ProcessUtility_hook signature. > Updated. > >> There are still some unnecessary code in v5 patch. > Thank you for updating the patch! > Actually second diff isn’t intended to be part of the patch, I've just shared > the way I ran regression test suite through the 2pc decoding changing > all commits to prepare/commits where commits happens only after decoding > of prepare is finished (more details in my previous message in this thread). Understood. Sorry for the noise. > > That is just argument against Andres concern that prepared transaction > is able to deadlock with decoding process — at least no such cases in > regression tests. > > And that concern is main thing blocking this patch. Except explicit catalog > locks in prepared tx nobody yet found such cases and it is hard to address > or argue about. > Hmm, I also has not found such deadlock case yet. Other than that issue current patch still could not pass 'make check' test of contrib/test_decoding. *** 154,167 (4 rows) :get_with2pc ! data ! - ! BEGIN ! table public.test_prepared1: INSERT: id[integer]:5 ! table public.test_prepared1: INSERT: id[integer]:6 data[text]:'frakbar' ! PREPARE TRANSACTION 'test_prepared#3'; ! COMMIT PREPARED 'test_prepared#3'; ! (5 rows) -- make sure stuff still works INSERT INTO test_prepared1 VALUES (8); --- 154,162 (4 rows) :get_with2pc ! data ! -- ! (0 rows) -- make sure stuff still works INSERT INTO test_prepared1 VALUES (8); I guess that the this part is a unexpected result and should be fixed. Right? - *** 215,222 -- If we try to decode it now we'll deadlock SET statement_timeout = '10s'; :get_with2pc_nofilter ! -- FIXME we expect a timeout here, but it actually works... ! ERROR: statement timed out RESET statement_timeout; -- we can decode past it by skipping xacts with catalog changes --- 210,222 -- If we try to decode it now we'll deadlock SET statement_timeout = '10s'; :get_with2pc_nofilter ! data ! ! BEGIN ! table public.test_prepared1: INSERT: id[integer]:10 data[text]:'othercol' ! table public.test_prepared1: INSERT: id[integer]:11 data[text]:'othercol2' ! PREPARE TRANSACTION 'test_prepared_lock' ! (4 rows) RESET statement_timeout; -- we can decode past it by skipping xacts with catalog changes Probably we can ignore this part for now. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Instead of DROP function use UPDATE pg_proc in an upgrade extension script
2017-04-04 15:40 GMT+02:00 Vicky Vergara : > Thanks, > > > > It is not safe due views - that are saved in post analyze form. > > > What is post analyze form? any link that you can give me to read about it? > The Query pipe line is: parsing, analyze, optimalization, execution when you change a API, then the analyze stage should be processed again - but views are stored as post analyzed serialized data. You cannot do this process again without source code. Regards Pavel > > Thanks > > -- > *De:* Pavel Stehule > *Enviado:* lunes, 3 de abril de 2017 11:21 p. m. > *Para:* Vicky Vergara > *Cc:* pgsql-hackers@postgresql.org > *Asunto:* Re: [HACKERS] Instead of DROP function use UPDATE pg_proc in an > upgrade extension script > > > > 2017-04-04 6:17 GMT+02:00 Vicky Vergara : > >> >> Hello, >> >> >> When creating an extension upgrade sql script, because the function does >> not have the same parameter names and/or parameters type and/or the result >> types changes, there is the need to drop the function because otherwise the >> CREATE OR REPLACE of the new signature will fail. >> >> >> So for example: >> >> having the following function: >> >> >> SELECT proallargtypes, proargmodes, proargnames FROM pg_proc WHERE >> proallargtypes = '{25,20,20,16,23,23,20,20}' AND proname = >> 'pgr_edgedisjointpaths'; >> -[ RECORD 1 ]--+ >> - >> proallargtypes | {25,20,20,16,23,23,20,20} >> proargmodes| {i,i,i,i,o,o,o,o} >> proargnames| {"","","",directed,seq,path_seq,node,edge} >> >> >> When adding extra OUT parameters, because the result types (&names) >> change, the function needs a DROP: >> >> -- Row type defined by OUT parameters is different >> >> ALTER EXTENSION pgrouting DROP FUNCTION pgr_edgedisjointpaths(text,big >> int,bigint,boolean); >> >> DROP FUNCTION IF EXISTS pgr_edgedisjointpaths(text,big >> int,bigint,boolean); >> >> >> but doing that, objects that depend on the function. like a view, get >> dropped when using CASCADE in the ALTER extension, and functions that use >> the pgr_edgedisjointpaths internally don't get dropped. >> >> >> So, I must say that I experimented: instead of doing the drop, I made: >> >> >> UPDATE pg_proc SET >> >> proallargtypes = '{25,20,20,16,23,23,23,20,20,7 >> 01,701}', >> >> proargmodes = '{i,i,i,i,o,o,o,o,o,o,o}', >> >>proargnames = '{"","","","directed","seq","p >> ath_id","path_seq","node","edge","cost","agg_cost"}' >> >> WHERE proallargtypes = '{25,20,20,16,23,23,20,20}' AND proname = >> 'pgr_edgedisjointpaths'; >> >> >> And CASCADE was not needed, and the view remained intact. >> >> >> So, I want to know how "safe" can you consider the second method, and >> what kind of other objects do I need to test besides views. >> > > It is not safe due views - that are saved in post analyze form. > > Regards > > Pavel > >> My plan, is to use the second method: >> >> - when the current names of the OUT parameters don't change, and there is >> an additional OUT parameter >> >> - when the current names of the IN parameters don't change, and there is >> an additional IN parameter with a default value >> >> >> Thanks >> >> >> Vicky Vergara >> >> >> >> >> >> >> >> >
Re: [HACKERS] pg_partman 3.0.0 - real-world usage of native partitioning and a case for native default
On Mon, Apr 3, 2017 at 11:33 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > > >>> Thankfully since native partitioning still uses inheritance internally for the most part, pg_partman works pretty well without nearly as much change as I thought I would need. The biggest deficiency I'm seeing has to do with not having a "default" partition to put data that doesn't match any children. The fact that it throws an error is a concern, but it's not where I see the main problem. Where this really comes into play is when someone wants to make an existing table into a partitioned table. There's really no easy way to do this outside of making a completely brand new partition set and copying/moving the data from the old to the new. >>> >>> If there are multiple partitions, there is likely to be more data that >>> needs to be moved that is retained in the old table. So, creating complete >>> brand new partitioning and copying/moving data is annoying but not as much >>> as it sounds. Obviously, if we could avoid it, we should try to. >>> >> >> Not sure I follow what you're saying here. With pg_partman, making the >> old table the parent and still containing all the data has caused no issues >> when I've migrated clients to it, nor has anyone reported any issues to me >> with this system. New data goes to the child tables as they should and old >> data is then moved when convenient. It makes things work very smoothly and >> the only outage encountered is a brief lock at creation time. >> > > In partitioning, partitioned table doesn't have any storage. Data that > belongs to a given partition is expected to be there from day one. > > >> -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > I understand that with native, the parent has no data and never will. I'm just using pg_partman's method as an example. The DEFAULT would take the place of the parent table in this situation. Yes, in an ideal world, all the data would be in the children right from the beginning when you declare the parent. But that hardly seems realistic, especially when you have to partition an existing billion+ row table and keep it running at the same time. Yes, the basic purpose of the default is to catch data that gets inserted outside the current child existence. That's what I also do with the parent in pg_partman. But it can also serve as a method to ease migration, as the parent also does in pg_partman's trigger-based method. Keith
Re: [HACKERS] partitioned tables and contrib/sepgsql
On 04/04/2017 09:55 AM, Mike Palmiotto wrote: > On Tue, Apr 4, 2017 at 10:19 AM, Joe Conway wrote: >> On 04/04/2017 06:45 AM, Robert Haas wrote: >>> On Mon, Apr 3, 2017 at 12:02 PM, Joe Conway wrote: > 0002 looks extremely straightforward, but I wonder if we could get one > of the people on this list who knows about sepgsql to have a look? > (Stephen Frost, Joe Conway, KaiGai Kohei...) Will have a look later today. >>> >>> I think it is now tomorrow, unless your time zone is someplace very >>> far away. :-) >> >> OBE -- I have scheduled time in 30 minutes from now, after I have gotten >> my first cup of coffee ;-) > > After some discussion off-list, I've rebased and udpated the patches. > Please see attached for further review. Thanks -- will have another look and test on a machine with selinux setup. Robert, did you want me to take responsibility to commit on this or just provide review/feedback? Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] partitioned tables and contrib/sepgsql
On Tue, Apr 4, 2017 at 10:19 AM, Joe Conway wrote: > On 04/04/2017 06:45 AM, Robert Haas wrote: >> On Mon, Apr 3, 2017 at 12:02 PM, Joe Conway wrote: 0002 looks extremely straightforward, but I wonder if we could get one of the people on this list who knows about sepgsql to have a look? (Stephen Frost, Joe Conway, KaiGai Kohei...) >>> >>> Will have a look later today. >> >> I think it is now tomorrow, unless your time zone is someplace very >> far away. :-) > > OBE -- I have scheduled time in 30 minutes from now, after I have gotten > my first cup of coffee ;-) After some discussion off-list, I've rebased and udpated the patches. Please see attached for further review. Thanks, -- Mike Palmiotto Software Engineer Crunchy Data Solutions https://crunchydata.com From be692f8cc94d74033494d140c310e932c705e785 Mon Sep 17 00:00:00 2001 From: Mike Palmiotto Date: Wed, 29 Mar 2017 14:59:37 + Subject: [PATCH 2/2] Add partitioned table support to sepgsql Account for RELKIND_PARTITIONED_RELATIONS in sepgsql and treat the objects like regular relations. This allows for proper object_access hook behavior for partitioned tables. --- contrib/sepgsql/label.c| 3 ++- contrib/sepgsql/relation.c | 33 +++-- 2 files changed, 25 insertions(+), 11 deletions(-) diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c index 8a72503..c66581a 100644 --- a/contrib/sepgsql/label.c +++ b/contrib/sepgsql/label.c @@ -787,7 +787,8 @@ exec_object_restorecon(struct selabel_handle * sehnd, Oid catalogId) case RelationRelationId: relForm = (Form_pg_class) GETSTRUCT(tuple); -if (relForm->relkind == RELKIND_RELATION) +if (relForm->relkind == RELKIND_RELATION || + relForm->relkind == RELKIND_PARTITIONED_TABLE) objtype = SELABEL_DB_TABLE; else if (relForm->relkind == RELKIND_SEQUENCE) objtype = SELABEL_DB_SEQUENCE; diff --git a/contrib/sepgsql/relation.c b/contrib/sepgsql/relation.c index 4dc48a0..4fcebc1 100644 --- a/contrib/sepgsql/relation.c +++ b/contrib/sepgsql/relation.c @@ -54,12 +54,14 @@ sepgsql_attribute_post_create(Oid relOid, AttrNumber attnum) ObjectAddress object; Form_pg_attribute attForm; StringInfoData audit_name; + char relkind; /* * Only attributes within regular relation have individual security * labels. */ - if (get_rel_relkind(relOid) != RELKIND_RELATION) + relkind = get_rel_relkind(relOid); + if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* @@ -135,8 +137,10 @@ sepgsql_attribute_drop(Oid relOid, AttrNumber attnum) { ObjectAddress object; char *audit_name; + char relkind; - if (get_rel_relkind(relOid) != RELKIND_RELATION) + relkind = get_rel_relkind(relOid); + if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* @@ -167,8 +171,11 @@ sepgsql_attribute_relabel(Oid relOid, AttrNumber attnum, { ObjectAddress object; char *audit_name; + char relkind; - if (get_rel_relkind(relOid) != RELKIND_RELATION) + relkind = get_rel_relkind(relOid); + + if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot set security label on non-regular columns"))); @@ -209,8 +216,11 @@ sepgsql_attribute_setattr(Oid relOid, AttrNumber attnum) { ObjectAddress object; char *audit_name; + char relkind; + + relkind = get_rel_relkind(relOid); - if (get_rel_relkind(relOid) != RELKIND_RELATION) + if (relkind != RELKIND_RELATION && relkind != RELKIND_PARTITIONED_TABLE) return; /* @@ -290,6 +300,7 @@ sepgsql_relation_post_create(Oid relOid) switch (classForm->relkind) { + case RELKIND_PARTITIONED_TABLE: case RELKIND_RELATION: tclass = SEPG_CLASS_DB_TABLE; break; @@ -336,7 +347,8 @@ sepgsql_relation_post_create(Oid relOid) true); /* - * Assign the default security label on the new relation + * Assign the default security label on the new relation or partitioned + * table. */ object.classId = RelationRelationId; object.objectId = relOid; @@ -344,10 +356,10 @@ sepgsql_relation_post_create(Oid relOid) SetSecurityLabel(&object, SEPGSQL_LABEL_TAG, rcontext); /* - * We also assigns a default security label on columns of the new regular - * tables. + * We also assign a default security label on columns of a new table. */ - if (classForm->relkind == RELKIND_RELATION) + if (classForm->relkind == RELKIND_RELATION || + classForm->relkind == RELKIND_PARTITIONED_TABLE) { Relation arel; ScanKeyData akey; @@ -422,6 +434,7 @@ sepgsql_relation_drop(Oid relOid) relkind = get_rel_relkind(relOid); switch (relkind) { + case RELKIND_PARTITIONED_TABLE: case RELKIND_RELATION: tclass = SEPG_CLASS_DB_TABLE; break; @@ -485,7 +498,7 @@ sepgsql_relation_drop(Oid relOid) /* * check db_column:{drop} permission */ - if (relkind == RELKIND_RELATION) + if (relkind == RE
Re: [HACKERS] Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)
Hi, As I am not seeing any response from Tomas for last 2-3 days and since the commit-fest is coming towards end, I have planned to work on the review comments that I had given few days back and submit the updated patch. PFA new version of patch that takes care of all the review comments given by me. I have also ran pgindent on btreefuncs.c file and have run some basic test cases. All looks fine to me now! Please note that this patch still belongs to Tomas not me. I still remain the reviewer of this patch. The same thing has been very clearly mentioned in the attached patch. The only intention behind Ashutosh (reviewer) working on this patch is to ensure that we do not miss the things that can easily get committed in this commit-fest. Thanks. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com On Tue, Apr 4, 2017 at 7:23 PM, David Steele wrote: > On 4/4/17 9:43 AM, Robert Haas wrote: >> On Tue, Apr 4, 2017 at 9:32 AM, David Steele wrote: >>> My goal is to help people focus on patches that have a chance. At this >>> point I think that includes poking authors who are not being responsive >>> using the limited means at my disposal. >> >> +1. Pings on specific threads can help clear things up when, for >> example, the author and reviewer are each waiting for the other. And, >> as you say, they also help avoid the situation where a patch just >> falls off the radar and misses the release for no especially good >> reason, which naturally causes people frustration. >> >> I think your pings have been quite helpful, although I think it would >> have been better in some cases if you'd done them sooner. Pinging >> after a week, with a 3 day deadline, when there are only a few days >> left in the CommitFest isn't really leaving a lot of room to maneuver. > > Thanks for the feedback! My thinking is that I don't want to bug people > too soon, but there's a maximum number of days a thread should be idle. > Over the course of the CF that has gone from 10 days, to 7, 5, and 3. > > I don't look at all patches every day so it can be a bit uneven, i.e., > all patches are allowed certain amount of idle time, but some may get a > bit more depending on when I check up on them. > > Thanks, > -- > -David > da...@pgmasters.net From 6d9814182eb03521f093d687eb8024c9df1c11d5 Mon Sep 17 00:00:00 2001 From: ashu Date: Tue, 4 Apr 2017 21:38:42 +0530 Subject: [PATCH] pageinspect: Add bt_page_items function with bytea v6 Author: Tomas Vondra Reviewed-by: Ashutosh Sharma --- contrib/pageinspect/btreefuncs.c | 198 +++--- contrib/pageinspect/expected/btree.out| 13 ++ contrib/pageinspect/pageinspect--1.5--1.6.sql | 14 ++ contrib/pageinspect/sql/btree.sql | 4 + doc/src/sgml/pageinspect.sgml | 31 src/include/access/nbtree.h | 1 + 6 files changed, 210 insertions(+), 51 deletions(-) diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c index 3d69aa9..02440ec 100644 --- a/contrib/pageinspect/btreefuncs.c +++ b/contrib/pageinspect/btreefuncs.c @@ -41,6 +41,7 @@ PG_FUNCTION_INFO_V1(bt_metap); PG_FUNCTION_INFO_V1(bt_page_items); +PG_FUNCTION_INFO_V1(bt_page_items_bytea); PG_FUNCTION_INFO_V1(bt_page_stats); #define IS_INDEX(r) ((r)->rd_rel->relkind == RELKIND_INDEX) @@ -235,14 +236,6 @@ bt_page_stats(PG_FUNCTION_ARGS) PG_RETURN_DATUM(result); } -/*--- - * bt_page_items() - * - * Get IndexTupleData set in a btree page - * - * Usage: SELECT * FROM bt_page_items('t1_pkey', 1); - *--- - */ /* * cross-call data structure for SRF @@ -253,14 +246,72 @@ struct user_args OffsetNumber offset; }; +/*--- + * bt_page_print_tuples() + * + * Form a tuple describing index tuple at a given offset + * -- + */ +static Datum +bt_page_print_tuples(FuncCallContext *fctx, Page page, OffsetNumber offset) +{ + char *values[6]; + HeapTuple tuple; + ItemId id; + IndexTuple itup; + int j; + int off; + int dlen; + char *dump; + char *ptr; + + id = PageGetItemId(page, offset); + + if (!ItemIdIsValid(id)) + elog(ERROR, "invalid ItemId"); + + itup = (IndexTuple) PageGetItem(page, id); + + j = 0; + values[j++] = psprintf("%d", offset); + values[j++] = psprintf("(%u,%u)", + ItemPointerGetBlockNumberNoCheck(&itup->t_tid), + ItemPointerGetOffsetNumberNoCheck(&itup->t_tid)); + values[j++] = psprintf("%d", (int) IndexTupleSize(itup)); + values[j++] = psprintf("%c", IndexTupleHasNulls(itup) ? 't' : 'f'); + values[j++] = psprintf("%c", IndexTupleHasVarwidths(itup) ? 't' : 'f'); + + ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info); + dlen = IndexTupleSize(itup) - IndexInfoFindDataOffset(itup->t_info); + dump = palloc0(dlen * 3 + 1); + values[j] = dump; +
Re: [HACKERS] strange parallel query behavior after OOM crashes
On Mon, Apr 3, 2017 at 6:08 AM, Kuntal Ghosh wrote: > On Fri, Mar 31, 2017 at 6:50 PM, Robert Haas wrote: >> On Thu, Mar 30, 2017 at 4:35 PM, Kuntal Ghosh >> wrote: >>> 2. the server restarts automatically, initialize >>> BackgroundWorkerData->parallel_register_count and >>> BackgroundWorkerData->parallel_terminate_count in the shared memory. >>> After that, it calls ForgetBackgroundWorker and it increments >>> parallel_terminate_count. >> >> Hmm. So this seems like the root of the problem. Presumably those >> things need to be reset AFTER forgetting any background workers from >> before the crash. >> > IMHO, the fix would be not to increase the terminated parallel worker > count whenever ForgetBackgroundWorker is called due to a bgworker > crash. I've attached a patch for the same. PFA. While I'm not opposed to that approach, I don't think this is a good way to implement it. If you want to pass an explicit flag to ForgetBackgroundWorker telling it whether or not it should performing the increment, fine. But with what you've got here, you're essentially relying on "spooky action at a distance". It would be easy for future code changes to break this, not realizing that somebody's got a hard dependency on 0 having a specific meaning. BTW, if this isn't on the open items list, it should be. It's presumably the fault of b460f5d6693103076dc554aa7cbb96e1e53074f9. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 3/30/17 22:47, Vitaly Burovoy wrote: > It seemed not very hard to fix it. > Please find attached patch to be applied on top of your one. > > I've added more tests to cover different cases of changing bounds when > data type is changed. Committed all that. Thanks! -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE command progress checker
On Tue, Apr 4, 2017 at 4:57 AM, Amit Langote wrote: > Hmm, you're right. It could be counted with a separate variable > initialized to 0 and incremented every time we decide to add a row to the > final set of sampled rows, although different implementations of > AcquireSampleRowsFunc have different ways of deciding if a given row will > be part of the final set of sampled rows. > > On the other hand, if we decide to count progress in terms of blocks as > you suggested afraid, I'm afraid that FDWs won't be able to report the > progress. I think it may be time to push this patch out to v11. It was submitted one day before the start of the last CommitFest, the design wasn't really right, and it's not clear even now that we know what the right design is. And we're pretty much out of time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Instead of DROP function use UPDATE pg_proc in an upgrade extension script
On Tue, Apr 4, 2017 at 9:07 AM, Vicky Vergara wrote: > you answered so fast that I know I am stepping into dangerous grounds. > > But I would like to know more about your experience. > > Any links that you can give me to read about the code and/or issues > regarding the ip4r experience? I can't comment on that, but in general I don't think there's an issue if (1) your UPDATE statement contains no bugs and (2) the DROP statement would have succeeded without CASCADE. The problem is when there's stuff depending on the existing function definition - such as views. Even then it may work if the dependencies are such that the new definition is compatible enough for purposes of the dependent objects, but if not then you've got trouble. To put this another way, if it were safe for CREATE OR REPLACE to succeed here, we would have made it succeed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional
On 4/4/17 11:42 AM, Stephen Frost wrote: > * David Steele (da...@pgmasters.net) wrote: >> On 3/22/17 4:42 PM, Peter Eisentraut wrote: >>> On 3/22/17 15:14, Stephen Frost wrote: > -SELECT * FROM pg_stop_backup(false); > +SELECT * FROM pg_stop_backup(false [, true ]); > > I think that it's better to get rid of "[" and "]" from the above because > IMO this should be the command example that users actually can run. Using the '[' and ']' are how all of the optional arguments are specified in the documentation, see things like current_setting() in our existing documentation: >>> >>> In the synopsis, but not in concrete examples. >> >> Wasn't quite sure what the protocol was in this case, and figured it >> was easier to subtract than to add. > > Forgot to close this out sooner, apologies. > > This has been committed now. Thank you, Stephen! -- -David da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Proposal: Local indexes for partitioned table
On 01.03.2017 13:53, Maksim Milyutin wrote: Hi hackers! As I've understood from thread [1] the main issue of creating local indexes for partitions is supporting REINDEX and DROP INDEX operations on parent partitioned tables. Furthermore Robert Haas mentioned the problem of creating index on key that is represented in partitions with single value (or primitive interval) [1] i.e. under the list-partitioning or range-partitioning with unit interval. I would like to propose the following solution: 1. Create index for hierarchy of partitioned tables and partitions recursively. Don't create relfilenode for indexes on parents, only entries in catalog (much like the partitioned table's storage elimination in [2]). Abstract index for partitioned tables is only for the reference on indexes of child tables to perform REINDEX and DROP INDEX operations. 2. Specify created indexes in pg_depend table so that indexes of child tables depend on corresponding indexes of parent tables with type of dependency DEPENDENCY_NORMAL so that index could be removed separately for partitions and recursively/separately for partitioned tables. 3. REINDEX on index of partitioned table would perform this operation on existing indexes of corresponding partitions. In this case it is necessary to consider such operations as REINDEX SCHEMA | DATABASE | SYSTEM so that partitions' indexes wouldn't be re-indexed multiple times in a row. Any thoughts? 1. https://www.postgresql.org/message-id/CA+TgmoZUwj=qynak+f7xef4w_e2g3xxdmnsnzmzjuinhrco...@mail.gmail.com 2. https://www.postgresql.org/message-id/2b0d42f2-3a53-763b-c9c2-47139e4b1c2e%40lab.ntt.co.jp I want to present the first version of patches that implement local indexes for partitioned tables and discuss some technical details of that implementation. 1. I have added a new relkind for local indexes named RELKIND_LOCAL_INDEX (literal 'l'). This was done because physical storage is created in the 'heap_create' function and we need to revoke the creating storage as with partitioned tables. Since information that this index belongs to partitioned tables is not available in 'heap_create' function (pg_index entry on the index is not created yet) I chose the least painful way - added a specific relkind for index on partitioned table. I suppose that this act will require the integrating new relkind to different places of source code so I'm ready to consider another proposals on this point. 2. My implementation doesn't support the concurrent creating of local index (CREATE INDEX CONCURRENTLY). As I understand, this operation involves nontrivial manipulation with snapshots and I don't know how to implement concurrent creating of multiple indexes. In this point I ask help from community. 3. As I noticed early pg_depend table is used for cascade deleting indexes on partitioned table and its children. I also use pg_depend to determine relationship between parent and child indexes when reindex executes recursively on child indexes. Perhaps, it's not good way to use pg_depend to determine the relationship between parent and child indexes because the kind of this relationship is not defined. I could propose to add into pg_index table specific field of 'oidvector' type that specify oids of dependent indexes for the current local index. On this stage I want to discuss only technical details of local indexes' implementation. The problems related to merging existing indexes of partitions within local index tree, determination uniqueness of field in global sense through local index and syntax notes I want to arise later. CC welcome! -- Maksim Milyutin Postgres Professional: http://www.postgrespro.com Russian Postgres Company diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index cc5ac8b..bec3983 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -154,7 +154,8 @@ index_open(Oid relationId, LOCKMODE lockmode) r = relation_open(relationId, lockmode); - if (r->rd_rel->relkind != RELKIND_INDEX) + if (r->rd_rel->relkind != RELKIND_INDEX && + r->rd_rel->relkind != RELKIND_LOCAL_INDEX) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("\"%s\" is not an index", diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index fc088b2..26a10e9 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -1107,7 +1107,8 @@ doDeletion(const ObjectAddress *object, int flags) { char relKind = get_rel_relkind(object->objectId); -if (relKind == RELKIND_INDEX) +if (relKind == RELKIND_INDEX || + relKind == RELKIND_LOCAL_INDEX) { bool concurrent = ((flags & PERFORM_DELETION_CONCURRENTLY) != 0); diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 36917c8..91ac740 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -293,6 +293,7 @@ heap_create(c
Re: [HACKERS] Making clausesel.c Smarter
On Tue, Apr 4, 2017 at 8:21 AM, David Rowley wrote: > On 4 April 2017 at 13:35, Claudio Freire wrote: >> On Mon, Apr 3, 2017 at 9:19 PM, Claudio Freire >> wrote: >>> On Mon, Apr 3, 2017 at 8:52 PM, David Rowley >>> wrote: > One last observation: > > +/* > + * An IS NOT NULL test is a no-op if there's any other strict > quals, > + * so if that's the case, then we'll only apply this, otherwise > we'll > + * ignore it. > + */ > +else if (cslist->selmask == CACHESEL_NOTNULLTEST) > +s1 *= cslist->notnullsel; > > In some paths, the range selectivity estimation code punts and returns > DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was > present, it should still be applied. It could make a big difference if > the selectivity for the nulltest is high. I'd say that the location that applies the DEFAULT_RANGE_INEQ_SEL should apply the nullfrac. Seems wrong to rely on a IS NOT NULL test to exists to estimate that properly. I don't think that needs done as part of this patch. I'd rather limit the scope since "returned with feedback" is already knocking at the door of this patch. >>> >>> I agree, but this patch regresses for those cases if the user >>> compensated with a not null test, leaving little recourse, so I'd fix >>> it in this patch to avoid the regression. >>> >>> Maybe you're right that the null fraction should be applied regardless >>> of whether there's an explicit null check, though. >> >> The more I read that code, the more I'm convinced there's no way to >> get a DEFAULT_RANGE_INEQ_SEL if (rqlist->hibound != DEFAULT_INEQ_SEL >> && >> rqlist->lobound != DEFAULT_INEQ_SEL) other than from contradictory >> clauses, a case the current code handles very poorly, returning >> DEFAULT_RANGE_INEQ_SEL and ignoring nullfrac, something that could >> give way-off estimates if the table had lots of nulls. >> >> Say, consider if "value" was mostly null and you write: >> >> SELECT * FROM testdata WHERE value BETWEEN 10 AND 20 AND value BETWEEN >> 50 AND 60; >> >> All other cases I can think of would end with either hibound or >> lobound being equal to DEFAULT_INEQ_SEL. >> >> It seems to me, doing a git blame, that the checks in the else branch >> were left only as a precaution, one that seems unnecessary. >> >> If you ask me, I'd just leave: >> >> + if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == >> DEFAULT_INEQ_SEL) >> + { >> + /* No point in checking null selectivity, DEFAULT_INEQ_SEL >> implies we have no stats */ >> + s2 = DEFAULT_RANGE_INEQ_SEL; >> + } >> + else >> + { >> ... >> + s2 = rqlist->hibound + rqlist->lobound - 1.0; >> + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid); >> + notnullsel = 1.0 - nullsel; >> + >> + /* Adjust for double-exclusion of NULLs */ >> + s2 += nullsel; >> + if (s2 <= 0.0) { >> + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6)) >> + { >> + /* Most likely contradictory clauses found */ >> + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel; >> + } >> + else >> + { >> + /* Could be a rounding error */ >> + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; >> + } >> + } >> + } >> >> Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly >> cautious) estimation of the amount of rounding error that could be >> there with 32-bit floats. >> >> The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is >> an estimation based on stats, maybe approximate, but one that makes >> sense (ie: preserves the monotonicity of the CPF), and as such >> negative values are either sign of a contradiction or rounding error. >> The previous code left the possibility of negatives coming out of >> default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates, >> but that doesn't seem possible coming out of scalarineqsel. >> >> But I'd like it if Tom could comment on that, since he's the one that >> did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back >> in 2004). >> >> Barring that, I'd just change the >> >> s2 = DEFAULT_RANGE_INEQ_SEL; >> >> To >> >> s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; >> >> Which makes a lot more sense. > > I think to fix this properly the selfuncs would have to return the > estimate, and nullfrac separately, so the nullfrac could just be > applied once per expression. There's already hacks in there to get rid > of the double null filtering. I'm not proposing that as something for > this patch. It would be pretty invasive to change this. There's no need, you can compute the nullfrac with nulltestsel. While there's some rework involved, it's not a big deal (it just reads the stats tuple), and it's a clean solution. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-h
Re: [HACKERS] Making clausesel.c Smarter
On Tue, Apr 4, 2017 at 8:12 AM, David Rowley wrote: > Result Comparison > > Master median tps Patch median tps comparison > Test 1 6993.7 6714.3 104.16% > Test 2 7053.1 6921.6 101.90% > Test 3 5137.2 4954.2 103.69% > Test 4 27.1 19.4 139.72% > Test 5 54.1 51.4 105.28% > Test 6 9328.1 9478.2 98.42% > > Results Analyzed: > > Test 1 has caused planning to slow down 4.16%. There's quite a bit of > noise from the results, but I think this indicates there is some > overhead to having to add items to the cslist and searching the cslist > when new quals are seen. > > Test 2 has a lesser slowdown than test 1, as this test will excercise > the existing rqlist caching in master too. Patched does a little more > work adding the equality condition to the list too. > > Test 3 similar to test 1 > > Test 4 adds quite an overhead and causes 0.5 million comparisons to > find the expressions in the cslist. > > Test 5 shows less overhead than test 4 since the Master code has to > also do the expression caching and searching. > > Test 6 is a control test That's consistent with the operation being quadratic. While I was about to point it out, the old code was quadratic too, so it sucks in both versions. This version only adds other opexprs to the list and makes the quadratic cost easier to run into, but it's nothing new. I don't think there's an easy fix for that. You'd have to build a hash table of expression nodes or something of the sort to avoid the quadratic cost. If you can do that, by all means, but that's a much bigger patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw: support parameterized foreign joins
On 23.03.2017 15:45, Etsuro Fujita wrote: Done. Also, I added regression tests and revised code and comments a bit. (As for create_foreignscan_path(), I haven't done anything about that yet.) Please find attached a new version created on top of [1]. Thank you! I didn't notice that it is necessary to apply the patch "epqpath-for-foreignjoin". I have a few comments. * innersortkeys are the sort pathkeys for the inner side of the mergejoin + * req_outer_list is a list of sensible parameterizations for the join rel */ I think it would be better if the comment explained what type is stored in req_outer_list. So the following comment would be good: "req_outer_list is a list of Relids of sensible parameterizations for the join rel" ! Assert(foreignrel->reloptkind == RELOPT_JOINREL); ! Here the new macro IS_JOIN_REL() can be used. ! /* Get the remote and local conditions */ ! remote_conds = list_concat(list_copy(remote_param_join_conds), ! fpinfo->remote_conds); ! local_param_join_exprs = ! get_actual_clauses(local_param_join_conds); ! local_exprs = list_concat(list_copy(local_param_join_exprs), ! fpinfo->local_conds); Is this code correct? 'remote_conds' and 'local_exprs' are initialized above when 'scan_clauses' is separated. Maybe better to use 'remote_conds' and 'local_exprs' instead of 'fpinfo->remote_conds' and 'fpinfo->local_conds' respectively? And the last. The patch needs rebasing because new macroses IS_JOIN_REL() and IS_UPPER_REL() were added. And the patch is applied with errors. -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional
* David Steele (da...@pgmasters.net) wrote: > On 3/22/17 4:42 PM, Peter Eisentraut wrote: > >On 3/22/17 15:14, Stephen Frost wrote: > >>>-SELECT * FROM pg_stop_backup(false); > >>>+SELECT * FROM pg_stop_backup(false [, true ]); > >>> > >>>I think that it's better to get rid of "[" and "]" from the above because > >>>IMO this should be the command example that users actually can run. > >>Using the '[' and ']' are how all of the optional arguments are > >>specified in the documentation, see things like current_setting() in our > >>existing documentation: > > > >In the synopsis, but not in concrete examples. > > Wasn't quite sure what the protocol was in this case, and figured it > was easier to subtract than to add. Forgot to close this out sooner, apologies. This has been committed now. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH] Reduce src/test/recovery verbosity
Greetings, * Craig Ringer (cr...@2ndquadrant.com) wrote: > On 31 March 2017 at 04:29, Stephen Frost wrote: > > > Unless people wish to object, I'll use Michael's patch to remove > > --verbose from the top level tomorrow. > > Sounds good. > > Maybe add > > To get more detailed output from tests set PROVE_FLAGS='--verbose' or examine > the detailed test output in tmp_check/logs. > > to src/test/perl/README too? Committed, with those additions. Thanks! Stephen signature.asc Description: Digital signature