Re: pg_dumpall --exclude-database option
On Wed, Oct 31, 2018 at 05:44:26PM +0100, Fabien COELHO wrote: > Patch v4 applies cleanly, compiles, doc generation ok, global & local tests > ok. +# also fails for -r and -t, but it seems pointless to add more tests for those. +command_fails_like( + [ 'pg_dumpall', '--exclude-database=foo', '--globals-only' ], + qr/\Qpg_dumpall: option --exclude-database cannot be used together with -g\/--globals-only\E/, + 'pg_dumpall: option --exclude-database cannot be used together with -g/--globals-only'); Usually testing all combinations is preferred, as well as having one error message for each pattern, which is also more consistent with all the other sanity checks in pg_dumpall.c and such. -- Michael signature.asc Description: PGP signature
Re: Strange failure in LWLock on skink in REL9_5_STABLE
On Tue, Sep 25, 2018 at 06:22:19PM +1200, Thomas Munro wrote: > On Sat, Sep 22, 2018 at 4:52 PM Amit Kapila wrote: >> I went through and check the original proposal [1] to see if any use >> case is mentioned there, but nothing related has been discussed. I >> couldn't think of much use of this facility except maybe for something >> like parallelizing correalated sub-queries where the size of outer var >> can change across executions and we might need to resize the initially >> allocated memory. This is just a wild thought, I don't have any >> concrete idea about this. Having said that, I don't object to >> removing this especially because the implementation doesn't seem to be >> complete. In future, if someone needs such a facility, they can first >> develop a complete version of this API. > > Thanks for looking into that. Here's a pair of draft patches to > disable and then remove dsm_resize() and dsm_map(). Hm. Don't we need to worry about anybody potentially using these APIs in a custom module on platforms where it was actually working? I imagine that their reaction is not going be nice if any code breaks suddenly after a minor release. No issues with removing the interface on HEAD of course. -- Michael signature.asc Description: PGP signature
Re: move PartitionBoundInfo creation code
On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote: > Done a few moments ago. :) From the file size this move is actually negative. From what I can see partcache decreases to 400 lines, while partbounds increases to 3k lines. There are a couple of things that this patch is doing: 1) Move the functions comparing two bounds into partbounds.c. 2) Remove the chunk in charge of building PartitionBoundData into partbounds.c for each method: list, hash and values. From what I can see, this patch brings actually more confusion by doing more things than just building all the PartitionBound structures as it fills in each structure and then builds a mapping which is used to save each partition OID into the correct mapping position. Wouldn't it move on with a logic without this mapping so as the partition OIDs are directly part of PartitionBound? It looks wrong to me to have build_partition_boundinfo create not only partdesc->boundinfo but also partdesc->oids, and the new routine is here to fill in data for the former, not the latter. The first phase building the bounds should switch to a switch/case like the second phase. PartitionHashBound & friends can become structures local to partbounds.c as they are used only there. To be more consistent with all the other routines, like partition_bounds_equal/copy, wouldn't it be better to call the new routine partition_bounds_build or partition_bounds_create? -- Michael signature.asc Description: PGP signature
Re: Speeding up INSERTs and UPDATEs to partitioned tables
On 2018/11/04 19:07, David Rowley wrote: > On 1 November 2018 at 22:39, Amit Langote > wrote: > I've attached v13 which hopefully addresses these. Thank you for updating the patch. >> The macro naming discussion got me thinking today about the macro itself. >> It encapsulates access to the various PartitionTupleRouting arrays >> containing the maps, but maybe we've got the interface of tuple routing a >> bit (maybe a lot given this thread!) wrong to begin with. Instead of >> ExecFindPartition returning indexes into various PartitionTupleRouting >> arrays and its callers then using those indexes to fetch various objects >> from those arrays, why doesn't it return those objects itself? Although >> we made sure that the callers don't need to worry about the meaning of >> these indexes changing with this patch, it still seems a bit odd for them >> to have to go back to those arrays to get various objects. >> >> How about we change ExecFindPartition's interface so that it returns the >> ResultRelInfo, the two maps, and the partition slot? So, the arrays >> simply become a cache for ExecFindPartition et al and are no longer >> accessed outside execPartition.c. Although that makes the interface of >> ExecFindPartition longer, I think it reduces overall complexity. > > I don't really think stuffing values into a bunch of output parameters > is much of an improvement. I'd rather allow callers to fetch what they > need using the index we return. Most callers don't need to know about > the child to parent maps, so it seems nicer for those places not to > have to invent a dummy variable to pass along to ExecFindPartition() > so it can needlessly populate it for them. Well, if a caller finds a partition using ExecFindPartition, it's going to need to fetch those other objects anyway. Both of its callers that exist today, CopyFrom and ExecPrepareTupleRouting, fetch both maps and the slot in the same code block as ExecFindPartition. > Perhaps a better design would be to instead of having random special > partitioned-table-only fields in ResultRelInfo, just have an extra > struct there that contains the extra partition information which would > include the translation maps and then just return the ResultRelInfo > and allow callers to lookup any extra details they require. IIUC, you're saying that we could introduce a new struct that contains auxiliary information needed by partition pruning (maps, slot, etc. for tuple conversion) and add a new ResultRelInfo member of that struct type. That way, there is no need to return them separately or return an index to access them from their arrays. I guess we won't even need the arrays we have now. I think that might be a good idea and simplifies things significantly. It reminds me of how ResultRelInfo grew a ri_onConflict member of type OnConflictSetState [1]. We decided to go that way, as opposed to the earlier approach of having arrays of num_partitions length in ModifyTableState or PartitionTupleRouting that contained ON CONFLICT related objects for individual partitions. > I've not > looked into this in detail, but I think the committer work that's > required for the patch as it is today is already quite significant. > I'm not keen on warding any willing one off by making the commit job > any harder. I agree that it would be good to stabilise the API for > all this partitioning code sometime, but I don't believe it needs to > be done all in one commit. My intent here is to improve performance or > INSERT and UPDATE on partitioned tables. Perhaps we can shape some API > redesign later in the release cycle. What do you think? I do suspect that simplifying ExecFindPartition's interface as part of patch will make a committer's life easier, as the resulting interface is simpler, especially if we revise it like you suggest above. Thanks, Amit [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=555ee77a9
Re: "Writing" output lines during make
On Mon, Nov 5, 2018 at 01:01:47PM +0700, John Naylor wrote: > On 11/5/18, Bruce Momjian wrote: > > During make --silent, I occasionally see these lines: > > > > Writing postgres.bki > > Writing schemapg.h > > Writing postgres.description > > Writing postgres.shdescription > > > > I can't seem to find where these lines are being output, perhaps from a > > Perl library. I would like to suppress them but I don't know how. > > Those should be gone as of pg11. Otherwise, those messages come from > in backend/catalog/genkbi.pl Oh, I see it now in src/backend/catalog/Catalog.pm: print "Writing $final_name\n"; I do make in parallel, and couldn't find a pattern for when the lines appear, so it was hard to debug. I thought the lines were new but I guess my --silent usage was somehow new. Thanks. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: ToDo: show size of partitioned table
On 2018/11/04 4:58, Pavel Stehule wrote: > here is a patch Thank you, Pavel. Here are some comments. I mentioned it during the last review, but maybe you missed it due to the other discussion. +the pattern are listed. If the form \dP+ +is used, a sum of size of related partitions and a description +are also displayed. +the pattern are listed. If the form \dPi+ +is used, a sum of size of related indexes and a description +are also displayed. +the pattern are listed. If the form \dPt+ +is used, a sum of size of related indexes and a description +are also displayed. I suggest: "is used, the sum of sizes of related partitions / index partitions / table partitions and associated description are also displayed." Note that I also fixed the typo (indexes -> tables) in the last one. Also, I wonder if we should mention in the description of \dP+ that the displayed size considers the sizes of both the tables and indexes on the individual partitions, because in the code, I see pg_total_relation_size being used. So, the text should be something like: "is used, the sum of size of related partitions (including the table and indexes, if any) and associated description are also displayed." Code itself looks to me to be in good shape, except you seem to have also missed Michael's comment upthread: +/* PostgreSQL 11 has pg_partition_tree function */ /* PostgreSQL 12 has pg_partition_tree function */ Thanks again. Regards, Amit
Re: "Writing" output lines during make
On 11/5/18, Bruce Momjian wrote: > During make --silent, I occasionally see these lines: > > Writing postgres.bki > Writing schemapg.h > Writing postgres.description > Writing postgres.shdescription > > I can't seem to find where these lines are being output, perhaps from a > Perl library. I would like to suppress them but I don't know how. Those should be gone as of pg11. Otherwise, those messages come from in backend/catalog/genkbi.pl -John Naylor
Re: "Writing" output lines during make
Bruce Momjian writes: > During make --silent, I occasionally see these lines: > Writing postgres.bki > Writing schemapg.h > Writing postgres.description > Writing postgres.shdescription > I can't seem to find where these lines are being output, perhaps from a > Perl library. I would like to suppress them but I don't know how. That's gone as of v11 & HEAD, no? regards, tom lane
"Writing" output lines during make
During make --silent, I occasionally see these lines: Writing postgres.bki Writing schemapg.h Writing postgres.description Writing postgres.shdescription I can't seem to find where these lines are being output, perhaps from a Perl library. I would like to suppress them but I don't know how. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: partitioned tables referenced by FKs
On Fri, Nov 2, 2018 at 7:42 PM Alvaro Herrera wrote: > Here's a patch to allow partitioned tables to be referenced by foreign > keys. Current state is WIP, but everything should work; see below for > the expected exception. > > The design is very simple: have one pg_constraint row for each partition > on each side, each row pointing to the topmost table on the other side; > triggers appear on each leaf partition (and naturally they don't appear > on any intermediate partitioned table). > This is an important and much needed feature! Based on my extremely naive reading of this code, I have two perhaps equally naive questions: 1. it seems that we will continue to to per-row RI checks for inserts and updates. However, there already exists a bulk check in RI_Initial_Check(). Could we modify this bulk check to do RI checks on a per-statement basis rather than a per-row basis? 2. If #1 is possible, is the overhead of transitions tables too great for the single-row case?
ON COMMIT actions and inheritance
Hi, Michael pointed out a problem with specifying different ON COMMIT actions on a temporary inheritance parent and its children: https://www.postgresql.org/message-id/20181102051804.GV1727%40paquier.xyz The problem is that when PreCommit_on_commit_actions() executes an ON COMMIT DROP action on the parent, it will drop its children as well. It doesn't however remove the children's own actions, especially ON COMMIT DELETE ROWS, from the list and when it gets around to executing them, the children are already gone. That causes the heap_open in heap_truncate to fail with an error like this: ERROR: XX000: could not open relation with OID 16420 LOCATION: relation_open, heapam.c:1138 One way to fix that is to remove the tables that no longer exist from the list that's passed to heap_truncate(), which the attached patch implements. Thanks, Amit From 37c97e6b9879b130a80305fd794a9e721332cb98 Mon Sep 17 00:00:00 2001 From: amit Date: Fri, 2 Nov 2018 15:26:37 +0900 Subject: [PATCH] Check relation still exists before applying ON COMMIT action --- src/backend/commands/tablecmds.c | 17 + src/test/regress/expected/temp.out | 23 +++ src/test/regress/sql/temp.sql | 15 +++ 3 files changed, 55 insertions(+) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 6048334c75..5b17a8679d 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -13353,6 +13353,23 @@ PreCommit_on_commit_actions(void) } if (oids_to_truncate != NIL) { + List *tmp = oids_to_truncate; + + /* +* Check that relations we're about to truncate still exist. Some of +* them might be inheritance children, which would be gone as a result +* of their parent being dropped due to ONCOMMIT_DROP action of its +* own. +*/ + oids_to_truncate = NIL; + foreach(l, tmp) + { + Oid relid = lfirst_oid(l); + + if (SearchSysCacheExists1(RELOID, ObjectIdGetDatum(relid))) + oids_to_truncate = lappend_oid(oids_to_truncate, relid); + } + heap_truncate(oids_to_truncate); CommandCounterIncrement(); /* XXX needed? */ } diff --git a/src/test/regress/expected/temp.out b/src/test/regress/expected/temp.out index a769abe9bb..70be46ca1d 100644 --- a/src/test/regress/expected/temp.out +++ b/src/test/regress/expected/temp.out @@ -216,3 +216,26 @@ select * from temp_parted_oncommit; (0 rows) drop table temp_parted_oncommit; +-- Another case where a child table is gone by the time it's ON COMMIT +-- action is executed due to the ON COMMIT DROP action on its parent +-- There are tests for both a partitioned table and regular inheritance. +begin; +create temp table temp_parted_oncommit_test (a int) partition by list (a) on commit drop; +create temp table temp_parted_oncommit_test1 partition of temp_parted_oncommit_test for values in (1) on commit delete rows; +insert into temp_parted_oncommit_test values (1); +create temp table temp_inh_oncommit_test (a int) on commit drop; +create temp table temp_inh_oncommit_test1 () inherits(temp_inh_oncommit_test) on commit delete rows; +insert into temp_inh_oncommit_test1 values (1); +commit; +NOTICE: drop cascades to table temp_inh_oncommit_test1 +-- zero rows in both cases +select relname from pg_class where relname like 'temp_parted_oncommit_test%'; + relname +- +(0 rows) + +select relname from pg_class where relname like 'temp_inh_oncommit_test%'; + relname +- +(0 rows) + diff --git a/src/test/regress/sql/temp.sql b/src/test/regress/sql/temp.sql index 1074c7cfac..ad8c5cb85a 100644 --- a/src/test/regress/sql/temp.sql +++ b/src/test/regress/sql/temp.sql @@ -165,3 +165,18 @@ commit; -- partitions are emptied by the previous commit select * from temp_parted_oncommit; drop table temp_parted_oncommit; + +-- Another case where a child table is gone by the time it's ON COMMIT +-- action is executed due to the ON COMMIT DROP action on its parent +-- There are tests for both a partitioned table and regular inheritance. +begin; +create temp table temp_parted_oncommit_test (a int) partition by list (a) on commit drop; +create temp table temp_parted_oncommit_test1 partition of temp_parted_oncommit_test for values in (1) on commit delete rows; +insert into temp_parted_oncommit_test values (1); +create temp table temp_inh_oncommit_test (a int) on commit drop; +create temp table temp_inh_oncommit_test1 () inherits(temp_inh_oncommit_test) on commit delete rows; +insert into temp_inh_oncommit_test1 values (1); +commit; +-- zero rows in both cases +select relname from pg_class where relname like 'temp_parted_oncommit_test%'; +select relname from pg_class where relname like 'temp_inh_on
Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT
On Mon, Nov 5, 2018 at 5:49 AM Michael Paquier wrote: > On Fri, Nov 02, 2018 at 04:39:07PM +0900, Amit Langote wrote: > > Agreed that they're two independent issues, although it wouldn't be such > a > > bad idea to fix them in one go, as they're both issues related to the > > handling of ON COMMIT actions on tables in inheritance trees. > > I have pushed 0001 which fixes the bug reported on this thread down to > v10 > Thanks Michael, Amit.
Re: Hooks to Modify Execution Flow and Query Planner
Hi, On 2018/11/03 17:28, Vincent Mirian wrote: > Hi Amit, > > Thank you for your response. Chapters 51, 57 and 59 (Overview of PostgreSQL > Internals, Writing A Foreign Data Wrapper and Writing A Custom Scan > Provider) seem to be relevant. Aside from the source code snippets in the > document, is there functional source code that can be used as reference? You can look at contrib/postgres_fdw for an example of a quite functional foreign data wrapper. For examples of a custom scan provider, you can look at Citus [1], pg-strom [2]. Citus, for example, implements plan nodes like MultiJoin, Job, etc. that are not present in vanilla PostgreSQL to implement additional query execution functionality. Similarly, you can find custom nodes like GpuJoin in pg-strom. > Also, aside from this mailing list, is there an interactive medium for > asking questions? There are IRC and Slack channels which you can access via the following page: https://www.postgresql.org/community/ Thanks, Amit [1] https://github.com/citusdata/citus [2] https://github.com/heterodb/pg-strom
Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT
On 2018/11/05 9:19, Michael Paquier wrote: > On Fri, Nov 02, 2018 at 04:39:07PM +0900, Amit Langote wrote: >> Agreed that they're two independent issues, although it wouldn't be such a >> bad idea to fix them in one go, as they're both issues related to the >> handling of ON COMMIT actions on tables in inheritance trees. > > I have pushed 0001 which fixes the bug reported on this thread down to > v10, after tweaking a bit the patch after more review. I was testing > heap_truncate_one_rel() a bit more deeply and it actually happens that > it can work with matviews. Re-reading the thread and sleeping on it, > Tom was been actually suggesting to move the check one level down to > heap_truncate_one_rel(), which actually makes more sense. So I have > changed the patch so as a check on RELKIND_PARTITIONED_TABLE is done > instead of RELKIND_RELATION which is what has been proposed until now. Okay, it's good that heap_truncate_one_rel() continues to work for all relation types that can have storage. Thanks for making the changes yourself and committing. > Regarding the second patch, could you start a second thread? The scope > is not only related to partitioned tables but also to inheritance trees > so this goes way back in time, and I think that we could attract a > better audience about the problem. I don't mind starting a thread > myself, not without your authorization as you wrote a patch to deal with > the problem. My apologies, I have not looked at what you are proposing > yet. Okay, no problem. I will post that patch in a new thread detailing what the problem is and how the proposed patch fixes it. Thanks, Amit
Re: Vacuum Full does not release the disk size space after delete from table
Thank Tom! We will check it. On Fri, Nov 2, 2018 at 10:35 PM Tom Lane wrote: > Haozhou Wang writes: > > We meet a corner case that related to the behavior of Vacuum Full. > > ... > > If we run both sql scripts on same database in parallel, the "VACUUM FULL > > a;" will not release the disk space. > > I think what's happening is that the delete in script 1 happens after the > "pg_sleep" in script 2 starts. Then the pg_sleep has an open snapshot > that could potentially see the deleted rows, so they can't be removed yet. > > You could check this theory by changing the vacuum to use VERBOSE, and > seeing what it says about rows that can't be removed yet. > > regards, tom lane > -- Regards, Haozhou
Re: zheap: a new storage format for PostgreSQL
On Sat, Nov 3, 2018 at 9:30 AM Amit Kapila wrote: > On Fri, Nov 2, 2018 at 6:41 PM Tomas Vondra > wrote: > > I'm sure > > it's not the only place where we do something like this, and the other > > places don't trigger the valgrind warning, so how do those places do > > this? heapam seems to call fetch_att in the end, which essentially calls > > Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same > > trick here? > > > > This is because, in zheap, we have omitted all alignment padding for > pass-by-value types. See the description in my previous email [1]. I > think here we need to initialize ret_datum at the beginning of the > function unless you have some better idea. > I have pushed a fix on the above lines in zheap-branch, but I am open to change it if you have better ideas for the same. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: pg_dump multi VALUES INSERT
Greetings, * Michael Paquier (mich...@paquier.xyz) wrote: > On Wed, Oct 17, 2018 at 03:05:28PM -0400, Stephen Frost wrote: > > The point of it is that it makes loading into other RDBMS faster. Yes, > > it has many of the same issues as our COPY does, but we support it > > because it's much faster. The same is true here, just for other > > databases, so I'm +1 on the general idea. > > Well, the patch author has mentioned that he cares about also being able > to detect errors when processing the dump, which multi inserts make that > easier to check for. However, even if you specify --data-only you still > need to worry about the first SET commands ahead, which requires manual > handling of the dump... That's hardly a serious complication.. > I am honestly not convinced that it is worth complicating pg_dump for > that, as there is no guarantee either that the DDLs generated by pg_dump > will be compatible with what other systems expect. This kind of > compatibility for fetch and reload can also be kind of tricky with > portability issues, so I'd rather let this stuff being handled correctly > by other tools like pgloader or others rather than expecting to get this > stuff half-baked within Postgres core tools. I can see an argument for not wanting to complicate pg_dump, but we've explicitly stated that the purpose of --inserts is to facilitate restoring into other database systems and I don't agree that we should just punt on that entirely. For better or worse, there's an awful lot of weight put behind things which are in core and we should take care to do what we can to make those things better, especially when someone is proposing a patch to improve the situation. Sure, the patch might need work or have other issues, but that's an opportunity for us to provide feedback to the author and encourage them to improve the patch. As for the other things that make it difficult to use pg_dump to get a result out that can be loaded into other database systems- let's try to improve on that too. Having an option to skip the 'setup' bits, such as the SET commands, certainly wouldn't be hard. I certainly don't see us adding code to pg_dump to handle 'fetch and reload' into some non-PG system, or, really, even into a PG system, and that certainly isn't something the patch does, so I don't think it's a particularly interesting argument. Users can handle that as needed themselves. In other words, none of the arguments put forth really seem to be a reason to reject the general idea of this patch, so I'm still +1 on that. Having just glanced over the patch quickly, I think I would have done something like '--inserts=100' as the way to enable it instead of adding a new option though. Not that I feel very strongly about it. Thanks! Stephen signature.asc Description: PGP signature
Re: pg_dump multi VALUES INSERT
On Wed, Oct 17, 2018 at 03:05:28PM -0400, Stephen Frost wrote: > The point of it is that it makes loading into other RDBMS faster. Yes, > it has many of the same issues as our COPY does, but we support it > because it's much faster. The same is true here, just for other > databases, so I'm +1 on the general idea. Well, the patch author has mentioned that he cares about also being able to detect errors when processing the dump, which multi inserts make that easier to check for. However, even if you specify --data-only you still need to worry about the first SET commands ahead, which requires manual handling of the dump... I am honestly not convinced that it is worth complicating pg_dump for that, as there is no guarantee either that the DDLs generated by pg_dump will be compatible with what other systems expect. This kind of compatibility for fetch and reload can also be kind of tricky with portability issues, so I'd rather let this stuff being handled correctly by other tools like pgloader or others rather than expecting to get this stuff half-baked within Postgres core tools. -- Michael signature.asc Description: PGP signature
Re: Unused entry in pg_opfamily
On Sun, Nov 04, 2018 at 08:02:25PM -0500, Tom Lane wrote: > We should, therefore, at least add an oprsanity check for > opfamilies without opclasses. I wonder how far it's worth > going to check for other orphaned catalog entries. +1. -- Michael signature.asc Description: PGP signature
Unused entry in pg_opfamily
I happened to notice that the pg_opfamily entry with OID 4035 (jsonb_ops for GIST) is not referenced anywhere; apparently it was put in in anticipation of support that never materialized. We still pass make check-world if it's removed. Getting rid of it seems like a no-brainer, but it surprised me a bit that none of our automated crosschecks had noticed this. It looks like the reason is that all the amvalidate logic is driven off pg_opclass entries, and so an opfamily that contains no opclass is never examined. We should, therefore, at least add an oprsanity check for opfamilies without opclasses. I wonder how far it's worth going to check for other orphaned catalog entries. regards, tom lane
Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT
On Fri, Nov 02, 2018 at 04:39:07PM +0900, Amit Langote wrote: > Agreed that they're two independent issues, although it wouldn't be such a > bad idea to fix them in one go, as they're both issues related to the > handling of ON COMMIT actions on tables in inheritance trees. I have pushed 0001 which fixes the bug reported on this thread down to v10, after tweaking a bit the patch after more review. I was testing heap_truncate_one_rel() a bit more deeply and it actually happens that it can work with matviews. Re-reading the thread and sleeping on it, Tom was been actually suggesting to move the check one level down to heap_truncate_one_rel(), which actually makes more sense. So I have changed the patch so as a check on RELKIND_PARTITIONED_TABLE is done instead of RELKIND_RELATION which is what has been proposed until now. Regarding the second patch, could you start a second thread? The scope is not only related to partitioned tables but also to inheritance trees so this goes way back in time, and I think that we could attract a better audience about the problem. I don't mind starting a thread myself, not without your authorization as you wrote a patch to deal with the problem. My apologies, I have not looked at what you are proposing yet. -- Michael signature.asc Description: PGP signature
Re: Delta Materialized View Refreshes?
denty wrote > (Seems I can't attach via the web interface, so copy/paste patch below.) > > -- > Sent from: > http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html Sending attachments from this web site (that is not an official postgres website) has been disabled as requested by postgres community members (attachments where stored locally in nabble but not sent) see https://www.postgresql-archive.org/postgresql-archive-org-td6035059i20.html#a6035631 -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Delta Materialized View Refreshes?
Hi folks, I had a crack at this, and it was pretty simple to get something working to play around with, and it seems like it might be useful. I developed it against 10.1, as that's what I happened to be working with at the time. The patch is pretty small, and I hoped it would apply cleanly against 11. Unfortunately it doesn't, but I doubt the issues are substantial. If there is interest in moving this forward, I'll update and re-share. The patch enables pretty much exactly what Jeremy suggests — something like "refresh materialized view concurrently testview where type = 'main';" — with fairly obvious semantics. Welcome comments on the patch or approach. denty. (Seems I can't attach via the web interface, so copy/paste patch below.) -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
Re: Ordered Partitioned Table Scans
On 1 November 2018 at 22:05, Antonin Houska wrote: > I think these conditions are too restrictive: > > /* > * Determine if these pathkeys match the partition order, or reverse > * partition order. It can't match both, so only go to the trouble of > * checking the reverse order when it's not in ascending partition > * order. > */ > partition_order = pathkeys_contained_in(pathkeys, > partition_pathkeys); > partition_order_desc = !partition_order && > pathkeys_contained_in(pathkeys, > > partition_pathkeys_desc); > > > In the example above I wanted to show that your new feature should work even > if "pathkeys" is not contained in "partition_pathkeys". Okay, after a bit more time looking at this I see what you're saying now and I agree, but; see below. >> > Another problem I see is that build_partition_pathkeys() continues even if >> > it >> > fails to create a pathkey for some partitioning column. In the example >> > above >> > it would mean that the table can have "partition_pathkeys" equal to {j} >> > instead of {i, j} on some EC-related conditions. However such a key does >> > not >> > correspond to reality - this is easier to imagine if another partition is >> > considered. >> > >> > partition 3: >> > >> > i | j | k >> > ---+---+--- >> > 1 | 0 | 1 >> > 1 | 0 | 0 >> > >> > So I think no "partition_pathkeys" should be generated in that case. On the >> > other hand, if the function returned the part of the list it could >> > construct >> > so far, it'd be wrong because such incomplete pathkeys could pass the >> > checks I >> > proposed above for reasons unrelated to the partitioning scheme. >> >> Oops. That's a mistake. We should return what we have so far if we >> can't make one of the pathkeys. Will fix. > > I think no "partition_pathkeys" should be created in this case, but before we > can discuss this in detail there needs to be an agreement on the evaluation of > the relationship between "pathkeys" and "partition_pathkeys", see above. The problem with doing that is that if the partition keys are better than the pathkeys then we'll most likely fail to generate any partition keys at all due to lack of any existing eclass to use for the pathkeys. It's unsafe to use just the prefix in this case as the eclass may not have been found due to, for example one of the partition keys having a different collation than the required sort order of the query. In other words, we can't rely on a failure to create the pathkey meaning that a more strict sort order is not required. I'm a bit unsure on how safe it would be to pass "create_it" as true to make_pathkey_from_sortinfo(). We might be building partition path keys for some sub-partitioned table. In this case the eclass should likely have a its member added with em_is_child = true. The existing code always sets em_is_child to false. It's not that clear to me that setting up a new eclass with a single em_is_child = true member is correct. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: date_trunc() in a specific time zone
On Mon, Oct 29, 2018 at 04:18:23PM +0100, Vik Fearing wrote: > A use case that I see quite a lot of is needing to do reports and other > calculations on data per day/hour/etc but in the user's time zone. The > way to do that is fairly trivial, but it's not obvious what it does so > reading queries becomes just a little bit more difficult. > > Attached is a patch to create a function for it, based off 5953c99697. +1 In a slightly related matter, at some point, we also need to come up with a timestamptz2 or some such which preserves the input time zone. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"
I wrote: > BTW, it looks to me like ExecuteCallStmt trashes the passed-in CallStmt > in cases where expand_function_arguments is not a no-op. Is that > really safe? Seems to me it'd cause problems if, for example, dealing > with a CallStmt that's part of a prepared stmt or cached plan. I dug into why this wasn't falling over like mad, and the answer is that the only reason it doesn't fall over is that plancache.c *always* decides to build a new custom plan for any cached plan consisting strictly of utility statements. That's because cached_plan_cost() ignores utility statements altogether, causing both the generic and custom plan costs to be exactly zero, and choose_custom_plan() figures it might as well break a tie in favor of generating a custom plan. Therefore, even when dealing with a cached CALL statement, ExecuteCallStmt always sees a virgin new parse tree, and the fact that it corrupts it isn't a problem. If I adjust the plancache logic so that it's capable of deciding to use a generic cached plan, repeated execution of a CALL statement with named parameters crashes at the seventh iteration, which is the first one that would actually try to re-use a cached plan tree. Nonetheless, make check-world passes, indicating that there is no place in our regression tests that stresses this type of situation. (On the other hand, if I set plan_cache_mode = force_generic_plan, kaboom.) This leaves me worried that there are probably other places besides CallStmt that think they can scribble on a utility statement's parse tree at execution time. We're clearly not testing that situation enough. For the moment, I'm going to go fix ExecuteCallStmt so it doesn't do this, but I'm wondering how we could detect such problems more generically. Also, it's just silly that the plancache won't use a generic plan for utility statements, so I think we should change that (in HEAD only). The easiest mod is to adjust cached_plan_cost() so that it charges some nonzero amount for "planning" of utility statements. regards, tom lane
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
On Sun, Nov 4, 2018 at 8:21 AM Andrey Lepikhov wrote: > I mean that your code have not any problems that I can detect by > regression tests and by the retail index tuple deletion patch. > Difference in amount of dropped objects is not a problem. It is caused > by pos 2293 - 'else if (thisobj->objectSubId == 0)' - at the file > catalog/dependency.c and it is legal behavior: column row object deleted > without any report because we already decided to drop its whole table. The behavior implied by using ASC heap TID order is always "legal", but it may cause a regression in certain functionality -- something that an ordinary user might complain about. There were some changes when DESC heap TID order is used too, of course, but those were safe to ignore (it seemed like nobody could ever care). It might have been okay to just use DESC order, but since it now seems like I must use ASC heap TID order for performance reasons, I have to tackle a couple of these issues head-on (e.g. 'cannot drop trigger trg1'). > Also, I checked the triggers test. Difference in the ERROR message > 'cannot drop trigger trg1' is caused by different order of tuples in the > relation with the dependDependerIndexId relid. It is legal behavior and > we can simply replace test results. Let's look at this specific "trg1" case: """ create table trigpart (a int, b int) partition by range (a); create table trigpart1 partition of trigpart for values from (0) to (1000); create trigger trg1 after insert on trigpart for each row execute procedure trigger_nothing(); ... drop trigger trg1 on trigpart1; -- fail -ERROR: cannot drop trigger trg1 on table trigpart1 because trigger trg1 on table trigpart requires it -HINT: You can drop trigger trg1 on table trigpart instead. +ERROR: cannot drop trigger trg1 on table trigpart1 because table trigpart1 requires it +HINT: You can drop table trigpart1 instead. """ The original hint suggests "you need to drop the object on the partition parent instead of its child", which is useful. The new hint suggests "instead of dropping the trigger on the partition child, maybe drop the child itself!". That's almost an insult to the user. Now, I suppose that I could claim that it's not my responsibility to fix this, since we get the useful behavior only due to accidental implementation details. I'm not going to take that position, though. I think that I am obliged to follow both the letter and the spirit of the law. I'm almost certain that this regression test was written because somebody specifically cared about getting the original, useful message. The underlying assumptions may have been a bit shaky, but we all know how common it is for software to evolve to depend on implementation-defined details. We've all written code that does it, but hopefully it didn't hurt us much because we also wrote regression tests that exercised the useful behavior. > May be you know any another problems of the patch? Just the lack of pg_upgrade support. That is progressing nicely, though. I'll probably have that part in the next revision of the patch. I've found what looks like a workable approach, though I need to work on a testing strategy for pg_upgrade. -- Peter Geoghegan
Re: using index or check in ALTER TABLE SET NOT NULL
Hello > This patch went through the last tree commit fests without any noticeable > activity Well, last two CF. During march commitfest patch has activity and was marked as ready for committer. But at end of july CF was no further activity and patch was reverted back to "need review" with reason [1]: > It's not clear that there is consensus that this is wanted. I can't say something here. > If not properly cataloguing NOT NULL constraints would be fixed, can it > potentially conflict with the current patch or not? We already doing same stuff for "alter table attach partition" and in this patch i use exactly this routine. If proper cataloguing would conflict with my patch - it would conflict with "attach partition" validation too. I think proper cataloguing can be implemented without conflict with proposed feature. regards, Sergei [1]: https://www.postgresql.org/message-id/c4c7556d-a046-ac29-2549-bdef0078b6fe%402ndQuadrant.com
Re: [PATCH] Improvements to "Getting started" tutorial for Google Code-in task
On 04.11.18 02:53, LAM JUN RONG wrote: Hi, I have made some changes based on Andreas’ suggestions. > + then one or more of the packages PostgreSQL requires is not installed. > + See for the required packages. > > How about: > then packages which are required to build > PostgreSQL are missing. > See for a list of requirements. This seems better than mine, I have included it. Ok. > If you are not sure whether PostgreSQL > - is already available or whether you can use it for your > - experimentation then you can install it yourself. Doing so is not > + is already available for your experimentation, > + you can install it yourself. Doing so is not > hard and it can be a good exercise. > > This change does not make much sense, given that you leave the > second part of the sentence. How’s this: If you are not sure whether PostgreSQL is already available for your experimentation, you can install it yourself, which is not hard. "you can install it by yourself, which is not complicated"? I don't like the "hard" there. > As is typical of client/server applications, the client and the > - server can be on different hosts. In that case they communicate > + server can be on different machines or networks. In that case they communicate > over a TCP/IP network connection. You should keep this in mind, > I think "hosts" is preferred here. The differentiation between machines > and networks is not necessary. > > > > - The path at your site might be different. Contact your site > + The path at your site's server might be different. Contact your site > administrator or check the installation instructions to > Dunno if this change improves the wording, or the understanding. > How about replacing "site" with "installation"? For the 2 points above, I decided that the original documentation seems fine. Ok. > PostgreSQL allows you to create any > - number of databases at a given site. Database names must have an > - alphabetic first character and are limited to 63 bytes in > - length. A convenient choice is to create a database with the same > - name as your current user name. Many tools assume that database > - name as the default, so it can save you some typing. To create > - that database, simply type: > + number of databases at a given site. Database names are limited to 63 bytes in > + length. Database names longer than 63 bytes will be truncated. A convenient > + choice is to create a database with the same name as your current user name. > + Many tools assume that database name as the default, so it > + can save you some typing. To create that database, simply type: > The part about "truncate" is correct, maybe you can include NAMEDATALEN here as well. > The part about the first character is correct - except you quote the name. > Then any name is allowed. If you rewrite this part, maybe include this as well. I’ve made some changes to include the part about NAMEDATALEN and quoting: However, if you would like to create databases with names that do not start with an alphabetic character, you will need to quote it like so: "1234". Database names are limited to 63 bytes in length. Database names longer than 63 bytes will be truncated. You can change this limit by modifying the NAMEDATALEN variable, but that would require recompiling PostgreSQL. you will need to quote the the identifier, like "1st database". Database names are limited to 63 bytes in length, or more specifically to the length defined in NAMEDATALEN. Changing this value requires recompiling the database. Names which are longer than that are truncated. A convenient choice ... Regards, -- Andreas 'ads' Scherbaum German PostgreSQL User Group European PostgreSQL User Group - Board of Directors Volunteer Regional Contact, Germany - PostgreSQL Project
Re: wal_dump output on CREATE DATABASE
Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> a écrit : > On 26/10/2018 15:53, Jean-Christophe Arnu wrote: > > Exemple on CREATE DATABASE (without defining a template database) : > > rmgr: Databaselen (rec/tot): 42/42, tx:568, lsn: > > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663 > > > > It comes out (to me) it may be more consistent to use the same template > > than the other operations in pg_waldump. > > I propose to swap the parameters positions for the copy dir operation > > output. > > > > You'll find a patch file included which does the switching job : > > rmgr: Databaselen (rec/tot): 42/42, tx:568, lsn: > > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384 > > I see your point. I suspect this was mainly implemented that way > because that's how the fields occur in the underlying > xl_dbase_create_rec structure. (But that structure also has the target > location first, so it's not entirely consistent that way either.) We > could switch the fields around in the output, but I wonder whether we > couldn't make the whole thing a bit more readable like this: > > desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384 > > or maybe like this > > desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384 > Thank you Peter for your review and proposal . I like the last one which gives the fields semantics in a concise way. Pushing the idea a bit farther we could think of applying that description to any other operation that involves the ts/db/oid fields. What do you think ? Example in nbtdesc.c we could add the "(ts/db/oid)" information to help the DBA to identify the objects: case XLOG_BTREE_REUSE_PAGE: { xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec; appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u; latestRemovedXid %u", xlrec->node.spcNode, xlrec->node.dbNode, xlrec->node.relNode, xlrec->latestRemovedXid); break; } This may help DBAs to better identify the objects related to the messages. Any thought/comments/suggestions ? ~~~ Side story Well to be clear, my first proposal is born while i was writting a side script to textually identify which objects were involved into pg_waldump operations (if that objects already exists at script run time). I'm wondering whether it could be interesting to incllde such a "textual decoding" into pg_waldump or not (as a command line switch). Anyway, my script would be available as proof of concept first. We have time to discuss that last point in another thread. ~~~ Thank you >
Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"
Pavel Stehule writes: > I am not sure how safe is read argmodes from syscache after procedure > execution. Theoretically, the procedure pg_proc tuple can be modified from > procedure, and can be committed from procedure. Isn't better to safe > argmodes before execution? Hm. That would mean throwing non-writable-arg errors before rather than after, but that's probably fine. regards, tom lane
Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"
ne 4. 11. 2018 v 17:14 odesílatel Tom Lane napsal: > Pavel Stehule writes: > > ne 4. 11. 2018 v 16:54 odesilatel Tom Lane napsal: > >> In short, I think it's a bug that we allow the above. If you > >> want to keep the must-be-a-variable error then it should apply in > >> this case too. > > > I agree. This should be prohibited from PLpgSQL. > > OK. In that case I'll run with my patch. The attached is what I had > code-wise, but I've not changed the regression tests to match ... gotta > go fix the docs too I guess. > I am not sure how safe is read argmodes from syscache after procedure execution. Theoretically, the procedure pg_proc tuple can be modified from procedure, and can be committed from procedure. Isn't better to safe argmodes before execution? regards Pavel > > regards, tom lane > >
Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
On 04.11.2018 9:31, Peter Geoghegan wrote: On Sat, Nov 3, 2018 at 8:52 PM Andrey Lepikhov wrote: I applied your patches at top of master. After tests corrections (related to TID ordering in index relations DROP...CASCADE operation) 'make check-world' passed successfully many times. In the case of 'create view' regression test - 'drop cascades to 62 other objects' problem - I verify an Álvaro Herrera hypothesis [1] and it is true. You can verify it by tracking the object_address_present_add_flags() routine return value. I'll have to go and fix the problem directly, so that ASC sort order can be used. Some doubts, however, may be regarding the 'triggers' test. May you specify the test failures do you mean? Not sure what you mean. The order of items that are listed in the DETAIL for a cascading DROP can have an "implementation defined" order. I think that this is an example of the more general problem -- what you call the 'drop cascades to 62 other objects' problem is a more specific subproblem, or, if you prefer, a more specific symptom of the same problem. I mean that your code have not any problems that I can detect by regression tests and by the retail index tuple deletion patch. Difference in amount of dropped objects is not a problem. It is caused by pos 2293 - 'else if (thisobj->objectSubId == 0)' - at the file catalog/dependency.c and it is legal behavior: column row object deleted without any report because we already decided to drop its whole table. Also, I checked the triggers test. Difference in the ERROR message 'cannot drop trigger trg1' is caused by different order of tuples in the relation with the dependDependerIndexId relid. It is legal behavior and we can simply replace test results. May be you know any another problems of the patch? -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company
Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"
Pavel Stehule writes: > ne 4. 11. 2018 v 16:54 odesilatel Tom Lane napsal: >> In short, I think it's a bug that we allow the above. If you >> want to keep the must-be-a-variable error then it should apply in >> this case too. > I agree. This should be prohibited from PLpgSQL. OK. In that case I'll run with my patch. The attached is what I had code-wise, but I've not changed the regression tests to match ... gotta go fix the docs too I guess. regards, tom lane diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index 4552638..4e6d5b1 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** *** 30,35 --- 30,36 #include "funcapi.h" #include "miscadmin.h" #include "nodes/nodeFuncs.h" + #include "optimizer/clauses.h" #include "optimizer/planner.h" #include "parser/parse_coerce.h" #include "parser/scansup.h" *** exec_stmt_call(PLpgSQL_execstate *estate *** 2158,2260 plpgsql_create_econtext(estate); } if (SPI_processed == 1) { SPITupleTable *tuptab = SPI_tuptable; /* ! * Construct a dummy target row based on the output arguments of the ! * procedure call. */ if (!stmt->target) { Node *node; - ListCell *lc; FuncExpr *funcexpr; ! int i; ! HeapTuple tuple; Oid *argtypes; char **argnames; char *argmodes; MemoryContext oldcontext; PLpgSQL_row *row; int nfields; /* ! * Get the original CallStmt */ ! node = linitial_node(Query, ((CachedPlanSource *) linitial(plan->plancache_list))->query_list)->utilityStmt; ! if (!IsA(node, CallStmt)) ! elog(ERROR, "returned row from not a CallStmt"); ! funcexpr = castNode(CallStmt, node)->funcexpr; /* ! * Get the argument modes */ ! tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcexpr->funcid)); ! if (!HeapTupleIsValid(tuple)) ! elog(ERROR, "cache lookup failed for function %u", funcexpr->funcid); ! get_func_arg_info(tuple, &argtypes, &argnames, &argmodes); ! ReleaseSysCache(tuple); /* ! * Construct row */ oldcontext = MemoryContextSwitchTo(estate->func->fn_cxt); ! row = palloc0(sizeof(*row)); row->dtype = PLPGSQL_DTYPE_ROW; row->refname = "(unnamed row)"; row->lineno = -1; ! row->varnos = palloc(sizeof(int) * FUNC_MAX_ARGS); nfields = 0; i = 0; ! foreach(lc, funcexpr->args) { Node *n = lfirst(lc); ! if (argmodes && argmodes[i] == PROARGMODE_INOUT) { if (IsA(n, Param)) { ! Param *param = castNode(Param, n); /* paramid is offset by 1 (see make_datum_param()) */ row->varnos[nfields++] = param->paramid - 1; } - else if (IsA(n, NamedArgExpr)) - { - NamedArgExpr *nexpr = castNode(NamedArgExpr, n); - Param *param; - - if (!IsA(nexpr->arg, Param)) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("argument %d is an output argument but is not writable", i + 1))); - - param = castNode(Param, nexpr->arg); - - /* - * Named arguments must be after positional arguments, - * so we can increase nfields. - */ - row->varnos[nexpr->argnumber] = param->paramid - 1; - nfields++; - } else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("argument %d is an output argument but is not writable", i + 1))); } i++; } row->nfields = nfields; - MemoryContextSwitchTo(oldcontext); - stmt->target = (PLpgSQL_variable *) row; } --- 2159,2268 plpgsql_create_econtext(estate); } + /* + * Check result rowcount; if there's one row, assign procedure's output + * values back to the appropriate variables. + */ if (SPI_processed == 1) { SPITupleTable *tuptab = SPI_tuptable; /* ! * If first time through, construct a DTYPE_ROW datum representing the ! * plpgsql variables associated with the procedure's output arguments. ! * Then we can use exec_move_row() to do the assignments. */ if (!stmt->target) { Node *node; FuncExpr *funcexpr; ! HeapTuple func_tuple; ! List *funcargs; Oid *argtypes; char **argnames; char *argmodes; MemoryContext oldcontext; PLpgSQL_row *row; int nfields; + ListCell *lc; + int i; /* ! * Get the parsed CallStmt, and look up the called procedure */ ! node = linitial_node(Query, ! ((CachedPlanSource *) linitial(plan->plancache_list))->query_list)->utilityStmt; ! if (node == NULL || !IsA(node, CallStmt)) ! elog(ERROR, "query for CALL statement is not a CallStmt"); ! funcexpr = ((CallStmt *) node)->funcexpr; ! ! func_tuple = SearchSysCache1(PROCOID, ! ObjectIdGetDatum(funce
Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"
ne 4. 11. 2018 v 16:54 odesílatel Tom Lane napsal: > Pavel Stehule writes: > > so 3. 11. 2018 v 22:47 odesílatel Tom Lane napsal: > >> So while looking at that ... isn't the behavior for non-writable output > >> parameters basically insane? It certainly fails to accord with the > >> plpgsql documentation, which shows an example that would throw an error: > >> > >> CREATE PROCEDURE triple(INOUT x int) > >> ... > >> CALL triple(5); > >> > >> It's even weirder that you can get away with not supplying a writable > >> target value for an output argument so long as it has a default. > >> > >> I think the behavior here ought to be "if the actual argument is a > plpgsql > >> variable, assign the output back to it, otherwise do nothing". That's > >> much closer to the behavior of OUT arguments in other old-school > >> programming languages. > > > I don't agree. The constant can be used only for IN parameter. Safe > > languages like Ada does copy result to variable used as INOUT parameter. > > PL/SQL doesn't allow it too. > > Well, my main point is that that ISN'T what our current code does, nor > does your patch. The reason I'm complaining is that after I rewrote the > code to use expand_function_arguments, it started rejecting this existing > regression test case: > > CREATE PROCEDURE test_proc8c(INOUT a int, INOUT b int, INOUT c int DEFAULT > 11) > ... > > CALL test_proc8c(_a, _b); > > I do not think you can argue that that's not broken while simultaneously > claiming that this error check promotes safe coding. > Hard to say what is correct in this case. When we use CALL statement directly from top level, then it is clean. But the semantic from PLpgSQL is difficult. Maybe we can disallow this case when procedure is called from PL/pgSQL. > > I looked into SQL:2011 to see what it has to say about this. In > 10.4 , syntax rule 9) g) iii) says > > For each SQL parameter Pi, 1 (one) ≤ i ≤ SRNP, that is an output SQL > parameter or both an input SQL parameter and an output SQL parameter, > XAi shall be a . > > The immediately preceding rules make it clear that XAi is the actual > argument corresponding to parameter Pi *after* default-insertion and > named-argument reordering. So our existing behavior here clearly > contradicts the spec: DEFAULT is not a . > > I couldn't find any really clear statement of what PL/SQL does > in this situation, but the docs I did find say that the actual > argument for an OUT parameter has to be a variable, with no > exceptions mentioned. > > In short, I think it's a bug that we allow the above. If you > want to keep the must-be-a-variable error then it should apply in > this case too. > I agree. This should be prohibited from PLpgSQL. Regards Pavel > > regards, tom lane >
Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"
Pavel Stehule writes: > so 3. 11. 2018 v 22:47 odesílatel Tom Lane napsal: >> So while looking at that ... isn't the behavior for non-writable output >> parameters basically insane? It certainly fails to accord with the >> plpgsql documentation, which shows an example that would throw an error: >> >> CREATE PROCEDURE triple(INOUT x int) >> ... >> CALL triple(5); >> >> It's even weirder that you can get away with not supplying a writable >> target value for an output argument so long as it has a default. >> >> I think the behavior here ought to be "if the actual argument is a plpgsql >> variable, assign the output back to it, otherwise do nothing". That's >> much closer to the behavior of OUT arguments in other old-school >> programming languages. > I don't agree. The constant can be used only for IN parameter. Safe > languages like Ada does copy result to variable used as INOUT parameter. > PL/SQL doesn't allow it too. Well, my main point is that that ISN'T what our current code does, nor does your patch. The reason I'm complaining is that after I rewrote the code to use expand_function_arguments, it started rejecting this existing regression test case: CREATE PROCEDURE test_proc8c(INOUT a int, INOUT b int, INOUT c int DEFAULT 11) ... CALL test_proc8c(_a, _b); I do not think you can argue that that's not broken while simultaneously claiming that this error check promotes safe coding. I looked into SQL:2011 to see what it has to say about this. In 10.4 , syntax rule 9) g) iii) says For each SQL parameter Pi, 1 (one) ≤ i ≤ SRNP, that is an output SQL parameter or both an input SQL parameter and an output SQL parameter, XAi shall be a . The immediately preceding rules make it clear that XAi is the actual argument corresponding to parameter Pi *after* default-insertion and named-argument reordering. So our existing behavior here clearly contradicts the spec: DEFAULT is not a . I couldn't find any really clear statement of what PL/SQL does in this situation, but the docs I did find say that the actual argument for an OUT parameter has to be a variable, with no exceptions mentioned. In short, I think it's a bug that we allow the above. If you want to keep the must-be-a-variable error then it should apply in this case too. regards, tom lane
Re: Optimizing nested ConvertRowtypeExpr execution
> On Sun, 4 Nov 2018 at 15:48, Andrew Gierth > wrote: > > > "Dmitry" == Dmitry Dolgov <9erthali...@gmail.com> writes: > > Dmitry> This patch went through the last tree commit fests without any > Dmitry> noticeable activity, but cfbot says it still applies and > Dmitry> doesn't break any tests. The patch itself is rather small, and > Dmitry> I could reproduce ~20% of performance improvements while > Dmitry> running the same scripts under pgbench (although not in all > Dmitry> cases), but probably we need to find someone to take over it. > Dmitry> Does anyone wants to do so, maybe Kyotaro? > > I'll deal with it. Thanks!
Re: Optimizing nested ConvertRowtypeExpr execution
> "Dmitry" == Dmitry Dolgov <9erthali...@gmail.com> writes: Dmitry> This patch went through the last tree commit fests without any Dmitry> noticeable activity, but cfbot says it still applies and Dmitry> doesn't break any tests. The patch itself is rather small, and Dmitry> I could reproduce ~20% of performance improvements while Dmitry> running the same scripts under pgbench (although not in all Dmitry> cases), but probably we need to find someone to take over it. Dmitry> Does anyone wants to do so, maybe Kyotaro? I'll deal with it. -- Andrew (irc:RhodiumToad)
tickling the lesser contributor's withering ego
I wouldn't mind if this page: https://www.postgresql.org/community/contributors/ contained a link to (contributors v11): https://www.postgresql.org/docs/11/static/release-11.html#id-1.11.6.5.6 and to (contributors v10) https://www.postgresql.org/docs/current/static/release-11.html#id-1.11.6.5.6 In fact, I think I would like it. I hope you agree. thanks, Erik Rijkers
Re: using index or check in ALTER TABLE SET NOT NULL
>On Sun, 15 Apr 2018 at 09:09, Sergei Kornilov wrote: > > Attached updated patch follows recent Reorganize partitioning code commit. > regards, Sergei This patch went through the last tree commit fests without any noticeable activity, but cfbot says it still applies and doesn't break any tests. The patch itself is rather small, the only significant objection I see from the thread is that probably "SET NOT NULL NOT VALID" feature could be more appropriate way of fixing the original problem, but looks like no one is working on that (the only related thread I could found was this one [1]). If not properly cataloguing NOT NULL constraints would be fixed, can it potentially conflict with the current patch or not? [1]: https://www.postgresql.org/message-id/flat/CAASwCXcOokBqHf8BXECbDgOLkXxJyL_Q_twhg2dGWSvgMzz%3DxA%40mail.gmail.com
Re: Optimizing nested ConvertRowtypeExpr execution
> On Mon, 9 Apr 2018 at 14:16, Ashutosh Bapat > wrote: > > On Mon, Apr 9, 2018 at 5:34 PM, Kyotaro HORIGUCHI > wrote: > > > > The new code doesn't seem to work as written. > > > >> arg = eval_const_expressions_mutator((Node *) cre->arg, > >>context); > >> > >> /* > >>* In case of a nested ConvertRowtypeExpr, we can convert the > >>* leaf row directly to the topmost row format without any > >>* intermediate conversions. > >>*/ > >> while (arg != NULL && IsA(arg, ConvertRowtypeExpr)) > >> arg = (Node *) castNode(ConvertRowtypeExpr, arg)->arg; > > > > This runs depth-first so the while loop seems to run at most > > once. I suppose that the "arg =" and the while loop are > > transposed as intention. > > Yes. I have modelled it after RelableType case few lines above in the > same function. This patch went through the last tree commit fests without any noticeable activity, but cfbot says it still applies and doesn't break any tests. The patch itself is rather small, and I could reproduce ~20% of performance improvements while running the same scripts under pgbench (although not in all cases), but probably we need to find someone to take over it. Does anyone wants to do so, maybe Kyotaro?
Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)
> On Sun, 1 Apr 2018 at 19:58, Yura Sokolov wrote: > > I didn't change serialized format. Therefore is no need to change > SerializeSnapshot. > But in-memory representation were changed, so RestoreSnapshot is changed. This patch went through the last tree commit fests without any noticeable activity, but cfbot says it still applies and doesn't break any tests. Taking into account potential performance improvements, I believe it would be a pity to stop at this point. Yura, what're your plans about it? If I understand correctly, there are still some commentaries, that were not answered from the last few messages. At the same time can anyone from active reviewers (Tomas, Amit) look at it to agree on what should be done to push it forward?
Re: chained transactions
Hello Peter, Attached is a rebased patch set. No functionality changes. Patch applies cleanly, compile, global make check ok, doc gen ok. Shouldn't psql tab completion be updated as well? About the code: I must admit that do not like much the three global variables & Save/Restore functions. I'd suggest saving directly into 3 local variables in function CommitTransactionCommand, and restoring them when needed. Code should not be longer. I'd would not bother to skip saving when not chaining. Copying & comparing nodes are updated. Should making, outing and reading nodes also be updated? About the tests: I'd suggest to use more options on the different tests, eg SERIALIZABLE, READ ONLY… Also ISTM that tests should show transaction_read_only value as well. -- Fabien.
Re: pg_dump multi VALUES INSERT
The patch attached add additional option for multi values insert statement with a default values of 100 row per statement so the row lose during error is at most 100 rather than entire table. Patch does not seem to apply anymore, could you rebase? -- Fabien.
Re: Speeding up INSERTs and UPDATEs to partitioned tables
On 1 November 2018 at 22:39, Amit Langote wrote: > On 2018/11/01 10:30, David Rowley wrote: >> It's great to know the patch is now so perfect that we've only the >> macro naming left to debate ;-) > > I looked over v12 again and noticed a couple minor issues. > > + * table then we store the index into parenting > + * PartitionTupleRouting 'partition_dispatch_info' array. An > > s/PartitionTupleRouting/PartitionTupleRouting's/g > > Also, I got a bit concerned about "parenting". Does it mean something > like "enclosing", because the PartitionDispatch is a member of > PartitionTupleRouting? I got concerned because using "parent" like this > may be confusing as this is the partitioning code we're talking about, > where "parent" is generally used to mean "parent" table. > > + * the partitioned table that's the target of the command. If we must > + * route tuple via some sub-partitioned table, then the PartitionDispatch > + * for those is only built the first time it's required. > > ... via some sub-partitioned table"s" > > Or perhaps rewrite a bit as: > > If we must route the tuple via some sub-partitioned table, then its > PartitionDispatch is built the first time it's required. I've attached v13 which hopefully addresses these. > The macro naming discussion got me thinking today about the macro itself. > It encapsulates access to the various PartitionTupleRouting arrays > containing the maps, but maybe we've got the interface of tuple routing a > bit (maybe a lot given this thread!) wrong to begin with. Instead of > ExecFindPartition returning indexes into various PartitionTupleRouting > arrays and its callers then using those indexes to fetch various objects > from those arrays, why doesn't it return those objects itself? Although > we made sure that the callers don't need to worry about the meaning of > these indexes changing with this patch, it still seems a bit odd for them > to have to go back to those arrays to get various objects. > > How about we change ExecFindPartition's interface so that it returns the > ResultRelInfo, the two maps, and the partition slot? So, the arrays > simply become a cache for ExecFindPartition et al and are no longer > accessed outside execPartition.c. Although that makes the interface of > ExecFindPartition longer, I think it reduces overall complexity. I don't really think stuffing values into a bunch of output parameters is much of an improvement. I'd rather allow callers to fetch what they need using the index we return. Most callers don't need to know about the child to parent maps, so it seems nicer for those places not to have to invent a dummy variable to pass along to ExecFindPartition() so it can needlessly populate it for them. Perhaps a better design would be to instead of having random special partitioned-table-only fields in ResultRelInfo, just have an extra struct there that contains the extra partition information which would include the translation maps and then just return the ResultRelInfo and allow callers to lookup any extra details they require. I've not looked into this in detail, but I think the committer work that's required for the patch as it is today is already quite significant. I'm not keen on warding any willing one off by making the commit job any harder. I agree that it would be good to stabilise the API for all this partitioning code sometime, but I don't believe it needs to be done all in one commit. My intent here is to improve performance or INSERT and UPDATE on partitioned tables. Perhaps we can shape some API redesign later in the release cycle. What do you think? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services v13-0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch Description: Binary data
Re: WIP: Avoid creation of the free space map for small tables
On 10/31/18, Robert Haas wrote: > It seems important to me that before anybody thinks > about committing this, we construct some kind of destruction case > where repeated scans of the whole table are triggered as frequently as > possible, and then run that test with varying thresholds. I might be > totally wrong, but I bet with a value as large as 32 you will be able > to find cases where it regresses in a big way. Here's an attempt at a destruction case: Lobotomize the heap insert logic such that it never checks the cached target block and has to call the free space logic for every single insertion, like this: index ff13c03083..5d5b36af29 100644 --- a/src/backend/access/heap/hio.c +++ b/src/backend/access/heap/hio.c @@ -377,7 +377,7 @@ RelationGetBufferForTuple(Relation relation, Size len, else if (bistate && bistate->current_buf != InvalidBuffer) targetBlock = BufferGetBlockNumber(bistate->current_buf); else -targetBlock = RelationGetTargetBlock(relation); +targetBlock = InvalidBlockNumber; if (targetBlock == InvalidBlockNumber && use_fsm) { (with the threshold patch I had to do additional work) With the small tuples used in the attached v2 test, this means the free space logic is called ~225 times per block. The test tables are pre-filled with one tuple and vacuumed so that the FSMs are already created when testing the master branch. The patch branch is compiled with a threshold of 8, but testing inserts of 4 pages will effectively simulate a threshold of 4, etc. As before, trimmed average of 10 runs, loading to 100 tables each: # blocksmaster patch 2 25.1ms 30.3ms 4 40.7ms 48.1ms 6 56.6ms 64.7ms 8 73.1ms 82.0ms Without this artificial penalty, the 8 block case was about 50ms for both branches. So if I calculated right, of that 50 ms, master is spending ~0.10ms looking for free space, and the patch is spending about ~0.15ms. So, from that perspective, the difference is trivial. Of course, this is a single client, so not entirely realistic. I think that shared buffer considerations are most important for deciding the threshold. > We also need to think about what happens on the standby, where the FSM > is updated in a fairly different way. Were you referring to performance or just functionality? Because the threshold works on the standby, but I don't know about the performance there. -John Naylor fsm-copy-test-v2.sql Description: application/sql