Re: pg_dumpall --exclude-database option

2018-11-04 Thread Michael Paquier
On Wed, Oct 31, 2018 at 05:44:26PM +0100, Fabien COELHO wrote:
> Patch v4 applies cleanly, compiles, doc generation ok, global & local tests
> ok.

+# also fails for -r and -t, but it seems pointless to add more tests
for those.
+command_fails_like(
+   [ 'pg_dumpall', '--exclude-database=foo', '--globals-only' ],
+   qr/\Qpg_dumpall: option --exclude-database cannot be used
together with -g\/--globals-only\E/,
+   'pg_dumpall: option --exclude-database cannot be used together
with -g/--globals-only');

Usually testing all combinations is preferred, as well as having one
error message for each pattern, which is also more consistent with all
the other sanity checks in pg_dumpall.c and such.
--
Michael


signature.asc
Description: PGP signature


Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-11-04 Thread Michael Paquier
On Tue, Sep 25, 2018 at 06:22:19PM +1200, Thomas Munro wrote:
> On Sat, Sep 22, 2018 at 4:52 PM Amit Kapila  wrote:
>> I went through and check the original proposal [1] to see if any use
>> case is mentioned there, but nothing related has been discussed.   I
>> couldn't think of much use of this facility except maybe for something
>> like parallelizing correalated sub-queries where the size of outer var
>> can change across executions and we might need to resize the initially
>> allocated memory.  This is just a wild thought, I don't have any
>> concrete idea about this.   Having said that, I don't object to
>> removing this especially because the implementation doesn't seem to be
>> complete.  In future, if someone needs such a facility, they can first
>> develop a complete version of this API.
> 
> Thanks for looking into that.  Here's a pair of draft patches to
> disable and then remove dsm_resize() and dsm_map().

Hm.  Don't we need to worry about anybody potentially using these APIs
in a custom module on platforms where it was actually working?  I
imagine that their reaction is not going be nice if any code breaks
suddenly after a minor release.  No issues with removing the interface
on HEAD of course.
--
Michael


signature.asc
Description: PGP signature


Re: move PartitionBoundInfo creation code

2018-11-04 Thread Michael Paquier
On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote:
> Done a few moments ago. :)

From the file size this move is actually negative.  From what I can see
partcache decreases to 400 lines, while partbounds increases to 3k
lines.

There are a couple of things that this patch is doing:
1) Move the functions comparing two bounds into partbounds.c.
2) Remove the chunk in charge of building PartitionBoundData into
partbounds.c for each method: list, hash and values.

From what I can see, this patch brings actually more confusion by doing
more things than just building all the PartitionBound structures as it
fills in each structure and then builds a mapping which is used to save
each partition OID into the correct mapping position.  Wouldn't it move
on with a logic without this mapping so as the partition OIDs are
directly part of PartitionBound?  It looks wrong to me to have
build_partition_boundinfo create not only partdesc->boundinfo but also
partdesc->oids, and the new routine is here to fill in data for the
former, not the latter.

The first phase building the bounds should switch to a switch/case like
the second phase.

PartitionHashBound & friends can become structures local to
partbounds.c as they are used only there.

To be more consistent with all the other routines, like
partition_bounds_equal/copy, wouldn't it be better to call the new
routine partition_bounds_build or partition_bounds_create?
--
Michael


signature.asc
Description: PGP signature


Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-11-04 Thread Amit Langote
On 2018/11/04 19:07, David Rowley wrote:
> On 1 November 2018 at 22:39, Amit Langote  
> wrote:
> I've attached v13 which hopefully addresses these.

Thank you for updating the patch.

>> The macro naming discussion got me thinking today about the macro itself.
>> It encapsulates access to the various PartitionTupleRouting arrays
>> containing the maps, but maybe we've got the interface of tuple routing a
>> bit (maybe a lot given this thread!) wrong to begin with.  Instead of
>> ExecFindPartition returning indexes into various PartitionTupleRouting
>> arrays and its callers then using those indexes to fetch various objects
>> from those arrays, why doesn't it return those objects itself?  Although
>> we made sure that the callers don't need to worry about the meaning of
>> these indexes changing with this patch, it still seems a bit odd for them
>> to have to go back to those arrays to get various objects.
>>
>> How about we change ExecFindPartition's interface so that it returns the
>> ResultRelInfo, the two maps, and the partition slot?  So, the arrays
>> simply become a cache for ExecFindPartition et al and are no longer
>> accessed outside execPartition.c.  Although that makes the interface of
>> ExecFindPartition longer, I think it reduces overall complexity.
> 
> I don't really think stuffing values into a bunch of output parameters
> is much of an improvement. I'd rather allow callers to fetch what they
> need using the index we return. Most callers don't need to know about
> the child to parent maps, so it seems nicer for those places not to
> have to invent a dummy variable to pass along to ExecFindPartition()
> so it can needlessly populate it for them.

Well, if a caller finds a partition using ExecFindPartition, it's going to
need to fetch those other objects anyway.  Both of its callers that exist
today, CopyFrom and ExecPrepareTupleRouting, fetch both maps and the slot
in the same code block as ExecFindPartition.

> Perhaps a better design would be to instead of having random special
> partitioned-table-only fields in ResultRelInfo, just have an extra
> struct there that contains the extra partition information which would
> include the translation maps and then just return the ResultRelInfo
> and allow callers to lookup any extra details they require.

IIUC, you're saying that we could introduce a new struct that contains
auxiliary information needed by partition pruning (maps, slot, etc. for
tuple conversion) and add a new ResultRelInfo member of that struct type.
That way, there is no need to return them separately or return an index to
access them from their arrays.  I guess we won't even need the arrays we
have now.  I think that might be a good idea and simplifies things
significantly.

It reminds me of how ResultRelInfo grew a ri_onConflict member of type
OnConflictSetState [1].  We decided to go that way, as opposed to the
earlier approach of having arrays of num_partitions length in
ModifyTableState or PartitionTupleRouting that contained ON CONFLICT
related objects for individual partitions.

> I've not
> looked into this in detail, but I think the committer work that's
> required for the patch as it is today is already quite significant.
> I'm not keen on warding any willing one off by making the commit job
> any harder.  I agree that it would be good to stabilise the API for
> all this partitioning code sometime, but I don't believe it needs to
> be done all in one commit. My intent here is to improve performance or
> INSERT and UPDATE on partitioned tables. Perhaps we can shape some API
> redesign later in the release cycle. What do you think?

I do suspect that simplifying ExecFindPartition's interface as part of
patch will make a committer's life easier, as the resulting interface is
simpler, especially if we revise it like you suggest above.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=555ee77a9




Re: "Writing" output lines during make

2018-11-04 Thread Bruce Momjian
On Mon, Nov  5, 2018 at 01:01:47PM +0700, John Naylor wrote:
> On 11/5/18, Bruce Momjian  wrote:
> > During make --silent, I occasionally see these lines:
> >
> > Writing postgres.bki
> > Writing schemapg.h
> > Writing postgres.description
> > Writing postgres.shdescription
> >
> > I can't seem to find where these lines are being output, perhaps from a
> > Perl library.  I would like to suppress them but I don't know how.
> 
> Those should be gone as of pg11. Otherwise, those messages come from
> in backend/catalog/genkbi.pl

Oh, I see it now in src/backend/catalog/Catalog.pm:

print "Writing $final_name\n";

I do make in parallel, and couldn't find a pattern for when the lines
appear, so it was hard to debug.  I thought the lines were new but I
guess my --silent usage was somehow new.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: ToDo: show size of partitioned table

2018-11-04 Thread Amit Langote
On 2018/11/04 4:58, Pavel Stehule wrote:
> here is a patch

Thank you, Pavel.

Here are some comments.

I mentioned it during the last review, but maybe you missed it due to the
other discussion.

+the pattern are listed.  If the form \dP+
+is used, a sum of size of related partitions and a description
+are also displayed.

+the pattern are listed.  If the form \dPi+
+is used, a sum of size of related indexes and a description
+are also displayed.

+the pattern are listed.  If the form \dPt+
+is used, a sum of size of related indexes and a description
+are also displayed.

I suggest:

"is used, the sum of sizes of related partitions / index partitions /
table partitions and associated description are also displayed."

Note that I also fixed the typo (indexes -> tables) in the last one.

Also, I wonder if we should mention in the description of \dP+ that the
displayed size considers the sizes of both the tables and indexes on the
individual partitions, because in the code, I see pg_total_relation_size
being used.  So, the text should be something like:

"is used, the sum of size of related partitions (including the table and
indexes, if any) and associated description are also displayed."

Code itself looks to me to be in good shape, except you seem to have also
missed Michael's comment upthread:

+/* PostgreSQL 11 has pg_partition_tree function */

/* PostgreSQL 12 has pg_partition_tree function */

Thanks again.

Regards,
Amit




Re: "Writing" output lines during make

2018-11-04 Thread John Naylor
On 11/5/18, Bruce Momjian  wrote:
> During make --silent, I occasionally see these lines:
>
>   Writing postgres.bki
>   Writing schemapg.h
>   Writing postgres.description
>   Writing postgres.shdescription
>
> I can't seem to find where these lines are being output, perhaps from a
> Perl library.  I would like to suppress them but I don't know how.

Those should be gone as of pg11. Otherwise, those messages come from
in backend/catalog/genkbi.pl

-John Naylor



Re: "Writing" output lines during make

2018-11-04 Thread Tom Lane
Bruce Momjian  writes:
> During make --silent, I occasionally see these lines:
>   Writing postgres.bki
>   Writing schemapg.h
>   Writing postgres.description
>   Writing postgres.shdescription

> I can't seem to find where these lines are being output, perhaps from a
> Perl library.  I would like to suppress them but I don't know how.

That's gone as of v11 & HEAD, no?

regards, tom lane



"Writing" output lines during make

2018-11-04 Thread Bruce Momjian
During make --silent, I occasionally see these lines:

Writing postgres.bki
Writing schemapg.h
Writing postgres.description
Writing postgres.shdescription

I can't seem to find where these lines are being output, perhaps from a
Perl library.  I would like to suppress them but I don't know how.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: partitioned tables referenced by FKs

2018-11-04 Thread Corey Huinker
On Fri, Nov 2, 2018 at 7:42 PM Alvaro Herrera 
wrote:

> Here's a patch to allow partitioned tables to be referenced by foreign
> keys.  Current state is WIP, but everything should work; see below for
> the expected exception.
>
> The design is very simple: have one pg_constraint row for each partition
> on each side, each row pointing to the topmost table on the other side;
> triggers appear on each leaf partition (and naturally they don't appear
> on any intermediate partitioned table).
>

This is an important and much needed feature!

Based on my extremely naive reading of this code, I have two perhaps
equally naive questions:

1. it seems that we will continue to to per-row RI checks for inserts and
updates. However, there already exists a bulk check in RI_Initial_Check().
Could we modify this bulk check to do RI checks on a per-statement basis
rather than a per-row basis?

2. If #1 is possible, is the overhead of transitions tables too great for
the single-row case?


ON COMMIT actions and inheritance

2018-11-04 Thread Amit Langote
Hi,

Michael pointed out a problem with specifying different ON COMMIT actions
on a temporary inheritance parent and its children:

https://www.postgresql.org/message-id/20181102051804.GV1727%40paquier.xyz

The problem is that when PreCommit_on_commit_actions() executes an ON
COMMIT DROP action on the parent, it will drop its children as well.  It
doesn't however remove the children's own actions, especially ON COMMIT
DELETE ROWS, from the list and when it gets around to executing them, the
children are already gone.  That causes the heap_open in heap_truncate to
fail with an error like this:

ERROR:  XX000: could not open relation with OID 16420
LOCATION:  relation_open, heapam.c:1138

One way to fix that is to remove the tables that no longer exist from the
list that's passed to heap_truncate(), which the attached patch implements.

Thanks,
Amit
From 37c97e6b9879b130a80305fd794a9e721332cb98 Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 2 Nov 2018 15:26:37 +0900
Subject: [PATCH] Check relation still exists before applying ON COMMIT action

---
 src/backend/commands/tablecmds.c   | 17 +
 src/test/regress/expected/temp.out | 23 +++
 src/test/regress/sql/temp.sql  | 15 +++
 3 files changed, 55 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6048334c75..5b17a8679d 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13353,6 +13353,23 @@ PreCommit_on_commit_actions(void)
}
if (oids_to_truncate != NIL)
{
+   List   *tmp = oids_to_truncate;
+
+   /*
+* Check that relations we're about to truncate still exist.  
Some of
+* them might be inheritance children, which would be gone as a 
result
+* of their parent being dropped due to ONCOMMIT_DROP action of 
its
+* own.
+*/
+   oids_to_truncate = NIL;
+   foreach(l, tmp)
+   {
+   Oid relid = lfirst_oid(l);
+
+   if (SearchSysCacheExists1(RELOID, 
ObjectIdGetDatum(relid)))
+   oids_to_truncate = 
lappend_oid(oids_to_truncate, relid);
+   }
+
heap_truncate(oids_to_truncate);
CommandCounterIncrement();  /* XXX needed? */
}
diff --git a/src/test/regress/expected/temp.out 
b/src/test/regress/expected/temp.out
index a769abe9bb..70be46ca1d 100644
--- a/src/test/regress/expected/temp.out
+++ b/src/test/regress/expected/temp.out
@@ -216,3 +216,26 @@ select * from temp_parted_oncommit;
 (0 rows)
 
 drop table temp_parted_oncommit;
+-- Another case where a child table is gone by the time it's ON COMMIT
+-- action is executed due to the ON COMMIT DROP action on its parent
+-- There are tests for both a partitioned table and regular inheritance.
+begin;
+create temp table temp_parted_oncommit_test (a int) partition by list (a) on 
commit drop;
+create temp table temp_parted_oncommit_test1 partition of 
temp_parted_oncommit_test for values in (1) on commit delete rows;
+insert into temp_parted_oncommit_test values (1);
+create temp table temp_inh_oncommit_test (a int) on commit drop;
+create temp table temp_inh_oncommit_test1 () inherits(temp_inh_oncommit_test) 
on commit delete rows;
+insert into temp_inh_oncommit_test1 values (1);
+commit;
+NOTICE:  drop cascades to table temp_inh_oncommit_test1
+-- zero rows in both cases
+select relname from pg_class where relname like 'temp_parted_oncommit_test%';
+ relname 
+-
+(0 rows)
+
+select relname from pg_class where relname like 'temp_inh_oncommit_test%';
+ relname 
+-
+(0 rows)
+
diff --git a/src/test/regress/sql/temp.sql b/src/test/regress/sql/temp.sql
index 1074c7cfac..ad8c5cb85a 100644
--- a/src/test/regress/sql/temp.sql
+++ b/src/test/regress/sql/temp.sql
@@ -165,3 +165,18 @@ commit;
 -- partitions are emptied by the previous commit
 select * from temp_parted_oncommit;
 drop table temp_parted_oncommit;
+
+-- Another case where a child table is gone by the time it's ON COMMIT
+-- action is executed due to the ON COMMIT DROP action on its parent
+-- There are tests for both a partitioned table and regular inheritance.
+begin;
+create temp table temp_parted_oncommit_test (a int) partition by list (a) on 
commit drop;
+create temp table temp_parted_oncommit_test1 partition of 
temp_parted_oncommit_test for values in (1) on commit delete rows;
+insert into temp_parted_oncommit_test values (1);
+create temp table temp_inh_oncommit_test (a int) on commit drop;
+create temp table temp_inh_oncommit_test1 () inherits(temp_inh_oncommit_test) 
on commit delete rows;
+insert into temp_inh_oncommit_test1 values (1);
+commit;
+-- zero rows in both cases
+select relname from pg_class where relname like 'temp_parted_oncommit_test%';
+select relname from pg_class where relname like 'temp_inh_on

Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT

2018-11-04 Thread Rajkumar Raghuwanshi
On Mon, Nov 5, 2018 at 5:49 AM Michael Paquier  wrote:

> On Fri, Nov 02, 2018 at 04:39:07PM +0900, Amit Langote wrote:
> > Agreed that they're two independent issues, although it wouldn't be such
> a
> > bad idea to fix them in one go, as they're both issues related to the
> > handling of ON COMMIT actions on tables in inheritance trees.
>
> I have pushed 0001 which fixes the bug reported on this thread down to
> v10
>
Thanks Michael, Amit.


Re: Hooks to Modify Execution Flow and Query Planner

2018-11-04 Thread Amit Langote
Hi,

On 2018/11/03 17:28, Vincent Mirian wrote:
> Hi Amit,
> 
> Thank you for your response. Chapters 51, 57 and 59 (Overview of PostgreSQL
> Internals, Writing A Foreign Data Wrapper and Writing A Custom Scan
> Provider) seem to be relevant. Aside from the source code snippets in the
> document, is there functional source code that can be used as reference?

You can look at contrib/postgres_fdw for an example of a quite functional
foreign data wrapper.

For examples of a custom scan provider, you can look at Citus [1],
pg-strom [2].

Citus, for example, implements plan nodes like MultiJoin, Job, etc. that
are not present in vanilla PostgreSQL to implement additional query
execution functionality.

Similarly, you can find custom nodes like GpuJoin in pg-strom.

> Also, aside from this mailing list, is there an interactive medium for
> asking questions?

There are IRC and Slack channels which you can access via the following page:

https://www.postgresql.org/community/

Thanks,
Amit

[1] https://github.com/citusdata/citus
[2] https://github.com/heterodb/pg-strom




Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT

2018-11-04 Thread Amit Langote
On 2018/11/05 9:19, Michael Paquier wrote:
> On Fri, Nov 02, 2018 at 04:39:07PM +0900, Amit Langote wrote:
>> Agreed that they're two independent issues, although it wouldn't be such a
>> bad idea to fix them in one go, as they're both issues related to the
>> handling of ON COMMIT actions on tables in inheritance trees.
> 
> I have pushed 0001 which fixes the bug reported on this thread down to
> v10, after tweaking a bit the patch after more review.  I was testing
> heap_truncate_one_rel() a bit more deeply and it actually happens that
> it can work with matviews.  Re-reading the thread and sleeping on it,
> Tom was been actually suggesting to move the check one level down to
> heap_truncate_one_rel(), which actually makes more sense.  So I have
> changed the patch so as a check on RELKIND_PARTITIONED_TABLE is done
> instead of RELKIND_RELATION which is what has been proposed until now.

Okay, it's good that heap_truncate_one_rel() continues to work for all
relation types that can have storage.  Thanks for making the changes
yourself and committing.

> Regarding the second patch, could you start a second thread?  The scope
> is not only related to partitioned tables but also to inheritance trees
> so this goes way back in time, and I think that we could attract a
> better audience about the problem.  I don't mind starting a thread
> myself, not without your authorization as you wrote a patch to deal with
> the problem.  My apologies, I have not looked at what you are proposing
> yet.

Okay, no problem.  I will post that patch in a new thread detailing what
the problem is and how the proposed patch fixes it.

Thanks,
Amit




Re: Vacuum Full does not release the disk size space after delete from table

2018-11-04 Thread Haozhou Wang
Thank Tom!
We will check it.

On Fri, Nov 2, 2018 at 10:35 PM Tom Lane  wrote:

> Haozhou Wang  writes:
> > We meet a corner case that related to the behavior of Vacuum Full.
> > ...
> > If we run both sql scripts on same database in parallel, the "VACUUM FULL
> > a;" will not release the disk space.
>
> I think what's happening is that the delete in script 1 happens after the
> "pg_sleep" in script 2 starts.  Then the pg_sleep has an open snapshot
> that could potentially see the deleted rows, so they can't be removed yet.
>
> You could check this theory by changing the vacuum to use VERBOSE, and
> seeing what it says about rows that can't be removed yet.
>
> regards, tom lane
>


-- 
Regards,
Haozhou


Re: zheap: a new storage format for PostgreSQL

2018-11-04 Thread Amit Kapila
On Sat, Nov 3, 2018 at 9:30 AM Amit Kapila  wrote:
> On Fri, Nov 2, 2018 at 6:41 PM Tomas Vondra
>  wrote:
> > I'm sure
> > it's not the only place where we do something like this, and the other
> > places don't trigger the valgrind warning, so how do those places do
> > this? heapam seems to call fetch_att in the end, which essentially calls
> > Int32GetDatum/Int16GetDatum/CharGetDatum, so why not to use the same
> > trick here?
> >
>
> This is because, in zheap, we have omitted all alignment padding for
> pass-by-value types.  See the description in my previous email [1].  I
> think here we need to initialize ret_datum at the beginning of the
> function unless you have some better idea.
>

I have pushed a fix on the above lines in zheap-branch, but I am open
to change it if you have better ideas for the same.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pg_dump multi VALUES INSERT

2018-11-04 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Wed, Oct 17, 2018 at 03:05:28PM -0400, Stephen Frost wrote:
> > The point of it is that it makes loading into other RDBMS faster.  Yes,
> > it has many of the same issues as our COPY does, but we support it
> > because it's much faster.  The same is true here, just for other
> > databases, so I'm +1 on the general idea.
> 
> Well, the patch author has mentioned that he cares about also being able
> to detect errors when processing the dump, which multi inserts make that
> easier to check for.  However, even if you specify --data-only you still
> need to worry about the first SET commands ahead, which requires manual
> handling of the dump...

That's hardly a serious complication..

> I am honestly not convinced that it is worth complicating pg_dump for
> that, as there is no guarantee either that the DDLs generated by pg_dump
> will be compatible with what other systems expect.  This kind of
> compatibility for fetch and reload can also be kind of tricky with
> portability issues, so I'd rather let this stuff being handled correctly
> by other tools like pgloader or others rather than expecting to get this
> stuff half-baked within Postgres core tools.

I can see an argument for not wanting to complicate pg_dump, but we've
explicitly stated that the purpose of --inserts is to facilitate
restoring into other database systems and I don't agree that we should
just punt on that entirely.  For better or worse, there's an awful lot
of weight put behind things which are in core and we should take care to
do what we can to make those things better, especially when someone is
proposing a patch to improve the situation.

Sure, the patch might need work or have other issues, but that's an
opportunity for us to provide feedback to the author and encourage them
to improve the patch.

As for the other things that make it difficult to use pg_dump to get a
result out that can be loaded into other database systems- let's try to
improve on that too.  Having an option to skip the 'setup' bits, such as
the SET commands, certainly wouldn't be hard.

I certainly don't see us adding code to pg_dump to handle 'fetch and
reload' into some non-PG system, or, really, even into a PG system, and
that certainly isn't something the patch does, so I don't think it's a
particularly interesting argument.  Users can handle that as needed
themselves.

In other words, none of the arguments put forth really seem to be a
reason to reject the general idea of this patch, so I'm still +1 on
that.  Having just glanced over the patch quickly, I think I would have
done something like '--inserts=100' as the way to enable it instead of
adding a new option though.  Not that I feel very strongly about it.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: pg_dump multi VALUES INSERT

2018-11-04 Thread Michael Paquier
On Wed, Oct 17, 2018 at 03:05:28PM -0400, Stephen Frost wrote:
> The point of it is that it makes loading into other RDBMS faster.  Yes,
> it has many of the same issues as our COPY does, but we support it
> because it's much faster.  The same is true here, just for other
> databases, so I'm +1 on the general idea.

Well, the patch author has mentioned that he cares about also being able
to detect errors when processing the dump, which multi inserts make that
easier to check for.  However, even if you specify --data-only you still
need to worry about the first SET commands ahead, which requires manual
handling of the dump...

I am honestly not convinced that it is worth complicating pg_dump for
that, as there is no guarantee either that the DDLs generated by pg_dump
will be compatible with what other systems expect.  This kind of
compatibility for fetch and reload can also be kind of tricky with
portability issues, so I'd rather let this stuff being handled correctly
by other tools like pgloader or others rather than expecting to get this
stuff half-baked within Postgres core tools.
--
Michael


signature.asc
Description: PGP signature


Re: Unused entry in pg_opfamily

2018-11-04 Thread Michael Paquier
On Sun, Nov 04, 2018 at 08:02:25PM -0500, Tom Lane wrote:
> We should, therefore, at least add an oprsanity check for
> opfamilies without opclasses.  I wonder how far it's worth
> going to check for other orphaned catalog entries.

+1.
--
Michael


signature.asc
Description: PGP signature


Unused entry in pg_opfamily

2018-11-04 Thread Tom Lane
I happened to notice that the pg_opfamily entry with OID 4035
(jsonb_ops for GIST) is not referenced anywhere; apparently it
was put in in anticipation of support that never materialized.
We still pass make check-world if it's removed.

Getting rid of it seems like a no-brainer, but it surprised me
a bit that none of our automated crosschecks had noticed this.
It looks like the reason is that all the amvalidate logic is
driven off pg_opclass entries, and so an opfamily that contains
no opclass is never examined.

We should, therefore, at least add an oprsanity check for
opfamilies without opclasses.  I wonder how far it's worth
going to check for other orphaned catalog entries.

regards, tom lane



Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT

2018-11-04 Thread Michael Paquier
On Fri, Nov 02, 2018 at 04:39:07PM +0900, Amit Langote wrote:
> Agreed that they're two independent issues, although it wouldn't be such a
> bad idea to fix them in one go, as they're both issues related to the
> handling of ON COMMIT actions on tables in inheritance trees.

I have pushed 0001 which fixes the bug reported on this thread down to
v10, after tweaking a bit the patch after more review.  I was testing
heap_truncate_one_rel() a bit more deeply and it actually happens that
it can work with matviews.  Re-reading the thread and sleeping on it,
Tom was been actually suggesting to move the check one level down to
heap_truncate_one_rel(), which actually makes more sense.  So I have
changed the patch so as a check on RELKIND_PARTITIONED_TABLE is done
instead of RELKIND_RELATION which is what has been proposed until now.

Regarding the second patch, could you start a second thread?  The scope
is not only related to partitioned tables but also to inheritance trees
so this goes way back in time, and I think that we could attract a
better audience about the problem.  I don't mind starting a thread
myself, not without your authorization as you wrote a patch to deal with
the problem.  My apologies, I have not looked at what you are proposing
yet.
--
Michael


signature.asc
Description: PGP signature


Re: Delta Materialized View Refreshes?

2018-11-04 Thread legrand legrand
denty wrote
> (Seems I can't attach via the web interface, so copy/paste patch below.)
> 
> --
> Sent from:
> http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Sending attachments from this web site (that is not an official postgres
website)
has been disabled as requested by postgres community members
(attachments where stored locally in nabble but not sent)

see
https://www.postgresql-archive.org/postgresql-archive-org-td6035059i20.html#a6035631





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Delta Materialized View Refreshes?

2018-11-04 Thread denty
Hi folks,

I had a crack at this, and it was pretty simple to get something working to
play around with, and it seems like it might be useful.

I developed it against 10.1, as that's what I happened to be working with at
the time. The patch is pretty small, and I hoped it would apply cleanly
against 11. Unfortunately it doesn't, but I doubt the issues are
substantial. If there is interest in moving this forward, I'll update and
re-share.

The patch enables pretty much exactly what Jeremy suggests — something like
"refresh materialized view concurrently testview where type = 'main';" —
with fairly obvious semantics.

Welcome comments on the patch or approach.

denty.

(Seems I can't attach via the web interface, so copy/paste patch below.)





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: Ordered Partitioned Table Scans

2018-11-04 Thread David Rowley
On 1 November 2018 at 22:05, Antonin Houska  wrote:
> I think these conditions are too restrictive:
>
> /*
>  * Determine if these pathkeys match the partition order, or reverse
>  * partition order.  It can't match both, so only go to the trouble of
>  * checking the reverse order when it's not in ascending partition
>  * order.
>  */
> partition_order = pathkeys_contained_in(pathkeys,
> partition_pathkeys);
> partition_order_desc = !partition_order &&
> pathkeys_contained_in(pathkeys,
> 
> partition_pathkeys_desc);
>
>
> In the example above I wanted to show that your new feature should work even
> if "pathkeys" is not contained in "partition_pathkeys".

Okay, after a bit more time looking at this I see what you're saying
now and I agree, but; see below.

>> > Another problem I see is that build_partition_pathkeys() continues even if 
>> > it
>> > fails to create a pathkey for some partitioning column. In the example 
>> > above
>> > it would mean that the table can have "partition_pathkeys" equal to {j}
>> > instead of {i, j} on some EC-related conditions. However such a key does 
>> > not
>> > correspond to reality - this is easier to imagine if another partition is
>> > considered.
>> >
>> > partition 3:
>> >
>> >  i | j | k
>> > ---+---+---
>> >  1 | 0 | 1
>> >  1 | 0 | 0
>> >
>> > So I think no "partition_pathkeys" should be generated in that case. On the
>> > other hand, if the function returned the part of the list it could 
>> > construct
>> > so far, it'd be wrong because such incomplete pathkeys could pass the 
>> > checks I
>> > proposed above for reasons unrelated to the partitioning scheme.
>>
>> Oops. That's a mistake. We should return what we have so far if we
>> can't make one of the pathkeys. Will fix.
>
> I think no "partition_pathkeys" should be created in this case, but before we
> can discuss this in detail there needs to be an agreement on the evaluation of
> the relationship between "pathkeys" and "partition_pathkeys", see above.

The problem with doing that is that if the partition keys are better
than the pathkeys then we'll most likely fail to generate any
partition keys at all due to lack of any existing eclass to use for
the pathkeys. It's unsafe to use just the prefix in this case as the
eclass may not have been found due to, for example one of the
partition keys having a different collation than the required sort
order of the query. In other words, we can't rely on a failure to
create the pathkey meaning that a more strict sort order is not
required.

I'm a bit unsure on how safe it would be to pass "create_it" as true
to make_pathkey_from_sortinfo(). We might be building partition path
keys for some sub-partitioned table. In this case the eclass should
likely have a its member added with em_is_child = true.  The existing
code always sets em_is_child to false. It's not that clear to me that
setting up a new eclass with a single em_is_child = true member is
correct.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: date_trunc() in a specific time zone

2018-11-04 Thread David Fetter
On Mon, Oct 29, 2018 at 04:18:23PM +0100, Vik Fearing wrote:
> A use case that I see quite a lot of is needing to do reports and other
> calculations on data per day/hour/etc but in the user's time zone.  The
> way to do that is fairly trivial, but it's not obvious what it does so
> reading queries becomes just a little bit more difficult.
> 
> Attached is a patch to create a function for it, based off 5953c99697.

+1

In a slightly related matter, at some point, we also need to come up
with a timestamptz2 or some such which preserves the input time zone.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-04 Thread Tom Lane
I wrote:
> BTW, it looks to me like ExecuteCallStmt trashes the passed-in CallStmt
> in cases where expand_function_arguments is not a no-op.  Is that
> really safe?  Seems to me it'd cause problems if, for example, dealing
> with a CallStmt that's part of a prepared stmt or cached plan.

I dug into why this wasn't falling over like mad, and the answer is that
the only reason it doesn't fall over is that plancache.c *always* decides
to build a new custom plan for any cached plan consisting strictly of
utility statements.  That's because cached_plan_cost() ignores utility
statements altogether, causing both the generic and custom plan costs
to be exactly zero, and choose_custom_plan() figures it might as well
break a tie in favor of generating a custom plan.  Therefore, even when
dealing with a cached CALL statement, ExecuteCallStmt always sees a
virgin new parse tree, and the fact that it corrupts it isn't a problem.

If I adjust the plancache logic so that it's capable of deciding to
use a generic cached plan, repeated execution of a CALL statement with
named parameters crashes at the seventh iteration, which is the first
one that would actually try to re-use a cached plan tree.

Nonetheless, make check-world passes, indicating that there is no place
in our regression tests that stresses this type of situation.  (On the
other hand, if I set plan_cache_mode = force_generic_plan, kaboom.)

This leaves me worried that there are probably other places besides
CallStmt that think they can scribble on a utility statement's parse
tree at execution time.  We're clearly not testing that situation
enough.

For the moment, I'm going to go fix ExecuteCallStmt so it doesn't do
this, but I'm wondering how we could detect such problems more
generically.

Also, it's just silly that the plancache won't use a generic plan for
utility statements, so I think we should change that (in HEAD only).
The easiest mod is to adjust cached_plan_cost() so that it charges
some nonzero amount for "planning" of utility statements.

regards, tom lane



Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

2018-11-04 Thread Peter Geoghegan
On Sun, Nov 4, 2018 at 8:21 AM Andrey Lepikhov
 wrote:
> I mean that your code have not any problems that I can detect by
> regression tests and by the retail index tuple deletion patch.
> Difference in amount of dropped objects is not a problem. It is caused
> by pos 2293 - 'else if (thisobj->objectSubId == 0)' - at the file
> catalog/dependency.c and it is legal behavior: column row object deleted
> without any report because we already decided to drop its whole table.

The behavior implied by using ASC heap TID order is always "legal",
but it may cause a regression in certain functionality -- something
that an ordinary user might complain about. There were some changes
when DESC heap TID order is used too, of course, but those were safe
to ignore (it seemed like nobody could ever care). It might have been
okay to just use DESC order, but since it now seems like I must use
ASC heap TID order for performance reasons, I have to tackle a couple
of these issues head-on (e.g.  'cannot drop trigger trg1').

> Also, I checked the triggers test. Difference in the ERROR message
> 'cannot drop trigger trg1' is caused by different order of tuples in the
> relation with the dependDependerIndexId relid. It is legal behavior and
> we can simply replace test results.

Let's look at this specific "trg1" case:

"""
 create table trigpart (a int, b int) partition by range (a);
 create table trigpart1 partition of trigpart for values from (0) to (1000);
 create trigger trg1 after insert on trigpart for each row execute
procedure trigger_nothing();
 ...
 drop trigger trg1 on trigpart1; -- fail
-ERROR:  cannot drop trigger trg1 on table trigpart1 because trigger
trg1 on table trigpart requires it
-HINT:  You can drop trigger trg1 on table trigpart instead.
+ERROR:  cannot drop trigger trg1 on table trigpart1 because table
trigpart1 requires it
+HINT:  You can drop table trigpart1 instead.
"""

The original hint suggests "you need to drop the object on the
partition parent instead of its child", which is useful. The new hint
suggests "instead of dropping the trigger on the partition child,
maybe drop the child itself!". That's almost an insult to the user.

Now, I suppose that I could claim that it's not my responsibility to
fix this, since we get the useful behavior only due to accidental
implementation details. I'm not going to take that position, though. I
think that I am obliged to follow both the letter and the spirit of
the law. I'm almost certain that this regression test was written
because somebody specifically cared about getting the original, useful
message. The underlying assumptions may have been a bit shaky, but we
all know how common it is for software to evolve to depend on
implementation-defined details. We've all written code that does it,
but hopefully it didn't hurt us much because we also wrote regression
tests that exercised the useful behavior.

> May be you know any another problems of the patch?

Just the lack of pg_upgrade support. That is progressing nicely,
though. I'll probably have that part in the next revision of the
patch. I've found what looks like a workable approach, though I need
to work on a testing strategy for pg_upgrade.

-- 
Peter Geoghegan



Re: using index or check in ALTER TABLE SET NOT NULL

2018-11-04 Thread Sergei Kornilov
Hello

> This patch went through the last tree commit fests without any noticeable 
> activity
Well, last two CF. During march commitfest patch has activity and was marked as 
ready for committer. But at end of july CF was no further activity and patch 
was reverted back to "need review" with reason [1]:
> It's not clear that there is consensus that this is wanted.
I can't say something here.

> If not properly cataloguing NOT NULL constraints would be fixed, can it
> potentially conflict with the current patch or not?
We already doing same stuff for "alter table attach partition" and in this 
patch i use exactly this routine. If proper cataloguing would conflict with my 
patch - it would conflict with "attach partition" validation too.
I think proper cataloguing can be implemented without conflict with proposed 
feature.

regards, Sergei

[1]: 
https://www.postgresql.org/message-id/c4c7556d-a046-ac29-2549-bdef0078b6fe%402ndQuadrant.com



Re: [PATCH] Improvements to "Getting started" tutorial for Google Code-in task

2018-11-04 Thread Andreas 'ads' Scherbaum

On 04.11.18 02:53, LAM JUN RONG wrote:


Hi,

I have made some changes based on Andreas’ suggestions.

> +    then one or more of the packages 
PostgreSQL requires is not installed.


> +    See  for the required 
packages.


>

> How about:

> then packages which are required to build

> PostgreSQL are missing.

> See  for a list of requirements.

This seems better than mine, I have included it.



Ok.



>  If you are not sure whether PostgreSQL

> -    is already available or whether you can use it for your

> -    experimentation then you can install it yourself.  Doing so is not

> +    is already available for your experimentation,

> +    you can install it yourself. Doing so is not

>  hard and it can be a good exercise.

>

> This change does not make much sense, given that you leave the

> second part of the sentence.

How’s this:

If you are not sure whether PostgreSQL

is already available for your experimentation,

you can install it yourself, which is not hard.



"you can install it by yourself, which is not complicated"?
I don't like the "hard" there.




>  As is typical of client/server applications, the client and the

> -    server can be on different hosts.  In that case they communicate

> +    server can be on different machines or networks.  In that case 
they communicate


>  over a TCP/IP network connection.  You should keep this in mind,

> I think "hosts" is preferred here. The differentiation between machines

> and networks is not necessary.

>

>

>

> -    The path at your site might be different.  Contact your site

> +    The path at your site's server might be different.  Contact 
your site


>  administrator or check the installation instructions to

> Dunno if this change improves the wording, or the understanding.

> How about replacing "site" with "installation"?

For the 2 points above, I decided that the original documentation 
seems fine.




Ok.



>   PostgreSQL allows you to create any

> -    number of databases at a given site.  Database names must have an

> -    alphabetic first character and are limited to 63 bytes in

> -    length.  A convenient choice is to create a database with the same

> -    name as your current user name. Many tools assume that database

> -    name as the default, so it can save you some typing.  To create

> -    that database, simply type:

> +    number of databases at a given site.  Database names are 
limited to 63 bytes in


> +    length. Database names longer than 63 bytes will be truncated. 
A convenient


> +    choice is to create a database with the same name as your 
current user name.


> +    Many tools assume that database name as the default, so it

> +    can save you some typing. To create that database, simply type:

> The part about "truncate" is correct, maybe you can include 
NAMEDATALEN here as well.


> The part about the first character is correct - except you quote the 
name.


> Then any name is allowed. If you rewrite this part, maybe include 
this as well.


I’ve made some changes to include the part about NAMEDATALEN and quoting:

However, if you would like to create databases with

names that do not start with an alphabetic character, you will need to 
quote it like so: "1234".


Database names are limited to 63 bytes in length. Database names 
longer than 63 bytes will be


truncated. You can change this limit by modifying the NAMEDATALEN 
variable,


but that would require recompiling PostgreSQL.



you will need to quote the the identifier, like "1st database".

Database names are limited to 63 bytes in length, or more specifically
to the length defined in NAMEDATALEN. Changing this value requires
recompiling the database. Names which are longer than that are
truncated.

A convenient choice ...


Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project



Re: wal_dump output on CREATE DATABASE

2018-11-04 Thread Jean-Christophe Arnu
Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> a écrit :

> On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
> > Exemple on CREATE DATABASE (without defining a template database) :
> > rmgr: Databaselen (rec/tot): 42/42, tx:568, lsn:
> > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
> >
> > It comes out (to me) it may be more consistent to use the same template
> > than the other operations in pg_waldump.
> > I propose to swap the parameters positions for the copy dir operation
> > output.
> >
> > You'll find a patch file included which does the switching job :
> > rmgr: Databaselen (rec/tot): 42/42, tx:568, lsn:
> > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384
>
> I see your point.  I suspect this was mainly implemented that way
> because that's how the fields occur in the underlying
> xl_dbase_create_rec structure.  (But that structure also has the target
> location first, so it's not entirely consistent that way either.)  We
> could switch the fields around in the output, but I wonder whether we
> couldn't make the whole thing a bit more readable like this:
>
> desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384
>
> or maybe like this
>
> desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384
>


Thank you Peter for your review and proposal .
I like the last one which gives the fields semantics in a concise way.
Pushing the idea a bit farther we could think of applying that description
to any other operation that involves the ts/db/oid fields. What do you
think ?

Example in nbtdesc.c we could add the "(ts/db/oid)" information to help the
DBA to identify the objects:
case XLOG_BTREE_REUSE_PAGE:
{
xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;

appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u;
latestRemovedXid %u",
 xlrec->node.spcNode, xlrec->node.dbNode,
 xlrec->node.relNode,
xlrec->latestRemovedXid);
break;
}

This may help DBAs to better identify the objects related to the messages.

Any thought/comments/suggestions ?

~~~ Side story
Well to be clear, my first proposal is born while i was writting a side
script to textually identify which objects were involved into pg_waldump
operations (if that objects already exists at script run time). I'm
wondering whether it could be interesting to incllde such a "textual
decoding" into pg_waldump or not (as a command line switch). Anyway, my
script would be available as proof of concept first. We have time to
discuss that last point in another thread.
~~~

Thank you

>


Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-04 Thread Tom Lane
Pavel Stehule  writes:
> I am not sure how safe is read argmodes from syscache after procedure
> execution. Theoretically, the procedure pg_proc tuple can be modified from
> procedure, and can be committed from procedure. Isn't better to safe
> argmodes before execution?

Hm.  That would mean throwing non-writable-arg errors before rather than
after, but that's probably fine.

regards, tom lane



Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-04 Thread Pavel Stehule
ne 4. 11. 2018 v 17:14 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > ne 4. 11. 2018 v 16:54 odesilatel Tom Lane  napsal:
> >> In short, I think it's a bug that we allow the above.  If you
> >> want to keep the must-be-a-variable error then it should apply in
> >> this case too.
>
> > I agree. This should be prohibited from PLpgSQL.
>
> OK.  In that case I'll run with my patch.  The attached is what I had
> code-wise, but I've not changed the regression tests to match ... gotta
> go fix the docs too I guess.
>

I am not sure how safe is read argmodes from syscache after procedure
execution. Theoretically, the procedure pg_proc tuple can be modified from
procedure, and can be committed from procedure. Isn't better to safe
argmodes before execution?

regards

Pavel

>
> regards, tom lane
>
>


Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

2018-11-04 Thread Andrey Lepikhov




On 04.11.2018 9:31, Peter Geoghegan wrote:

On Sat, Nov 3, 2018 at 8:52 PM Andrey Lepikhov
 wrote:

I applied your patches at top of master. After tests corrections
(related to TID ordering in index relations DROP...CASCADE operation)
'make check-world' passed successfully many times.
In the case of 'create view' regression test - 'drop cascades to 62
other objects' problem - I verify an Álvaro Herrera hypothesis [1] and
it is true. You can verify it by tracking the
object_address_present_add_flags() routine return value.


I'll have to go and fix the problem directly, so that ASC sort order
can be used.


Some doubts, however, may be regarding the 'triggers' test.
May you specify the test failures do you mean?


Not sure what you mean. The order of items that are listed in the
DETAIL for a cascading DROP can have an "implementation defined"
order. I think that this is an example of the more general problem --
what you call the 'drop cascades to 62 other objects' problem is a
more specific subproblem, or, if you prefer, a more specific symptom
of the same problem.


I mean that your code have not any problems that I can detect by 
regression tests and by the retail index tuple deletion patch.
Difference in amount of dropped objects is not a problem. It is caused 
by pos 2293 - 'else if (thisobj->objectSubId == 0)' - at the file 
catalog/dependency.c and it is legal behavior: column row object deleted 
without any report because we already decided to drop its whole table.


Also, I checked the triggers test. Difference in the ERROR message 
'cannot drop trigger trg1' is caused by different order of tuples in the 
relation with the dependDependerIndexId relid. It is legal behavior and 
we can simply replace test results.


May be you know any another problems of the patch?

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company



Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-04 Thread Tom Lane
Pavel Stehule  writes:
> ne 4. 11. 2018 v 16:54 odesilatel Tom Lane  napsal:
>> In short, I think it's a bug that we allow the above.  If you
>> want to keep the must-be-a-variable error then it should apply in
>> this case too.

> I agree. This should be prohibited from PLpgSQL.

OK.  In that case I'll run with my patch.  The attached is what I had
code-wise, but I've not changed the regression tests to match ... gotta
go fix the docs too I guess.

regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 4552638..4e6d5b1 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 30,35 
--- 30,36 
  #include "funcapi.h"
  #include "miscadmin.h"
  #include "nodes/nodeFuncs.h"
+ #include "optimizer/clauses.h"
  #include "optimizer/planner.h"
  #include "parser/parse_coerce.h"
  #include "parser/scansup.h"
*** exec_stmt_call(PLpgSQL_execstate *estate
*** 2158,2260 
  		plpgsql_create_econtext(estate);
  	}
  
  	if (SPI_processed == 1)
  	{
  		SPITupleTable *tuptab = SPI_tuptable;
  
  		/*
! 		 * Construct a dummy target row based on the output arguments of the
! 		 * procedure call.
  		 */
  		if (!stmt->target)
  		{
  			Node	   *node;
- 			ListCell   *lc;
  			FuncExpr   *funcexpr;
! 			int			i;
! 			HeapTuple	tuple;
  			Oid		   *argtypes;
  			char	  **argnames;
  			char	   *argmodes;
  			MemoryContext oldcontext;
  			PLpgSQL_row *row;
  			int			nfields;
  
  			/*
! 			 * Get the original CallStmt
  			 */
! 			node = linitial_node(Query, ((CachedPlanSource *) linitial(plan->plancache_list))->query_list)->utilityStmt;
! 			if (!IsA(node, CallStmt))
! elog(ERROR, "returned row from not a CallStmt");
  
! 			funcexpr = castNode(CallStmt, node)->funcexpr;
  
  			/*
! 			 * Get the argument modes
  			 */
! 			tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcexpr->funcid));
! 			if (!HeapTupleIsValid(tuple))
! elog(ERROR, "cache lookup failed for function %u", funcexpr->funcid);
! 			get_func_arg_info(tuple, &argtypes, &argnames, &argmodes);
! 			ReleaseSysCache(tuple);
  
  			/*
! 			 * Construct row
  			 */
  			oldcontext = MemoryContextSwitchTo(estate->func->fn_cxt);
  
! 			row = palloc0(sizeof(*row));
  			row->dtype = PLPGSQL_DTYPE_ROW;
  			row->refname = "(unnamed row)";
  			row->lineno = -1;
! 			row->varnos = palloc(sizeof(int) * FUNC_MAX_ARGS);
  
  			nfields = 0;
  			i = 0;
! 			foreach(lc, funcexpr->args)
  			{
  Node	   *n = lfirst(lc);
  
! if (argmodes && argmodes[i] == PROARGMODE_INOUT)
  {
  	if (IsA(n, Param))
  	{
! 		Param	   *param = castNode(Param, n);
  
  		/* paramid is offset by 1 (see make_datum_param()) */
  		row->varnos[nfields++] = param->paramid - 1;
  	}
- 	else if (IsA(n, NamedArgExpr))
- 	{
- 		NamedArgExpr *nexpr = castNode(NamedArgExpr, n);
- 		Param	   *param;
- 
- 		if (!IsA(nexpr->arg, Param))
- 			ereport(ERROR,
- 	(errcode(ERRCODE_SYNTAX_ERROR),
- 	 errmsg("argument %d is an output argument but is not writable", i + 1)));
- 
- 		param = castNode(Param, nexpr->arg);
- 
- 		/*
- 		 * Named arguments must be after positional arguments,
- 		 * so we can increase nfields.
- 		 */
- 		row->varnos[nexpr->argnumber] = param->paramid - 1;
- 		nfields++;
- 	}
  	else
  		ereport(ERROR,
  (errcode(ERRCODE_SYNTAX_ERROR),
!  errmsg("argument %d is an output argument but is not writable", i + 1)));
  }
  i++;
  			}
  
  			row->nfields = nfields;
  
- 			MemoryContextSwitchTo(oldcontext);
- 
  			stmt->target = (PLpgSQL_variable *) row;
  		}
  
--- 2159,2268 
  		plpgsql_create_econtext(estate);
  	}
  
+ 	/*
+ 	 * Check result rowcount; if there's one row, assign procedure's output
+ 	 * values back to the appropriate variables.
+ 	 */
  	if (SPI_processed == 1)
  	{
  		SPITupleTable *tuptab = SPI_tuptable;
  
  		/*
! 		 * If first time through, construct a DTYPE_ROW datum representing the
! 		 * plpgsql variables associated with the procedure's output arguments.
! 		 * Then we can use exec_move_row() to do the assignments.
  		 */
  		if (!stmt->target)
  		{
  			Node	   *node;
  			FuncExpr   *funcexpr;
! 			HeapTuple	func_tuple;
! 			List	   *funcargs;
  			Oid		   *argtypes;
  			char	  **argnames;
  			char	   *argmodes;
  			MemoryContext oldcontext;
  			PLpgSQL_row *row;
  			int			nfields;
+ 			ListCell   *lc;
+ 			int			i;
  
  			/*
! 			 * Get the parsed CallStmt, and look up the called procedure
  			 */
! 			node = linitial_node(Query,
!  ((CachedPlanSource *) linitial(plan->plancache_list))->query_list)->utilityStmt;
! 			if (node == NULL || !IsA(node, CallStmt))
! elog(ERROR, "query for CALL statement is not a CallStmt");
  
! 			funcexpr = ((CallStmt *) node)->funcexpr;
! 
! 			func_tuple = SearchSysCache1(PROCOID,
! 		 ObjectIdGetDatum(funce

Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-04 Thread Pavel Stehule
ne 4. 11. 2018 v 16:54 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > so 3. 11. 2018 v 22:47 odesílatel Tom Lane  napsal:
> >> So while looking at that ... isn't the behavior for non-writable output
> >> parameters basically insane?  It certainly fails to accord with the
> >> plpgsql documentation, which shows an example that would throw an error:
> >>
> >> CREATE PROCEDURE triple(INOUT x int)
> >> ...
> >> CALL triple(5);
> >>
> >> It's even weirder that you can get away with not supplying a writable
> >> target value for an output argument so long as it has a default.
> >>
> >> I think the behavior here ought to be "if the actual argument is a
> plpgsql
> >> variable, assign the output back to it, otherwise do nothing".  That's
> >> much closer to the behavior of OUT arguments in other old-school
> >> programming languages.
>
> > I don't agree. The constant can be used only for IN parameter. Safe
> > languages like Ada does copy result to variable used as INOUT parameter.
> > PL/SQL doesn't allow it too.
>
> Well, my main point is that that ISN'T what our current code does, nor
> does your patch.  The reason I'm complaining is that after I rewrote the
> code to use expand_function_arguments, it started rejecting this existing
> regression test case:
>
> CREATE PROCEDURE test_proc8c(INOUT a int, INOUT b int, INOUT c int DEFAULT
> 11)
> ...
>
>   CALL test_proc8c(_a, _b);
>
> I do not think you can argue that that's not broken while simultaneously
> claiming that this error check promotes safe coding.
>

Hard to say what is correct in this case. When we use CALL statement
directly from top level, then it is clean. But the semantic from PLpgSQL is
difficult.

Maybe we can disallow this case when procedure is called from PL/pgSQL.


>
> I looked into SQL:2011 to see what it has to say about this.  In
> 10.4 , syntax rule 9) g) iii) says
>
> For each SQL parameter Pi, 1 (one) ≤ i ≤ SRNP, that is an output SQL
> parameter or both an input SQL parameter and an output SQL parameter,
> XAi shall be a .
>
> The immediately preceding rules make it clear that XAi is the actual
> argument corresponding to parameter Pi *after* default-insertion and
> named-argument reordering.  So our existing behavior here clearly
> contradicts the spec: DEFAULT is not a .
>
> I couldn't find any really clear statement of what PL/SQL does
> in this situation, but the docs I did find say that the actual
> argument for an OUT parameter has to be a variable, with no
> exceptions mentioned.
>
> In short, I think it's a bug that we allow the above.  If you
> want to keep the must-be-a-variable error then it should apply in
> this case too.
>

I agree. This should be prohibited from PLpgSQL.

Regards

Pavel

>
> regards, tom lane
>


Re: bugfix: BUG #15477: Procedure call with named inout refcursor parameter - "invalid input syntax for type boolean"

2018-11-04 Thread Tom Lane
Pavel Stehule  writes:
> so 3. 11. 2018 v 22:47 odesílatel Tom Lane  napsal:
>> So while looking at that ... isn't the behavior for non-writable output
>> parameters basically insane?  It certainly fails to accord with the
>> plpgsql documentation, which shows an example that would throw an error:
>> 
>> CREATE PROCEDURE triple(INOUT x int)
>> ...
>> CALL triple(5);
>> 
>> It's even weirder that you can get away with not supplying a writable
>> target value for an output argument so long as it has a default.
>> 
>> I think the behavior here ought to be "if the actual argument is a plpgsql
>> variable, assign the output back to it, otherwise do nothing".  That's
>> much closer to the behavior of OUT arguments in other old-school
>> programming languages.

> I don't agree. The constant can be used only for IN parameter. Safe
> languages like Ada does copy result to variable used as INOUT parameter.
> PL/SQL doesn't allow it too.

Well, my main point is that that ISN'T what our current code does, nor
does your patch.  The reason I'm complaining is that after I rewrote the
code to use expand_function_arguments, it started rejecting this existing
regression test case:

CREATE PROCEDURE test_proc8c(INOUT a int, INOUT b int, INOUT c int DEFAULT 11)
...

  CALL test_proc8c(_a, _b);

I do not think you can argue that that's not broken while simultaneously
claiming that this error check promotes safe coding.

I looked into SQL:2011 to see what it has to say about this.  In
10.4 , syntax rule 9) g) iii) says

For each SQL parameter Pi, 1 (one) ≤ i ≤ SRNP, that is an output SQL
parameter or both an input SQL parameter and an output SQL parameter,
XAi shall be a .

The immediately preceding rules make it clear that XAi is the actual
argument corresponding to parameter Pi *after* default-insertion and
named-argument reordering.  So our existing behavior here clearly
contradicts the spec: DEFAULT is not a .

I couldn't find any really clear statement of what PL/SQL does
in this situation, but the docs I did find say that the actual
argument for an OUT parameter has to be a variable, with no
exceptions mentioned.

In short, I think it's a bug that we allow the above.  If you
want to keep the must-be-a-variable error then it should apply in
this case too.

regards, tom lane



Re: Optimizing nested ConvertRowtypeExpr execution

2018-11-04 Thread Dmitry Dolgov
> On Sun, 4 Nov 2018 at 15:48, Andrew Gierth  
> wrote:
>
> > "Dmitry" == Dmitry Dolgov <9erthali...@gmail.com> writes:
>
>  Dmitry> This patch went through the last tree commit fests without any
>  Dmitry> noticeable activity, but cfbot says it still applies and
>  Dmitry> doesn't break any tests. The patch itself is rather small, and
>  Dmitry> I could reproduce ~20% of performance improvements while
>  Dmitry> running the same scripts under pgbench (although not in all
>  Dmitry> cases), but probably we need to find someone to take over it.
>  Dmitry> Does anyone wants to do so, maybe Kyotaro?
>
> I'll deal with it.

Thanks!



Re: Optimizing nested ConvertRowtypeExpr execution

2018-11-04 Thread Andrew Gierth
> "Dmitry" == Dmitry Dolgov <9erthali...@gmail.com> writes:

 Dmitry> This patch went through the last tree commit fests without any
 Dmitry> noticeable activity, but cfbot says it still applies and
 Dmitry> doesn't break any tests. The patch itself is rather small, and
 Dmitry> I could reproduce ~20% of performance improvements while
 Dmitry> running the same scripts under pgbench (although not in all
 Dmitry> cases), but probably we need to find someone to take over it.
 Dmitry> Does anyone wants to do so, maybe Kyotaro?

I'll deal with it.

-- 
Andrew (irc:RhodiumToad)



tickling the lesser contributor's withering ego

2018-11-04 Thread Erik Rijkers

I wouldn't mind if this page:
  https://www.postgresql.org/community/contributors/

contained a link to (contributors v11):
  
https://www.postgresql.org/docs/11/static/release-11.html#id-1.11.6.5.6


and to (contributors v10)
  
https://www.postgresql.org/docs/current/static/release-11.html#id-1.11.6.5.6


In fact, I think I would like it.

I hope you agree.


thanks,


Erik Rijkers




Re: using index or check in ALTER TABLE SET NOT NULL

2018-11-04 Thread Dmitry Dolgov
>On Sun, 15 Apr 2018 at 09:09, Sergei Kornilov  wrote:
>
> Attached updated patch follows recent Reorganize partitioning code commit.
> regards, Sergei

This patch went through the last tree commit fests without any noticeable
activity, but cfbot says it still applies and doesn't break any tests. The
patch itself is rather small, the only significant objection I see from the
thread is that probably "SET NOT NULL NOT VALID" feature could be more
appropriate way of fixing the original problem, but looks like no one is
working on that (the only related thread I could found was this one [1]). If
not properly cataloguing NOT NULL constraints would be fixed, can it
potentially conflict with the current patch or not?


[1]: 
https://www.postgresql.org/message-id/flat/CAASwCXcOokBqHf8BXECbDgOLkXxJyL_Q_twhg2dGWSvgMzz%3DxA%40mail.gmail.com



Re: Optimizing nested ConvertRowtypeExpr execution

2018-11-04 Thread Dmitry Dolgov
> On Mon, 9 Apr 2018 at 14:16, Ashutosh Bapat  
> wrote:
>
> On Mon, Apr 9, 2018 at 5:34 PM, Kyotaro HORIGUCHI
>  wrote:
> >
> > The new code doesn't seem to work as written.
> >
> >>   arg = eval_const_expressions_mutator((Node *) cre->arg,
> >>context);
> >>
> >>   /*
> >>* In case of a nested ConvertRowtypeExpr, we can convert the
> >>* leaf row directly to the topmost row format without any
> >>* intermediate conversions.
> >>*/
> >>   while (arg != NULL && IsA(arg, ConvertRowtypeExpr))
> >>   arg = (Node *) castNode(ConvertRowtypeExpr, arg)->arg;
> >
> > This runs depth-first so the while loop seems to run at most
> > once. I suppose that the "arg =" and the while loop are
> > transposed as intention.
>
> Yes. I have modelled it after RelableType case few lines above in the
> same function.

This patch went through the last tree commit fests without any noticeable
activity, but cfbot says it still applies and doesn't break any tests. The
patch itself is rather small, and I could reproduce ~20% of performance
improvements while running the same scripts under pgbench (although not in all
cases), but probably we need to find someone to take over it. Does anyone wants
to do so, maybe Kyotaro?



Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)

2018-11-04 Thread Dmitry Dolgov
> On Sun, 1 Apr 2018 at 19:58, Yura Sokolov  wrote:
>
> I didn't change serialized format. Therefore is no need to change
> SerializeSnapshot.
> But in-memory representation were changed, so RestoreSnapshot is changed.

This patch went through the last tree commit fests without any noticeable
activity, but cfbot says it still applies and doesn't break any tests. Taking
into account potential performance improvements, I believe it would be a pity
to stop at this point.

Yura, what're your plans about it? If I understand correctly, there are still
some commentaries, that were not answered from the last few messages. At the
same time can anyone from active reviewers (Tomas, Amit) look at it to agree on
what should be done to push it forward?



Re: chained transactions

2018-11-04 Thread Fabien COELHO


Hello Peter,


Attached is a rebased patch set.  No functionality changes.


Patch applies cleanly, compile, global make check ok, doc gen ok.

Shouldn't psql tab completion be updated as well?

About the code:

I must admit that do not like much the three global variables & 
Save/Restore functions. I'd suggest saving directly into 3 local variables 
in function CommitTransactionCommand, and restoring them when needed. Code 
should not be longer. I'd would not bother to skip saving when not 
chaining.


Copying & comparing nodes are updated. Should making, outing and reading 
nodes also be updated?


About the tests: I'd suggest to use more options on the different tests, 
eg SERIALIZABLE, READ ONLY… Also ISTM that tests should show 
transaction_read_only value as well.


--
Fabien.

Re: pg_dump multi VALUES INSERT

2018-11-04 Thread Fabien COELHO




The patch attached add additional option for multi values insert statement
with a default values of 100 row per statement so the row lose during error
is at most 100 rather than entire table.


Patch does not seem to apply anymore, could you rebase?

--
Fabien.



Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-11-04 Thread David Rowley
On 1 November 2018 at 22:39, Amit Langote  wrote:
> On 2018/11/01 10:30, David Rowley wrote:
>> It's great to know the patch is now so perfect that we've only the
>> macro naming left to debate ;-)
>
> I looked over v12 again and noticed a couple minor issues.
>
> + *  table then we store the index into parenting
> + *  PartitionTupleRouting 'partition_dispatch_info' array.  An
>
> s/PartitionTupleRouting/PartitionTupleRouting's/g
>
> Also, I got a bit concerned about "parenting".  Does it mean something
> like "enclosing", because the PartitionDispatch is a member of
> PartitionTupleRouting?  I got concerned because using "parent" like this
> may be confusing as this is the partitioning code we're talking about,
> where "parent" is generally used to mean "parent" table.
>
> + * the partitioned table that's the target of the command.  If we must
> + * route tuple via some sub-partitioned table, then the PartitionDispatch
> + * for those is only built the first time it's required.
>
> ... via some sub-partitioned table"s"
>
> Or perhaps rewrite a bit as:
>
> If we must route the tuple via some sub-partitioned table, then its
> PartitionDispatch is built the first time it's required.

I've attached v13 which hopefully addresses these.

> The macro naming discussion got me thinking today about the macro itself.
> It encapsulates access to the various PartitionTupleRouting arrays
> containing the maps, but maybe we've got the interface of tuple routing a
> bit (maybe a lot given this thread!) wrong to begin with.  Instead of
> ExecFindPartition returning indexes into various PartitionTupleRouting
> arrays and its callers then using those indexes to fetch various objects
> from those arrays, why doesn't it return those objects itself?  Although
> we made sure that the callers don't need to worry about the meaning of
> these indexes changing with this patch, it still seems a bit odd for them
> to have to go back to those arrays to get various objects.
>
> How about we change ExecFindPartition's interface so that it returns the
> ResultRelInfo, the two maps, and the partition slot?  So, the arrays
> simply become a cache for ExecFindPartition et al and are no longer
> accessed outside execPartition.c.  Although that makes the interface of
> ExecFindPartition longer, I think it reduces overall complexity.

I don't really think stuffing values into a bunch of output parameters
is much of an improvement. I'd rather allow callers to fetch what they
need using the index we return. Most callers don't need to know about
the child to parent maps, so it seems nicer for those places not to
have to invent a dummy variable to pass along to ExecFindPartition()
so it can needlessly populate it for them.

Perhaps a better design would be to instead of having random special
partitioned-table-only fields in ResultRelInfo, just have an extra
struct there that contains the extra partition information which would
include the translation maps and then just return the ResultRelInfo
and allow callers to lookup any extra details they require. I've not
looked into this in detail, but I think the committer work that's
required for the patch as it is today is already quite significant.
I'm not keen on warding any willing one off by making the commit job
any harder.  I agree that it would be good to stabilise the API for
all this partitioning code sometime, but I don't believe it needs to
be done all in one commit. My intent here is to improve performance or
INSERT and UPDATE on partitioned tables. Perhaps we can shape some API
redesign later in the release cycle. What do you think?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


v13-0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch
Description: Binary data


Re: WIP: Avoid creation of the free space map for small tables

2018-11-04 Thread John Naylor
On 10/31/18, Robert Haas  wrote:
> It seems important to me that before anybody thinks
> about committing this, we construct some kind of destruction case
> where repeated scans of the whole table are triggered as frequently as
> possible, and then run that test with varying thresholds.  I might be
> totally wrong, but I bet with a value as large as 32 you will be able
> to find cases where it regresses in a big way.

Here's an attempt at a destruction case: Lobotomize the heap insert
logic such that it never checks the cached target block and has to
call the free space logic for every single insertion, like this:

index ff13c03083..5d5b36af29 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -377,7 +377,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 else if (bistate && bistate->current_buf != InvalidBuffer)
 targetBlock = BufferGetBlockNumber(bistate->current_buf);
 else
-targetBlock = RelationGetTargetBlock(relation);
+targetBlock = InvalidBlockNumber;

 if (targetBlock == InvalidBlockNumber && use_fsm)
 {

(with the threshold patch I had to do additional work)
With the small tuples used in the attached v2 test, this means the
free space logic is called ~225 times per block. The test tables are
pre-filled with one tuple and vacuumed so that the FSMs are already
created when testing the master branch. The patch branch is compiled
with a threshold of 8, but testing inserts of 4 pages will effectively
simulate a threshold of 4, etc. As before, trimmed average of 10 runs,
loading to 100 tables each:

# blocksmaster  patch
2   25.1ms  30.3ms
4   40.7ms  48.1ms
6   56.6ms  64.7ms
8   73.1ms  82.0ms

Without this artificial penalty, the 8 block case was about 50ms for
both branches. So if I calculated right, of that 50 ms, master is
spending ~0.10ms looking for free space, and the patch is spending
about ~0.15ms. So, from that perspective, the difference is trivial.
Of course, this is a single client, so not entirely realistic. I think
that shared buffer considerations are most important for deciding the
threshold.

> We also need to think about what happens on the standby, where the FSM
> is updated in a fairly different way.

Were you referring to performance or just functionality? Because the
threshold works on the standby, but I don't know about the performance
there.

-John Naylor


fsm-copy-test-v2.sql
Description: application/sql