Re: [HACKERS] INSERT ON CONFLICT and partitioned tables

2017-11-30 Thread Amit Langote
On 2017/12/01 11:01, Amit Langote wrote:
> On 2017/12/01 1:02, Robert Haas wrote:
>> Second, this would be the first place where the second argument to
>> ExecOpenIndices() is passed simply as true.  The only other caller
>> that doesn't pass constant false is in nodeModifyTable.c and looks
>> like this:
>>
>> ExecOpenIndices(resultRelInfo, mtstate->mt_onconflict != ONCONFLICT_NONE);
>>
>> The intention here appears to be to avoid the overhead of doing
>> BuildSpeculativeIndexInfo() when it isn't needed.  I think we should
>> try to do the same thing here.  As written, the patch will do that
>> work even when the query has no ON CONFLICT clause, which doesn't seem
>> like a good idea.
> 
> Agreed.  One thing though is that ExecSetupPartitionTupleRouting doesn't
> receive the mtstate today.  I've a patch pending that will re-design that
> interface (for another project) that I will post in the near future, but
> meanwhile let's just add a new parameter mtstate so that this patch can
> move forward with its business.  The future patch I'm talking about will
> keep the parameter intact, so it should be OK now to add the same.
> 
> Attached two patches:
> 
> 0001: refactoring to add the mtstate parameter
> 0002: the original patch updated to address the above comment

I forgot to consider the fact that mtstate could be NULL in
ExecSetupPartitionTupleRouting(), so would result in dereferencing NULL
pointer when called from CopyFrom(), which fixed in the attached updated
patch.

Thanks,
Amit
From 986676c8614d2d5954ea5d505cf729dc6dc487bd Mon Sep 17 00:00:00 2001
From: amit 
Date: Fri, 1 Dec 2017 10:44:37 +0900
Subject: [PATCH 1/2] Pass mtstate to ExecSetupPartitionTupleRouting

---
 src/backend/commands/copy.c| 3 ++-
 src/backend/executor/execPartition.c   | 3 ++-
 src/backend/executor/nodeModifyTable.c | 3 ++-
 src/include/executor/execPartition.h   | 3 ++-
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 13eb9e34ba..bace390470 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2478,7 +2478,8 @@ CopyFrom(CopyState cstate)
int num_parted,
num_partitions;
 
-   ExecSetupPartitionTupleRouting(cstate->rel,
+   ExecSetupPartitionTupleRouting(NULL,
+  
cstate->rel,
   1,
   
estate,
   
_dispatch_info,
diff --git a/src/backend/executor/execPartition.c 
b/src/backend/executor/execPartition.c
index 59a0ca4597..f6108a156b 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -63,7 +63,8 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation 
rel,
  * RowExclusiveLock mode upon return from this function.
  */
 void
-ExecSetupPartitionTupleRouting(Relation rel,
+ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
+  Relation rel,
   Index resultRTindex,
   EState *estate,
   PartitionDispatch 
**pd,
diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index 1e3ece9b34..afb83ed3ae 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1953,7 +1953,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, 
int eflags)
int num_parted,
num_partitions;
 
-   ExecSetupPartitionTupleRouting(rel,
+   ExecSetupPartitionTupleRouting(mtstate,
+  rel,
   
node->nominalRelation,
   
estate,
   
_dispatch_info,
diff --git a/src/include/executor/execPartition.h 
b/src/include/executor/execPartition.h
index 43ca9908aa..86a199d169 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -49,7 +49,8 @@ typedef struct PartitionDispatchData
 
 typedef struct PartitionDispatchData *PartitionDispatch;
 
-extern void ExecSetupPartitionTupleRouting(Relation rel,
+extern void ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
+  Relation rel,
  

Re: Commit fest 2017-11

2017-11-30 Thread Fabien COELHO


Hello Michaël,


And the last 21 patches have been classified as well. Here is the
final score for this time:
Committed: 55.
Moved to next CF: 103.
Rejected: 1.
Returned with Feedback: 47.
Total: 206.

Thanks to all the contributors for this session! The CF is now closed.


Thanks for the CF management.

Attached a small graph of the end status of patch at the end of each CF.

There is a significant "moved to next CF" set of patches, which includes:

 - patches that were ready but did not get a committer,
   (about one per committer...)

 - patches that were still waiting for a review (a lot...)

 - possibly a few that were in "waiting on author" state, although
   they tend to be switched to "returned with feedback".

There is a rolling chunck of 50% of patches which moves from CF to CF.

Not enough review(er)s, obviously.

Also, not enough committers: as a reviewer, if the patches I review do not 
get committed once ready, spending time on reviews becomes quite 
unattractive.


--
Fabien.

Re: [HACKERS] pgbench - allow to store select results into variables

2017-11-30 Thread Fabien COELHO


Hello Michaël,


Here is a v14, after yet another rebase, and some comments added to answer
your new comments.


The last patch sent still applies, but its status of "ready for
committer" does not look adapted as this version got no reviews. So I
am switching back the patch as "needs review". Feel free to change if
that's not adapted,


Indeed, I think that is not needed, as the edit was beyond mundane 
(comments lightly edited, one const added after a wind of const got 
committed, patch context diff changes probably after a pgindent run that 
triggered the need of a rebase), which is why I did not updated the status 
for this version.


So I put it back to "ready".


The patch is moved as well to next CF.


--
Fabien.

Re: Commit fest 2017-11

2017-11-30 Thread Michael Paquier
On Thu, Nov 30, 2017 at 1:51 PM, Michael Paquier
 wrote:
> Remains 22 patches as of now, exactly *one* for each committer.

And the last 21 patches have been classified as well. Here is the
final score for this time:
Committed: 55.
Moved to next CF: 103.
Rejected: 1.
Returned with Feedback: 47.
Total: 206.

Thanks to all the contributors for this session! The CF is now closed.

(I know that I am a couple of hours ahead, doing things now fits
better with my schedule.)
-- 
Michael



Re: [HACKERS] pg_basebackup --progress output for batch execution

2017-11-30 Thread Michael Paquier
On Tue, Nov 21, 2017 at 8:24 PM, Martín Marqués  wrote:
> Thank you very much for reviewing the patch and for your valuable
> input (you made me read the Microsoft Visual C specs ;))

+   if (!isatty(fileno(stderr)))
+   fprintf(stderr, "\n");
+   else
+   fprintf(stderr, "\r");
Er, why is that not "\r\n"?

I have moved this patch to next CF. Congrats for being the last one.
-- 
Michael



Re: [HACKERS] [PATCH] Improve geometric types

2017-11-30 Thread Michael Paquier
On Thu, Nov 30, 2017 at 3:28 AM, Emre Hasegeli  wrote:
> It would at least be dump-and-restore hazard if we don't let them in.
> The new version allows NaNs.
>
>> This gives a wrong result for NaN-containing objects.
>
> I removed the NaN aware comparisons from FP macros, and carefully
> reviewed the places that needs to be NaN aware.
>
> I am sorry that it took so long for me to post the new versions.  The
> more I get into this the more problems I find.  The new versions
> include non-trivial changes.  I would be glad if you can look into
> them.

I have moved this patch to next CF, with "needs review" as status as a
new version has been posted.
-- 
Michael



Re: [HACKERS] postgres_fdw super user checks

2017-11-30 Thread Michael Paquier
On Thu, Nov 30, 2017 at 12:15 PM, Ashutosh Bapat
 wrote:
> On Wed, Nov 29, 2017 at 7:42 PM, Stephen Frost  wrote:
>> Ashutosh,
>>
>> * Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
>>> On Wed, Nov 29, 2017 at 4:56 AM, Stephen Frost  wrote:
>>> > The "global rethink" being contemplated seems to be more about
>>> > authentication forwarding than it is about this specific change.  If
>>> > there's some 'global rethink' which is actually applicable to this
>>> > specific deviation from the usual "use the view's owner for privilege
>>> > checks", then it's unclear to me what that is.
>>>
>>> Global rethink may constitute other authentication methods like
>>> certificate based authentication. But I am not clear about global
>>> rethink in the context of owner privileges problem being discussed
>>> here.
>>
>> Right, I'm all for an independent discussion about how we can have
>> same-cluster or cross-cluster trust relationships set up to make it
>> easier for users in one cluster/database to access tables in another
>> that they should be allowed to, but that's a different topic from this.
>>
>> In other words, I think we're agreeing here. :)
>
> Yes.

I am moving this patch to next CF 2018-01.
-- 
Michael



Re: Transform for pl/perl

2017-11-30 Thread Michael Paquier
On Tue, Nov 28, 2017 at 5:14 PM, Aleksander Alekseev
 wrote:
> The new status of this patch is: Ready for Committer

Patch moved to CF 2018-01. Perhaps a committer will look at it at some point.
-- 
Michael



Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files

2017-11-30 Thread Michael Paquier
On Tue, Oct 3, 2017 at 1:20 AM, chenhj  wrote:
> I had filled the authors field of this patch in commitfest, and will rebase
> this patch if needed. Thank you for your help!

The documentation of the patch needs a rebase, so I am moving it to
next CF with "waiting on author" as status.

$ git diff master --check
src/bin/pg_rewind/pg_rewind.c:292: trailing whitespace.
+*
There are whitespace complains.
-- 
Michael



Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

2017-11-30 Thread Michael Paquier
On Tue, Nov 28, 2017 at 4:10 AM, Jesper Pedersen
 wrote:
> On 11/27/2017 07:41 AM, Юрий Соколов wrote:
>> Oh... there were stupid error in previos file.
>> Attached fixed version.
>
> I can reconfirm my performance findings with this patch; system same as
> up-thread.

I have moved this patch to next CF, with "needs review" as status as a
new patch has been sent after addressing Andres' comments.
-- 
Michael



Re: [HACKERS] GUC for cleanup indexes threshold.

2017-11-30 Thread Michael Paquier
On Thu, Nov 30, 2017 at 12:06 AM, Masahiko Sawada  wrote:
> On Wed, Nov 29, 2017 at 11:05 PM, Simon Riggs  wrote:
>> Unless there is disagreement on the above, it seems we should apply
>> Yura's patch (an edited version, perhaps).
>>
>
> IIRC the patches that makes the cleanup scan skip has a problem
> pointed by Peter[1], that is that we stash an XID when a btree page is
> deleted, which is used to determine when it's finally safe to recycle
> the page. Yura's patch doesn't have that problem?
>
> [1] 
> https://www.postgresql.org/message-id/CAH2-Wz%3D1%3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com

The latest patch present on this thread does not apply
(https://www.postgresql.org/message-id/c4a47caef28024ab7528946bed264...@postgrespro.ru).
Please send a rebase. For now I am moving the patch to next CF, with
"waiting on author".
-- 
Michael



Re: Use of uninitialized variables in ExecFindPartition() for parent partition without leaves (HEAD only)

2017-11-30 Thread Amit Langote
On 2017/12/01 11:48, Michael Paquier wrote:
> On Thu, Nov 30, 2017 at 10:17 AM, Amit Langote
>  wrote:
>> Oops, I messed up taking the diff and mistakenly added noise to the patch.
> 
> Which is that bit:
> - * BuildSlotPartitionKeyDescription
> + * ExecBuildSlotPartitionKeyDescription

Yep.

>>  Fixed in the attached.
> 
> For information, it is easy enough to run into rather nasty behaviors
> here. Taking for example the test case in the last patch, but without
> the actual fix:
>   -- no partitions, so fail
>   insert into range_parted values ('a', 11);
> ! ERROR:  invalid memory alloc request size 2139062147

Yikes.

> So the test case you are adding is a good thing to have, and I am fine
> with the comment.

Thanks.

> I have added a CF entry for this thread by the way
> (https://commitfest.postgresql.org/16/1392/), and marked the thing as
> ready for committer as we agree about the fix. Let's track properly
> this issue until it gets committed.

Yeah, thanks.

Regards,
Amit




Re: [HACKERS] INSERT ON CONFLICT and partitioned tables

2017-11-30 Thread Amit Langote
On 2017/12/01 11:27, Simon Riggs wrote:
> On 24 November 2017 at 13:45, Amit Langote
>  wrote:
> 
>>> Why? There is no caller that needs information.
>>
>> It is to be used if and when ExecInsert() calls
>> ExecCheckIndexConstraints() in the code path to handle ON CONFLICT DO
>> NOTHING that we're intending to support in some cases.  Note that it will
>> only check conflicts for the individual leaf partitions using whatever
>> constraint-enforcing indexes they might have.
> 
> So we should have 2 patches. One for now that does DO NOTHING and
> another that adds the change that depends upon Alvaro's work.

Yes, I'd think so.

It won't be until the patch at [1] to add support to define UNIQUE indexes
on partitioned tables is committed that we could support specifying
conflict_target on partitioned tables.  Even then, it would take at least
some executor changes to actually make it work, but at least the planner
won't complain that there are no indexes.  If I try Alvaro's patch today
and see what happens when conflict_target is specified, still get an error
but now it's the executor:

create table p (a int, b char) partition by list (a);
create table p1 partition of p (b unique) for values in (1);

-- on HEAD (before applying the patch on this thread)
insert into p values (1) on conflict (a) do nothing;
ERROR:  ON CONFLICT clause is not supported with partitioned tables

-- after applying the patch on this thread (no indexes yet)
insert into p values (1) on conflict (a) do nothing;
ERROR:  there is no unique or exclusion constraint matching the ON
CONFLICT specification

-- after applying Alvaro's patch at [1]
create unique index on p (a);

-- but, the executor support is missing, so...
insert into p values (1) on conflict (a) do nothing;
ERROR:  unexpected failure to find arbiter index

I will report this on that thread and we can discuss the executor changes
that would be need to make it work there.

Thanks,
Amit

[1] https://commitfest.postgresql.org/16/1365/




Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-30 Thread David Rowley
On 1 December 2017 at 01:02, Alvaro Herrera  wrote:
> we currently (I mean after my
> patch) don't mark partitioned-table-level indexes as valid or not valid
> depending on whether all its children exist, so trying to use that in
> the planner without having a flag could cause invalid plans to be
> generated (i.e. ones that would cause nonexistent indexes to be
> referenced).

So, then this patch is only really intended as a syntax shortcut for
mass adding of indexes to each partition?

I feel like we could do better here with little extra effort. The
DETACH index feature does not really seem required for this patch. I
think just ensuring a matching index exists on each leaf partition and
creating any which don't exist before creating the index on the target
partitioned table seems like the correct solution. That way we can
make that index indisvalid flag have a valid meaning all the time.
Some later patch can invent some way to replace a bloated index.

Perhaps later we can invent some generic way to replace a physical
leaf index for a given partitioned index perhaps with the same patch
that might allow us to replace an index which is used by a constraint,
which to me seems like a feature we should have had years ago.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Doc tweak for huge_pages?

2017-11-30 Thread Thomas Munro
On Fri, Dec 1, 2017 at 5:04 PM, Justin Pryzby  wrote:
> On Fri, Dec 01, 2017 at 04:01:24PM +1300, Thomas Munro wrote:
>> Hi hackers,
>>
>> The manual implies that only Linux can use huge pages.  That is not
>> true: FreeBSD, Illumos and probably others support larger page sizes
>> using transparent page coalescing algorithms.  On my FreeBSD box
>> procstat -v often shows PostgreSQL shared buffers in "S"-flagged
>> memory.  I think we should adjust the manual to make clear that it's
>> the *explicit request for huge pages* that is supported only on Linux
>> (and hopefully soon Windows).  Am I being too pedantic?
>
> I suggest to remove "other" and include Linux in the enumeration, since it 
> also
> supports "transparent" hugepages.

Hmm.  Yeah, it does, but apparently it's not so transparent.  So if we
mention that we'd better indicate in the same paragraph that you
probably don't actually want to use it.  How about the attached?

-- 
Thomas Munro
http://www.enterprisedb.com


huge-pages-doc-tweak-v2.patch
Description: Binary data


Re: Protect syscache from bloating with negative cache entries

2017-11-30 Thread Tom Lane
Robert Haas  writes:
> On Wed, Nov 29, 2017 at 11:17 PM, Tom Lane  wrote:
>> The thing that makes me uncomfortable about this is that we used to have a
>> catcache size limitation mechanism, and ripped it out because it had too
>> much overhead (see commit 8b9bc234a).  I'm not sure how we can avoid that
>> problem within a fresh implementation.

> At the risk of beating a dead horse, I still think that the amount of
> wall clock time that has elapsed since an entry was last accessed is
> very relevant.

While I don't object to that statement, I'm not sure how it helps us
here.  If we couldn't afford DLMoveToFront(), doing a gettimeofday()
during each syscache access is surely right out.

regards, tom lane



Re: [HACKERS] Additional logging for VACUUM and ANALYZE

2017-11-30 Thread Michael Paquier
On Wed, Nov 8, 2017 at 12:54 AM, Fabrízio de Royes Mello
 wrote:
> Great. Changed status to ready for commiter.

The patch still applies, but no committer seem to be interested in the
topic, so moved to next CF.
-- 
Michael



Re: [HACKERS] Supporting huge pages on Windows

2017-11-30 Thread Michael Paquier
On Fri, Nov 24, 2017 at 9:28 AM, Tsunakawa, Takayuki
 wrote:
> From: Thomas Munro [mailto:thomas.mu...@enterprisedb.com]
>> I hope Tsunakawa-san doesn't mind me posting another rebased version of
>> his patch.  The last version conflicted with the change from SGML to XML
>> that just landed in commit 3c49c6fa.
>
> Thank you very much for keeping it fresh.  I hope the committer could have 
> some time.

I have moved this patch to next CF. Magnus, you are registered as a
reviewer of this patch. Are you planning to look at it and potentially
commit it?
-- 
Michael



Doc tweak for huge_pages?

2017-11-30 Thread Thomas Munro
Hi hackers,

The manual implies that only Linux can use huge pages.  That is not
true: FreeBSD, Illumos and probably others support larger page sizes
using transparent page coalescing algorithms.  On my FreeBSD box
procstat -v often shows PostgreSQL shared buffers in "S"-flagged
memory.  I think we should adjust the manual to make clear that it's
the *explicit request for huge pages* that is supported only on Linux
(and hopefully soon Windows).  Am I being too pedantic?

-- 
Thomas Munro
http://www.enterprisedb.com


huge-pages-doc-tweak.patch
Description: Binary data


Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump

2017-11-30 Thread Michael Paquier
On Wed, Nov 29, 2017 at 5:50 AM, Robert Haas  wrote:
> On Mon, Aug 7, 2017 at 11:14 AM, Fabrízio de Royes Mello
>  wrote:
>> I also test against all current supported versions (9.2 ... 9.6) and didn't
>> find any issue.
>>
>> Changed status to "ready for commiter".
>
> On a very fast read this patch looks OK to me, but I'm a bit concerned
> about whether we have consensus for it.  By my count the vote is 6-3
> in favor of proceeding.
>
> +1: Robins Tharakan, Stephen Frost, David Fetter, Fabrizio Mello,
> Michael Paquier, Robert Haas
> -1: David G. Johnston, Tom Lane, Simon Riggs
>
> I guess that's probably sufficient to go forward, but does anyone wish
> to dispute my characterization of their vote or the vote tally in
> general?  Would anyone else like to cast a vote?

As this is not settled yet, I have moved the patch to next CF.
-- 
Michael



Re: [HACKERS] pgbench more operators & functions

2017-11-30 Thread Michael Paquier
On Wed, Oct 25, 2017 at 5:57 PM, Pavel Stehule  wrote:
> 2017-10-20 18:36 GMT+02:00 Fabien COELHO :
> Here is a v13. No code changes, but TAP tests added to maintain pgbench
> coverage to green.
>>
>>
>> Here is a v14, which is just a rebase after the documentation xml-ization.
>
> all tests passed
> no problems with doc building

Moved to next CF. This is still parked as ready for committer.
-- 
Michael



Re: [HACKERS] pgbench - allow to store select results into variables

2017-11-30 Thread Michael Paquier
On Sat, Nov 4, 2017 at 8:05 PM, Fabien COELHO  wrote:
>>> Here is a v13, which is just a rebase after the documentation
>>> xml-ization.
>
> Here is a v14, after yet another rebase, and some comments added to answer
> your new comments.

The last patch sent still applies, but its status of "ready for
committer" does not look adapted as this version got no reviews. So I
am switching back the patch as "needs review". Feel free to change if
that's not adapted, The patch is moved as well to next CF.
-- 
Michael



Re: [HACKERS] pow support for pgbench

2017-11-30 Thread Michael Paquier
On Tue, Nov 7, 2017 at 1:34 AM, Fabien COELHO  wrote:
> Let's now hope that a committer gets around to consider these patch some
> day.

Which is not the case yet, so moved to CF 2018-01. Please note that
the patch proposed does not apply anymore, so its status is changed to
"waiting on author" for a rebase.
-- 
Michael



Re: [HACKERS] Assertion failure when the non-exclusive pg_stop_backup aborted.

2017-11-30 Thread Michael Paquier
On Wed, Nov 29, 2017 at 7:36 AM, Michael Paquier
 wrote:
> On Wed, Nov 29, 2017 at 6:44 AM, Robert Haas  wrote:
>> On Tue, Nov 21, 2017 at 4:32 AM, Masahiko Sawada  
>> wrote:
>>> Thank you for comments. Attached updated patch.
>>
>> I see that Michael has marked this Ready for Committer, but also that
>> he didn't update the thread, so perhaps some interested committer
>> (Fujii Masao?) might have missed the fact that Michael thinks it's
>> ready to go.
>
> Sorry for not mentioning that directly on the thread.
>
>> Anyone interested in taking a look at this one for commit?
>
> I would assume that Fujii-san would be the chosen one here as he has
> worked already on the thread.

No replies yet. So I am moving this patch to next CF.
-- 
Michael



Re: [HACKERS] Commits don't block for synchronous replication

2017-11-30 Thread Michael Paquier
On Fri, Nov 24, 2017 at 11:38 PM, Simon Riggs  wrote:
> On 23 November 2017 at 11:11, Michael Paquier  
> wrote:
>
>> This is older than the bug report of this thread. All those
>> indications point out that the patch has *not* been committed. So it
>> seems to me that you perhaps committed it to your local repository,
>> but forgot to push it to the remote. I am switching back the patch
>> status to what looks correct to me "Ready for committer". Thanks.
>
> Yes, that looks like it's my mistake. Thanks for rechecking.
>
> Will commit and backpatch when I get home.

Ping. Simon, are you planning to look at this patch, and potentially commit it?

I am moving this item to next CF for now.
-- 
Michael



Re: Use of uninitialized variables in ExecFindPartition() for parent partition without leaves (HEAD only)

2017-11-30 Thread Michael Paquier
On Thu, Nov 30, 2017 at 10:17 AM, Amit Langote
 wrote:
> Oops, I messed up taking the diff and mistakenly added noise to the patch.

Which is that bit:
- * BuildSlotPartitionKeyDescription
+ * ExecBuildSlotPartitionKeyDescription

>  Fixed in the attached.

For information, it is easy enough to run into rather nasty behaviors
here. Taking for example the test case in the last patch, but without
the actual fix:
  -- no partitions, so fail
  insert into range_parted values ('a', 11);
! ERROR:  invalid memory alloc request size 2139062147
So the test case you are adding is a good thing to have, and I am fine
with the comment.

I have added a CF entry for this thread by the way
(https://commitfest.postgresql.org/16/1392/), and marked the thing as
ready for committer as we agree about the fix. Let's track properly
this issue until it gets committed.
-- 
Michael



Re: [HACKERS] INSERT ON CONFLICT and partitioned tables

2017-11-30 Thread Simon Riggs
On 24 November 2017 at 13:45, Amit Langote
 wrote:

>> Why? There is no caller that needs information.
>
> It is to be used if and when ExecInsert() calls
> ExecCheckIndexConstraints() in the code path to handle ON CONFLICT DO
> NOTHING that we're intending to support in some cases.  Note that it will
> only check conflicts for the individual leaf partitions using whatever
> constraint-enforcing indexes they might have.

So we should have 2 patches. One for now that does DO NOTHING and
another that adds the change that depends upon Alvaro's work.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256

2017-11-30 Thread Michael Paquier
On Wed, Nov 29, 2017 at 7:42 AM, Peter Eisentraut
 wrote:
> On 11/28/17 17:33, Michael Paquier wrote:
>> 1) Have a special value in the parameter saslchannelbinding proposed
>> in patch 0001. For example by specifying "none" then no channel
>> binding is used.
>
> I was thinking if it's empty then don't use channel binding.  Right now,
> empty means the same thing as tls-unique.  In any case, some variant of
> that should be fine.  I don't think we need a separate server option
> that this point.

OK, here is a reworked version with the following changes:
- renamed saslchannelbinding to scramchannelbinding, with a default
set to tls-unique.
- An empty value of scramchannelbinding allows client to not use
channel binding, or in short use use SCRAM-SHA-256 and cbind-flag set
to 'n'.

While reviewing the code, I have found something a bit disturbing with
the header definitions: the libpq frontend code includes scram.h,
which references backend-side routines. So I think that the definition
of the SCRAM mechanisms as well as the channel binding types should be
moved to scram-common.h. This cleanup is included in 0001.
-- 
Michael


0001-Move-SCRAM-related-name-definitions-to-scram-common..patch
Description: Binary data


0002-Add-connection-parameter-scramchannelbinding.patch
Description: Binary data


0003-Implement-channel-binding-tls-server-end-point-for-S.patch
Description: Binary data


Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

2017-11-30 Thread Masahiko Sawada
On Fri, Dec 1, 2017 at 3:04 AM, Robert Haas  wrote:
> On Thu, Nov 30, 2017 at 6:20 AM, Masahiko Sawada  
> wrote:
>>> This code ignores the existence of multiple databases; RELEXTLOCK
>>> contains a relid, but no database OID.  That's easy enough to fix, but
>>> it actually causes no problem unless, by bad luck, you have two
>>> relations with the same OID in different databases that are both being
>>> rapidly extended at the same time -- and even then, it's only a
>>> performance problem, not a correctness problem.  In fact, I wonder if
>>> we shouldn't go further: instead of creating these RELEXTLOCK
>>> structures dynamically, let's just have a fixed number of them, say
>>> 1024.  When we get a request to take a lock, hash  and
>>> take the result modulo 1024; lock the RELEXTLOCK at that offset in the
>>> array.
>>
>> Attached the latest patch incorporated comments except for the fix of
>> the memory size for relext lock.
>
> It doesn't do anything about the comment of mine quoted above.

Sorry I'd missed the comment.

>   Since it's only possible to hold one relation extension lock at a time, we
> don't really need the hash table here at all. We can just have an
> array of 1024 or so locks and map every  pair on to one of
> them by hashing.  The worst thing we'll get it some false contention,
> but that doesn't seem awful, and it would permit considerable further
> simplification of this code -- and maybe make it faster in the
> process, because we'd no longer need the hash table, or the pin count,
> or the extra LWLocks that protect the hash table.  All we would have
> is atomic operations manipulating the lock state, which seems like it
> would be quite a lot faster and simpler.

Agreed. With this change, we will have an array of the struct that has
lock state and cv. The lock state has the wait count as well as the
status of lock.

> BTW, I think RelExtLockReleaseAll is broken because it shouldn't
> HOLD_INTERRUPTS(); I also think it's kind of silly to loop here when
> we know we can only hold one lock.  Maybe RelExtLockRelease can take
> bool force and do if (force) held_relextlock.nLocks = 0; else
> held_relextlock.nLocks--.  Or, better yet, have the caller adjust that
> value and then only call RelExtLockRelease() if we needed to release
> the lock in shared memory.  That avoids needless branching.

Agreed. I'd vote for the latter.

>  On a
> related note, is there any point in having both held_relextlock.nLocks
> and num_held_relextlocks?

num_held_relextlocks is actually unnecessary, will be removed.

> I think RelationExtensionLock should be a new type of IPC wait event,
> rather than a whole new category.

Hmm, I thought the wait event types of IPC seems related to events
that communicates to other processes for the same purpose, for example
parallel query, sync repli etc. On the other hand, the relation
extension locks are one kind of the lock mechanism. That's way I added
a new category. But maybe it can be fit to the IPC wait event.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-11-30 Thread Michael Paquier
On Fri, Dec 1, 2017 at 4:14 AM, Robert Haas  wrote:
> I think the changes in DefineView and ATExecSetRelOptions is wrong,
> because transformRelOptions() is still using pg_strcasecmp.  With the
> patch:
>
> rhaas=# create view v(x) with ("Check_option"="local") as select 1;
> CREATE VIEW
> rhaas=# create view v(x) with ("check_option"="local") as select 1;
> ERROR:  WITH CHECK OPTION is supported only on automatically updatable views
> HINT:  Views that do not select from a single table or view are not
> automatically updatable.
>
> Oops.

I am getting the feeling that there could be other issues than this
one... If this patch ever gets integrated, I think that this should
actually be shaped as two patches:
- One patch with the set of DDL queries taking advantage of the
current grammar with pg_strcasecmp.
- A second patch that does the switch from pg_strcasecmp to strcmp,
and allows checking which paths of patch 1 are getting changed.
Patch 1 is the hard part to figure out all the possible patterns that
could be changed.

> I am actually pretty dubious that we want to do this.  I found one bug
> already (see above), and I don't think there's any guarantee that it's
> the only one, because it's pretty hard to be sure that none of the
> remaining calls to pg_strcasecmp are being applied to any of these
> values in some other part of the code.  I'm not sure that the backward
> compatibility issue is a huge deal, but from my point of view this
> carries a significant risk of introducing new bugs, might annoy users
> who spell any of these keywords in all caps with surrounding quotation
> marks, and really has no clear benefit that I can see.  The
> originally-offered justification is that making this consistent would
> help us avoid subtle bugs, but it seems at least as likely to CREATE
> subtle bugs, and the status quo is AFAICT harming nobody.

Changing opinion here ;)
Yes, I agree that the risk of bugs may not be worth the compatibility
effort at the end. I still see value in this patch for long-term
purposes by making the code more consistent though.
-- 
Michael



Re: Combine function returning NULL unhandled?

2017-11-30 Thread David Rowley
On 24 November 2017 at 14:20, Andres Freund  wrote:
> Pushed a fix to the relevant branches, including tests of the
> trans/combine functions returns NULL cases.

Apologies for my silence here. I've been on leave and out of internet
range for two weeks.

Thank you for making the fix.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-11-30 Thread Alvaro Herrera
Alexey Kondratov wrote:

> I have rebased my branch and squashed all commits into one, since the
> patch is quite small. Everything seems to be working fine now, patch
> passes all regression tests.

On a *very* quick look, please use an enum to return from NextCopyFrom
rather than 'int'.  The chunks that change bool to int are very
odd-looking.  This would move the comment that explains the value from
copy.c to copy.h, obviously.  Also, you seem to be using non-ASCII dashes
in the descriptions of those values; please don't.

But those are really just minor nitpicks while paging through.  After
looking at the way you're handling errors, it seems that you've chosen
to go precisely the way people were saying that was not admissible: use
a PG_TRY block, and when things go south, simply log a WARNING and move
on.  This is unsafe.  For example: what happens if the error being
raised is out-of-memory?  Failing to re-throw means you don't release
current transaction memory context.  What if the error is that WAL
cannot be written because the WAL disk ran out of space?  This would
just be reported as a warning over and over until the server log disk is
also full.  What if your insertion fired a trigger that caused an update
which got a serializability exception?  You cannot just move on, because
at that point session state (serializability session state) might be
busted until you abort the current transaction.

Or maybe I misunderstood the patch completely.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Commit fest 2017-11

2017-11-30 Thread Michael Paquier
On Fri, Dec 1, 2017 at 12:44 AM, Andreas Karlsson  wrote:
> On 11/30/2017 05:51 AM, Michael Paquier wrote:> One thing that could be
> improved in my
>>
>> opinion is that patch authors should try more to move a patch to a
>> following commit fest once the end gets close...This would leverage
>> slightly the load of work for the CFM.
>
> Thanks for the suggestion. I had no idea that I could do that.

As a patch author, you have the right to decide what you are yourself
planning to do with your own baby :)

When the commit fest comes close to the end, I always try to deal with
my own patches first. If the discussion has stalled or that based on
the feedback a given thing is going to require more efforts than I
initially planned so it is not possible to spend more time on
something in the close future, I mark my own entries as returned with
feedback. If I am still planning to work on something in the close
future, then I move them to the next CF. This makes less work for the
CFM which has less patches to deal with at the end. Mentioning what
you do on the patch thread with the so-said CF entry is also something
worth doing in my opinion to keep a track of what you did. When
multiple authors and/or reviewers are involved, it is sometimes good
to post your intention before doing it some time ahead.
-- 
Michael



Re: [HACKERS] Custom compression methods

2017-11-30 Thread Alvaro Herrera
Tomas Vondra wrote:

> On 11/30/2017 04:20 PM, Ildus Kurbangaliev wrote:

> > CREATE COMPRESSION METHOD ts1 FOR tsvector HANDLER
> > tsvector_compression_handler;
> 
> Understood. Good to know you've considered it, and I agree it doesn't
> need to be there from the start (which makes the patch simpler).

Just passing by, but wouldn't this fit in the ACCESS METHOD group of
commands?  So this could be simplified down to
CREATE ACCESS METHOD ts1 TYPE COMPRESSION
we have that for indexes and there are patches flying for heap storage,
sequences, etc.  I think that's simpler than trying to invent all new
commands here.  Then (in a future patch) you can use ALTER TYPE to
define compression for that type, or even add a column-level option to
reference a specific compression method.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: using index or check in ALTER TABLE SET NOT NULL

2017-11-30 Thread Sergei Kornilov
Thank you for review! My apologies for my errors. It seems i read developers 
wiki pages not enough carefully. I will reread wiki, code style and then update 
patch with all your remarks.

> The comment /* T if we added new NOT NULL constraints */ should
> probably be changed to /* T if we should recheck NOT NULL constraints
> */ or similar.
Correct. Hm, probably I will also rename this property to verify_new_notnull 
for clarity

> I'd suggest registering this with the next CommitFest.
Already done in https://commitfest.postgresql.org/16/1389/ , i noticed this 
step in wiki



Re: IndexTupleDSize macro seems redundant

2017-11-30 Thread Peter Geoghegan
On Thu, Nov 30, 2017 at 1:48 PM, Robert Haas  wrote:
> One difference between those two macros is that IndexTupleSize
> forcibly casts the argument to IndexTuple, which means that you don't
> get any type-checking when you use that one.  I suggest that in
> addition to removing IndexTupleDSize as proposed, we also remove that
> cast.  It seems that the only place which relies on that cast
> currently is btree_xlog_split.
>
> Therefore I propose the attached patch instead.

+1 to both points. I specifically recall being annoyed by both issues,
more than once.

-- 
Peter Geoghegan



Re: [HACKERS] <> join selectivity estimate question

2017-11-30 Thread Thomas Munro
On Fri, Dec 1, 2017 at 4:05 AM, Robert Haas  wrote:
> On Wed, Nov 29, 2017 at 11:55 PM, Thomas Munro
>  wrote:
>> Thank you for the original pointer and the commit.  Everything here
>> seems to make intuitive sense and the accompanying throw-away tests
>> that I posted above seem to produce sensible results except in some
>> cases that we discussed, so I think this is progress.  There is still
>> something pretty funny about the cardinality estimates for TPCH Q21
>> which I haven't grokked though.  I suspect it is crafted to look for a
>> technique we don't know (an ancient challenge set by some long retired
>> database gurus back in 1992 that their RDBMSs know how to solve,
>> hopefully not in the manner of a certain car manufacturer's air
>> pollution tests), but I haven't yet obtained enough round tuits to dig
>> further.  I will, though.
>
> Hmm, do you have an example of the better but still-funky estimates
> handy?  Like an EXPLAIN plan?

Sure.  Here's some EXPLAIN ANALYZE output from scale 3 TPCH + a few
indexes[1].  There's a version from HEAD with and without commit
7ca25b7d.

[1] 
https://github.com/macdice/pg_sisyphus/blob/master/cluster-recipes/make-tpch-cluster.sh

-- 
Thomas Munro
http://www.enterprisedb.com

   QUERY PLAN   
 
-
 Limit  (cost=544020.68..544020.69 rows=1 width=34) (actual 
time=5070.235..5070.253 rows=100 loops=1)
   ->  Sort  (cost=544020.68..544020.69 rows=1 width=34) (actual 
time=5070.233..5070.241 rows=100 loops=1)
 Sort Key: (count(*)) DESC, supplier.s_name
 Sort Method: top-N heapsort  Memory: 39kB
 ->  GroupAggregate  (cost=544020.65..544020.67 rows=1 width=34) 
(actual time=5061.289..5068.050 rows=1194 loops=1)
   Group Key: supplier.s_name
   ->  Sort  (cost=544020.65..544020.66 rows=1 width=26) (actual 
time=5061.275..5063.360 rows=11887 loops=1)
 Sort Key: supplier.s_name
 Sort Method: quicksort  Memory: 1406kB
 ->  Nested Loop  (cost=1752.62..544020.64 rows=1 width=26) 
(actual time=3.134..4926.365 rows=11887 loops=1)
   ->  Nested Loop Semi Join  (cost=1752.19..544015.98 
rows=1 width=34) (actual time=3.122..4598.302 rows=24138 loops=1)
 ->  Gather  (cost=1751.75..544010.27 rows=1 
width=34) (actual time=2.909..4354.282 rows=40387 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Nested Loop Anti Join  
(cost=751.75..543010.17 rows=1 width=34) (actual time=4.282..4720.283 
rows=13462 loops=3)
 ->  Hash Join  
(cost=751.32..442412.33 rows=99981 width=34) (actual time=3.911..3554.800 
rows=151103 loops=3)
   Hash Cond: (l1.l_suppkey = 
supplier.s_suppkey)
   ->  Parallel Seq Scan on 
lineitem l1  (cost=0.00..431288.00 rows=2499520 width=8) (actual 
time=0.046..2742.912 rows=3792542 loops=3)
 Filter: (l_receiptdate 
> l_commitdate)
 Rows Removed by 
Filter: 2206328
   ->  Hash  
(cost=736.32..736.32 rows=1200 width=30) (actual time=3.690..3.690 rows=1194 
loops=3)
 Buckets: 2048  
Batches: 1  Memory Usage: 91kB
 ->  Nested Loop  
(cost=25.59..736.32 rows=1200 width=30) (actual time=0.486..3.206 rows=1194 
loops=3)
   ->  Seq Scan on 
nation  (cost=0.00..1.31 rows=1 width=4) (actual time=0.018..0.038 rows=1 
loops=3)
 Filter: 
(n_name = 'ALGERIA'::bpchar)
 Rows 
Removed by Filter: 24
   ->  Bitmap Heap 
Scan on supplier  (cost=25.59..723.00 rows=1200 width=34) (actual 
time=0.460..2.809 rows=1194 loops=3)
 Recheck 
Cond: (s_nationkey = nation.n_nationkey)
 Heap 
Blocks: exact=564
 ->  Bitmap 
Index Scan 

Re: [HACKERS] Bug in to_timestamp().

2017-11-30 Thread Arthur Zakirov
On Thu, Nov 30, 2017 at 10:39:56AM -0500, Robert Haas wrote:
> 
> So is it now the case that all of the regression test cases in this
> patch produce the same results as they would on Oracle?
> 

Yes, exactly. All new cases give the same result as in Oracle, except
text of error messages.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: [HACKERS] postgres_fdw bug in 9.6

2017-11-30 Thread Robert Haas
On Wed, Nov 29, 2017 at 5:32 PM, Tom Lane  wrote:
> AFAICT, [1] just demonstrates that nobody had thought about the scenario
> up to that point, not that the subsequently chosen solution was a good
> one.

But have a look at this:

http://postgr.es/m/561e12d4.7040...@lab.ntt.co.jp

That shows a case where, before
5fc4c26db5120bd90348b6ee3101fcddfdf54800, a query that required the
foreign table to do an EPQ recheck produced an unambiguously wrong
answer; the query stipulates that inttab.a = ft1.a but the row
returned lacks that property.  I think this clearly shows that EPQ
rechecks are needed at least for FDW paths that are parameterized.

However, I guess that doesn't actually refute your point, which IIUC
is that maybe we don't need these checks for FDW join paths, because
we don't build parameterized paths for those yet. Now I think it would
still be a good idea to have a way of handling that case, because
somebody's likely going want to fix that /* XXX Consider parameterized
paths for the join relation */ eventually, but I guess for the
back-branches just setting epq_path = NULL instead of calling
GetExistingLocalJoinPath() might be the way to go... assuming that
your contention that this won't break anything is indeed correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Custom compression methods

2017-11-30 Thread Tomas Vondra


On 11/30/2017 04:20 PM, Ildus Kurbangaliev wrote:
> On Thu, 30 Nov 2017 00:30:37 +0100
> Tomas Vondra  wrote:
>
> ...
>
>> I can imagine other interesting use cases - for example values in
>> JSONB columns often use the same "schema" (keys, nesting, ...), so
>> can I imagine building a "dictionary" of JSON keys for the whole
>> column ...
>>
>> Ildus, is this a use case you've been aiming for, or were you aiming
>> to use the new API in a different way?
> 
> Thank you for such good overview. I agree that pglz is pretty good as
> general compression method, and there's no point to change it, at
> least now.
> 
> I see few useful use cases for compression methods, it's special
> compression methods for int[], timestamp[] for time series and yes,
> dictionaries for jsonb, for which I have even already created an
> extension (https://github.com/postgrespro/jsonbd). It's working and
> giving promising results.
> 

I understand the reluctance to put everything into core, particularly
for complex patches that evolve quickly. Also, not having to put
everything into core is kinda why we have extensions.

But perhaps some of the simpler cases would be good candidates for core,
making it possible to test the feature?

>>
>> I wonder if the patch can be improved to handle this use case better.
>> For example, it requires knowledge the actual data type, instead of
>> treating it as opaque varlena / byte array. I see tsvector compression
>> does that by checking typeid in the handler.
>>
>> But that fails for example with this example
>>
>> db=# create domain x as tsvector;
>> CREATE DOMAIN
>> db=# create table t (a x compressed ts1);
>> ERROR:  unexpected type 28198672 for tsvector compression handler
>>
>> which means it's a few brick shy to properly support domains. But I
>> wonder if this should be instead specified in CREATE COMPRESSION
>> METHOD instead. I mean, something like
>>
>> CREATE COMPRESSION METHOD ts1 HANDLER tsvector_compression_handler
>> TYPE tsvector;
>>
>> When type is no specified, it applies to all varlena values. Otherwise
>> only to that type. Also, why not to allow setting the compression as
>> the default method for a data type, e.g.
>>
>> CREATE COMPRESSION METHOD ts1 HANDLER tsvector_compression_handler
>> TYPE tsvector DEFAULT;
>>
>> would automatically add 'COMPRESSED ts1' to all tsvector columns in
>> new CREATE TABLE commands.
> 
> Initial version of the patch contains ALTER syntax that change 
> compression method for whole types, but I have decided to remove
> that functionality for now because the patch is already quite complex
> and it could be added later as separate patch.
> 
> Syntax was:
> ALTER TYPE  SET COMPRESSION ;
> 
> Specifying the supported type for the compression method is a good idea.
> Maybe the following syntax would be better?
> 
> CREATE COMPRESSION METHOD ts1 FOR tsvector HANDLER
> tsvector_compression_handler;
> 

Understood. Good to know you've considered it, and I agree it doesn't
need to be there from the start (which makes the patch simpler).

>>
>> BTW do you expect the tsvector compression to be generally useful, or
>> is it meant to be used only by the tests? If generally useful,
>> perhaps it should be created in pg_compression by default. If only
>> for tests, maybe it should be implemented in an extension in contrib
>> (thus also serving as example how to implement new methods).
>>
>> I haven't thought about the JSONB use case very much, but I suppose
>> that could be done using the configure/drop methods. I mean,
>> allocating the dictionary somewhere (e.g. in a table created by an
>> extension?). The configure method gets the Form_pg_attribute record,
>> so that should be enough I guess.
>>
>> But the patch is not testing those two methods at all, which seems
>> like something that needs to be addresses before commit. I don't
>> expect a full-fledged JSONB compression extension, but something
>> simple that actually exercises those methods in a meaningful way.
> 
> I will move to tsvector_compression_handler to separate extension in
> the next version. I added it more like as example, but also it could be
> used to achieve a better compression for tsvectors. Tests on maillists
> database ('archie' tables):
> 
> usual compression:
> 
> maillists=# select body_tsvector, subject_tsvector into t1 from
> messages; SELECT 1114213
> maillists=# select pg_size_pretty(pg_total_relation_size('t1'));
>  pg_size_pretty 
> 
>  1637 MB
> (1 row)
> 
> tsvector_compression_handler:
> maillists=# select pg_size_pretty(pg_total_relation_size('t2'));
>  pg_size_pretty 
> 
>  1521 MB
> (1 row)
> 
> lz4:
> maillists=# select pg_size_pretty(pg_total_relation_size('t3'));
>  pg_size_pretty 
> 
>  1487 MB
> (1 row)
> 
> I don't stick to tsvector_compression_handler, I think if there
> will some example that can use all the features then
> 

Re: [HACKERS] SQL procedures

2017-11-30 Thread Peter Eisentraut
On 11/29/17 14:17, Andrew Dunstan wrote:
> On 11/28/2017 10:03 AM, Peter Eisentraut wrote:
>> Here is a new patch that addresses the previous review comments.
>>
>> If there are no new comments, I think this might be ready to go.
> 
> Looks good to me. Marking ready for committer.

committed


-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] WIP: Covering + unique indexes.

2017-11-30 Thread Andrey Borodin
Hi, Peter!
> 29 нояб. 2017 г., в 8:45, Peter Geoghegan  написал(а):
> 
> It looks like amcheck needs to be patched -- a simple oversight.
> amcheck is probably calling _bt_compare() without realizing that
> internal pages don't have the extra attributes (just leaf pages,
> although they should also not participate in comparisons in respect of
> included/extra columns). There were changes to amcheck at one point in
> the past. That must have slipped through again. I don't think it's
> that complicated.
> 
> BTW, it would probably be a good idea to use the new Github version's
> "heapallindexed" verification [1] for testing this patch. Anastasia
> will need to patch the externally maintained amcheck to do this, but
> it's probably no extra work, because this is already needed for
> contrib/amcheck, and because the heapallindexed check doesn't actually
> care about index structure at all.

Seems like it was not a big deal of patching, I've fixed those bits (see 
attachment). 
I've done only simple tests as for now, but I'm planning to do better testing 
before next CF.
Thanks for mentioning "heapallindexed", I'll use it too.

Best regards, Andrey Borodin.


amcheck_include.diff
Description: Binary data


Re: [HACKERS] Support to COMMENT ON DATABASE CURRENT_DATABASE

2017-11-30 Thread Robert Haas
On Wed, Nov 29, 2017 at 11:35 PM, Michael Paquier
 wrote:
> On Mon, Nov 27, 2017 at 11:41 AM, Jing Wang  wrote:
>> This is a patch for current_database working on ALTER ROLE/GRANT/REVOKE
>> statements which should be applied after the previous patch
>> "comment_on_current_database_no_pgdump_v4.4.patch".
>>
>> By using the patch the CURRENT_DATABASE can working in the following SQL
>> statements:
>>
>> ALTER ROLE ... IN DATABASE CURRENT_DATABASE SET/RESET
>> configuration_parameter
>> GRANT ... ON DATABASE CURRENT_DATABASE TO role_specification ...
>> REVOKE ... ON DATABASE CURRENT_DATABASE FROM ...
>
> Moved to next CF with same status, "needs review".

Patch no longer applies.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-11-30 Thread Robert Haas
On Tue, Nov 21, 2017 at 4:36 AM, Peter Moser  wrote:
> Hi hackers,
> we like to rethink our approach...
>
> For simplicity I'll drop ALIGN for the moment and focus solely on NORMALIZE:
>
> SELECT * FROM (R NORMALIZE S ON R.x = S.y WITH (R.time, S.time)) c;
>
> Our normalization executor node needs the following input (for now
> expressed in plain SQL):
>
> SELECT R.*, p1
> FROM (SELECT *, row_id() OVER () rn FROM R) R
>  LEFT OUTER JOIN (
> SELECT y, LOWER(time) p1 FROM S
> UNION
> SELECT y, UPPER(time) p1 FROM S
>  ) S
>  ON R.x = S.y AND p1 <@ R.time
> ORDER BY rn, p1;
>
> In other words:
> 1) The left subquery adds an unique ID to each tuple (i.e., rn).
> 2) The right subquery creates two results for each input tuple: one for
>the upper and one for the lower bound of each input tuple's valid time
>column. The boundaries get put into a single (scalar) column, namely p1.
> 3) We join both subqueries if the normalization predicates hold (R.x = S.y)
>and p1 is inside the time of the current outer tuple.
> 4) Finally, we sort the result by the unique ID (rn) and p1, and give all
>columns of the outer relation, rn and p1 back.

So, if I understand correctly, there are three possible outcomes.  If
S.time and R.time are non-overlapping, then a null-extended row is
produced with the original data from R.  If S.time overlaps R.time but
is not contained within it, then this produces one result row, not
null-extended -- either the upper or lower bound will found to be
contained within R.time, but the other will not.  If S.time overlaps
R.time but is not contained within it, then this produces 2 result
rows, one with p1 containing S.time's lower bound and one with p1
containing S.time's upper bound.  Is that right?  Then, after all this
happens, the temporal normalizer code runs over the results and uses
the p1 values as split points for the ranges from R.

I wonder if we could instead think about R NORMALIZE S ON R.x = S.y
WITH (R.time, S.time) as an ordinary join on R.x = S.y with some extra
processing afterwards.  After finding all the join partners for a row
in R, extract all the lower and upper bounds from the S.time fields of
the join partners and use that to emit as many rows from R as may be
necessary.  The main problem that I see with that approach is that it
seems desirable to separate the extra-processing-afterwards step into
a separate executor node, and there's no way for a separate type of
plan node to know when the join advances the outer side of the join.
That's the purpose that row_id() is serving as you have it; we'd have
to invent some other kind of signalling mechanism, which might not be
entirely trivial.  :-(

If we could do it, though, it might be quite a bit more efficient,
because it would avoid scanning S twice and performing a UNION on the
results of those scans.  Also, we wouldn't necessarily need to sort
the whole set of rows from R, which I suspect is unavoidable in your
implementation.  We'd just need to sort the individual groups of rows
from S, and my guess is that in many practical cases those groups are
fairly small.

I wonder what identities hold for NORMALIZE.  It does not seem to be
commutative, but it looks like it might be associative... i.e. (R
NORMALIZE S) NORMALIZE T produces the same output as R NORMALIZE (S
NORMALIZE T), perhaps?

> Our first attempt to understand the new approach would be as follows: The
> left base rel of the inner left-outer-join can be expressed as a WindowAgg
> node. However, the right query of the join is much more difficult to build
> (maybe through hash aggregates). Both queries could be put together with a
> MergeJoin for instance. However, if we create the plan tree by hand and
> choose algorithms for it manually, how is it possible to have it optimized
> later? Or, if that is not possible, how do we choose the best algorithms
> for it?

You don't want to generate plan nodes directly like this, I think.
Generally, you create paths, and the cheapest path wins at the end of
planning.  The thing that makes this tricky is that the temporal
normalization phase can be separated from the inner left-outer-join by
other joins, and the planner doesn't really have a good way of
representing that concept.

More broadly, the very idea of creating plan nodes suggests that you
are approaching this from the point of view of sticking the logic in
near the end of the planning process, which I think is not going to
work.  Whatever is going to happen here needs to happen at path
generation time at the latest, or maybe even earlier, before the
optimizer even starts processing the join tree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] create_unique_path and GEQO

2017-11-30 Thread Robert Haas
On Fri, Mar 24, 2017 at 9:50 AM, Tom Lane  wrote:
> Yeah.  I think the code in mark_dummy_rel is newer and better-thought-out
> than what's in create_unique_path.  It probably makes sense to change over.

I did a bit of archaeology here.  create_unique_path() first appears
in commit bdfbfde1b168b3332c4cdac34ac86a80aaf4d442 (vintage 2003),
where it used GetMemoryChunkContext(rel). Commit
f41803bb39bc2949db200116a609fd242d0ec221 (vintage 2007) changed it to
use root->planner_cxt, but neither the comment changes in that patch
nor the commit message give any hint as to the motivation for the
change.  The comments do mention that it should be kept in sync with
best_inner_indexscan(), which was also switched to use
root->planner_cxt by that commit, again without any real explanation
as to why, and best_inner_indexscan() continued to use
root->planner_cxt until its demise in commit
e2fa76d80ba571d4de8992de6386536867250474 (vintage 2012).  Meanwhile,
mark_dummy_rel() didn't switch contexts at all until commit
eca75a12a27d28b972fc269c1c8813cd8eb15441 (vintage 2011) at which point
it began using GetMemoryChunkContext(rel).

So, the coding that Ashutosh is proposing here is just changing things
back to the way they were before 2007, but with the better comment
that Tom wrote in 2011 for mark_dummy_rel().  Since the 2007 change
lacks any justification and the new code has one, I'm inclined to
agree with Tom that changing this make sense.  If it turns out to be
wrong, then mark_dummy_rel() also needs fixing, and the comments need
to explain things better.  My guess, though, is that it's right,
because the rationale offered in mark_dummy_rel() seems to make a lot
of sense.  One reason why we might want to care about this, other than
a laudable consistency, is that at one point Ashutosh had some patches
to reduce the memory usage of partition-wise join that involved doing
more incremental discarding of RelOptInfos akin to what GEQO does
today.  If we ever do anything like that, it will be good if we're
consistent (and correct) about which contexts end up containing which
data structures.

All of which, I think, is a long-winded way of saying that I'm going
to go commit this patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: ASCII Null control character validation

2017-11-30 Thread Tom Lane
Alexey Chernyshov  writes:
> I found in src/backend/utils/mb/wchar.c: pg_verify_mbstr_len() that it
> reports ASCII Null character (\000) as invalid. As for me, it should
> pass validation.

This is intentional and we're not going to change it.  There is too
much code in the backend that relies on NUL-terminated strings ...

> However, ASCII Null character breaks a line and the
> end of the line is missed, try:
> INSERT INTO mytable VALUES (E'a\001b\000c and rest of line MIA');

... like that for instance.  See (many) past discussions of this issue.

regards, tom lane



Re: [HACKERS] postgres_fdw bug in 9.6

2017-11-30 Thread Tom Lane
Etsuro Fujita  writes:
> (2017/11/30 7:32), Tom Lane wrote:
>> the output of the foreign join cannot change during EPQ, since the remote
>> server already locked the rows before returning them.  The only thing that
>> can change is the output of the local scan on public.tab.  Yes, we then
>> need to re-verify that foo.a = tab.a ... but in this example, that's the
>> responsibility of the NestLoop plan node, not the foreign join.

> That's right, but is that true when the FDW uses late row locking?

An FDW using late row locking would need to work harder, yes.  But
that's true at the scan level as well as the join level.  We have
already committed to using early locking in postgres_fdw, for the
network-round-trip-cost reasons I mentioned before, and I can't see
why we'd change that decision at the join level.

Right now we've got the worst of both worlds, in that we're actually
doing early row locking on the remote, but we're paying (some of) the
code complexity costs required for late locking.

regards, tom lane



Re: [HACKERS] UPDATE of partition key

2017-11-30 Thread Amit Khandekar
While addressing Thomas's point about test scenarios not yet covered,
I observed the following ...

Suppose an UPDATE RLS policy with a WITH CHECK clause is defined on
the target table. Now In ExecUpdate(), the corresponding WCO qual gets
executed *before* the partition constraint check, as per existing
behaviour. And the qual succeeds. And then because of partition-key
updated, the row is moved to another partition. Here, suppose there is
a BR INSERT trigger which modifies the row, and the resultant row
actually would *not* pass the UPDATE RLS policy. But for this
partition, since it is an INSERT, only INSERT RLS WCO quals are
executed.

So effectively, with a user-perspective, an RLS WITH CHECK policy that
was defined to reject an updated row, is getting updated successfully.
This is because the policy is not checked *after* a row trigger in the
new partition is executed.

Attached is a test case that reproduces this issue.

I think, in case of row-movement, we should defer calling
ExecWithCheckOptions() until the row is being inserted using
ExecInsert(). And then in ExecInsert(), ExecWithCheckOptions() should
be called using WCO_RLS_UPDATE_CHECK rather than WCO_RLS_INSERT_CHECK
(I recall Amit Langote was of this opinion) as below :

--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -510,7 +510,9 @@ ExecInsert(ModifyTableState *mtstate,
  * we are looking for at this point.
  */
  if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(WCO_RLS_INSERT_CHECK,
+ExecWithCheckOptions((mtstate->operation == CMD_UPDATE ?
+ WCO_RLS_UPDATE_CHECK : WCO_RLS_INSERT_CHECK),
  resultRelInfo, slot, estate);


It can be argued that since in case of triggers we always execute
INSERT row triggers for rows inserted as part of update-row-movement,
we should be consistent and execute INSERT WCOs and not UPDATE WCOs
for such rows. But note that, the row triggers we execute are defined
on the leaf partitions. But the RLS policies being executed are
defined for the target partitioned table, and not the leaf partition.
Hence it makes sense to execute them as per the original operation on
the target table. This is similar to why we execute UPDATE statement
triggers even when the row is eventually inserted into another
partition. This is because UPDATE statement trigger was defined for
the target table, not the leaf partition.

Barring any objections, I am going to send a revised patch that fixes
the above issue as described.

Thanks
-Amit Khandekar


wco_rls_issue.sql
Description: Binary data


Re: [HACKERS] postgres_fdw bug in 9.6

2017-11-30 Thread Etsuro Fujita

(2017/11/30 7:32), Tom Lane wrote:

AFAICT, [1] just demonstrates that nobody had thought about the scenario
up to that point, not that the subsequently chosen solution was a good
one.  In that example,

LockRows  (cost=100.00..101.18 rows=4 width=70)
Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
->   Nested Loop  (cost=100.00..101.14 rows=4 width=70)
  Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
  Join Filter: (foo.a = tab.a)
  ->   Seq Scan on public.tab  (cost=0.00..1.01 rows=1 width=14)
Output: tab.a, tab.b, tab.ctid
  ->   Foreign Scan  (cost=100.00..100.08 rows=4 width=64)
Output: foo.*, foo.a, bar.*, bar.a
Relations: (public.foo) INNER JOIN (public.bar)
Remote SQL: SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT 
ROW(l.a9), l.a9 FROM (SELECT a a9 FROM public.foo FOR UPDATE) l) l (a1, a2) 
INNER JOIN (SELECT ROW(r.a9), r.a9 FROM (SELECT a a9 FROM public.bar FOR 
UPDATE) r) r (a1, a2) ON ((l.a2 = r.a2))

the output of the foreign join cannot change during EPQ, since the remote
server already locked the rows before returning them.  The only thing that
can change is the output of the local scan on public.tab.  Yes, we then
need to re-verify that foo.a = tab.a ... but in this example, that's the
responsibility of the NestLoop plan node, not the foreign join.


That's right, but is that true when the FDW uses late row locking?

Best regards,
Etsuro Fujita



Re: [HACKERS] INSERT .. ON CONFLICT DO SELECT [FOR ..]

2017-11-30 Thread Marko Tiikkaja
On Thu, Nov 30, 2017 at 6:39 AM, Peter Geoghegan  wrote:

> On Wed, Nov 29, 2017 at 8:34 PM, Michael Paquier
>  wrote:
> > The patch does not currently apply. I am noticing as well that Peter
> > Geoghegan has registered himself as a reviewer a couple of hours back,
> > so moved to next CF with waiting on author as status.
>
> Marko unregistered me, so I promptly reregistered. You'll have to ask him
> why.


Oops.  I didn't mean to do what I did, and I apologize.  I had my months
mixed up and I thought this commit fest had ended without you having looked
at the patch, so I assumed you would register in the next CF if you were
still interested, or someone else could pick it up.

In any case, *I* don't have any interest in pursuing this patch at this
point.  I think this is a really good feature, and as far as I know the
patch is technically solid, or at least was when it still applied.  Can
someone else take it from here?


.m


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-30 Thread Alvaro Herrera
Michael Paquier wrote:
> On Wed, Nov 15, 2017 at 2:49 AM, Alvaro Herrera  
> wrote:
> > Here's the remaining bits, rebased.
> 
> At least patch 3 has conflicts with HEAD. I am moving this patch to
> next CF per a lack of reviews, switching status to waiting on author.

Thanks, will submit a new version before the start of the next
commitfest.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-11-30 Thread Alvaro Herrera
David Rowley wrote:

>  I wonder if it should be this patches job to alter the code in
> get_relation_info() which causes the indexes not to be loaded for
> partitioned tables:
> 
> /*
> * Make list of indexes.  Ignore indexes on system catalogs if told to.
> * Don't bother with indexes for an inheritance parent, either.
> */
> if (inhparent ||
> (IgnoreSystemIndexes && IsSystemRelation(relation)))
> hasindex = false;
> else
> hasindex = relation->rd_rel->relhasindex;
> 
> A partitioned table will always go into the hasindex = false code path.
> 
> I'm kind of thinking this patch should change that, even if the patch is
> not making use of the indexes, you could argue that something
> using set_rel_pathlist_hook might want to do something there, although,
> there's likely a bunch of counter arguments too.
> 
> What do you think?

Great question.  So you're thinking that the planner might have an
interest in knowing what indexes are defined at the parent table level
for planning purposes; but for that to actually have any effect we would
need to change the planner and executor also.  And one more point, also
related to something you said before: we currently (I mean after my
patch) don't mark partitioned-table-level indexes as valid or not valid
depending on whether all its children exist, so trying to use that in
the planner without having a flag could cause invalid plans to be
generated (i.e. ones that would cause nonexistent indexes to be
referenced).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Issues with logical replication

2017-11-30 Thread Simon Riggs
On 30 November 2017 at 11:30, Petr Jelinek  wrote:
> On 30/11/17 00:47, Andres Freund wrote:
>> On 2017-11-30 00:45:44 +0100, Petr Jelinek wrote:
>>> I don't understand. I mean sure the SnapBuildWaitSnapshot() can live
>>> with it, but the problematic logic happens inside the
>>> XactLockTableInsert() and SnapBuildWaitSnapshot() has no way of
>>> detecting the situation short of reimplementing the
>>> XactLockTableInsert() instead of calling it.
>>
>> Right. But we fairly trivially can change that. I'm remarking on it
>> because other people's, not yours, suggestions aimed at making this
>> bulletproof. I just wanted to make clear that I don't think that's
>> necessary at all.
>>
>
> Okay, then I guess we are in agreement. I can confirm that the attached
> fixes the issue in my tests. Using SubTransGetTopmostTransaction()
> instead of SubTransGetParent() means 3 more ifs in terms of extra CPU
> cost for other callers. I don't think it's worth worrying about given we
> are waiting for heavyweight lock, but if we did we can just inline the
> code directly into SnapBuildWaitSnapshot().

This will still fail an Assert in TransactionIdIsInProgress() when
snapshots are overflowed.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] logical decoding of two-phase transactions

2017-11-30 Thread Nikhil Sontakke
Hi,


>> So perhaps better approach would be to not return
>> HEAPTUPLE_DEAD if the transaction id is newer than the OldestXmin (same
>> logic we use for deleted tuples of committed transactions) in the
>> HeapTupleSatisfiesVacuum() even for aborted transactions. I also briefly
>> checked HOT pruning and AFAICS the normal HOT pruning (the one not
>> called by vacuum) also uses the xmin as authoritative even for aborted
>> txes so nothing needs to be done there probably.
>>
>> In case we are worried that this affects cleanups of for example large
>> aborted COPY transactions and we think it's worth worrying about then we
>> could limit the new OldestXmin based logic only to catalog tuples as
>> those are the only ones we need available in decoding.
>
>
> Yeah, if it's limited to catalog tuples only then that sounds good. I was
> quite concerned about how it'd impact vacuuming otherwise, but if limited to
> catalogs about the only impact should be on workloads that create lots of
> TEMPORARY tables then ROLLBACK - and not much on those.
>

Based on these discussions, I think there are two separate issues here:

1) Make HeapTupleSatisfiesVacuum() to behave differently for recently
aborted catalog tuples.

2) Invent a mechanism to stop a specific logical decoding activity in
the middle. The reason to stop it could be a concurrent abort, maybe a
global transaction manager decides to rollback, or any other reason,
for example.

 ISTM, that for 2, if (1) is able to leave the recently abort tuples
around for a little bit while (we only really need them till the
decode of the current change record is ongoing), then we could
accomplish it via a callback. This callback should be called before
commencing decode and network send of each change record. In case of
in-core logical decoding, the callback for pgoutput could check for
the transaction having aborted (a call to TransactionIdDidAbort() or
similar such functions), additional logic can be added as needed for
various scenarios. If it's aborted, we will abandon decoding and send
an ABORT to the subscribers before returning.

Regards,
Nikhils



Re: es_query_dsa is broken

2017-11-30 Thread Thomas Munro
On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas  wrote:
>> Better ideas?
>
> How about this:
>
> 1. Remove es_query_dsa altogether.
> 2. Add a dsa_area * to ExecParallelInitializeDSMContext.
> 3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
> the per-node-type function.
> 4. If the per-node-type function cares about the dsa_area *, it is
> responsible for saving a pointer to it in the PlanState node.
> 5. Also pass it down via the ParallelWorkerContext.
>
> In v10 we might need to go with a solution like what you've sketched
> here, because Tom will complain about breaking binary compatibility
> with EState (and maybe other PlanState nodes) in a released branch.

I will post both versions.  I've been stuck for a while now trying to
come up with a query that actually breaks, so I can show that it's
fixed...

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] create_unique_path and GEQO

2017-11-30 Thread Ashutosh Bapat
On Thu, Nov 30, 2017 at 9:00 AM, Ashutosh Bapat
 wrote:
> On Thu, Nov 30, 2017 at 8:50 AM, Michael Paquier
>  wrote:
>> On Fri, Mar 24, 2017 at 10:50 PM, Tom Lane  wrote:
>>> Ashutosh Bapat  writes:
> Do you have test case, which can reproduce the issue you
> explained above?
>>>
 No. It would require some surgery in standard_planner() to measure the
 memory consumed in the planner context OR build the code with
 SHOW_MEMORY_STATS defined and dump memory context statistics and check
 planner context memory usage. I don't think I can produce a testcase
 quickly right now. But then, I think the problem is quite apparent
 from the code inspection alone, esp. comparing what mark_dummy_rel()
 does with what create_unique_path() is doing.
>>>
>>> Yeah.  I think the code in mark_dummy_rel is newer and better-thought-out
>>> than what's in create_unique_path.  It probably makes sense to change over.
>>
>> This has remained unanswered for more than 8 months, so I am marking
>> this patch as returned with feedback.
>
> I am not sure what's there to answer in Tom's reply. He seems to be
> agreeing with my analysis. Correct me if I am wrong. If that's the
> case, I am expecting somebody to review the patch. If there are no
> review comments, some committer may commit it.
>

For now I have marked it as need reviewer and moved to the next commitfest.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



ASCII Null control character validation

2017-11-30 Thread Alexey Chernyshov
Hello, hackers!

I found in src/backend/utils/mb/wchar.c: pg_verify_mbstr_len() that it
reports ASCII Null character (\000) as invalid. As for me, it should
pass validation. However, ASCII Null character breaks a line and the
end of the line is missed, try:

INSERT INTO mytable VALUES (E'a\001b\000c and rest of line MIA');

Find patch attached. Am I wrong?

-- 
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company>From ee7b505792a4d7630dd5e20aca475825d11301a7 Mon Sep 17 00:00:00 2001
From: Alexey Chernyshov 
Date: Wed, 29 Nov 2017 15:35:10 +0300
Subject: [PATCH] Fix 0x00 symbol validation. Earlier it was considered as
 illegal by fast path for ASCII.

---
 src/backend/utils/mb/wchar.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/src/backend/utils/mb/wchar.c b/src/backend/utils/mb/wchar.c
index a5fdda4..62b1a7d 100644
--- a/src/backend/utils/mb/wchar.c
+++ b/src/backend/utils/mb/wchar.c
@@ -1922,18 +1922,12 @@ pg_verify_mbstr_len(int encoding, const char *mbstr, int len, bool noError)
 		int			l;
 
 		/* fast path for ASCII-subset characters */
-		if (!IS_HIGHBIT_SET(*mbstr))
+		if ((*mbstr >= 0x00) && (*mbstr <= 0x7F))
 		{
-			if (*mbstr != '\0')
-			{
-mb_len++;
-mbstr++;
-len--;
-continue;
-			}
-			if (noError)
-return -1;
-			report_invalid_encoding(encoding, mbstr, len);
+			mb_len++;
+			mbstr++;
+			len--;
+			continue;
 		}
 
 		l = (*mbverify) ((const unsigned char *) mbstr, len);
-- 
2.7.4