Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Noah Misch
On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
> On 5/31/17 23:54, Peter Eisentraut wrote:
> > On 5/29/17 22:01, Noah Misch wrote:
> >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> >>>  wrote:
>  Hi,
> 
>  so this didn't really move anywhere AFAICS, do we think the approach
>  I've chosen is good or do we want to do something else here?
> >>>
> >>> Can you add it to the open items list?
> >>
> >> [Action required within three days.  This is a generic notification.]
> > 
> > I have posted an update.  The next update will be Friday.
> 
> Andres is now working on this.  Let's give him a bit of time. ;-)

If you intended this as your soon-due status update, it is missing a mandatory
bit.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Alter subscription..SET - NOTICE message is coming for table which is already removed

2017-06-02 Thread Noah Misch
On Wed, May 31, 2017 at 10:15:50PM -0400, Peter Eisentraut wrote:
> On 5/30/17 13:25, Masahiko Sawada wrote:
> > I think this cause is that the relation status entry could be deleted
> > by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
> > starting. Attached patch fixes issues reported on this thread so far.
> 
> This looks like a reasonable change, but it conflicts with the change
> discussed on "logical replication - still unstable after all these
> months".  I think I'll deal with that one first.

The above-described topic is currently a PostgreSQL 10 open item.  You own
this open item.  Please observe the policy on open item ownership and send a
status update within three calendar days of this message.  Include a date for
your subsequent status update.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Server ignores contents of SASLInitialResponse

2017-06-02 Thread Noah Misch
On Tue, May 30, 2017 at 03:04:47AM +, Noah Misch wrote:
> On Thu, May 25, 2017 at 10:52:23AM -0400, Michael Paquier wrote:
> > On Thu, May 25, 2017 at 9:32 AM, Michael Paquier
> >  wrote:
> > > On Thu, May 25, 2017 at 8:51 AM, Heikki Linnakangas  
> > > wrote:
> > >> On 05/24/2017 11:33 PM, Michael Paquier wrote:
> > >>> I have noticed today that the server ignores completely the contents
> > >>> of SASLInitialResponse. ... Attached is a patch to fix the problem.
> > >>
> > >> Fixed, thanks!
> > >
> > > Thanks for the commit.
> > 
> > Actually, I don't think that we are completely done here. Using the
> > patch of upthread to enforce a failure on SASLInitialResponse, I see
> > that connecting without SSL causes the following error:
> > psql: FATAL:  password authentication failed for user "mpaquier"
> > But connecting with SSL returns that:
> > psql: duplicate SASL authentication request
> > 
> > I have not looked at that in details yet, but it seems to me that we
> > should not take pg_SASL_init() twice in the scram authentication code
> > path in libpq for a single attempt.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Heikki,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
> 
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Andres Freund
On 2017-06-02 23:27:55 -0400, Peter Eisentraut wrote:
> On 5/31/17 23:54, Peter Eisentraut wrote:
> > On 5/29/17 22:01, Noah Misch wrote:
> >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> >>>  wrote:
>  Hi,
> 
>  so this didn't really move anywhere AFAICS, do we think the approach
>  I've chosen is good or do we want to do something else here?
> >>>
> >>> Can you add it to the open items list?
> >>
> >> [Action required within three days.  This is a generic notification.]
> > 
> > I have posted an update.  The next update will be Friday.
> 
> Andres is now working on this.  Let's give him a bit of time. ;-)

Yep.

I've posted patches for most things in this thread over at
http://archives.postgresql.org/message-id/20170603002023.rcppadggfiaav377%40alap3.anarazel.de
because the proper fixes here are only possible after resolving the
discussion there.  The only thing I know of that's going remaining
afterwards is SIGHUP handling, of which there's already a patch in this
thread (although I want to prune its scope before committing).

I suspect that the review for those will probably happen over the next
few days, I plan to push those afterwards.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Peter Eisentraut
On 5/31/17 23:54, Peter Eisentraut wrote:
> On 5/29/17 22:01, Noah Misch wrote:
>> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
>>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
>>> wrote:
 Hi,

 so this didn't really move anywhere AFAICS, do we think the approach
 I've chosen is good or do we want to do something else here?
>>>
>>> Can you add it to the open items list?
>>
>> [Action required within three days.  This is a generic notification.]
> 
> I have posted an update.  The next update will be Friday.

Andres is now working on this.  Let's give him a bit of time. ;-)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-06-02 Thread Peter Eisentraut
On 6/2/17 15:08, Peter Eisentraut wrote:
> On 5/30/17 23:10, Peter Eisentraut wrote:
>> Here is a proposed solution that splits bgw_name into bgw_type and
>> bgw_name_extra.  bgw_type shows up in pg_stat_activity.backend_type.
>> Uses of application_name are removed, because they are no longer
>> necessary to identity the process type.
>>
>> This code appears to be buggy because I sometimes get NULL results of
>> the backend_type lookup, implying that it couldn't find the background
>> worker slot.  This needs another look.
> 
> I would like some more input on this proposal, especially from those
> have have engineered the extended pg_stat_activity content.
> 
> If we don't come to a quick conclusion on this, I'd be content to leave
> PG10 as is and put this patch into the next commit fest.

If there are no new insights into this by Monday, I will commit patches
that remove the setting of application_name, which was originally
complained about, and postpone the rest of this patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-06-02 Thread Peter Eisentraut
On 6/2/17 16:44, Petr Jelinek wrote:
> However, I am not sure about the bgw_name_extra. I think I would have
> preferred keeping full bgw_name field which would be used where full
> name is needed and bgw_type where only the worker type is used. The
> concatenation just doesn't sit well with me, especially if it requires
> the bgw_name_extra to start with space.

I see your point.  There are also some i18n considerations to think through.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-06-02 Thread Andres Freund
Hi,


On 2017-06-02 22:53:00 -0400, Tom Lane wrote:
> I think you've got enough on your plate.  I can take care of whatever
> we decide to do here.

Thanks.


> Andres Freund  writes:
> >> Another possibility is to say that we've broken this situation
> >> irretrievably and we should start throwing errors for SRFs in
> >> places where they'd be conditionally evaluated.  That's not real
> >> nice perhaps, but it's better than the way things are right now.
> 
> > I'd be ok with that too, but I don't really see a strong need so far.
> 
> The argument for this way is basically that it's better to break
> apps visibly than silently.

Right, I got that.


> The behavior for SRF-inside-CASE is
> not going to be the same as before even if we implement the fix
> I suggest above, and it's arguable that this new behavior is not
> at all intuitive.

Yea, I'm not a big fan of the either the pre v10 or the v10 behaviour of
SRFs inside coalesce/case.  Neither is really resonable imo - I'm not
sure a reasonable behaviour even exists.  IIRC I'd argued in the
original SRF thread that we should just throw an error, and I think we'd
concluded that we'd not do so for now.


> I'm not really sure which way to jump, which is why I was hoping
> for some discussion here.

There not really being an intuitive behaviour seems to be a bit of a
reason to disallow. Another argument that I can see is that it'll be
easier to allow it again later, than to do the reverse.  But I think the
new behaviour can also be useful, and I suspect not that many people
will hit this...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error while creating subscription when server is running in single user mode

2017-06-02 Thread Peter Eisentraut
On 6/2/17 15:24, Andres Freund wrote:
> but that doesn't really solve the issue that
> logical rep pretty essentially requires multiple processes.

But it may be sensible to execute certain DDL commands for repair, which
is why I'm arguing for a finer-grained approach than just prohibiting
everything.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error while creating subscription when server is running in single user mode

2017-06-02 Thread Peter Eisentraut
On 6/2/17 15:41, Tom Lane wrote:
> It's certainly plausible that we could have the latch code just ignore
> WL_POSTMASTER_DEATH if not IsUnderPostmaster.  I think that the original
> reasoning for not doing that was that the calling code should know which
> environment it's in, and not pass an unimplementable wait-exit reason;
> so silently ignoring the bit could mask a bug.  Perhaps that argument is
> no longer attractive.  Alternatively, we could fix the relevant call sites
> to do "(IsUnderPostmaster ? WL_POSTMASTER_DEATH : 0)", and keep the strict
> behavior for the majority of call sites.

There are a lot of those call sites.  (And a lot of duplicate code for
what to do if postmaster death actually happens.)  I doubt we want to
check them all.

The attached patch fixes the reported issue for me.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 8f0e1c844aabd4a0e3c245180d9019cbf65b1601 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 2 Jun 2017 23:02:13 -0400
Subject: [PATCH] Ignore WL_POSTMASTER_DEATH latch event in single user mode

Otherwise code that uses this will abort with an assertion failure,
because postmaster_alive_fds are not initialized.

Reported-by: tushar 
---
 src/backend/storage/ipc/latch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 53e6bf2477..55959de91f 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -370,7 +370,7 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, 
pgsocket sock,
AddWaitEventToSet(set, WL_LATCH_SET, PGINVALID_SOCKET,
  (Latch *) latch, NULL);
 
-   if (wakeEvents & WL_POSTMASTER_DEATH)
+   if (wakeEvents & WL_POSTMASTER_DEATH && IsUnderPostmaster)
AddWaitEventToSet(set, WL_POSTMASTER_DEATH, PGINVALID_SOCKET,
  NULL, NULL);
 
-- 
2.13.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-06-02 Thread Tom Lane
Andres Freund  writes:
> On 2017-05-28 14:03:26 -0400, Tom Lane wrote:
>> I think it would be possible to teach eval_const_expressions that
>> it must not discard CASE/COALESCE subexpressions that contain SRFs,
>> which would preserve the rule that expression simplification doesn't
>> change the query semantics.

> That sounds like a good idea.  Do you want to write up a patch, or
> should I?  I can, but I'd want to finish the walsender panic and other
> signal handling stuff first (mostly waiting for review for now).

I think you've got enough on your plate.  I can take care of whatever
we decide to do here.  What I was looking for was opinions on which
way to address it.

>> Another possibility is to say that we've broken this situation
>> irretrievably and we should start throwing errors for SRFs in
>> places where they'd be conditionally evaluated.  That's not real
>> nice perhaps, but it's better than the way things are right now.

> I'd be ok with that too, but I don't really see a strong need so far.

The argument for this way is basically that it's better to break
apps visibly than silently.  The behavior for SRF-inside-CASE is
not going to be the same as before even if we implement the fix
I suggest above, and it's arguable that this new behavior is not
at all intuitive.

I'm not really sure which way to jump, which is why I was hoping
for some discussion here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication - still unstable after all these months

2017-06-02 Thread Petr Jelinek
On 03/06/17 04:45, Mark Kirkwood wrote:
> On 03/06/17 11:10, Petr Jelinek wrote:
> 
>> On 02/06/17 22:29, Petr Jelinek wrote:
>>> On 02/06/17 08:55, Mark Kirkwood wrote:
 On 02/06/17 17:11, Erik Rijkers wrote:

> On 2017-06-02 00:46, Mark Kirkwood wrote:
>> On 31/05/17 21:16, Petr Jelinek wrote:
>>
>> I'm seeing a new failure with the patch applied - this time the
>> history table has missing rows. Petr, I'll put back your access :-)
> Is this error during 1-minute runs?
>
> I'm asking because I've moved back to longer (1-hour) runs (no errors
> so far), and I'd like to keep track of what the most 'vulnerable'
> parameters are.
>
 Yeah, still using your test config (with my minor modifications).

 When I got the error the 1st time, I did a complete make clean and
 rebuildbut it is still possible I've 'done it wrong' - so
 independent confirmation would be good!
>>> Well, I've seen this issue as well while I was developing the fix, but
>>> the patch I proposed fixed it for me as well as the original issue.
>>>
>> While I was testing something for different thread I noticed that I
>> manage transactions incorrectly in this patch. Fixed here, I didn't test
>> it much yet (it takes a while as you know :) ). Not sure if it's related
>> to the issue you've seen though.
>>
>>
> Ok - I've applied this version, and running tests again. I needed to do
> a git pull to apply the patch, so getting some other changes too!
> 

Thanks, yes, I forgot to mention that I rebased it against the current
HEAD as well.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication busy-waiting on a lock

2017-06-02 Thread Petr Jelinek
On 03/06/17 03:25, Andres Freund wrote:
> 
>> That should solve the original problem reported here.
> 
> Did you verify that?
>

Yes, I tried to manually create multiple exported logical decoding
snapshots in parallel and read data from them and it worked fine, while
it blocked before.

> 
>> @@ -1741,17 +1741,17 @@ GetSerializableTransactionSnapshotInt(Snapshot 
>> snapshot,
>>  } while (!sxact);
>>  
>>  /* Get the snapshot, or check that it's safe to use */
>> -if (!TransactionIdIsValid(sourcexid))
>> +if (!sourcevxid)
>>  snapshot = GetSnapshotData(snapshot);
>> -else if (!ProcArrayInstallImportedXmin(snapshot->xmin, sourcexid))
>> +else if (!ProcArrayInstallImportedXmin(snapshot->xmin, sourcevxid))
>>  {
>>  ReleasePredXact(sxact);
>>  LWLockRelease(SerializableXactHashLock);
>>  ereport(ERROR,
>>  
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>   errmsg("could not import the requested 
>> snapshot"),
>> -   errdetail("The source transaction %u is not running 
>> anymore.",
>> - sourcexid)));
>> +   errdetail("The source virtual transaction %d/%u is not running 
>> anymore.",
>> + sourcevxid->backendId, 
>> sourcevxid->localTransactionId)));
> 
> Hm, this is a harder to read.  Wonder if we should add a pid field, that'd
> make it a bit easier to interpret?
> 

Agreed, see attached. We have to pass the pid around a bit but I don't
think it's too bad.

One thing I don't like is the GetLastLocalTransactionId() that I had to
add because we clear the proc->lxid before we get to AtEOXact_Snapshot()
but I don't have better solution there.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


0002-Don-t-assign-xid-to-logical-decoding-snapshots.patch
Description: invalid/octet-stream


0001-Use-virtual-transaction-instead-of-normal-ones-in-ex.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication - still unstable after all these months

2017-06-02 Thread Mark Kirkwood

On 03/06/17 11:10, Petr Jelinek wrote:


On 02/06/17 22:29, Petr Jelinek wrote:

On 02/06/17 08:55, Mark Kirkwood wrote:

On 02/06/17 17:11, Erik Rijkers wrote:


On 2017-06-02 00:46, Mark Kirkwood wrote:

On 31/05/17 21:16, Petr Jelinek wrote:

I'm seeing a new failure with the patch applied - this time the
history table has missing rows. Petr, I'll put back your access :-)

Is this error during 1-minute runs?

I'm asking because I've moved back to longer (1-hour) runs (no errors
so far), and I'd like to keep track of what the most 'vulnerable'
parameters are.


Yeah, still using your test config (with my minor modifications).

When I got the error the 1st time, I did a complete make clean and
rebuildbut it is still possible I've 'done it wrong' - so
independent confirmation would be good!

Well, I've seen this issue as well while I was developing the fix, but
the patch I proposed fixed it for me as well as the original issue.


While I was testing something for different thread I noticed that I
manage transactions incorrectly in this patch. Fixed here, I didn't test
it much yet (it takes a while as you know :) ). Not sure if it's related
to the issue you've seen though.


Ok - I've applied this version, and running tests again. I needed to do 
a git pull to apply the patch, so getting some other changes too!


Cheers

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

2017-06-02 Thread Andres Freund
Hi,

On 2017-05-28 14:03:26 -0400, Tom Lane wrote:
> I think it would be possible to teach eval_const_expressions that
> it must not discard CASE/COALESCE subexpressions that contain SRFs,
> which would preserve the rule that expression simplification doesn't
> change the query semantics.

That sounds like a good idea.  Do you want to write up a patch, or
should I?  I can, but I'd want to finish the walsender panic and other
signal handling stuff first (mostly waiting for review for now).


> Another possibility is to say that we've broken this situation
> irretrievably and we should start throwing errors for SRFs in
> places where they'd be conditionally evaluated.  That's not real
> nice perhaps, but it's better than the way things are right now.

I'd be ok with that too, but I don't really see a strong need so far.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SUBSCRIPTION ..SET PUBLICATION refresh is not throwing error.

2017-06-02 Thread Peter Eisentraut
On 5/27/17 06:54, Petr Jelinek wrote:
> On 27/05/17 04:00, Euler Taveira wrote:
>> 2017-05-26 21:29 GMT-03:00 Petr Jelinek > >:
>>
>>
>> Actually another possibility would be to remove the REFRESH keyword
>> completely and just have [ WITH (...) ] and have the refresh option
>> there, ie simplified version of what you have suggested (without the
>> ugliness of specifying refresh twice to disable).
>>
>>
>> It will cause confusion. It seems that WITH sets ALTER SUBSCRIPTION
>> properties. Indeed, they are REFRESH properties. I think we shouldn't
>> exclude REFRESH keyword. Syntax leaves no doubt that WITH are REFRESH
>> properties.
>>
> 
> Maybe, I don't know, it might not be that confusing when SET PUBLICATION
> and REFRESH PUBLICATION have same set of WITH options.

I'm not sure what the conclusion from the above discussion was supposed
to be, but here is a patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 697c4cbdd386a4bd856614de6cedf5af8d1b63be Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 2 Jun 2017 21:59:07 -0400
Subject: [PATCH] Fix ALTER SUBSCRIPTION grammar ambiguity

There was a grammar ambiguity between SET PUBLICATION name REFRESH and
SET PUBLICATION SKIP REFRESH, because SKIP is not a reserved word.  To
resolve that, fold the refresh choice into the WITH options.  Refreshing
is the default now.

Author: tushar 
---
 doc/src/sgml/catalogs.sgml |  2 +-
 doc/src/sgml/ref/alter_subscription.sgml   | 35 --
 src/backend/commands/subscriptioncmds.c| 34 +
 src/backend/parser/gram.y  | 14 ++--
 src/include/nodes/parsenodes.h |  1 -
 src/test/regress/expected/subscription.out |  2 +-
 src/test/regress/sql/subscription.sql  |  2 +-
 src/test/subscription/t/001_rep_changes.pl |  2 +-
 8 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index b2fae027f5..5723be744d 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6609,7 +6609,7 @@ 
pg_subscription_rel
   
This catalog only contains tables known to the subscription after running
either CREATE SUBSCRIPTION or
-   ALTER SUBSCRIPTION ... REFRESH.
+   ALTER SUBSCRIPTION ... REFRESH PUBLICATION.
   
 
   
diff --git a/doc/src/sgml/ref/alter_subscription.sgml 
b/doc/src/sgml/ref/alter_subscription.sgml
index a3471a0442..bead99622e 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -22,7 +22,7 @@
  
 
 ALTER SUBSCRIPTION name 
CONNECTION 'conninfo'
-ALTER SUBSCRIPTION name SET 
PUBLICATION publication_name [, 
...] { REFRESH [ WITH ( refresh_option [= value] [, ... ] ) ] | SKIP REFRESH }
+ALTER SUBSCRIPTION name SET 
PUBLICATION publication_name [, 
...] [ WITH ( set_publication_option [= value] [, ... ] ) ]
 ALTER SUBSCRIPTION name REFRESH 
PUBLICATION [ WITH ( refresh_option [= value] [, ... ] ) ]
 ALTER SUBSCRIPTION name ENABLE
 ALTER SUBSCRIPTION name DISABLE
@@ -80,18 +80,29 @@ Parameters
  
   Changes list of subscribed publications. See
for more information.
+  By default this command will also act like REFRESH
+  PUBLICATION.
  
 
  
-  When REFRESH is specified, this command will also act
-  like REFRESH
-  PUBLICATION.  refresh_option specifies
-  additional options for the refresh operation, as described
-  under REFRESH PUBLICATION.  When
-  SKIP REFRESH is specified, the command will not try
-  to refresh table information.  Note that
-  either REFRESH or SKIP REFRESH
-  must be specified.
+  set_publication_option specifies additional
+  options for this operation.  The supported options are:
+
+  
+   
+refresh (boolean)
+
+ 
+  When false, the command will not try to refresh table information.
+  REFRESH PUBLICATION should then be executed 
separately.
+  The default is true.
+ 
+
+   
+  
+
+  Additionally, refresh options as described
+  under REFRESH PUBLICATION may be specified.
  
 

@@ -107,7 +118,7 @@ Parameters
  
 
  
-  refresh_option specifies additional options for the
+  refresh_option specifies additional options 
for the
   refresh operation.  The supported options are:
 
   
@@ -185,7 +196,7 @@ Examples
Change the publication subscribed by a subscription to
insert_only:
 
-ALTER SUBSCRIPTION mysub SET PUBLICATION insert_only REFRESH;
+ALTER SUBSCRIPTION mysub SET PUBLICATION insert_only;
 
   
 
diff --git a/src/backend/commands/subscriptioncmds.c 
b/src/backend/commands/subscriptioncmds.c
index 86eb31df93..ad98b38efe 

Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-02 Thread Thomas Munro
On Sat, Jun 3, 2017 at 1:20 AM, Thomas Munro
 wrote:
> On Sat, Jun 3, 2017 at 12:10 AM, Thomas Munro
>  wrote:
>> I would have expected that to be handled by separate transition tables
>> at different query levels, but evidently it isn't.
>
> I am wondering if we need to wrap the execution of a modifying CTE in
> AfterTriggerBeginQuery() and AfterTriggerEndQuery().  But I'm not sure
> where, and it may be a couple of days before I can dig some more.

So, afterTriggers.query_stack is used to handle the reentrancy that
results from triggers running further statements that might fire
triggers.  It isn't used for dealing with extra ModifyTable nodes that
can appear in a plan because of wCTEs.  Could it also be used for that
purpose?  I think that would only work if it is the case that each
ModifyTable node begin and then runs to completion (ie no interleaving
of wCTE execution) and then its AS trigger fires, which I'm not sure
about.  If that is the case, perhaps AfterTriggerBeginQuery and
AfterTriggerEndQuery could become the responsibility of
nodeModifyTable.c rather than execMain.c.  I haven't tried this yet
and it may well be too ambitious at this stage.

Other ideas: (1) ban wCTEs that target relations with transition
tables in PG10, because we're out of time; (2) find an entirely new
way to keep track of the current active transition table(s) to replace
the current stack approach, such as including them in the
TransitionCaptureState object in the patch that I proposed to fix the
nearby inheritance bugs[1].

Stepping back and squinting a bit, both this and the inheritance bug
stem from a failure to handle multiple ModifyTable nodes in a plan,
but the solutions are approximately opposite: in the inheritance case,
the solution I propose is to teach them all to coordinate their tuple
capture so that it's in a common format, and in the wCTE case the
solution must surely be to ensure that their state is kept separate.
The question I'd like to figure out is whether the existing
AfterTriggerBeginQuery/AfterTriggerEndQuery stack is the right data
structure for that task, considering the control flow when CTEs are
executed, which I need to learn more about.

Thoughts?

[1] 
https://www.postgresql.org/message-id/flat/CA%2BTgmoZzTBBAsEUh4MazAN7ga%3D8SsMC-Knp-6cetts9yNZUCcg%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication busy-waiting on a lock

2017-06-02 Thread Andres Freund
Hi,

On 2017-06-03 03:03:20 +0200, Petr Jelinek wrote:
> All right, here is my rough attempt on doing what Andres has suggested,
> going straight from xid to vxid, seems to work fine in my tests and
> parallel pg_dump seems happy with it as well.

Cool!


> The second patch removes the xid parameter from SnapBuildBuildSnapshot
> as it's not used for anything and skips the GetTopTransactionId() call
> as well.

Makes sense.


> That should solve the original problem reported here.

Did you verify that?


> From 4f7eb5b8fdbab2539731414e81d7a7419a83e5b1 Mon Sep 17 00:00:00 2001
> From: Petr Jelinek 
> Date: Sat, 3 Jun 2017 02:06:22 +0200
> Subject: [PATCH 1/2] Use virtual transaction instead of normal ones in
>  exported snapshots

> @@ -1817,8 +1818,10 @@ ProcArrayInstallImportedXmin(TransactionId xmin, 
> TransactionId sourcexid)
>   if (pgxact->vacuumFlags & PROC_IN_VACUUM)
>   continue;
>  
> - xid = pgxact->xid;  /* fetch just once */

Hm, I don't quite understand what the 'fetch just once' bit was trying
to achieve here (not to speak of the fact that it's simply not
guaranteed this way).  Therefore I think you're right to simply
disregard it.


> @@ -1741,17 +1741,17 @@ GetSerializableTransactionSnapshotInt(Snapshot 
> snapshot,
>   } while (!sxact);
>  
>   /* Get the snapshot, or check that it's safe to use */
> - if (!TransactionIdIsValid(sourcexid))
> + if (!sourcevxid)
>   snapshot = GetSnapshotData(snapshot);
> - else if (!ProcArrayInstallImportedXmin(snapshot->xmin, sourcexid))
> + else if (!ProcArrayInstallImportedXmin(snapshot->xmin, sourcevxid))
>   {
>   ReleasePredXact(sxact);
>   LWLockRelease(SerializableXactHashLock);
>   ereport(ERROR,
>   
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>errmsg("could not import the requested 
> snapshot"),
> -errdetail("The source transaction %u is not running 
> anymore.",
> -  sourcexid)));
> +errdetail("The source virtual transaction %d/%u is not running 
> anymore.",
> +  sourcevxid->backendId, 
> sourcevxid->localTransactionId)));

Hm, this is a harder to read.  Wonder if we should add a pid field, that'd
make it a bit easier to interpret?


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication busy-waiting on a lock

2017-06-02 Thread Petr Jelinek
On 02/06/17 03:38, Robert Haas wrote:
> On Thu, Jun 1, 2017 at 2:28 PM, Andres Freund  wrote:
>> On 2017-06-01 14:17:44 +0200, Petr Jelinek wrote:
>>> Thinking more about this, I am not convinced it's a good idea to change
>>> exports this late in the cycle. I still think it's best to do the xid
>>> assignment only when the snapshot is actually exported but don't assign
>>> xid when the export is only used by the local session (the new option in
>>> PG10). That's one line change which impacts only logical
>>> replication/decoding as opposed to everything else which uses exported
>>> snapshots.
>>
>> I'm not quite convinced by this argument.  Exported snapshot contents
>> are ephemeral, we can change the format at any time.  The wait is fairly
>> annoying for every user of logical decoding.  For me the combination of
>> those two fact implies that we should rather fix this properly.
> 
> +1.  The change Andres is proposing doesn't sound like it will be
> terribly high-impact, and I think we'll be happier in the long run if
> we install real fixes instead of kludging it.
> 

All right, here is my rough attempt on doing what Andres has suggested,
going straight from xid to vxid, seems to work fine in my tests and
parallel pg_dump seems happy with it as well.

The second patch removes the xid parameter from SnapBuildBuildSnapshot
as it's not used for anything and skips the GetTopTransactionId() call
as well. That should solve the original problem reported here.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


0002-Don-t-assign-xid-to-logical-decoding-snapshots.patch
Description: invalid/octet-stream


0001-Use-virtual-transaction-instead-of-normal-ones-in-ex.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-06-02 Thread Andres Freund
On 2017-06-01 17:29:12 -0700, Andres Freund wrote:
> On 2017-06-02 08:38:51 +0900, Michael Paquier wrote:
> > On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund  wrote:
> > > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler.
> > > Normally INT is used cancel interrupts, and since walsender is now also
> > > working as a normal backend, this overlap is bad.  Even for plain
> > > walsender backend this seems bad, because now non-superusers replication
> > > users will terminate replication connections when they do
> > > pg_cancel_backend().  For replication=dbname users it's especially bad
> > > because there can be completely legitimate uses of pg_cancel_backend().
> >
> > Signals for WAL senders are set in WalSndSignals() which uses SIG_IGN
> > for SIGINT now in ~9.6, and StatementCancelHandler does not get set up
> > for a non-am_walsender backend. Am I missing something?
> 
> Yes, but nothing in those observeration actually addresses my point?
> 
> Some points:
> 
> 1) 086221cf6b1727c2baed4703c582f657b7c5350e changes things so walsender
>backends use SIGINT for WalSndLastCycleHandler(), which is now
>triggerable by pg_cancel_backend(). Especially for logical rep
>walsenders it's not absurd sending that.
> 2) Walsenders now can run normal queries.
> 3) Afaict 086221cf6b1727c2baed4703c582f657b7c5350e doesn't really
>address the PANIC problem for database connected walsenders at all,
>because it doesn't even cancel non-replication commands.  I.e. an
>already running query can just continue to run.  Which afaict just
>entirely breaks shutdown.  If the connection is idle, or running a
>query, we'll just wait forever.
> 4) the whole logic introduced in the above commit doesn't actually
>appear to handle logical decoding senders properly - wasn't the whole
>issue at hand that those can write WAL in some case?  But
>nevertheless WalSndWaitForWal() does a
>WalSndSetState(WALSNDSTATE_STOPPING); *and then continues decoding
>and waiting* - which seems to obviate the entire point of that commit.
> 
> I'm working on a patch rejiggering things so:
> 
> a) upon shutdown checkpointer (so we can use procsignal), not
>postmaster, sends PROCSIG_WALSND_INIT_STOPPING to all walsenders (so
>we don't have to use up a normal signal handler)
> b) Upon reception walsenders immediately exit if !replication_active,
>otherwise sets got_STOPPING
> c) Logical decoding will finish if got_STOPPING is sent, *WITHOUT*, as
>currently done, moving to WALSNDSTATE_STOPPING.  Not yet quite sure
>how to best handle the WALSNDSTATE_STOPPING transition in WalSndLoop().
> d) Once all remaining walsenders are in stopping state, postmaster sends
>SIGUSR2 to trigger shutdown (basically as before)
> 
> Does that seem to make sense?

Attached is a *preliminary* patch series implementing this.  I've first
reverted the previous patch, as otherwise backpatchable versions of the
necessary patches would get too complicated, due to the signals used and
such.

This also fixes several of the issues from the somewhat related thread at
http://archives.postgresql.org/message-id/20170421014030.fdzvvvbrz4nckrow%40alap3.anarazel.de

I'm not perfectly happy with the use of XLogBackgroundFlush() but we
don't currently expose anything else to flush all pending WAL afaics -
it's not too bad either.  Without that we can end up waiting forever
while if the last XLogInserts are done by an asynchronously committing
backend or the relevant backends exited before getting to flush out
their records, because walwriter has already been shut down at that
point.

Comments?

- Andres
>From 187f99cdf98886be954cd1edda275c51b83da5ef Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Fri, 2 Jun 2017 14:14:34 -0700
Subject: [PATCH 1/4] Revert "Prevent panic during shutdown checkpoint"

This reverts commit 086221cf6b1727c2baed4703c582f657b7c5350e, which
was made to master only.

The approach implemented in the above commit has some issues.  While
this could easily be fixed incrementally, doing so would make
backpatching considerably harder, so instead first revert this patch.

Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxp...@alap3.anarazel.de
---
 doc/src/sgml/monitoring.sgml|   5 -
 src/backend/access/transam/xlog.c   |   6 --
 src/backend/postmaster/postmaster.c |   7 +-
 src/backend/replication/walsender.c | 143 
 src/include/replication/walsender.h |   1 -
 src/include/replication/walsender_private.h |   3 +-
 6 files changed, 24 insertions(+), 141 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 79ca45a156..5640c0d84a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1690,11 +1690,6 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
backup: This 

Re: [HACKERS] logical replication - still unstable after all these months

2017-06-02 Thread Petr Jelinek
On 02/06/17 22:29, Petr Jelinek wrote:
> On 02/06/17 08:55, Mark Kirkwood wrote:
>> On 02/06/17 17:11, Erik Rijkers wrote:
>>
>>> On 2017-06-02 00:46, Mark Kirkwood wrote:
 On 31/05/17 21:16, Petr Jelinek wrote:

 I'm seeing a new failure with the patch applied - this time the
 history table has missing rows. Petr, I'll put back your access :-)
>>>
>>> Is this error during 1-minute runs?
>>>
>>> I'm asking because I've moved back to longer (1-hour) runs (no errors
>>> so far), and I'd like to keep track of what the most 'vulnerable'
>>> parameters are.
>>>
>>
>> Yeah, still using your test config (with my minor modifications).
>>
>> When I got the error the 1st time, I did a complete make clean and
>> rebuildbut it is still possible I've 'done it wrong' - so
>> independent confirmation would be good!
> 
> Well, I've seen this issue as well while I was developing the fix, but
> the patch I proposed fixed it for me as well as the original issue.
> 

While I was testing something for different thread I noticed that I
manage transactions incorrectly in this patch. Fixed here, I didn't test
it much yet (it takes a while as you know :) ). Not sure if it's related
to the issue you've seen though.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Improve-handover-logic-between-sync-and-apply-worker-v2.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender termination error messages worse in v10

2017-06-02 Thread Petr Jelinek
On 02/06/17 23:45, Andres Freund wrote:
> Hi Petr,
> 
> On 2017-06-02 22:57:37 +0200, Petr Jelinek wrote:
>> On 02/06/17 20:51, Andres Freund wrote:
>>> I don't understand why the new block is there, nor does the commit
>>> message explain it.
>>>
>>
>> Hmm, that particular change can actually be reverted. It was needed for
>> one those custom replication commands which were replaced by normal
>> query support. I have missed it during the rewrite.
> 
> Doesn't appear to be quite that simple, I get regression test failures
> in that case.
> 

Hmm, looks like we still use it for normal COPY handling. So basically
the problem is that if we run COPY TO STDOUT and then consume it using
the libpqrcv_receive it will end with normal PGRES_COMMAND_OK but we
need to call PQgetResult() in that case otherwise libpq thinks the
command is still active and any following command will fail, but if we
call PQgetResult on dead connection we get that error you complained about.

I guess it would make sense to do conditional exit on
(PQstatus(streamConn) == CONNECTION_BAD) like libpqrcv_PQexec does. It's
quite ugly code-wise though.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


libpqwalsender-disconnect-fix.diff
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender termination error messages worse in v10

2017-06-02 Thread Andres Freund
Hi Petr,

On 2017-06-02 22:57:37 +0200, Petr Jelinek wrote:
> On 02/06/17 20:51, Andres Freund wrote:
> > I don't understand why the new block is there, nor does the commit
> > message explain it.
> > 
> 
> Hmm, that particular change can actually be reverted. It was needed for
> one those custom replication commands which were replaced by normal
> query support. I have missed it during the rewrite.

Doesn't appear to be quite that simple, I get regression test failures
in that case.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Directory pg_replslot is not properly cleaned

2017-06-02 Thread Fabrízio de Royes Mello
On Fri, Jun 2, 2017 at 6:32 PM, Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
>
> Hi all,
>
> This week I faced a out of disk space trouble in 8TB production cluster.
During investigation we notice that pg_replslot was the culprit growing
more than 1TB in less than 1 (one) hour.
>
> We're using PostgreSQL 9.5.6 with pglogical 1.2.2 replicating to a new
9.6 instance and planning the upgrade soon.
>
> What I did? I freed some disk space just to startup PostgreSQL and begin
the investigation. During the 'startup recovery' simply the files inside
the pg_replslot was tottaly removed. So our trouble with 'out of disk
space' disappear. Then the server went up and physical slaves attached
normally to master but logical slaves doesn't, staying stalled in 'catchup'
state.
>
> At this moment the "pg_replslot" directory started growing fast again and
forced us to drop the logical replication slot and we lost the logical
slave.
>
> Googling awhile I found this thread [1] about a similar issue reported by
Dmitriy Sarafannikov and replied by Andres and Álvaro.
>
> I ran the test case provided by Dmitriy [1] against branches:
> - REL9_4_STABLE
> - REL9_5_STABLE
> - REL9_6_STABLE
> - master
>
> After all test the issue remains... and also using the new Logical
Replication stuff (CREATE PUB/CREATE SUB). Just after a restart the
"pg_replslot" was properly cleaned. The typo in ReorderBufferIterTXNInit
complained by Dimitriy was fixed but the issue remains.
>
> Seems no one complain again about this issue and the thread was lost.
>
> The attached is a reworked version of Dimitriy's patch that seems solve
the issue. I confess I don't know enough about replication slots code to
really know if it's the best solution.
>
> Regards,
>
> [1]
https://www.postgresql.org/message-id/1457621358.355011041%40f382.i.mail.ru
>

Just adding Dimitriy to conversation... previous email I provided was wrong.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


[HACKERS] Directory pg_replslot is not properly cleaned

2017-06-02 Thread Fabrízio de Royes Mello
Hi all,

This week I faced a out of disk space trouble in 8TB production cluster.
During investigation we notice that pg_replslot was the culprit growing
more than 1TB in less than 1 (one) hour.

We're using PostgreSQL 9.5.6 with pglogical 1.2.2 replicating to a new 9.6
instance and planning the upgrade soon.

What I did? I freed some disk space just to startup PostgreSQL and begin
the investigation. During the 'startup recovery' simply the files inside
the pg_replslot was tottaly removed. So our trouble with 'out of disk
space' disappear. Then the server went up and physical slaves attached
normally to master but logical slaves doesn't, staying stalled in 'catchup'
state.

At this moment the "pg_replslot" directory started growing fast again and
forced us to drop the logical replication slot and we lost the logical
slave.

Googling awhile I found this thread [1] about a similar issue reported by
Dmitriy Sarafannikov and replied by Andres and Álvaro.

I ran the test case provided by Dmitriy [1] against branches:
- REL9_4_STABLE
- REL9_5_STABLE
- REL9_6_STABLE
- master

After all test the issue remains... and also using the new Logical
Replication stuff (CREATE PUB/CREATE SUB). Just after a restart the
"pg_replslot" was properly cleaned. The typo in ReorderBufferIterTXNInit
complained by Dimitriy was fixed but the issue remains.

Seems no one complain again about this issue and the thread was lost.

The attached is a reworked version of Dimitriy's patch that seems solve the
issue. I confess I don't know enough about replication slots code to really
know if it's the best solution.

Regards,

[1]
https://www.postgresql.org/message-id/1457621358.355011041%40f382.i.mail.ru

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 524946a..a538715 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -1142,7 +1142,7 @@ ReorderBufferCleanupTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
 	Assert(found);
 
 	/* remove entries spilled to disk */
-	if (txn->nentries != txn->nentries_mem)
+	if (txn->nentries != txn->nentries_mem || txn->is_known_as_subxact)
 		ReorderBufferRestoreCleanup(rb, txn);
 
 	/* deallocate */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG 10 release notes

2017-06-02 Thread Jim Nasby

On 4/24/17 8:31 PM, Bruce Momjian wrote:

I have committed the first draft of the Postgres 10 release notes.  They
are current as of two days ago, and I will keep them current.  Please
give me any feedback you have.

The only unusual thing is that this release has ~180 items while most
recent release have had ~220.  The pattern I see that there are more
large features in this release than previous ones.


Can you change the attribution on

Allow PL/Tcl functions to return composite types and sets

to Karl Lehenbauer? He actually wrote the original patch; I just helped 
to get it through the community (something that FlightAware paid for). I 
didn't realize at the time that you could change the listed Author in 
the commitfest.

--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender termination error messages worse in v10

2017-06-02 Thread Petr Jelinek
Hi Andres,

On 02/06/17 20:51, Andres Freund wrote:
> Hi,
> 
> commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920
> Author: Peter Eisentraut 
> Date:   2017-03-23 08:36:36 -0400
> 
> Logical replication support for initial data copy
> 
> made walreceiver emit worse messages in v10 than before when the master
> gets shut down.  Before 10 you'll get something like:
> 
> 2017-06-02 11:46:07.173 PDT [16143][] LOG:  replication terminated by primary 
> server
> 2017-06-02 11:46:07.173 PDT [16143][] DETAIL:  End of WAL reached on timeline 
> 1 at 0/1B7FB8C8.
> 2017-06-02 11:46:07.173 PDT [16143][] FATAL:  could not send end-of-streaming 
> message to primary: no COPY in progress
> 
> the last bit is a bit superflous, but it still kinda makes sense.  Now
> you get:
> 2017-06-02 11:47:09.362 PDT [17061][] FATAL:  unexpected result after 
> CommandComplete: server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> 
> the reason is that
> 
> static int
> libpqrcv_receive(char **buffer, pgsocket *wait_fd)
> {
> 
> previously did:
> if (rawlen == -1)   /* end-of-streaming or error */
> {
> PGresult   *res;
> 
> res = PQgetResult(streamConn);
> if (PQresultStatus(res) == PGRES_COMMAND_OK ||
> PQresultStatus(res) == PGRES_COPY_IN)
> {
> PQclear(res);
> return -1;
> }
> else
> {
> PQclear(res);
> ereport(ERROR,
> (errmsg("could not receive data from WAL stream: %s",
> PQerrorMessage(streamConn;
> }
> }
> 
> and now does
> 
> if (rawlen == -1)   /* end-of-streaming or error */
> {
> PGresult   *res;
> 
> res = PQgetResult(conn->streamConn);
> if (PQresultStatus(res) == PGRES_COMMAND_OK)
> {
> PQclear(res);
> 
> /* Verify that there are no more results */
> res = PQgetResult(conn->streamConn);
> if (res != NULL)
> ereport(ERROR,
> (errmsg("unexpected result after CommandComplete: %s",
> PQerrorMessage(conn->streamConn;
> return -1;
> }
> else if (PQresultStatus(res) == PGRES_COPY_IN)
> {
> PQclear(res);
> return -1;
> }
> else
> {
> PQclear(res);
> ereport(ERROR,
> (errmsg("could not receive data from WAL stream: %s",
> pchomp(PQerrorMessage(conn->streamConn);
> }
> }
> 
> note the new /* Verify that there are no more results */ bit.
> 
> I don't understand why the new block is there, nor does the commit
> message explain it.
> 

Hmm, that particular change can actually be reverted. It was needed for
one those custom replication commands which were replaced by normal
query support. I have missed it during the rewrite.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-06-02 Thread Petr Jelinek
On 02/06/17 21:05, Peter Eisentraut wrote:
> On 6/2/17 02:31, Masahiko Sawada wrote:
>> I'd say current patch makes the user difficult to
>> distinguish between apply worker and table sync worker.
> 
> We could arguably make apply workers and sync workers have different
> bgw_type values.  But if you are interested in that level of detail, you
> should perhaps look at pg_stat_subscription.  pg_stat_activity only
> contains the "common" data, and the process-specific data is in other views.
> 

Agreed with this.

However, I am not sure about the bgw_name_extra. I think I would have
preferred keeping full bgw_name field which would be used where full
name is needed and bgw_type where only the worker type is used. The
concatenation just doesn't sit well with me, especially if it requires
the bgw_name_extra to start with space.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-06-02 Thread Robert Haas
On Thu, Jun 1, 2017 at 3:35 PM, Jeevan Ladhe
 wrote:
> Please let me know if I have missed anything and any further comments.

+ errmsg("a default partition \"%s\" already exists",

I suggest: partition \"%s\" conflicts with existing default partition \"%s\"

The point is that's more similar to the message you get when overlap
&& !spec->is_default.

+ * If the default partition exists, it's partition constraint will change

it's -> its

+ errmsg("default partition contains row(s)
that would overlap with partition being created")));

It doesn't really sound right to talk about rows overlapping with a
partition.  Partitions can overlap with each other, but not rows.
Also, it's not really project style to use ambiguously plural forms
like "row(s)" in error messages.  Maybe something like:

new partition constraint for default partition \"%s\" would be
violated by some row

+/*
+ * InvalidateDefaultPartitionRelcache
+ *
+ * Given a parent oid, this function checks if there exists a default partition
+ * and invalidates it's relcache if it exists.
+ */
+void
+InvalidateDefaultPartitionRelcache(Oid parentOid)
+{
+Relation parent = heap_open(parentOid, AccessShareLock);
+Oid default_relid =
parent->rd_partdesc->oids[DEFAULT_PARTITION_INDEX(parent)];
+
+if (partition_bound_has_default(parent->rd_partdesc->boundinfo))
+CacheInvalidateRelcacheByRelid(default_relid);
+
+heap_close(parent, AccessShareLock);
+}

It does not seem like a good idea to put the heap_open() call inside
this function.  One of the two callers already *has* the Relation, and
we definitely want to avoid pulling the Oid out of the Relation only
to reopen it to get the Relation back.  And I think
heap_drop_with_catalog could open the parent relation instead of
calling LockRelationOid().

If DETACH PARTITION and DROP PARTITION require this, why not ATTACH
PARTITION and CREATE TABLE .. PARTITION OF?

The indentation of the changes in gram.y doesn't appear to match the
nearby code.  I'd remove this comment:

+ * Currently this is supported only for LIST partition.

Since nothing here is dependent on this working only for LIST
partitions, and since this will probably change, I think it would be
more future-proof to leave this out, lest somebody forget to update it
later.

-switch (spec->strategy)
+if (spec->is_default && (strategy == PARTITION_STRATEGY_LIST ||
+ strategy == PARTITION_STRATEGY_RANGE))

Checking strategy here appears pointless.

This is not a full review, but I'm out of time for today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical replication - still unstable after all these months

2017-06-02 Thread Petr Jelinek
On 02/06/17 08:55, Mark Kirkwood wrote:
> On 02/06/17 17:11, Erik Rijkers wrote:
> 
>> On 2017-06-02 00:46, Mark Kirkwood wrote:
>>> On 31/05/17 21:16, Petr Jelinek wrote:
>>>
>>> I'm seeing a new failure with the patch applied - this time the
>>> history table has missing rows. Petr, I'll put back your access :-)
>>
>> Is this error during 1-minute runs?
>>
>> I'm asking because I've moved back to longer (1-hour) runs (no errors
>> so far), and I'd like to keep track of what the most 'vulnerable'
>> parameters are.
>>
> 
> Yeah, still using your test config (with my minor modifications).
> 
> When I got the error the 1st time, I did a complete make clean and
> rebuildbut it is still possible I've 'done it wrong' - so
> independent confirmation would be good!

Well, I've seen this issue as well while I was developing the fix, but
the patch I proposed fixed it for me as well as the original issue.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP: allow overriding PostgresNode in get_new_node

2017-06-02 Thread Chapman Flack
On 06/02/2017 12:50 PM, Robert Haas wrote:
> On Thu, Jun 1, 2017 at 3:36 AM, Michael Paquier
>>
>> +$pgnclass = 'PostgresNode' unless defined $pgnclass;
>> I'd rather leave any code of this kind for the module maintainers,
> 
> Craig's proposal is a standard Perl idiom, though.

Would it not be even more idiomatic to have the class, if
present, be the first argument? That is:

  my $pgnclass = 'PostgresNode';
  $pgnclass = shift if 1 < scalar @_;
  my $name = shift;

That part's a little weird (an optional FIRST argument?) in order
to preserve compatibility with callers who don't pass a class.

But what it buys you is then if your MyExtraPGNode has PostgresNode
as a base, the familiar idiom

  MyExtraPGNode->get_new_node('foo');

works, as it inserts the class as the first argument.

As a bonus, you then don't need to complicate get_new_node
with a test for (not ($node->isa("PostgresNode"))) because
if it weren't, it wouldn't have inherited get_new_node
so MyExtraPGNode->get_new_node('foo') would have failed.

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PLpgSQL parallel assignemnt

2017-06-02 Thread Pavel Stehule
2017-06-02 19:05 GMT+02:00 Pavel Stehule :

>
>
> 2017-06-02 10:15 GMT+02:00 Pavel Stehule :
>
>>
>>
>> 2017-06-02 10:06 GMT+02:00 Craig Ringer :
>>
>>> On 2 June 2017 at 15:51, Pavel Stehule  wrote:
>>>
>>> > a, b := fx();
>>> >
>>> > Comments, notes, ideas?
>>>
>>> I'd be pretty happy to have
>>>
>>> (a, b) = (x, y);
>>> (a, b) = f(x);
>>>
>>> which is SQL-esque.
>>>
>>
>> This is not too far to my proposal - and it is fully adequate
>> alternative.
>>
>
> The ANSI form is related to SET or UPDATE commands - so in this case I see
> classic languages style https://en.wikipedia.org/wiki/
> Assignment_(computer_science) better. The assign statement in PLpgSQL is
> not related to embedded SQL. If we introduce SQL syntax and SET commands
> for schema variables then ( ) syntax is perfect, but for := PLpgSQL I am
> not sure
>
> It is maybe strange, but
>
> SET (a,b) =  (SELECT a,b FROM foo)
>
> a, b := fx()
>
> are sentences from two independent worlds and different syntax can be
> correct (depends how much we would to integrate procedural and SQL worlds
> .. 100% T-SQL, 80% SQL/PSM, ..20% PLpgSQL or 5%PL/SQL)
>

More thoughts:

1. syntax (a,b) := f() ... can mean - assign record to temporary composite
(a,b)
2. syntax a,b := f() ... can mean - unpack result composite and assign to
a, b fields

so both syntaxes has sense although we don't introduce relation to SQL - on
this way

a,b := 10, 20   -- ok .. attach a=c1, b=c2
a,b := (10,20)  -- ok .. attach a = r.c1, b = r.c2
(a,b) := (10,20) -- ok  attach ct = rt
(a,b) := 10,20 -- ok attach ct = row(c1, c2)

@1 syntax says "create composite target", @2 syntax says "unpack result".
Both should to work. Personally I prefer @1 .. due less parenthesis

Regards

Pavel



> Regards
>
> Pavel
>
>
>>
>>
>>>
>>> But what, if anything, does Ada do?
>>>
>>
>> What I know, no, Ada has not this statement - but the design of OUT
>> parameters in Ada absolutely different than PostgreSQL - so in this case we
>> cannot to use Ada language as our base :(
>>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>> --
>>>  Craig Ringer   http://www.2ndQuadrant.com/
>>>  PostgreSQL Development, 24x7 Support, Training & Services
>>>
>>
>>
>


Re: [HACKERS] TAP: allow overriding PostgresNode in get_new_node

2017-06-02 Thread J Chapman Flack
On 06/02/2017 12:50 PM, Robert Haas wrote:
> On Thu, Jun 1, 2017 at 3:36 AM, Michael Paquier
>>
>> +$pgnclass = 'PostgresNode' unless defined $pgnclass;
>> I'd rather leave any code of this kind for the module maintainers,
> 
> Craig's proposal is a standard Perl idiom, though.

Would it not be even more idiomatic to have the class, if
present, be the first argument? That is:

  my $pgnclass = 'PostgresNode';
  $pgnclass = shift if 1 < scalar @_;
  my $name = shift;

That part's a little weird (an optional FIRST argument?) in order
to preserve compatibility with callers who don't pass a class.

But what it buys you is then if your MyExtraPGNode has PostgresNode
as a base, the familiar idiom

  MyExtraPGNode->get_new_node('foo');

works, as it inserts the class as the first argument.

As a bonus, you then don't need to complicate get_new_node
with a test for (not ($node->isa("PostgresNode"))) because
if it weren't, it wouldn't have inherited get_new_node
so MyExtraPGNode->get_new_node('foo') would have failed.

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error while creating subscription when server is running in single user mode

2017-06-02 Thread Tom Lane
Peter Eisentraut  writes:
> My point is that we shouldn't be putting checks into DDL commands about
> single-user mode if the actual cause of the issue is in a lower-level
> system.  Not all uses of a particular DDL command necessary use a latch,
> for example.  Also, there could be other things that hit a latch that
> are reachable in single-user mode that we haven't found yet.

> So I think the check should either go somewhere in the latch code, or
> possibly in the libpqwalreceiver code.  Or we make the latch code work
> so that the check-for-postmaster-death code becomes a noop in
> single-user mode.  Suggestions?

It's certainly plausible that we could have the latch code just ignore
WL_POSTMASTER_DEATH if not IsUnderPostmaster.  I think that the original
reasoning for not doing that was that the calling code should know which
environment it's in, and not pass an unimplementable wait-exit reason;
so silently ignoring the bit could mask a bug.  Perhaps that argument is
no longer attractive.  Alternatively, we could fix the relevant call sites
to do "(IsUnderPostmaster ? WL_POSTMASTER_DEATH : 0)", and keep the strict
behavior for the majority of call sites.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 3:48 AM, Rafia Sabih
 wrote:
> I tried following the sketch stated above, and here's what I added to
> v2 of the patch. In begin_remote_xact when savepoint is send to the
> remote server through do_sql_command I set the changing_xact_state
> flag and when it successfully returns from there the flag is unset.
> Similar behaviour is followed in pgfdw_subxact_callback for release
> savepoint. Additionally, I added this flag set-unset for start
> transaction command in begin_remote_xact. Enlighten me if I've
> msised/misunderstood something here.

You didn't catch all of the places where do_sql_command() is used to
change the transaction state, and you didn't update the comments to
match the code changes.

> I ran the regress test for
> postgres_fdw and it went on fine.

Well, that's good as far as it goes, but it doesn't prove much, since
this patch mostly changes the behavior when there are errors or
interrupts involved, and that's not going to happen in the regression
tests.

> I have a couple of concerns here, since there is only flag required
> for the purpose, it's  name to be changing_xact_state doesn't suit at
> some places particularly in pgfdw_subxact_callback, not sure should
> change comment or variable name
>
> /* Disarm abort_cleanup_incomplete if it all worked. */
> + entry->changing_xact_state = abort_cleanup_failure;

Seems pretty obvious that would need to be updated, at least insofar
as making the comment match the new structure member name.  I mean,
there's often some question regarding the degree to which existing
comments should be updated, but it hardly makes sense to add a new
comment that isn't consistent with the rest of the patch, right?

> Also, by any chance should we add a test-case for this?

I don't see how to do that.  It could possibly be done with the TAP
framework, but that exceeds my abilities.

Here's an updated patch with a bunch of cosmetic fixes, plus I
adjusted things so that do_sql_command() is more interruptible.  I
tested it manually and it seems to work OK.  I'll commit and
back-patch this version, barring objections or further suggestions for
improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


improve-pgfdw-abort-behavior-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Create subscription with `create_slot=false` and incorrect slot name

2017-06-02 Thread Peter Eisentraut
On 6/1/17 08:20, Petr Jelinek wrote:
> I think the combination of those patches is probably good enough
> solution for PG10 (I never understood the need for name validation in
> ReplicationSlotAcquire() anyway).

I have committed these two patches and will close this open item.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error while creating subscription when server is running in single user mode

2017-06-02 Thread Andres Freund
On 2017-06-02 15:00:21 -0400, Peter Eisentraut wrote:
> On 6/1/17 21:55, Andres Freund wrote:
> > On 2017-06-01 21:42:41 -0400, Peter Eisentraut wrote:
> >> We should look at what the underlying problem is before we prohibit
> >> anything at a high level.
> > 
> > I'm not sure there's any underlying issue here, except being in single
> > user mode.
> 
> My point is that we shouldn't be putting checks into DDL commands about
> single-user mode if the actual cause of the issue is in a lower-level
> system.

But it's not really.


> Not all uses of a particular DDL command necessary use a latch,
> for example.  Also, there could be other things that hit a latch that
> are reachable in single-user mode that we haven't found yet.

Latches work in single user mode, it's just that the new code for some
reason uses uninitialized memory as the latch.  As I pointed out above,
the new code really should just use MyLatch instead of
MyProc->procLatch.


> So I think the check should either go somewhere in the latch code, or
> possibly in the libpqwalreceiver code.  Or we make the latch code work
> so that the check-for-postmaster-death code becomes a noop in
> single-user mode.  Suggestions?

I don't think the postmaster death code is really the issue here.  Nor
is libpqwalreceiver really the issue.  We can put ERRORs in a bunch of
unrelated subsystems, sure, but that doesn't really solve the issue that
logical rep pretty essentially requires multiple processes.  We've
prevented parallelism from being used in general (cf. standard_planner),
we've not put checks in all the subsystems it uses.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-06-02 Thread Peter Eisentraut
On 5/30/17 23:10, Peter Eisentraut wrote:
> Here is a proposed solution that splits bgw_name into bgw_type and
> bgw_name_extra.  bgw_type shows up in pg_stat_activity.backend_type.
> Uses of application_name are removed, because they are no longer
> necessary to identity the process type.
> 
> This code appears to be buggy because I sometimes get NULL results of
> the backend_type lookup, implying that it couldn't find the background
> worker slot.  This needs another look.

I would like some more input on this proposal, especially from those
have have engineered the extended pg_stat_activity content.

If we don't come to a quick conclusion on this, I'd be content to leave
PG10 as is and put this patch into the next commit fest.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-06-02 Thread Peter Eisentraut
On 6/2/17 02:31, Masahiko Sawada wrote:
> On Wed, May 31, 2017 at 12:10 PM, Peter Eisentraut
>  wrote:
>> Here is a proposed solution that splits bgw_name into bgw_type and
>> bgw_name_extra.  bgw_type shows up in pg_stat_activity.backend_type.
>> Uses of application_name are removed, because they are no longer
>> necessary to identity the process type.
> 
> Hmm, is there any reasons why bgw_name_extra string doesn't appear in
> pg_stat_activity?

That's the whole point:  We want to be able to group similar process
types.  The _extra part is particular to a single process, so it might
contain a specific OID or PID it is working on.  The bgw_type is common
for all workers of that kind.

> I'd say current patch makes the user difficult to
> distinguish between apply worker and table sync worker.

We could arguably make apply workers and sync workers have different
bgw_type values.  But if you are interested in that level of detail, you
should perhaps look at pg_stat_subscription.  pg_stat_activity only
contains the "common" data, and the process-specific data is in other views.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error while creating subscription when server is running in single user mode

2017-06-02 Thread Peter Eisentraut
On 6/1/17 21:55, Andres Freund wrote:
> On 2017-06-01 21:42:41 -0400, Peter Eisentraut wrote:
>> We should look at what the underlying problem is before we prohibit
>> anything at a high level.
> 
> I'm not sure there's any underlying issue here, except being in single
> user mode.

My point is that we shouldn't be putting checks into DDL commands about
single-user mode if the actual cause of the issue is in a lower-level
system.  Not all uses of a particular DDL command necessary use a latch,
for example.  Also, there could be other things that hit a latch that
are reachable in single-user mode that we haven't found yet.

So I think the check should either go somewhere in the latch code, or
possibly in the libpqwalreceiver code.  Or we make the latch code work
so that the check-for-postmaster-death code becomes a noop in
single-user mode.  Suggestions?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Get stuck when dropping a subscription during synchronizing table

2017-06-02 Thread Peter Eisentraut
On 5/24/17 15:14, Petr Jelinek wrote:
> All the locking works just fine the way it is in master. The issue with
> deadlock with apply comes from the wrong handling of the SIGTERM in the
> apply (we didn't set InterruptPending). I changed the SIGTERM handler in
> patch 0001 just to die which is actually the correct behavior for apply
> workers. I also moved the connection cleanup code to the
> before_shmem_exit callback (similar to walreceiver) and now that part
> works correctly.

I have committed this, in two separate parts.  This should fix the
originally reported issue.

I will continue to work through your other patches.  I notice there is
still a bit of discussion about another patch, so please let me know if
there is anything else I should be looking for.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] walsender termination error messages worse in v10

2017-06-02 Thread Andres Freund
Hi,

commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920
Author: Peter Eisentraut 
Date:   2017-03-23 08:36:36 -0400

Logical replication support for initial data copy

made walreceiver emit worse messages in v10 than before when the master
gets shut down.  Before 10 you'll get something like:

2017-06-02 11:46:07.173 PDT [16143][] LOG:  replication terminated by primary 
server
2017-06-02 11:46:07.173 PDT [16143][] DETAIL:  End of WAL reached on timeline 1 
at 0/1B7FB8C8.
2017-06-02 11:46:07.173 PDT [16143][] FATAL:  could not send end-of-streaming 
message to primary: no COPY in progress

the last bit is a bit superflous, but it still kinda makes sense.  Now
you get:
2017-06-02 11:47:09.362 PDT [17061][] FATAL:  unexpected result after 
CommandComplete: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

the reason is that

static int
libpqrcv_receive(char **buffer, pgsocket *wait_fd)
{

previously did:
if (rawlen == -1)   /* end-of-streaming or error */
{
PGresult   *res;

res = PQgetResult(streamConn);
if (PQresultStatus(res) == PGRES_COMMAND_OK ||
PQresultStatus(res) == PGRES_COPY_IN)
{
PQclear(res);
return -1;
}
else
{
PQclear(res);
ereport(ERROR,
(errmsg("could not receive data from WAL stream: %s",
PQerrorMessage(streamConn;
}
}

and now does

if (rawlen == -1)   /* end-of-streaming or error */
{
PGresult   *res;

res = PQgetResult(conn->streamConn);
if (PQresultStatus(res) == PGRES_COMMAND_OK)
{
PQclear(res);

/* Verify that there are no more results */
res = PQgetResult(conn->streamConn);
if (res != NULL)
ereport(ERROR,
(errmsg("unexpected result after CommandComplete: %s",
PQerrorMessage(conn->streamConn;
return -1;
}
else if (PQresultStatus(res) == PGRES_COPY_IN)
{
PQclear(res);
return -1;
}
else
{
PQclear(res);
ereport(ERROR,
(errmsg("could not receive data from WAL stream: %s",
pchomp(PQerrorMessage(conn->streamConn);
}
}

note the new /* Verify that there are no more results */ bit.

I don't understand why the new block is there, nor does the commit
message explain it.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Tom Lane
Teodor Sigaev  writes:
>> BTW, was the larger query plan that you showed (with a Materialize node)
>> generated by 9.6, or v10 HEAD?  Because I would be surprised if 9.6 did

> v10,
> commit acbd8375e954774181b673a31b814e9d46f436a5
> Author: Magnus Hagander 
> Date:   Fri Jun 2 11:18:24 2017 +0200

Thanks.  Meanwhile, I poked into this larger example (which Teodor had
sent me the data for off-list).  I concur with the conclusion that the
speed change is because the HEAD code inserts a Materialize node on
the inside of an inner loop even though it thinks the outside will
produce only one row.  In reality the outside produces five rows so
there's a small win from the Materialize, and because this is all in
a SubPlan that gets executed 24121 times, that adds up.

However, I still don't think that this is evidence in favor of forcing
a Materialize on the inside of a nestloop even when we think the outside
will produce just one row; and it's certainly not evidence that we should
do that accidentally in a small number of cases due to a logic error.
This query itself has got four other places where there's a nestloop
with an outer rel that's predicted to return just one row, and in those
four places the prediction is correct.  If we were to establish a policy
like that, we'd be adding useless overhead to those other places.
(Out of idle curiosity, I hacked the planner to force materialization
for all non-parameterized nestloop inners, and confirmed that that
adds back a couple hundred msec to this query.  It might have been
worse, except that two of the four other places are in SubPlans that
never get executed in this specific example.)

So I think it's just chance that this bug was a net benefit on this
query, and it's not a reason not to go ahead with the patch.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strcmp() tie-breaker for identical ICU-collated strings

2017-06-02 Thread Peter Geoghegan
On Fri, Jun 2, 2017 at 10:34 AM, Amit Khandekar  wrote:
> Ok. I was thinking we are doing the tie-breaker because specifically
> strcoll_l() was unexpectedly returning 0 for some cases. Now I get it,
> that we do that to be compatible with texteq().

Both of these explanations are correct, in a way. See commit 656beff.

> Secondly, I was also considering if ICU especially has a way to
> customize an ICU locale by setting some attributes which dictate
> comparison or sorting rules for a set of characters. I mean, if there
> is such customized ICU locale defined in the system, and we use that
> to create PG collation, I thought we might have to strictly follow
> those rules without a tie-breaker, so as to be 100% conformant to ICU.
> I can't come up with an example, or may there isn't one, but , say ,
> there is a locale which is supposed to sort only by lowest comparison
> strength (de@strength=1 ?? ). In that case, there might be many
> characters considered equal, but PG < operator or > operator would
> still return true for those chars.

In the terminology of the Unicode collation algorithm, PostgreSQL
"forces deterministic comparisons" [1]. There is a lot of information
on the details of that within the UCA spec.

If we ever wanted to offer a case insensitive collation feature, then
we wouldn't necessarily have to do the equivalent of a full strxfrm()
when hashing, at least with collations controlled by ICU. Perhaps we
could instead use a collator whose UCOL_STRENGTH is only UCOL_PRIMARY
to build binary sort keys, and leave the rest to a ucol_equal() call
(within texteq()) that has the usual UCOL_STRENGTH for the underlying
PostgreSQL collation.

I don't think it would be possible to implement case insensitive
collations by using some pre-existing ICU collation that is case
insensitive. Instead, an implementation might directly vary collation
strength of any given collation to achieve case insensitivity.
PostgreSQL would know that this collation was case insensitive, so
regular collations wouldn't need to change their
behavior/implementation (to use ucol_equal() within texteq(), and so
on).

[1] http://unicode.org/reports/tr10/#Forcing_Deterministic_Comparisons
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_class.relpartbound definition overly brittle

2017-06-02 Thread Mark Dilger

> On Jun 2, 2017, at 9:59 AM, Tom Lane  wrote:
> 
> Robert Haas  writes:
>> Also, you're attacking a straw man. Accidentally storing a meaningless
>> parse location in the catalog isn't a feature, and we shouldn't
>> pretend otherwise.
> 
> It's not meaningless, and it is a feature, useful for debugging.
> Otherwise you'd have a hard time telling one occurrence of e.g. "+" from
> another when you're trying to make sense of a stored tree.  We worked out
> all this behavior ages ago for other expression node trees that are stored
> in the catalogs (default expressions, index expressions, check
> constraints, etc etc) and I don't see a reason for partition expressions
> to be different.

Ok, that makes more sense now.  Thanks for clarifying the history of how
and why the location field from parsing is getting stored.

> I agree that this means you can't just strcmp a couple of stored node tree
> strings to decide if they're equal, but you couldn't anyway --- a moment's
> perusal of equalfuncs.c will show you other special cases, each with their
> own rationale.  Maybe you'd like to complain that every one of those
> rationales is wrongheaded but I do not think you will get far.
> 
> I think the best advice for Mark is to look at pg_get_expr() output and
> see if that matches.

Yes, I'm already doing that based on the discussion so far.

Mark Dilger


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-06-02 Thread Kevin Grittner
> Mengxing Liu wrote:

>> The CPU utilization of CheckForSerializableConflictOut/In is
>> 0.71%/0.69%.

How many cores were on the system used for this test?  The paper
specifically said that they didn't see performance degradation on
the PostgreSQL implementation until 32 concurrent connections with
24 or more cores.  The goal here is to fix *scaling* problems.  Be
sure you are testing at an appropriate scale.  (The more sockets,
cores, and clients, the better, I think.)


On Fri, Jun 2, 2017 at 10:08 AM, Alvaro Herrera
 wrote:

> Kevin mentioned during PGCon that there's a paper by some group in
> Sydney that developed a benchmark on which this scalability
> problem showed up very prominently.

https://pdfs.semanticscholar.org/6c4a/e427e6f392b7dec782b7a60516f0283af1f5.pdf

"[...] we see a much better scalability of pgSSI than the
corresponding implementations on InnoDB. Still, the overhead of
pgSSI versus standard SI increases significantly with higher MPL
than one would normally expect, reaching around 50% with MPL 128."

"Our profiling showed that PostgreSQL spend 2.3% of the overall
runtime in traversing these list, plus 10% of its runtime waiting on
the corresponding kernel mutexes."

If you cannot duplicate their results, you might want to contact the
authors for more details on their testing methodology.

Note that the locking around access to the list is likely to be a
bigger problem than the actual walking and manipulation of the list
itself.

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strcmp() tie-breaker for identical ICU-collated strings

2017-06-02 Thread Amit Khandekar
On 2 June 2017 at 03:18, Thomas Munro  wrote:
> On Fri, Jun 2, 2017 at 9:27 AM, Peter Geoghegan  wrote:
>> On Thu, Jun 1, 2017 at 2:24 PM, Thomas Munro
>>  wrote:
>>> Why should ICU be any different than the system provider in this
>>> respect?  In both cases, we have a two-level comparison: first we use
>>> the collation-aware comparison, and then as a tie breaker, we use a
>>> binary comparison.  If we didn't do a binary comparison as a
>>> tie-breaker, wouldn't the result be logically incompatible with the =
>>> operator, which does a binary comparison?

Ok. I was thinking we are doing the tie-breaker because specifically
strcoll_l() was unexpectedly returning 0 for some cases. Now I get it,
that we do that to be compatible with texteq().

Secondly, I was also considering if ICU especially has a way to
customize an ICU locale by setting some attributes which dictate
comparison or sorting rules for a set of characters. I mean, if there
is such customized ICU locale defined in the system, and we use that
to create PG collation, I thought we might have to strictly follow
those rules without a tie-breaker, so as to be 100% conformant to ICU.
I can't come up with an example, or may there isn't one, but , say ,
there is a locale which is supposed to sort only by lowest comparison
strength (de@strength=1 ?? ). In that case, there might be many
characters considered equal, but PG < operator or > operator would
still return true for those chars.


-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Teodor Sigaev

BTW, was the larger query plan that you showed (with a Materialize node)
generated by 9.6, or v10 HEAD?  Because I would be surprised if 9.6 did

v10,
commit acbd8375e954774181b673a31b814e9d46f436a5
Author: Magnus Hagander 
Date:   Fri Jun 2 11:18:24 2017 +0200

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PLpgSQL parallel assignemnt

2017-06-02 Thread Pavel Stehule
2017-06-02 10:15 GMT+02:00 Pavel Stehule :

>
>
> 2017-06-02 10:06 GMT+02:00 Craig Ringer :
>
>> On 2 June 2017 at 15:51, Pavel Stehule  wrote:
>>
>> > a, b := fx();
>> >
>> > Comments, notes, ideas?
>>
>> I'd be pretty happy to have
>>
>> (a, b) = (x, y);
>> (a, b) = f(x);
>>
>> which is SQL-esque.
>>
>
> This is not too far to my proposal - and it is fully adequate alternative.
>

The ANSI form is related to SET or UPDATE commands - so in this case I see
classic languages style
https://en.wikipedia.org/wiki/Assignment_(computer_science) better. The
assign statement in PLpgSQL is not related to embedded SQL. If we introduce
SQL syntax and SET commands for schema variables then ( ) syntax is
perfect, but for := PLpgSQL I am not sure

It is maybe strange, but

SET (a,b) =  (SELECT a,b FROM foo)

a, b := fx()

are sentences from two independent worlds and different syntax can be
correct (depends how much we would to integrate procedural and SQL worlds
.. 100% T-SQL, 80% SQL/PSM, ..20% PLpgSQL or 5%PL/SQL)

Regards

Pavel


>
>
>>
>> But what, if anything, does Ada do?
>>
>
> What I know, no, Ada has not this statement - but the design of OUT
> parameters in Ada absolutely different than PostgreSQL - so in this case we
> cannot to use Ada language as our base :(
>
> Regards
>
> Pavel
>
>
>>
>> --
>>  Craig Ringer   http://www.2ndQuadrant.com/
>>  PostgreSQL Development, 24x7 Support, Training & Services
>>
>
>


Re: [HACKERS] pg_class.relpartbound definition overly brittle

2017-06-02 Thread Tom Lane
Robert Haas  writes:
> Also, you're attacking a straw man. Accidentally storing a meaningless
> parse location in the catalog isn't a feature, and we shouldn't
> pretend otherwise.

It's not meaningless, and it is a feature, useful for debugging.
Otherwise you'd have a hard time telling one occurrence of e.g. "+" from
another when you're trying to make sense of a stored tree.  We worked out
all this behavior ages ago for other expression node trees that are stored
in the catalogs (default expressions, index expressions, check
constraints, etc etc) and I don't see a reason for partition expressions
to be different.

I agree that this means you can't just strcmp a couple of stored node tree
strings to decide if they're equal, but you couldn't anyway --- a moment's
perusal of equalfuncs.c will show you other special cases, each with their
own rationale.  Maybe you'd like to complain that every one of those
rationales is wrongheaded but I do not think you will get far.

I think the best advice for Mark is to look at pg_get_expr() output and
see if that matches.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about upgrade standby with rsync

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 12:42 PM, Sergey Burladyan  wrote:
> Hello, all!
>
> I have problem with upgrading standby via rsync.

This mailing list is for discussion of PostgreSQL development, not
user questions.  You might try pgsql-general or pgsql-admin.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BEFORE trigger can cause undetected partition constraint violation

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 12:51 AM, Amit Langote
 wrote:
> Attached patch makes InitResultRelInfo() *always* initialize the
> partition's constraint, that is, regardless of whether insert/copy is
> through the parent or directly on the partition.  I'm wondering if
> ExecInsert() and CopyFrom() should check if it actually needs to execute
> the constraint?  I mean it's needed if there exists BR insert triggers
> which may change the row, but not otherwise.  The patch currently does not
> implement that check.

I think it should.  I mean, otherwise we're leaving a
probably-material amount of performance on the table.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP: allow overriding PostgresNode in get_new_node

2017-06-02 Thread Robert Haas
On Thu, Jun 1, 2017 at 3:36 AM, Michael Paquier
 wrote:
> On Wed, May 31, 2017 at 7:18 PM, Craig Ringer  wrote:
>> Note that you can achieve the same effect w/o patching
>> PostgresNode.pm, albeit in a somewhat ugly manner, by re-blessing the
>> returned object.
>>
>> sub get_new_mywhatever_node {
>> my $self = PostgresNode::get_new_node($name);
>> $self = bless $self, 'MyWhateverNode';
>> return $self;
>> }
>>
>> so this would be cosmetically nice, but far from functionally vital.
>
> +$pgnclass = 'PostgresNode' unless defined $pgnclass;
> I'd rather leave any code of this kind for the module maintainers,
> there is no actual reason to complicate PostgresNode.pm with code
> that's not directly useful for what is in-core.

Craig's proposal is a standard Perl idiom, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_class.relpartbound definition overly brittle

2017-06-02 Thread Robert Haas
On Thu, Jun 1, 2017 at 9:31 AM, Tom Lane  wrote:
> Let me explain the project standards in words of one syllable: user code
> should not examine the contents of node trees.  That's what pg_get_expr
> is for.  There is not, never has been, and never will be any guarantee
> that we won't whack those structures around in completely arbitrary ways,
> as long as we do a catversion bump along with it.

Many of those words have more than one syllable.

Also, you're attacking a straw man. Accidentally storing a meaningless
parse location in the catalog isn't a feature, and we shouldn't
pretend otherwise.  I agree that what Mark's doing is a bit unusual
and doesn't necessarily need to work, and he seems to agree with that,
too.  That doesn't mean that the status quo is some brilliant piece of
engineering.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Questions about upgrade standby with rsync

2017-06-02 Thread Sergey Burladyan
Hello, all!

I have problem with upgrading standby via rsync.

Documentation say:
> Verify that the "Latest checkpoint location" values match in all clusters.

But it is impossible for "WAL archive only" standby, if I not mistaken.

Standby can make restartpoint only at master checkpoint location, because 
standby cannot write WALs.
When you use WAL archive only, without streaming replication, last WAL from 
master with shutdown checkpoint
do not archived. I think may be because WAL archiving process already 
terminated, when last WAL switched.

For example:
 standby log ===
2017-06-01 13:09:25 GMT LOG:  recovery restart point at 0/3000790
2017-06-01 13:09:26 GMT LOG:  recovery restart point at 0/603FB10
2017-06-01 13:09:27 GMT LOG:  recovery restart point at 0/903ED28
2017-06-01 13:09:33 GMT LOG:  recovery restart point at 0/C03EFF0
2017-06-01 13:09:38 GMT LOG:  recovery restart point at 0/F040E28

now stop master and then standby:

 master control data after shutdown ===   standby 
control data after shutdown ===
Latest checkpoint location:   0/1220 Latest 
checkpoint location:   0/110402A8
Prior checkpoint location:0/110402A8 Prior 
checkpoint location:0/D067A00
Latest checkpoint's REDO location:0/1220 Latest 
checkpoint's REDO location:0/F040E28
Latest checkpoint's TimeLineID:   1  Latest 
checkpoint's TimeLineID:   1
Latest checkpoint's full_page_writes: on Latest 
checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/1267 Latest 
checkpoint's NextXID:  0/1267
Latest checkpoint's NextOID:  16393  Latest 
checkpoint's NextOID:  24576
Latest checkpoint's NextMultiXactId:  1  Latest 
checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0  Latest 
checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:1254   Latest 
checkpoint's oldestXID:1254
Latest checkpoint's oldestXID's DB:   1  Latest 
checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  0  Latest 
checkpoint's oldestActiveXID:  1266
Time of latest checkpoint:Thu Jun  1 16:09:32 2017   Time of latest 
checkpoint:Thu Jun  1 16:09:30 2017
Minimum recovery ending location: 0/0Minimum 
recovery ending location: 0/1200

As you can see, when standby was run — it was consistent with master: "Minimum 
recovery ending location: 0/1200"
but it last restartpoint is previous checkpoint (from master), because last WAL 
(00010012) from master
do not archived:
 xlogdump 
m/9.2/pg_xlog/00010012:

[page:0, xlp_info:6, xlp_tli:1, xlp_pageaddr:0/1200] XLP_LONG_HEADER 
XLP_BKP_REMOVABLE 
Unexpected page info flags 0006 at offset 0
[cur:0/1220, xid:0, rmid:0(XLOG), len/tot_len:64/96, info:0, 
prev:0/11387BB8] checkpoint: redo 0/1220; tli 1; nextxid 1267;  nextoid 
16393; nextmulti 1; nextoffset 0; shutdown at 2017-06-01 16:09:32 MSK
ReadRecord: record with zero len at 0/1280

This WAL file is only at master pg_xlog, and not in WAL archive.


And my second question, this algorithm with rsync described only starting from 
pg 9.5,
is it possible to use it for upgrade from pg 9.2 to pg 9.4?

Thanks!

-- 
Sergey Burladyan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 10:19 AM, Joe Conway  wrote:
>> Yeah, that's not crazy.  I find it a bit surprising in terms of the
>> semantics, though.  SET
>> when_i_try_to_insert_into_a_specific_partition_i_dont_really_mean_it =
>> true?
>
> Maybe
>   SET partition_tuple_retry = true;
> -or-
>   SET partition_tuple_reroute = true;
> ?
>
> I like the idea of only rerouting when failing constraints although I
> can envision where there might be use cases where you essentially want
> to re-partition and therefore reroute everything, leading to:
>
>   SET partition_tuple_reroute = (none | error | all);

Personally, I think it's more elegant to make this a pg_dump option
than to make it a server GUC, but I'm not going to spend time fighting
the server GUC idea if other people like it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sketchy partcollation handling

2017-06-02 Thread Robert Haas
If you create a partitioned table in the obvious way, partcollation ends up 0:

rhaas=# create table foo (a int, b text) partition by list (a);
CREATE TABLE
rhaas=# select * from pg_partitioned_table;
 partrelid | partstrat | partnatts | partattrs | partclass |
partcollation | partexprs
---+---+---+---+---+---+---
 16420 | l | 1 | 1 | 1978  | 0 |
(1 row)

You could argue that 0 is an OK value there; offhand, I'm not sure
about that.  But there's nothing in
https://www.postgresql.org/docs/10/static/catalog-pg-partitioned-table.html
which indicates that some entries can be 0 rather than a valid
collation OID.  And this is definitely not OK:

rhaas=# select * from pg_depend where objid = 16420;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
-+---+--++--+-+-
1259 | 16420 |0 |   2615 | 2200 |   0 | n
1259 | 16420 |0 |   3456 |0 |   0 | n
(2 rows)

We shouldn't be storing a dependency on non-existing collation 0.

I'm not sure whether the bug here is that we should have a valid
collation OID there rather than 0, or whether the bug is that we
shouldn't be recording a dependency on anything other than a real
collation OID, but something about this is definitely not right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use of non-restart-safe storage by temp_tablespaces

2017-06-02 Thread Bruce Momjian
On Wed, May 31, 2017 at 08:28:51AM -0700, Mark Dilger wrote:
> > Uh, I thought only the sessions that created the temporary objects could
> > see them, and since they are not in WAL and autovacuum can't see them,
> > their non-existence in a temporary tablespace would not be a problem.
> 
> You are correct.  I was thinking about an extension to allow unlogged
> tablespaces on temporary filesystems, but got the words "unlogged" and
> "temporary" mixed up in my thinking and in what I wrote.  I should have
> written that unlogged tablespaces would only host unlogged tables and
> unlogged indexes, such that users are not surprised to find their data
> missing.
> 
> On reflection, I think both features are worthwhile, and not at all exclusive
> of each other, though unlogged tablespaces is probably considerably more
> work to implement.

TODO item added:

Allow tablespaces on RAM-based partitions for temporary objects 

and I wrote a blog entry about this:

https://momjian.us/main/blogs/pgblog/2017.html#June_2_2017

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] relocation truncated to fit: citus build failure on s390x

2017-06-02 Thread Tom Lane
Andres Freund  writes:
> On 2017-05-31 11:57:16 -0400, Alvaro Herrera wrote:
>> My vote would be to backpatch it all the way.

> +1

Done, buildfarm seems happy.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need the gcc feature "__builtin_expect" to promote the branches prediction?

2017-06-02 Thread Andres Freund
Hi,

On 2017-06-02 16:40:56 +0800, Hao Lee wrote:
> Hi all,
>There is a lot of "if statement" in system, and GCC provides a
> feature,"__builtin_expect", which  let compilers know which branch is
> mostly run. as we known, miss-prediction will lead the performance
> lost(because the CPU will thrown away some instructions, and re-fetch some
> new instructions). so that we can tell GCC how produce more efficient code.
> for example as following.
> It will gain performance promotion i think. As i know, the in Linux kernel,
> this feature is also applied already.
> 
>  #define likely(cond) __builtin_expect(cond,true)
> #define unlikely(cond)  __builtin_expect(cond,false)
> 
> if (likely(cond)) {
> //most likely run.
>
> } else //otherwise.
> {
>
> }

We already do this in a few cases, that are performance critical enough
to matter.  But in most cases the CPUs branch predictor does a good
enough job on its own.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-06-02 Thread Alvaro Herrera
Mengxing Liu wrote:
> Hi,  Alvaro and Kevin.
> 
> > Anyway, this is just my analysis. 
> > So I want to hack the PG and count the conflict lists' size of 
> > transactions. That would be more accurate.
> 
> In the last week, I hacked the PG to add an additional thread to count 
> RWConflict list lengths. 
> And tune the benchmark to get more conflict. But the result is still not good.

Kevin mentioned during PGCon that there's a paper by some group in
Sydney that developed a benchmark on which this scalability problem
showed up very prominently.  I think your first step should be to
reproduce their results -- my recollection is that Kevin says you
already know that paper, so please dedicate some time to analyze it and
reproduce their workload.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Tom Lane
Teodor Sigaev  writes:
>> There were old threads about considering a risk factor when estimating
>> plans, and I'm thinking this issue is the planner failing to do
>> exactly that.

> I'm afraid it's tool late for v10

Yeah, we're surely not opening that can of worms for v10.  Right now
we have to be content with avoiding regressions from 9.6.

BTW, was the larger query plan that you showed (with a Materialize node)
generated by 9.6, or v10 HEAD?  Because I would be surprised if 9.6 did
it.  But this bug could well cause HEAD to insert Materialize nodes in
surprising places, because it would have the effect of making a nestloop
with a single row expected from the outer rel look cheaper with a
Materialize on the inner rel than without.

(Actually I guess 9.6 would have done that too, but only for semi/anti
join cases, limiting the visibility of the bug.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Claudio Freire
On Fri, Jun 2, 2017 at 11:46 AM, Teodor Sigaev  wrote:
>> There were old threads about considering a risk factor when estimating
>> plans, and I'm thinking this issue is the planner failing to do
>> exactly that.
>>
> I'm afraid it's tool late for v10

Clearly


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Teodor Sigaev

There were old threads about considering a risk factor when estimating
plans, and I'm thinking this issue is the planner failing to do
exactly that.


I'm afraid it's tool late for v10

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Merge Join v1

2017-06-02 Thread Jeff Davis
On Tue, May 30, 2017 at 11:44 PM, Andrew Borodin  wrote:
> Hi, Jeff!

Hi!

> Sorry for being late. Actually, I had several unsuccessful attempts to
> find something wrong with the patch.
> Here's my review.
>
> in pathkey.c
>
> ecs = (EquivalenceClass **) palloc(nClauses * sizeof(EquivalenceClass *));
> scores = (int *) palloc(nClauses * sizeof(int));
> range_ecs = palloc(nClauses * sizeof(bool));
>
> Third assignment has no cast.

Will fix.

> And I have few questions:
> 1. Are there any types, which could benefit from Range Merge and are
> not covered by this patch?

I thought about this for a while, and the only thing I can think of
are range joins that don't explicitly use range types.

> 2. Can Range Merge handle merge of different ranges? Like int4range()
> && int8range() ?

Right now, there aren't even casts between range types. I think the
best way to handle that at this point would be to add casts among the
numeric ranges. There may be advantages to supporting any two ranges
where the contained types are part of the same opfamily, but it seems
a little early to add that complication.

> My perf test script from the previous message was broken, here's fixed
> one in the attachment.
>
> This patch implements feature, contains new tests and passes old
> tests, is documented and spec compliant. I do not see any reason why
> not mark it "Ready for committer".

Great!

I think there are a couple more things that could be done if we want
to. Let me know if you think these things should be done now, or if
they should be a separate patch later when the need arises:

* Support for r1 @> r2 joins (join on "contains" rather than "overlaps").
* Better integration with the catalog so that users could add their
own types that support range merge join.

Thank you for the review.

Regards,
 Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Claudio Freire
On Fri, Jun 2, 2017 at 10:27 AM, Tom Lane  wrote:
> Teodor Sigaev  writes:
>>> Teodor, could you check if this patch fixes your real-world problem?
>
>> It works fine with original query, thank you. But some other query slowdowns 
>> for
>> ~10% (9 secs vs 10 secs). Look at following part of plans of huge query:
>> ...
>> As you said, it removes Materialize node, although it's useful here.
>
> Meh.  If it's expecting only one outer row, it shouldn't be using a
> Materialize on the inner side, period.  That's not sane per the cost
> model.  You haven't shown us enough to guess why the rowcount estimates
> are so far off reality in this query, but I don't think it's the fault
> of this patch if the result is slightly slower given that much error.
>
> Most likely, the answer for your real-world problem is "you need to
> ANALYZE before running the query".
>
> regards, tom lane

I don't know. Perhaps the risky part is assuming rows=1 for non-unique
inner scans. In fact a wrongly estimated rows=1 outer scan would be
just as risky.

There were old threads about considering a risk factor when estimating
plans, and I'm thinking this issue is the planner failing to do
exactly that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-06-02 Thread Joe Conway
On 06/02/2017 05:47 AM, Robert Haas wrote:
> On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis  wrote:
>> 2. I basically see two approaches to solve the problem:
>>   (a) Tom suggested at PGCon that we could have a GUC that
>> automatically causes inserts to the partition to be re-routed through
>> the parent. We could discuss whether to always route through the
>> parent, or do a recheck on the partition constrains and only reroute
>> tuples that will fail it. If the user gets into trouble, the worst
>> that would happen is a helpful error message telling them to set the
>> GUC. I like this idea.
> 
> Yeah, that's not crazy.  I find it a bit surprising in terms of the
> semantics, though.  SET
> when_i_try_to_insert_into_a_specific_partition_i_dont_really_mean_it =
> true?

Maybe
  SET partition_tuple_retry = true;
-or-
  SET partition_tuple_reroute = true;
?

I like the idea of only rerouting when failing constraints although I
can envision where there might be use cases where you essentially want
to re-partition and therefore reroute everything, leading to:

  SET partition_tuple_reroute = (none | error | all);

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-02 Thread Petr Jelinek
On 02/06/17 15:37, Amit Kapila wrote:
> On Thu, Jun 1, 2017 at 10:36 PM, Petr Jelinek
>  wrote:
>> On 01/06/17 15:25, Tom Lane wrote:
>>> Robert Haas  writes:
 So, are you going to, perhaps, commit this?  Or who is picking this up?
>>>
 /me knows precious little about Windows.
>>>
>>> I'm not going to be the one to commit this either, but seems like someone
>>> should.
>>>
>>
>> The new code does not use any windows specific APIs or anything, it just
>> adds retry logic for reattaching when we do EXEC_BACKEND which seems to
>> be agreed way of solving this. I do have couple of comments about the
>> code though.
>>
>> The new parameter retry_count in PGSharedMemoryReAttach() seems to be
>> only used to decide if to log reattach issues so that we don't spam log
>> when retrying, but this fact is not mentioned anywhere.
>>
> 
> No, it is to avoid calling free of memory which is not reserved on
> retry.  See the comment:
> + * On the first try, release memory region reservation that was made by
> + * the postmaster.
> 
> Are you referring to the same function in sysv_shm.c, if so probably I
> can say refer the same API in win32_shmem.c or maybe add a similar
> comment there as well?
> 

Yeah something like that would help, but my main confusion comes from
the fact that there is counter (and even named as such) but only
relevant difference is 0 and not 0. I'd like mention of that mainly
since I was confused by that on the first read.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: retry shm attach for windows (WAS: Re: [HACKERS] OK, so culicidae is *still* broken)

2017-06-02 Thread Amit Kapila
On Thu, Jun 1, 2017 at 10:36 PM, Petr Jelinek
 wrote:
> On 01/06/17 15:25, Tom Lane wrote:
>> Robert Haas  writes:
>>> So, are you going to, perhaps, commit this?  Or who is picking this up?
>>
>>> /me knows precious little about Windows.
>>
>> I'm not going to be the one to commit this either, but seems like someone
>> should.
>>
>
> The new code does not use any windows specific APIs or anything, it just
> adds retry logic for reattaching when we do EXEC_BACKEND which seems to
> be agreed way of solving this. I do have couple of comments about the
> code though.
>
> The new parameter retry_count in PGSharedMemoryReAttach() seems to be
> only used to decide if to log reattach issues so that we don't spam log
> when retrying, but this fact is not mentioned anywhere.
>

No, it is to avoid calling free of memory which is not reserved on
retry.  See the comment:
+ * On the first try, release memory region reservation that was made by
+ * the postmaster.

Are you referring to the same function in sysv_shm.c, if so probably I
can say refer the same API in win32_shmem.c or maybe add a similar
comment there as well?


> Also, I am not excited about following coding style:
>> + if (!pgwin32_ReserveSharedMemoryRegion(pi.hProcess))
>> + continue;
>> + else
>> + {
>
> Amit, if you want to avoid having to add the curly braces for single
> line while still having else, I'd invert the expression in the if ()
> statement so that true comes first. It's much less ugly to have curly
> braces part first and the continue statement in the else block IMHO.
>

I felt that it is easier to understand the code in the way it is
currently written, but I can invert the check if you find it is easier
to read and understand that way.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Tom Lane
Teodor Sigaev  writes:
>> Teodor, could you check if this patch fixes your real-world problem?

> It works fine with original query, thank you. But some other query slowdowns 
> for 
> ~10% (9 secs vs 10 secs). Look at following part of plans of huge query:
> ...
> As you said, it removes Materialize node, although it's useful here.

Meh.  If it's expecting only one outer row, it shouldn't be using a
Materialize on the inner side, period.  That's not sane per the cost
model.  You haven't shown us enough to guess why the rowcount estimates
are so far off reality in this query, but I don't think it's the fault
of this patch if the result is slightly slower given that much error.

Most likely, the answer for your real-world problem is "you need to
ANALYZE before running the query".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-02 Thread Thomas Munro
On Sat, Jun 3, 2017 at 12:10 AM, Thomas Munro
 wrote:
> I would have expected that to be handled by separate transition tables
> at different query levels, but evidently it isn't.

I am wondering if we need to wrap the execution of a modifying CTE in
AfterTriggerBeginQuery() and AfterTriggerEndQuery().  But I'm not sure
where, and it may be a couple of days before I can dig some more.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE

2017-06-02 Thread Amit Kapila
On Fri, Jun 2, 2017 at 6:38 PM, Robert Haas  wrote:
> On Fri, Jun 2, 2017 at 9:01 AM, Amit Kapila  wrote:
>> Your reasoning sounds sensible to me.  I think the other way to attack
>> this problem is that we can maintain some local queue in each of the
>> workers when the shared memory queue becomes full.  Basically, we can
>> extend your "Faster processing at Gather node" patch [1] such that
>> instead of fixed sized local queue, we can extend it when the shm
>> queue become full.  I think that way we can handle both the problems
>> (worker won't stall if shm queues are full and workers can do batched
>> writes in shm queue to avoid the shm queue communication overhead) in
>> a similar way.
>
> We still have to bound the amount of memory that we use for queueing
> data in some way.
>

Yeah, probably till work_mem (or some percentage of work_mem).  If we
want to have some extendable solution then we might want to back it up
with some file, however, we might not need to go that far.  I think we
can do some experiments to see how much additional memory is
sufficient to give us maximum benefit.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 9:01 AM, Amit Kapila  wrote:
> Your reasoning sounds sensible to me.  I think the other way to attack
> this problem is that we can maintain some local queue in each of the
> workers when the shared memory queue becomes full.  Basically, we can
> extend your "Faster processing at Gather node" patch [1] such that
> instead of fixed sized local queue, we can extend it when the shm
> queue become full.  I think that way we can handle both the problems
> (worker won't stall if shm queues are full and workers can do batched
> writes in shm queue to avoid the shm queue communication overhead) in
> a similar way.

We still have to bound the amount of memory that we use for queueing
data in some way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE

2017-06-02 Thread Amit Kapila
On Thu, Jun 1, 2017 at 6:41 PM, Rafia Sabih
 wrote:
> On Tue, May 30, 2017 at 4:57 PM, Robert Haas  wrote:
>
>> I did a little bit of brief experimentation on this same topic a long
>> time ago and didn't see an improvement from boosting the queue size
>> beyond 64k but Rafia is testing Gather rather than Gather Merge and,
>> as I say, my test was very brief.  I think it would be a good idea to
>> try to get a complete picture here.  Does this help on any query that
>> returns many tuples through the Gather?  Only the ones that use Gather
>> Merge?  Some queries but not others with no obvious pattern?  Only
>> this query?
>>
> I did further exploration trying other values of
> PARALLEL_TUPLE_QUEUE_SIZE and trying different queries and here are my
> findings,
> - on even setting PARALLEL_TUPLE_QUEUE_SIZE to 655360, there isn't
> much improvement in q12 itself.
> - there is no other TPC-H query which is showing significant
> improvement on 6553600 itself. There is a small improvement in q3
> which is also using gather-merge.
> - as per perf analysis of q12 on head and patch, the %age of
> ExecGatherMerge is 18% with patch and 98% on head, and similar with
> gather_merge_readnext and gather_merge_writenext.
>
> As per my understanding it looks like this increase in tuple queue
> size is helping only gather-merge. Particularly, in the case where it
> is enough stalling by master in gather-merge because it is maintaining
> the sort-order. Like in q12 the index is unclustered and gather-merge
> is just above parallel index scan, thus, it is likely that to maintain
> the order the workers have to wait long for the in-sequence tuple is
> attained by the master. Something like this might be happening, master
> takes one tuple from worker 1, then next say 10 tuples from worker 2
> and so on, and then finally returning to worker1, so, one worker 1 has
> done enough that filled it's queue it sits idle. Hence, on increasing
> the tuple queue size helps in workers to keep on working for longer
> and this is improving the performance.
>

Your reasoning sounds sensible to me.  I think the other way to attack
this problem is that we can maintain some local queue in each of the
workers when the shared memory queue becomes full.  Basically, we can
extend your "Faster processing at Gather node" patch [1] such that
instead of fixed sized local queue, we can extend it when the shm
queue become full.  I think that way we can handle both the problems
(worker won't stall if shm queues are full and workers can do batched
writes in shm queue to avoid the shm queue communication overhead) in
a similar way.


[1] - 
https://www.postgresql.org/message-id/CAOGQiiMwhOd5-iKZnizn%2BEdzZmB0bc3xa6rKXQgvhbnQ29zCJg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis  wrote:
> 1. For range partitioning, I think it's "yes, a little". As you point
> out, there are already some weird edge cases -- the main way range
> partitioning would make the problem worse is simply by having more
> users.

I agree.

> But for hash partitioning I think the problems will become more
> substantial. Different encodings, endian issues, etc. will be a
> headache for someone, and potentially a bad day if they are urgently
> trying to restore on a new machine. We should remember that not
> everyone is a full-time postgres DBA, and users might reasonably think
> that the default options to pg_dump[all] will give them a portable
> dump.

I agree to an extent.  I think the problem will be worse for hash
partitioning but I might disagree with you on how much worse.  I think
that most people don't do encoding conversions very often, and that
those who do know (or should know) enough to expect trouble.  I think
most people do endian-ness conversions almost never, but since that's
a matter of hardware not configuration I'd like to paper over that
case if we can.

> 2. I basically see two approaches to solve the problem:
>   (a) Tom suggested at PGCon that we could have a GUC that
> automatically causes inserts to the partition to be re-routed through
> the parent. We could discuss whether to always route through the
> parent, or do a recheck on the partition constrains and only reroute
> tuples that will fail it. If the user gets into trouble, the worst
> that would happen is a helpful error message telling them to set the
> GUC. I like this idea.

Yeah, that's not crazy.  I find it a bit surprising in terms of the
semantics, though.  SET
when_i_try_to_insert_into_a_specific_partition_i_dont_really_mean_it =
true?

>   (b) I had suggested before that we could make the default text dump
> (and the default output from pg_restore, for consistency) route
> through the parent. Advanced users would dump with -Fc, and pg_restore
> would support an option to do partition-wise loading. To me, this is
> simpler, but users might forget to use (or not know about) the
> pg_restore option and then it would load more slowly. Also, the ship
> is sailing on range partitioning, so we might prefer option (a) just
> to avoid making any changes.

I think this is a non-starter.  The contents of the dump shouldn't
depend on the format chosen; that is bound to confuse somebody.  I
also do not wish to inflict a speed penalty on the users of
plain-format dumps.

>> 2. Add an option like --dump-partition-data-with-parent.  I'm not sure
>> who originally proposed this, but it seems that everybody likes it.
>> What we disagree about is the degree to which it's sufficient.  Jeff
>> Davis thinks it doesn't go far enough: what if you have an old
>> plain-format dump that you don't want to hand-edit, and the source
>> database is no longer available?  Most people involved in the
>> unconference discussion of partitioning at PGCon seemed to feel that
>> wasn't really something we should be worry about too much.  I had been
>> taking that position also, more or less because I don't see that there
>> are better alternatives.
>
> If the suggestions above are unacceptable, and we don't come up with
> anything better, then of course we have to move on. I am worrying now
> primarily because now is the best time to worry; I don't expect any
> horrible outcome.

OK.

>> 3. Implement portable hash functions (Jeff Davis or me, not sure
>> which).  Andres scoffed at this idea, but I still think it might have
>> legs.
>
> I think it reduces the problem, which has value, but it's hard to make
> it rock-solid.

I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need the gcc feature "__builtin_expect" to promote the branches prediction?

2017-06-02 Thread Hao Lee
Okay. Thanks

On Fri, Jun 2, 2017 at 7:27 PM, Julien Rouhaud 
wrote:

> On 02/06/2017 12:50, Craig Ringer wrote:
> >
> >
> > On 2 Jun. 2017 16:42, "Hao Lee"  > > wrote:
> >
> > Hi all,
> >There is a lot of "if statement" in system, and GCC provides
> > a feature,"__builtin_expect", which  let compilers know which branch
> > is mostly run.
> >
> >
> > Compilers and CPUs are really good at guessing this.
> >
> > Humans are wrong about it more than we'd like too.
>
> +1
>
> >
> > It's surprisingly rarely s good idea to use branch prediction hints.
> >
> > See the vsrious Linux kernel discussions about this.
> >
> > If you find concrete sites of frequent or costly branch mis-prediction
> > please point them out, with benchmarks.
>
> And also see this thread:
> https://www.postgresql.org/message-id/CAKJS1f8yqRW3qx2CO9r4bqqvA2Vx6
> 8=3awbh8cjwtp9zxeo...@mail.gmail.com
>
> BTW Andres added support for likely/unlikely in aa3ca5e3dd6, but its
> usage is still really limited.
>
> --
> Julien Rouhaud
> http://dalibo.com - http://dalibo.org
>


Re: [HACKERS] Perfomance bug in v10

2017-06-02 Thread Teodor Sigaev


Teodor, could you check if this patch fixes your real-world problem?


It works fine with original query, thank you. But some other query slowdowns for 
~10% (9 secs vs 10 secs). Look at following part of plans of huge query:


without patch:
->  Nested Loop  (cost=34.82..50.91 rows=1 width=20)
 (actual time=0.017..0.061 rows=5 loops=24121)
  ->  ...
  ->  Materialize  (cost=0.56..15.69 rows=1 width=5)
   (actual time=0.003..0.004 rows=2 loops=109061)
->  Index Scan using ... (cost=0.56..15.68 rows=1 width=5)
 (actual time=0.013..0.014 rows=2 loops=24121)

with patch:
->  Nested Loop  (cost=34.82..50.91 rows=1 width=20)
 (actual time=0.018..0.063 rows=5 loops=24121)
  -> ...
  ->  Index Scan using ...  (cost=0.56..15.68 rows=1 width=5)
(actual time=0.012..0.013 rows=2 loops=109061)

(dots hidden the same parts)

As you said, it removes Materialize node, although it's useful here.

If you wish, I can do a test suite, its size will be around 10MB and send it by 
private  email.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check false success

2017-06-02 Thread Robert Haas
On Thu, Jun 1, 2017 at 10:18 AM, Sandro Santilli  wrote:
> I noticed that the `check` Makefile rule imported by PGXS is giving
> a success exit code even when it is unsupported.
>
> The attached patch fixes that.

Hmm.  I'm not 100% sure that the existing behavior is wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default Partition for Range

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 8:09 AM, Amit Kapila  wrote:
> I think if you have found spelling mistakes unrelated to this patch,
> then it is better to submit those as a separate patch in a new thread.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump

2017-06-02 Thread Robert Haas
On Thu, Jun 1, 2017 at 10:05 PM, Tom Lane  wrote:
> I dunno.  What's the actual use-case, other than as a bad workaround
> to a problem we should fix a different way?

Well, that's a fair point.  I don't have a specific use case in mind.
However, I also don't think that options for controlling what gets
dumped should be subjected to extreme vetting.  On the strength mostly
of my own experiences trying to solve database problems in the real
world, I generally think that pg_dump could benefit from significantly
more options to control what gets dumped.  The existing options are
pretty good, but it's not that hard to run into a situation that they
don't cover.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-02 Thread Thomas Munro
On Fri, Jun 2, 2017 at 10:48 PM, Marko Tiikkaja  wrote:
> Since the subject of transition tables came up, I thought I'd test how this
> case works:
>
> =# create table qwr(a int);
> CREATE TABLE
>
> =# create function qq() returns trigger as $$ begin raise notice '%',
> (select count(*) from oo); return null; end $$ language plpgsql;
> CREATE FUNCTION
>
> =# create trigger xx after insert on qwr referencing new table as oo for
> each statement execute procedure qq();
> CREATE TRIGGER
>
> =# with t as (insert into qwr values (1)) insert into qwr values (2), (3);
> NOTICE:  3
> NOTICE:  3
> INSERT 0 2
>
> to me, this means that it doesn't work.  Surely one of the trigger
> invocations should say 1, and the other 2.  Or was this intentional?

I would have expected that to be handled by separate transition tables
at different query levels, but evidently it isn't.  The following
crashes:

create table table1(a int);
create table table2(a text);

create function trigfunc() returns trigger as $$ begin raise notice
'got: %', (select oo from oo); return null; end $$ language plpgsql;

create trigger trig1 after insert on table1 referencing new table as
oo for each statement execute procedure trigfunc();

create trigger trig2 after insert on table2 referencing new table as
oo for each statement execute procedure trigfunc();

with t as (insert into table1 values (1)) insert into table2 values ('hello');

I've run out of day today but will investigate.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Default Partition for Range

2017-06-02 Thread Amit Kapila
On Fri, Jun 2, 2017 at 4:12 PM, Rafia Sabih
 wrote:
> On Wed, May 31, 2017 at 5:53 PM, Beena Emerson  
> wrote:
>
> Hi Beena,
>
> I had a look at the patch from the angle of aesthetics and there are a
> few cosmetic changes I might suggest. Please have a look at the
> attached patch and if you agree with those changes you may merge it on
> your patch. The patch also fixes a couple of more spelling mistakes
> unrelated to this patch.
>

I think if you have found spelling mistakes unrelated to this patch,
then it is better to submit those as a separate patch in a new thread.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need the gcc feature "__builtin_expect" to promote the branches prediction?

2017-06-02 Thread Julien Rouhaud
On 02/06/2017 12:50, Craig Ringer wrote:
> 
> 
> On 2 Jun. 2017 16:42, "Hao Lee"  > wrote:
> 
> Hi all, 
>There is a lot of "if statement" in system, and GCC provides
> a feature,"__builtin_expect", which  let compilers know which branch
> is mostly run.
> 
> 
> Compilers and CPUs are really good at guessing this.
> 
> Humans are wrong about it more than we'd like too.

+1

> 
> It's surprisingly rarely s good idea to use branch prediction hints.
> 
> See the vsrious Linux kernel discussions about this.
> 
> If you find concrete sites of frequent or costly branch mis-prediction
> please point them out, with benchmarks.

And also see this thread:
https://www.postgresql.org/message-id/CAKJS1f8yqRW3qx2CO9r4bqqvA2Vx68=3awbh8cjwtp9zxeo...@mail.gmail.com

BTW Andres added support for likely/unlikely in aa3ca5e3dd6, but its
usage is still really limited.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-06-02 Thread Amit Khandekar
On 2 June 2017 at 01:17, Robert Haas  wrote:
> On Thu, Jun 1, 2017 at 7:41 AM, Amit Khandekar  wrote:
>>> Regarding the trigger issue, I can't claim to have a terribly strong
>>> opinion on this.  I think that practically anything we do here might
>>> upset somebody, but probably any halfway-reasonable thing we choose to
>>> do will be OK for most people.  However, there seems to be a
>>> discrepancy between the approach that got the most votes and the one
>>> that is implemented by the v8 patch, so that seems like something to
>>> fix.
>>
>> Yes, I have started working on updating the patch to use that approach
>> (BR and AR update triggers on source and destination partition
>> respectively, instead of delete+insert) The approach taken by the
>> patch (BR update + delete+insert triggers) didn't require any changes
>> in the way ExecDelete() and ExecInsert() were called. Now we would
>> require to skip the delete/insert triggers, so some flags need to be
>> passed to these functions, or else have stripped down versions of
>> ExecDelete() and ExecInsert() which don't do other things like
>> RETURNING handling and firing triggers.
>
> See, that strikes me as a pretty good argument for firing the
> DELETE+INSERT triggers...
>
> I'm not wedded to that approach, but "what makes the code simplest?"
> is not a bad tiebreak, other things being equal.

Yes, that sounds good to me. But I think we want to wait for other's
opinion because it is quite understandable that two triggers firing on
the same partition sounds odd.

>
>>> In terms of the approach taken by the patch itself, it seems
>>> surprising to me that the patch only calls
>>> ExecSetupPartitionTupleRouting when an update fails the partition
>>> constraint.  Note that in the insert case, we call that function at
>>> the start of execution;
>>
>>> calling it in the middle seems to involve additional hazards;
>>> for example, is it really safe to add additional
>>> ResultRelInfos midway through the operation?
>>
>> I thought since the additional ResultRelInfos go into
>> mtstate->mt_partitions which is independent of
>> estate->es_result_relations, that should be safe.
>
> I don't know.  That sounds scary to me, but it might be OK.  Probably
> needs more study.
>
>>> Is it safe to take more locks midway through the operation?
>>
>> I can imagine some rows already updated, when other tasks like ALTER
>> TABLE or CREATE INDEX happen on other partitions which are still
>> unlocked, and then for row movement we try to lock these other
>> partitions and wait for the DDL tasks to complete. But I didn't see
>> any particular issues with that. But correct me if you suspect a
>> possible issue. One issue can be if we were able to modify the table
>> attributes, but I believe we cannot do that for inherited columns.
>
> It's just that it's very unlike what we do anywhere else.  I don't
> have a real specific idea in mind about what might totally break, but
> at a minimum it could certainly cause behavior that can't happen
> today.  Today, if you run a query on some tables, it will block
> waiting for any locks at the beginning of the query, and the query
> won't begin executing until it has all of the locks.  With this
> approach, you might block midway through; you might even deadlock
> midway through.  Maybe that's not overtly broken, but it's at least
> got the possibility of being surprising.
>
> Now, I'd actually kind of like to have behavior like this for other
> cases, too.  If we're inserting one row, can't we just lock the one
> partition into which it needs to get inserted, rather than all of
> them?  But I'm wary of introducing such behavior incidentally in a
> patch whose main goal is to allow UPDATE row movement.  Figuring out
> what could go wrong and fixing it seems like a substantial project all
> of its own.

Yes, I agree it makes sense trying to avoid introducing something we
haven't tried before, in this patch, as far as possible.

>
>> The reason I thought it cannot be done at the start of the execution,
>> is because even if we know that update is not modifying the partition
>> key column, we are not certain that the final NEW row has its
>> partition key column unchanged, because of triggers. I understand it
>> might be weird for a user requiring to modify a partition key value,
>> but if a user does that, it will result in crash because we won't have
>> the partition routing setup, thinking that there is no partition key
>> column in the UPDATE.
>
> I think we could avoid that issue.  Suppose we select the target
> partition based only on the original NEW tuple.  If a trigger on that
> partition subsequently modifies the tuple so that it no longer
> satisfies the partition constraint for that partition, just let it
> ERROR out normally.

Ok, so you are saying, don't allow a partition trigger to initiate the
row movement. I think we should keep this as a documented restriction.
Actually it would 

Re: [HACKERS] Broken hint bits (freeze)

2017-06-02 Thread Dmitriy Sarafannikov
Thanks for all.

We found the source of the problem. It was mistake in upgrade to 9.6.

We upgrade replica with rsync as it is in the documentation:
rsync --verbose --relative --archive --hard-links --size-only old_pgdata 
new_pgdata remote_dir

We must provide 100% read-only availability of our shard (master + 2 replicas).
So we can’t stop master and both replicas, upgrade them one by one and start 
them.
We do it as follows:
Close master from load, stop master, upgrade it, stop 1st replica, upgrade it, 
start 1st replica,
stop 2nd replica, upgrade it, start 2nd replica, start master, open master.
But upgraded replicas died under load without statistics and we decided to 
perform
analyze on master before upgrading replicas. In this case statistics would be 
copied to replicas by rsync.
The upgrade algorithm became as follows:
Close master, stop master, close master from replicas (iptables), upgrade 
master,
start master, perform analyze, stop master, stop 1st replica, upgrade 1st 
replica,
start 1st replica, stop 2nd replica, upgrade 2nd replica, start 2nd replica,
start master, open master.

If autovacuum starts vacuuming relations while we are performing analyze, wal 
records
generated by it will not be replayed on replicas, because next step is stopping
master with checkpoint and new redo location LSN (newer that these wal records)
will appear in pg_control file, which then will be copied by rsync to replicas.

If it was simple vacuum, we most likely will not see the consequences. Because 
it marks
tuples as deleted, and some of the next new tuples will be placed here, and due 
to FPW
replicas will receive correct page, identical to master.
But if it was vacuum to prevent wraparound, we will see situation like ours. 
Tuples on
master will be frozen, but on replicas not. And it will not change if nobody 
will not
update any tuple on this page.

It’s dangerous, because, if we perform switchover to replica, «corrupted» page
will be delivered to all replicas after next update of any tuple from this page.

We reproduced this case in our test environment and this assumption was 
confirmed.

Starting and stopping master after running pg_upgrade but before rsync to 
collect statistics
was a bad idea.

We know how to find such «corrupted» tuples. And we want to fix this by manually
freezing tuples via calling specially written C functions. Functions are 
«copy-pasted»
and simplified code from vacuum functions with SQL interface (see attachment).
Can you look on them? Do you think it is safe to use them for fixing corrupted 
pages
or is there a better way not to loose data?

Regards,
Dmitriy Sarafannikov



freeze_tuple.c
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need the gcc feature "__builtin_expect" to promote the branches prediction?

2017-06-02 Thread Craig Ringer
On 2 Jun. 2017 16:42, "Hao Lee"  wrote:

Hi all,
   There is a lot of "if statement" in system, and GCC provides a
feature,"__builtin_expect", which  let compilers know which branch is
mostly run.


Compilers and CPUs are really good at guessing this.

Humans are wrong about it more than we'd like too.

It's surprisingly rarely s good idea to use branch prediction hints.

See the vsrious Linux kernel discussions about this.

If you find concrete sites of frequent or costly branch mis-prediction
please point them out, with benchmarks.


. as we known, miss-prediction will lead the performance lost(because the
CPU will thrown away some instructions, and re-fetch some new
instructions). so that we can tell GCC how produce more efficient code. for
example as following.
It will gain performance promotion i think. As i know, the in Linux kernel,
this feature is also applied already.

 #define likely(cond) __builtin_expect(cond,true)
#define unlikely(cond)  __builtin_expect(cond,false)

if (likely(cond)) {
//most likely run.
   
} else //otherwise.
{
   
}


Best Regards.

Hom.


[HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table

2017-06-02 Thread Marko Tiikkaja
Since the subject of transition tables came up, I thought I'd test how this
case works:

=# create table qwr(a int);
CREATE TABLE

=# create function qq() returns trigger as $$ begin raise notice '%',
(select count(*) from oo); return null; end $$ language plpgsql;
CREATE FUNCTION

=# create trigger xx after insert on qwr referencing new table as oo for
each statement execute procedure qq();
CREATE TRIGGER

=# with t as (insert into qwr values (1)) insert into qwr values (2), (3);
NOTICE:  3
NOTICE:  3
INSERT 0 2

to me, this means that it doesn't work.  Surely one of the trigger
invocations should say 1, and the other 2.  Or was this intentional?


.m


Re: [HACKERS] Default Partition for Range

2017-06-02 Thread Rafia Sabih
On Wed, May 31, 2017 at 5:53 PM, Beena Emerson  wrote:
> Hello,
>
> PFA the updated patch.
> Dependent patch default_partition_v17.patch [1]
> This patch adds regression tests and removes the separate function to
> get default partition qual.
>
>
> On Mon, May 29, 2017 at 3:22 PM, Jeevan Ladhe
>  wrote:
>> Hi Beena,
>>
>> I went through your patch, and here are some of my comments:
>>
>> - For generating a qual for default range partition, instead of scanning for
>> all
>> the children and collecting all the boundspecs, how about creating and
>> negating
>> an expression from the lists of lowerdatums and upperdatums in boundinfo.
>>
>
> Unlike list, range partition can be for multiple columns and the
> expressions get complicated. I have used the same logic of looping
> through different partitions and negating the ORed expr. However, this
> is now done under get_qual_for_range.
>
>
>> - Wrong comment:
>> + int default_index; /* Index of the default list partition. */
>
> Comment is made generic in the dependent patch.
>
>>
>> - Suggested by Robert earlier on default list partitioning thread, instead
>> of
>> abbreviating is_def/found_def use is(found)_default etc.
>
> corrected.
>
>>
>> - unrelated change:
>> - List   *all_parts;
>> + List   *all_parts;
>>
>
> undone.
>
>> - typo: partiton should be partition, similar typo at other places too.
>> +  * A non-default range partiton table does not currently allow partition
>> keys
>>
>
> rectified.
>
>> - Useless hunk for this patch:
>> - Oiddefid;
>> + Oid defid;
>
> undone.
>
>>
>> - better to use IsA here, instead of doing explicit check:
>> + bound->content[i] = (datum->type == T_DefElem)
>> + ? RANGE_DATUM_DEFAULT
>
> Modified
>
>>
>> - It is better to use head of either lowerdatums or upperdatums list to
>> verify
>> if its a default partition and get rid of the constant PARTITION_DEFAULT
>> altogether.
>>
>
> modified this part as necessary.
>
>
>> - In the function get_qual_from_partbound() is_def is always going to be
>> false
>> for range partition, the function get_qual_for_range can be directly passed
>> false instead.
>>
>> - Following comment for function get_qual_for_range_default() implies that
>> this
>> function returns bool, but the actually it returns a List.
>> + *
>> + * If DEFAULT is the only partiton for the table then this returns TRUE.
>> + *
>>
> Updated.
>
> [1] http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg315573.html
>

Hi Beena,

I had a look at the patch from the angle of aesthetics and there are a
few cosmetic changes I might suggest. Please have a look at the
attached patch and if you agree with those changes you may merge it on
your patch. The patch also fixes a couple of more spelling mistakes
unrelated to this patch.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


cosmetic_range_default_partition.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Why restore_command is called for existing files in pg_xlog?

2017-06-02 Thread Alexander Kukushkin
Hello hackers,

There is one strange and awful thing I don't understand about
restore_command: it is always being called for every single WAL segment
postgres wants to apply (even if such segment already exists in pg_xlog)
until replica start streaming from the master.

If there is no restore_command in the recovery.conf - it perfectly works,
i.e. postgres replays existing wal segments and at some point connects to
the master and start streaming from it.

When recovery_conf is there, starting of a replica could become a real
problem, especially if restore_command is slow.

Is it possible to change this behavior somehow? First look into pg_xlog and
only if file is missing or "corrupted" call restore_command.


Regards,
---
Alexander Kukushkin


Re: [HACKERS] Bug in ExecModifyTable function and trigger issues for foreign tables

2017-06-02 Thread Etsuro Fujita

On 2017/05/16 21:36, Etsuro Fujita wrote:
One approach I came up with to fix this issue is to rewrite the 
targetList entries of an inherited UPDATE/DELETE properly using 
rewriteTargetListUD, when generating a plan for each child table in 
inheritance_planner.  Attached is a WIP patch for that.  Maybe I am 
missing something, though.


While updating the patch, I noticed the patch rewrites the UPDATE 
targetList incorrectly in some cases; rewrite_inherited_tlist I added to 
adjust_appendrel_attrs (1) removes all junk items from the targetList 
and (2) adds junk items for the child table using rewriteTargetListUD, 
but it's wrong to drop all junk items in cases where there are junk 
items for some other reasons than rewriteTargetListUD.  Consider junk 
items containing MULTIEXPR SubLink.  One way I came up with to fix this 
is to change (1) to only remove junk items with resname; since junk 
items added by rewriteTargetListUD should have resname (note: we would 
need resname to call ExecFindJunkAttributeInTlist at execution time!) 
while other junk items wouldn't have resname (see 
transformUpdateTargetList), we could correctly replace junk items added 
by rewriteTargetListUD for the parent with ones for the child, by that 
change.  I might be missing something, though.  Comments welcome.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] comment fix in attoptcache.c

2017-06-02 Thread Magnus Hagander
On Fri, Jun 2, 2017 at 11:14 AM, Amit Langote  wrote:

> Noticed a comment copy-pasted from spccache.c, which attached patch fixes
> to match the file.
>
>  /*
>   * InitializeAttoptCache
> - *  Initialize the tablespace cache.
> + *  Initialize the attribute options cache.
>   */
>

Applied, thanks.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


[HACKERS] comment fix in attoptcache.c

2017-06-02 Thread Amit Langote
Noticed a comment copy-pasted from spccache.c, which attached patch fixes
to match the file.

 /*
  * InitializeAttoptCache
- *  Initialize the tablespace cache.
+ *  Initialize the attribute options cache.
  */

Thanks,
Amit
diff --git a/src/backend/utils/cache/attoptcache.c 
b/src/backend/utils/cache/attoptcache.c
index f7f85b53db..4b30e6bc62 100644
--- a/src/backend/utils/cache/attoptcache.c
+++ b/src/backend/utils/cache/attoptcache.c
@@ -71,7 +71,7 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 
hashvalue)
 
 /*
  * InitializeAttoptCache
- * Initialize the tablespace cache.
+ * Initialize the attribute options cache.
  */
 static void
 InitializeAttoptCache(void)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Do we need the gcc feature "__builtin_expect" to promote the branches prediction?

2017-06-02 Thread Hao Lee
Hi all,
   There is a lot of "if statement" in system, and GCC provides a
feature,"__builtin_expect", which  let compilers know which branch is
mostly run. as we known, miss-prediction will lead the performance
lost(because the CPU will thrown away some instructions, and re-fetch some
new instructions). so that we can tell GCC how produce more efficient code.
for example as following.
It will gain performance promotion i think. As i know, the in Linux kernel,
this feature is also applied already.

 #define likely(cond) __builtin_expect(cond,true)
#define unlikely(cond)  __builtin_expect(cond,false)

if (likely(cond)) {
//most likely run.
   
} else //otherwise.
{
   
}


Best Regards.

Hom.


Re: [HACKERS] proposal: PLpgSQL parallel assignemnt

2017-06-02 Thread Pavel Stehule
2017-06-02 10:06 GMT+02:00 Craig Ringer :

> On 2 June 2017 at 15:51, Pavel Stehule  wrote:
>
> > a, b := fx();
> >
> > Comments, notes, ideas?
>
> I'd be pretty happy to have
>
> (a, b) = (x, y);
> (a, b) = f(x);
>
> which is SQL-esque.
>

This is not too far to my proposal - and it is fully adequate alternative.


>
> But what, if anything, does Ada do?
>

What I know, no, Ada has not this statement - but the design of OUT
parameters in Ada absolutely different than PostgreSQL - so in this case we
cannot to use Ada language as our base :(

Regards

Pavel


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-06-02 Thread Mengxing Liu
Hi,  Alvaro and Kevin.

> Anyway, this is just my analysis. 
> So I want to hack the PG and count the conflict lists' size of transactions. 
> That would be more accurate.

In the last week, I hacked the PG to add an additional thread to count 
RWConflict list lengths. 
And tune the benchmark to get more conflict. But the result is still not good.

> 
> > 
> > Yeah, you need a workload that generates a longer conflict list -- if
> > you can make the tool generate a conflict list with a configurable
> > length, that's even better (say, 100 conflicts vs. 1000 conflicts).
> > Then we can see how the conflict list processing scales.
> > 
> 
> Yes, I tried to increase the read set to make more conflicts. 
> However the abort ratio will also increase. The CPU cycles consumed by 
> conflict tracking are still less than 1%.
> According to the design of PG, a transaction will be aborted if there is a 
> rw-antidependency. 
> In this case, a transaction with a longer conflict list, is more possible to 
> be aborted.
> That means, the conflict list doesn't have too many chances to grow too long. 
> 

To solve this problem, I use just two kinds of transactions: Read-only 
transactions and Update-only transactions.
In this case, no transaction would  have an In-RWconflict and an Out-RWconflict 
at the same time.  
Thus transactions would not be aborted by conflict checking. 

Specifically, The benchmark is like this:
The table has 10K rows. 
Read-only transactions read 1K rows and Update-only transactions update 20 
random rows of the table. 

In this benchmark, about 91% lists are shorter than 10; 
lengths of 6% conflict lists are between 10 and 20. Only 2% lists are longer 
than 20. The CPU utilization of CheckForSerializableConflictOut/In is 
0.71%/0.69%.

I tried to increase the write set. As a result, conflict list become longer. 
But the total CPU utilization is decreased (about 50%).
CPU is not the bottleneck anymore. I'm not familiar with other part of PG. Is 
it caused by LOCK? Is there any chance to get rid of this problem?

BTW, I found that the email is not very convenient, especially when I  have 
some problem and want to discuss with you.
Would you mind scheduling a meeting every week by Skye, or other instant 
message software you like?
I would not take you too much time. Maybe one hour is enough.   


Sincerely.
--
Mengxing Liu










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PLpgSQL parallel assignemnt

2017-06-02 Thread Craig Ringer
On 2 June 2017 at 15:51, Pavel Stehule  wrote:

> a, b := fx();
>
> Comments, notes, ideas?

I'd be pretty happy to have

(a, b) = (x, y);
(a, b) = f(x);

which is SQL-esque.

But what, if anything, does Ada do?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: PLpgSQL parallel assignemnt

2017-06-02 Thread Pavel Stehule
Hi

Some modern or old languages (GO, Lua, CLU) has similarly designed function
OUT parameters like PLpgSQL.

In these languages is usually supported parallel assignment. It is not
supported by PLpgSQL - there is workaround - using SELECT FROM, but it is
workaround. The implementation of PA is trivial.

Current state:
==

CREATE OR REPLACE FUNCTION fx(OUT a int, OUT b int)
AS $$ ...

possibilities of CALL and assignment

recvar := fx();
rowvar := fx();

SELECT * FROM fx() INTO a, b;

Lua, Golang like proposal:

a, b := fx();

Comments, notes, ideas?

Regards

Pavel


Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

2017-06-02 Thread Rafia Sabih
On Tue, May 16, 2017 at 9:39 PM, Robert Haas  wrote:
> On Sun, May 7, 2017 at 11:54 AM, Robert Haas  wrote:
>> I'm having second thoughts based on some more experimentation I did
>> this morning.  I'll update again once I've had a bit more time to poke
>> at it.
>
> So the issue that I noticed here is that this problem really isn't
> confined to abort processing.  If we ROLLBACK TO SAVEPOINT or ABORT
> TRANSACTION on an open connection, we really do not know what the
> state of that connection is until we get an acknowledgement that the
> command completed successfully.  The patch handles that.  However, the
> same is true if we're sending a SAVEPOINT or RELEASE SAVEPOINT
> command, and the patch that I posted doesn't do anything about those
> cases.  I think it would be best to fix all transaction control
> commands in a symmetric manner.
>
> Concretely, I think we should replace the abort_cleanup_incomplete
> flag from my previous patch with a changing_xact_state flag and set
> that flag around all transaction state changes, clearing it when such
> changes have succeeded.  On error, the flag remains set, so we know
> that the state of that connection is unknown and that we must close it
> (killing outer transaction levels as needed).
>
> Thoughts?

I tried following the sketch stated above, and here's what I added to
v2 of the patch. In begin_remote_xact when savepoint is send to the
remote server through do_sql_command I set the changing_xact_state
flag and when it successfully returns from there the flag is unset.
Similar behaviour is followed in pgfdw_subxact_callback for release
savepoint. Additionally, I added this flag set-unset for start
transaction command in begin_remote_xact. Enlighten me if I've
msised/misunderstood something here. I ran the regress test for
postgres_fdw and it went on fine.

I have a couple of concerns here, since there is only flag required
for the purpose, it's  name to be changing_xact_state doesn't suit at
some places particularly in pgfdw_subxact_callback, not sure should
change comment or variable name

/* Disarm abort_cleanup_incomplete if it all worked. */
+ entry->changing_xact_state = abort_cleanup_failure;

Also, by any chance should we add a test-case for this?

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/


improve-pgfdw-abort-behavior-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix a typo in predicate.c

2017-06-02 Thread Magnus Hagander
On Fri, Jun 2, 2017 at 7:32 AM, Masahiko Sawada 
wrote:

> Hi,
>
> While reading predicate lock source code, I found a comment typo in
> predicate.c file.
> Attached patch fixes it.
>
> s/scrach/scratch/
>

Applied, thanks.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] logical replication - still unstable after all these months

2017-06-02 Thread Mark Kirkwood

On 02/06/17 17:11, Erik Rijkers wrote:


On 2017-06-02 00:46, Mark Kirkwood wrote:

On 31/05/17 21:16, Petr Jelinek wrote:

I'm seeing a new failure with the patch applied - this time the
history table has missing rows. Petr, I'll put back your access :-)


Is this error during 1-minute runs?

I'm asking because I've moved back to longer (1-hour) runs (no errors 
so far), and I'd like to keep track of what the most 'vulnerable' 
parameters are.




Yeah, still using your test config (with my minor modifications).

When I got the error the 1st time, I did a complete make clean and 
rebuildbut it is still possible I've 'done it wrong' - so 
independent confirmation would be good!


Cheers

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >