Re: [HACKERS] walsender & parallelism

2017-06-09 Thread Petr Jelinek
On 09/06/17 20:56, Andres Freund wrote:
> On 2017-06-08 23:04:31 -0700, Noah Misch wrote:
>> On Wed, Jun 07, 2017 at 10:54:57PM -0700, Noah Misch wrote:
>>> On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote:
 On 2017-06-02 22:12:46 -0700, Noah Misch wrote:
> On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
>> On 5/31/17 23:54, Peter Eisentraut wrote:
>>> On 5/29/17 22:01, Noah Misch wrote:
 On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
>  wrote:
>> Hi,
>>
>> so this didn't really move anywhere AFAICS, do we think the approach
>> I've chosen is good or do we want to do something else here?
>
> Can you add it to the open items list?

 [Action required within three days.  This is a generic notification.]
>>>
>>> I have posted an update.  The next update will be Friday.
>>
>> Andres is now working on this.  Let's give him a bit of time. ;-)
>
> If you intended this as your soon-due status update, it is missing a 
> mandatory
> bit.

 I'm now owning this item.  I've posted patches, await review.  If none
 were to be forthcoming, I'll do another pass Monday, and commit.
>>>
>>> This PostgreSQL 10 open item is past due for your status update.  Kindly 
>>> send
>>> a status update within 24 hours, and include a date for your subsequent 
>>> status
>>> update.  Refer to the policy on open item ownership:
>>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
>>
>> IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
>> for your status update.  Please reacquaint yourself with the policy on open
>> item ownership[1] and then reply immediately.  If I do not hear from you by
>> 2017-06-10 07:00 UTC, I will transfer this item to release management team
>> ownership without further notice.
> 
> The issue starting this thread is resolved, including several issues
> found in its wake, with 47fd420fb4d3e77dde960312f8672c82b14ecbad
> 6e1dd2773eb60a6ab87b27b8d9391b756e904ac3 and 
> c1abe6c786d8f00643de8519140d77644b474163
> 
> There's a remaining testing patch, and some related things left though.
> Basically patches 0001 and 0003 from
> http://archives.postgresql.org/message-id/81eaca60-3f7b-6a21-5531-fd51ffbf3f63%402ndquadrant.com
> I'd personally rather see a separate open-items entry for those, since
> this isn't related to paralellism, signal handler or such.
> 
> Petr, Peter (henceforth P^2), do you agree?

Makes sense, one could argue that 0003 could be even 2 open items with 2
patches - one with the fixes for parser and one for the other type of
message support (I am not even sure if we want to add other message
support into PG10).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-09 Thread Andres Freund
On 2017-06-08 23:04:31 -0700, Noah Misch wrote:
> On Wed, Jun 07, 2017 at 10:54:57PM -0700, Noah Misch wrote:
> > On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote:
> > > On 2017-06-02 22:12:46 -0700, Noah Misch wrote:
> > > > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
> > > > > On 5/31/17 23:54, Peter Eisentraut wrote:
> > > > > > On 5/29/17 22:01, Noah Misch wrote:
> > > > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> > > > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> > > > > >>>  wrote:
> > > > >  Hi,
> > > > > 
> > > > >  so this didn't really move anywhere AFAICS, do we think the 
> > > > >  approach
> > > > >  I've chosen is good or do we want to do something else here?
> > > > > >>>
> > > > > >>> Can you add it to the open items list?
> > > > > >>
> > > > > >> [Action required within three days.  This is a generic 
> > > > > >> notification.]
> > > > > > 
> > > > > > I have posted an update.  The next update will be Friday.
> > > > > 
> > > > > Andres is now working on this.  Let's give him a bit of time. ;-)
> > > > 
> > > > If you intended this as your soon-due status update, it is missing a 
> > > > mandatory
> > > > bit.
> > > 
> > > I'm now owning this item.  I've posted patches, await review.  If none
> > > were to be forthcoming, I'll do another pass Monday, and commit.
> > 
> > This PostgreSQL 10 open item is past due for your status update.  Kindly 
> > send
> > a status update within 24 hours, and include a date for your subsequent 
> > status
> > update.  Refer to the policy on open item ownership:
> > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> 
> IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
> for your status update.  Please reacquaint yourself with the policy on open
> item ownership[1] and then reply immediately.  If I do not hear from you by
> 2017-06-10 07:00 UTC, I will transfer this item to release management team
> ownership without further notice.

The issue starting this thread is resolved, including several issues
found in its wake, with 47fd420fb4d3e77dde960312f8672c82b14ecbad
6e1dd2773eb60a6ab87b27b8d9391b756e904ac3 and 
c1abe6c786d8f00643de8519140d77644b474163

There's a remaining testing patch, and some related things left though.
Basically patches 0001 and 0003 from
http://archives.postgresql.org/message-id/81eaca60-3f7b-6a21-5531-fd51ffbf3f63%402ndquadrant.com
I'd personally rather see a separate open-items entry for those, since
this isn't related to paralellism, signal handler or such.

Petr, Peter (henceforth P^2), do you agree?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-08 Thread Noah Misch
On Wed, Jun 07, 2017 at 10:54:57PM -0700, Noah Misch wrote:
> On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote:
> > On 2017-06-02 22:12:46 -0700, Noah Misch wrote:
> > > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
> > > > On 5/31/17 23:54, Peter Eisentraut wrote:
> > > > > On 5/29/17 22:01, Noah Misch wrote:
> > > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> > > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> > > > >>>  wrote:
> > > >  Hi,
> > > > 
> > > >  so this didn't really move anywhere AFAICS, do we think the 
> > > >  approach
> > > >  I've chosen is good or do we want to do something else here?
> > > > >>>
> > > > >>> Can you add it to the open items list?
> > > > >>
> > > > >> [Action required within three days.  This is a generic notification.]
> > > > > 
> > > > > I have posted an update.  The next update will be Friday.
> > > > 
> > > > Andres is now working on this.  Let's give him a bit of time. ;-)
> > > 
> > > If you intended this as your soon-due status update, it is missing a 
> > > mandatory
> > > bit.
> > 
> > I'm now owning this item.  I've posted patches, await review.  If none
> > were to be forthcoming, I'll do another pass Monday, and commit.
> 
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is long past due
for your status update.  Please reacquaint yourself with the policy on open
item ownership[1] and then reply immediately.  If I do not hear from you by
2017-06-10 07:00 UTC, I will transfer this item to release management team
ownership without further notice.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-07 Thread Noah Misch
On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote:
> On 2017-06-02 22:12:46 -0700, Noah Misch wrote:
> > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
> > > On 5/31/17 23:54, Peter Eisentraut wrote:
> > > > On 5/29/17 22:01, Noah Misch wrote:
> > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> > > >>>  wrote:
> > >  Hi,
> > > 
> > >  so this didn't really move anywhere AFAICS, do we think the approach
> > >  I've chosen is good or do we want to do something else here?
> > > >>>
> > > >>> Can you add it to the open items list?
> > > >>
> > > >> [Action required within three days.  This is a generic notification.]
> > > > 
> > > > I have posted an update.  The next update will be Friday.
> > > 
> > > Andres is now working on this.  Let's give him a bit of time. ;-)
> > 
> > If you intended this as your soon-due status update, it is missing a 
> > mandatory
> > bit.
> 
> I'm now owning this item.  I've posted patches, await review.  If none
> were to be forthcoming, I'll do another pass Monday, and commit.

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Andres Freund
On 2017-06-02 22:12:46 -0700, Noah Misch wrote:
> On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
> > On 5/31/17 23:54, Peter Eisentraut wrote:
> > > On 5/29/17 22:01, Noah Misch wrote:
> > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> > >>>  wrote:
> >  Hi,
> > 
> >  so this didn't really move anywhere AFAICS, do we think the approach
> >  I've chosen is good or do we want to do something else here?
> > >>>
> > >>> Can you add it to the open items list?
> > >>
> > >> [Action required within three days.  This is a generic notification.]
> > > 
> > > I have posted an update.  The next update will be Friday.
> > 
> > Andres is now working on this.  Let's give him a bit of time. ;-)
> 
> If you intended this as your soon-due status update, it is missing a mandatory
> bit.

I'm now owning this item.  I've posted patches, await review.  If none
were to be forthcoming, I'll do another pass Monday, and commit.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Noah Misch
On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote:
> On 5/31/17 23:54, Peter Eisentraut wrote:
> > On 5/29/17 22:01, Noah Misch wrote:
> >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> >>>  wrote:
>  Hi,
> 
>  so this didn't really move anywhere AFAICS, do we think the approach
>  I've chosen is good or do we want to do something else here?
> >>>
> >>> Can you add it to the open items list?
> >>
> >> [Action required within three days.  This is a generic notification.]
> > 
> > I have posted an update.  The next update will be Friday.
> 
> Andres is now working on this.  Let's give him a bit of time. ;-)

If you intended this as your soon-due status update, it is missing a mandatory
bit.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Andres Freund
On 2017-06-02 23:27:55 -0400, Peter Eisentraut wrote:
> On 5/31/17 23:54, Peter Eisentraut wrote:
> > On 5/29/17 22:01, Noah Misch wrote:
> >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek 
> >>>  wrote:
>  Hi,
> 
>  so this didn't really move anywhere AFAICS, do we think the approach
>  I've chosen is good or do we want to do something else here?
> >>>
> >>> Can you add it to the open items list?
> >>
> >> [Action required within three days.  This is a generic notification.]
> > 
> > I have posted an update.  The next update will be Friday.
> 
> Andres is now working on this.  Let's give him a bit of time. ;-)

Yep.

I've posted patches for most things in this thread over at
http://archives.postgresql.org/message-id/20170603002023.rcppadggfiaav377%40alap3.anarazel.de
because the proper fixes here are only possible after resolving the
discussion there.  The only thing I know of that's going remaining
afterwards is SIGHUP handling, of which there's already a patch in this
thread (although I want to prune its scope before committing).

I suspect that the review for those will probably happen over the next
few days, I plan to push those afterwards.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-02 Thread Peter Eisentraut
On 5/31/17 23:54, Peter Eisentraut wrote:
> On 5/29/17 22:01, Noah Misch wrote:
>> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
>>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
>>> wrote:
 Hi,

 so this didn't really move anywhere AFAICS, do we think the approach
 I've chosen is good or do we want to do something else here?
>>>
>>> Can you add it to the open items list?
>>
>> [Action required within three days.  This is a generic notification.]
> 
> I have posted an update.  The next update will be Friday.

Andres is now working on this.  Let's give him a bit of time. ;-)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-01 Thread Andres Freund
On 2017-06-01 22:17:57 -0400, Peter Eisentraut wrote:
> On 6/1/17 00:06, Andres Freund wrote:
> > On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote:
> >> I think the easiest and safest thing to do now is to just prevent
> >> parallel plans in the walsender.  See attached patch.  This prevents the
> >> hang in the select_parallel tests run under your new test setup.
> > I'm not quite sure I can buy this.  The lack of wired up signals has
> > more problems than just hurting parallelism.
> 
> Which problems are those?

For example not wiring up sigusr1, which is the cause of the paralellism
hang, breaks several things including recovery conflict interrupts.
Which means HS standby is affected. Just forbidding parallelism doesn't
address this.

> I can see from the code what they might be,
> but which ones actually happen in practice?  And are there any
> regressions from 9.6?

Yes. For example the issue that atm walsender backends don't ever quit
when executing sql statements is new.

> If someone wants to work all this out, that would be great.  But I'm
> here mainly to fix the one reported problem.

I'll deal with the issues in this thread.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-01 Thread Peter Eisentraut
On 6/1/17 00:06, Andres Freund wrote:
> On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote:
>> I think the easiest and safest thing to do now is to just prevent
>> parallel plans in the walsender.  See attached patch.  This prevents the
>> hang in the select_parallel tests run under your new test setup.
> I'm not quite sure I can buy this.  The lack of wired up signals has
> more problems than just hurting parallelism.

Which problems are those?  I can see from the code what they might be,
but which ones actually happen in practice?  And are there any
regressions from 9.6?

If someone wants to work all this out, that would be great.  But I'm
here mainly to fix the one reported problem.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-06-01 Thread Petr Jelinek
On 01/06/17 06:06, Andres Freund wrote:
> On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote:
>> I think the easiest and safest thing to do now is to just prevent
>> parallel plans in the walsender.  See attached patch.  This prevents the
>> hang in the select_parallel tests run under your new test setup.
> 
> I'm not quite sure I can buy this.  The lack of wired up signals has
> more problems than just hurting parallelism.  In fact, the USR1 thing
> seems like something that we actually should backpatch, rather than
> defer to v11.  I think there's some fair arguments to be made that we
> shouldn't do the refactoring right now - although I'm not sure about it
> - but just not fixing the bugs seems like a bad plan.
> 

I think the signal handling needs to be fixed. It does not have to be
done via large refactoring, but signals should be handled properly (= we
need to share SIGHUP/SIGUSR1 handling between postgres.c and walsender.c).

The rest can wait for PG11.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-31 Thread Andres Freund
On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote:
> I think the easiest and safest thing to do now is to just prevent
> parallel plans in the walsender.  See attached patch.  This prevents the
> hang in the select_parallel tests run under your new test setup.

I'm not quite sure I can buy this.  The lack of wired up signals has
more problems than just hurting parallelism.  In fact, the USR1 thing
seems like something that we actually should backpatch, rather than
defer to v11.  I think there's some fair arguments to be made that we
shouldn't do the refactoring right now - although I'm not sure about it
- but just not fixing the bugs seems like a bad plan.


> @@ -272,6 +273,7 @@ standard_planner(Query *parse, int cursorOptions, 
> ParamListInfo boundParams)
>*/
>   if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
>   IsUnderPostmaster &&
> + !am_walsender &&
>   dynamic_shared_memory_type != DSM_IMPL_NONE &&
>   parse->commandType == CMD_SELECT &&
>   !parse->hasModifyingCTE &&

If we were to go for this, surely this'd need a comment.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-31 Thread Craig Ringer
On 1 June 2017 at 11:51, Peter Eisentraut
 wrote:

> Unifying the signal handling and query processing further seems like a
> good idea, but the patches are pretty involved, so I suggest to put them
> into the next commit fest.

I had a quick look a the idea of just getting rid of walsenders as a
specific entity. Making them normal backends wherever possible. But it
starts running into trouble when you're allowing walsenders to stay
alive well into postmaster shutdown when normal backends are
terminated, so there's probably going to be a need for more ongoing
separation than I'd really like to stop walsenders doing things they
can't safely do during shutdown.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-31 Thread Peter Eisentraut
On 5/29/17 22:01, Noah Misch wrote:
> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
>> wrote:
>>> Hi,
>>>
>>> so this didn't really move anywhere AFAICS, do we think the approach
>>> I've chosen is good or do we want to do something else here?
>>
>> Can you add it to the open items list?
> 
> [Action required within three days.  This is a generic notification.]

I have posted an update.  The next update will be Friday.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-31 Thread Peter Eisentraut
On 5/23/17 13:57, Petr Jelinek wrote:
> On 23/05/17 19:45, Andres Freund wrote:
>>
>>
>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
>> wrote:
>>> Hi,
>>>
>>> so this didn't really move anywhere AFAICS, do we think the approach
>>> I've chosen is good or do we want to do something else here?
>>
>> Can you add it to the open items list?
>>
> 
> Done

I think the easiest and safest thing to do now is to just prevent
parallel plans in the walsender.  See attached patch.  This prevents the
hang in the select_parallel tests run under your new test setup.

Unifying the signal handling and query processing further seems like a
good idea, but the patches are pretty involved, so I suggest to put them
into the next commit fest.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 26002c4e58647c67cce97e66b34b489bb838e39a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 31 May 2017 23:27:13 -0400
Subject: [PATCH] Prevent parallel query in walsender

Parallel query in walsender currently hangs, because walsender does not
have the full SIGUSR1 signal handler to handle this.  It's safer and
easier to prevent using a parallel plan for now.  Eventually, the signal
handling in walsender should be unified with regular backends.

Reported-by: Andres Freund 
---
 src/backend/optimizer/plan/planner.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 40cb79d4cd..bb921ba29c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -50,6 +50,7 @@
 #include "parser/analyze.h"
 #include "parser/parsetree.h"
 #include "parser/parse_agg.h"
+#include "replication/walsender.h"
 #include "rewrite/rewriteManip.h"
 #include "storage/dsm_impl.h"
 #include "utils/rel.h"
@@ -272,6 +273,7 @@ standard_planner(Query *parse, int cursorOptions, 
ParamListInfo boundParams)
 */
if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
IsUnderPostmaster &&
+   !am_walsender &&
dynamic_shared_memory_type != DSM_IMPL_NONE &&
parse->commandType == CMD_SELECT &&
!parse->hasModifyingCTE &&
-- 
2.13.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-29 Thread Noah Misch
On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote:
> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
> wrote:
> >Hi,
> >
> >so this didn't really move anywhere AFAICS, do we think the approach
> >I've chosen is good or do we want to do something else here?
> 
> Can you add it to the open items list?

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-23 Thread Petr Jelinek
On 23/05/17 19:45, Andres Freund wrote:
> 
> 
> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
> wrote:
>> Hi,
>>
>> so this didn't really move anywhere AFAICS, do we think the approach
>> I've chosen is good or do we want to do something else here?
> 
> Can you add it to the open items list?
> 

Done

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-23 Thread Andres Freund


On May 23, 2017 1:42:41 PM EDT, Petr Jelinek  
wrote:
>Hi,
>
>so this didn't really move anywhere AFAICS, do we think the approach
>I've chosen is good or do we want to do something else here?

Can you add it to the open items list?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-23 Thread Petr Jelinek
Hi,

so this didn't really move anywhere AFAICS, do we think the approach
I've chosen is good or do we want to do something else here?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-05-13 Thread Andres Freund
On 2017-04-25 11:18:35 -0400, Robert Haas wrote:
> On Thu, Apr 20, 2017 at 9:40 PM, Andres Freund  wrote:
> > This'd be easier to look at if the fairly separate facility of allowing
> > walsender to execute SQL commands had been committed separately, rather
> > than as part of a fairly large commit of largely unrelated things.
> 
> Good grief, yes.  I really don't think that should have been slipped
> into a commit that was mostly about logical replication, with only a
> brief mention in the commit message and, AFAICS, no documentation
> whatsoever.

In my opinion some explanation about how this came about would be
appropriate.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-25 Thread Robert Haas
On Thu, Apr 20, 2017 at 9:40 PM, Andres Freund  wrote:
> This'd be easier to look at if the fairly separate facility of allowing
> walsender to execute SQL commands had been committed separately, rather
> than as part of a fairly large commit of largely unrelated things.

Good grief, yes.  I really don't think that should have been slipped
into a commit that was mostly about logical replication, with only a
brief mention in the commit message and, AFAICS, no documentation
whatsoever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-24 Thread Petr Jelinek
On 25/04/17 01:25, Andres Freund wrote:
> Hi,
> 
> On 2017-04-24 07:31:18 +0200, Petr Jelinek wrote:
>> The previous coding tried to run the unknown string throur lexer which
>> could fail for some valid SQL statements as the replication command
>> parser is too simple to handle all the complexities of SQL language.
>>
>> Instead just fall back to SQL parser on any unknown command.
>>
>> Also remove SHOW command handling from walsender as it's handled by the
>> simple query code path.
> 
> This break SHOW var; over the plain replication connections though
> (which was quite intentionally supported), so I don't think that's ok?
> 

Hmm I guess we don't use this anywhere in tests (which is not surprising
as it would have to be used in plain walsender connection). Fixed.

> 
> I don't like much how SHOW and walsender understanding SQL statements
> turned out, I think code like
> 
>   if (am_walsender)
>   {
>   if 
> (!exec_replication_command(query_string))
>   
> exec_simple_query(query_string);
>   }
>   else
>   exec_simple_query(query_string);
> 
> shows some of the issues here.
> 

Not sure how it's related to SHOW, but in any case, it's just result of
the parsing fallback.

> 
>> New alternative output for largeobject test has been added as the
>> replication connection does not support fastpath function calls.
> 
> I think just supporting fastpath over the replication protocol is less
> work than maintaining the alternative file.
> 
> 
>> --- a/src/Makefile.global.in
>> +++ b/src/Makefile.global.in
>> @@ -321,6 +321,7 @@ BZIP2= bzip2
>>  # Testing
>>  
>>  check: temp-install
>> +check-walsender-sql: temp-install
> 
> This doesn't strike me as something that should go into
> a/src/Makefile.global.in - I'd rather put it in
> b/src/test/regress/GNUmakefile.  check is in .global because it's used
> in a lot of different makefiles, but that's not true for this target, right?
> 

Shows how well I know our build system :)

Anyway, I gave it a try to support as much as possible and the result is
attached.

First patch adds the new make target that runs regression tests through
walsender. I had to change one test to run SHOW TIMEZONE instead of SHOW
TIME ZONE, otherwise no tests need to be changed.

The second patch unifies the sighup handling across all kinds of
processes. This is mostly replacing got_SIGHuP with new variable and
removing local SIGHUP handlers in favor of generic one where possible.

And the third patch changes walsender. There are 3 things it does a) it
overhauls signal handling there, b) it allows both fast path function
calls and extended protocol messages (once I did function fast path I
didn't really see any difference in terms of extended protocol, please
correct me if I am wrong in that) and c) improves the fallback in parser.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


0001-Add-check-walsender-sql-make-target.patch
Description: binary/octet-stream


0002-Unify-SIGHUP-hadnling-across-all-types-of-processes.patch
Description: binary/octet-stream


0003-Make-walsender-behave-more-like-normal-backend.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-24 Thread Andres Freund
Hi,

On 2017-04-24 07:31:18 +0200, Petr Jelinek wrote:
> The previous coding tried to run the unknown string throur lexer which
> could fail for some valid SQL statements as the replication command
> parser is too simple to handle all the complexities of SQL language.
> 
> Instead just fall back to SQL parser on any unknown command.
> 
> Also remove SHOW command handling from walsender as it's handled by the
> simple query code path.

This break SHOW var; over the plain replication connections though
(which was quite intentionally supported), so I don't think that's ok?


I don't like much how SHOW and walsender understanding SQL statements
turned out, I think code like

if (am_walsender)
{
if 
(!exec_replication_command(query_string))

exec_simple_query(query_string);
}
else
exec_simple_query(query_string);

shows some of the issues here.


> Checks the SQL over walsender interface by running the standard set of
> regression tests against it.

I like that approach a lot.


> New alternative output for largeobject test has been added as the
> replication connection does not support fastpath function calls.

I think just supporting fastpath over the replication protocol is less
work than maintaining the alternative file.


> --- a/src/Makefile.global.in
> +++ b/src/Makefile.global.in
> @@ -321,6 +321,7 @@ BZIP2 = bzip2
>  # Testing
>  
>  check: temp-install
> +check-walsender-sql: temp-install

This doesn't strike me as something that should go into
a/src/Makefile.global.in - I'd rather put it in
b/src/test/regress/GNUmakefile.  check is in .global because it's used
in a lot of different makefiles, but that's not true for this target, right?



Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-24 Thread Petr Jelinek
On 24/04/17 20:00, Andres Freund wrote:
> On 2017-04-24 18:29:51 +0200, Petr Jelinek wrote:
>> On 24/04/17 07:42, Andres Freund wrote:
>>>
>>>
>>> On April 23, 2017 10:31:18 PM PDT, Petr Jelinek 
>>>  wrote:
 On 24/04/17 04:31, Petr Jelinek wrote:
 So actually maybe running regression tests through it might be
 reasonable approach if we add new make target for it.
>>>
>>> That sounds like a good plan.
>>>
>>>
 Note that the first patch is huge. That's because I needed to add
 alternative output for largeobject test because it uses fastpath
 function calls which are not allowed over replication protocol.
>>>
>>> There's no need for that restriction, is there?  At least for db 
>>> walsenders...
>>>
>>
>> No, there is no real need to restring the extended protocol either but
>> we do so currently. The point of allowing SQL was to allow logical
>> replication to work, not to merge walsender completely into normal
>> backend code.
> 
> Well, that's understandable, but there's also the competing issue that
> we need something that is well defined and behaved.
> 

It's well defined, it supports simple protocol queries, that's it.

> 
>> Obviously it
>> means walsender is still special but as I said, my plan was to make it
>> work for logical replication not to merge it completely with existing
>> backends.
> 
> Yea, and I don't think that's an argument for anything on its own,
> sorry.
> 

It's not meant argument, it's plain statement of my intentions. I am not
stopping you from doing more if you want, however I don't see that it's
needed or any arguments about why it is needed.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-24 Thread Andres Freund
On 2017-04-24 18:29:51 +0200, Petr Jelinek wrote:
> On 24/04/17 07:42, Andres Freund wrote:
> > 
> > 
> > On April 23, 2017 10:31:18 PM PDT, Petr Jelinek 
> >  wrote:
> >> On 24/04/17 04:31, Petr Jelinek wrote:
> >> So actually maybe running regression tests through it might be
> >> reasonable approach if we add new make target for it.
> > 
> > That sounds like a good plan.
> > 
> > 
> >> Note that the first patch is huge. That's because I needed to add
> >> alternative output for largeobject test because it uses fastpath
> >> function calls which are not allowed over replication protocol.
> > 
> > There's no need for that restriction, is there?  At least for db 
> > walsenders...
> > 
> 
> No, there is no real need to restring the extended protocol either but
> we do so currently. The point of allowing SQL was to allow logical
> replication to work, not to merge walsender completely into normal
> backend code.

Well, that's understandable, but there's also the competing issue that
we need something that is well defined and behaved.


> Obviously it
> means walsender is still special but as I said, my plan was to make it
> work for logical replication not to merge it completely with existing
> backends.

Yea, and I don't think that's an argument for anything on its own,
sorry.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-24 Thread Petr Jelinek
On 24/04/17 07:42, Andres Freund wrote:
> 
> 
> On April 23, 2017 10:31:18 PM PDT, Petr Jelinek 
>  wrote:
>> On 24/04/17 04:31, Petr Jelinek wrote:
>> So actually maybe running regression tests through it might be
>> reasonable approach if we add new make target for it.
> 
> That sounds like a good plan.
> 
> 
>> Note that the first patch is huge. That's because I needed to add
>> alternative output for largeobject test because it uses fastpath
>> function calls which are not allowed over replication protocol.
> 
> There's no need for that restriction, is there?  At least for db walsenders...
> 

No, there is no real need to restring the extended protocol either but
we do so currently. The point of allowing SQL was to allow logical
replication to work, not to merge walsender completely into normal
backend code. And I don't know about differences well enough to go for
the full merge, especially not at this stage of PG10 dev.

>> As part of this I realized that the parser fallback that I wrote
>> originally for handling SQL in walsender is not robust enough as the
>> lexer would fail on some valid SQL statements. So there is also second
>> patch that does this using different approach which allows any SQL
>> statement.
> 
> Haven't looked at the new thing yet, but I'm not particularly surprised...  
> Wonder of there's a good way to fully integrate both grammars, without 
> exploding keywords.
> 

I think we'd need to keyword the first word of every replication command
if we wanted to merge both grammar files together. I am not sure if
there is all that much need for that, the pass-through for everything
that is not replication command seems to work just fine. Obviously it
means walsender is still special but as I said, my plan was to make it
work for logical replication not to merge it completely with existing
backends.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund


On April 23, 2017 10:31:18 PM PDT, Petr Jelinek  
wrote:
>On 24/04/17 04:31, Petr Jelinek wrote:
>So actually maybe running regression tests through it might be
>reasonable approach if we add new make target for it.

That sounds like a good plan.


>Note that the first patch is huge. That's because I needed to add
>alternative output for largeobject test because it uses fastpath
>function calls which are not allowed over replication protocol.

There's no need for that restriction, is there?  At least for db walsenders...

>As part of this I realized that the parser fallback that I wrote
>originally for handling SQL in walsender is not robust enough as the
>lexer would fail on some valid SQL statements. So there is also second
>patch that does this using different approach which allows any SQL
>statement.

Haven't looked at the new thing yet, but I'm not particularly surprised...  
Wonder of there's a good way to fully integrate both grammars, without 
exploding keywords.

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
On 2017-04-24 04:27:58 +0200, Petr Jelinek wrote:
> On 24/04/17 01:43, Andres Freund wrote:
> > 
> >> BTW while looking at the code, I don't understand why we call
> >> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
> >> the SetLatch be enough (they both end up calling sendSelfPipeByte()
> >> eventually)?
> > 
> > Historic raisins - there didn't use to be a SetLatch in
> > procsignal_sigusr1_handler. That changed when I whacked around catchup &
> > notify to be based on latches ([1] and following).
> > 
> > [1] 
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47
> > 
> 
> Okay, but why call both SetLatch and latch_sigusr1_handler? What does
> that buy us?

Nothing.  It's how the code evolved, we can change that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
On 2017-04-24 04:26:16 +0200, Petr Jelinek wrote:
> > WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> > currently sets a different variable from postgres.c, but that seems like
> > a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> > WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> > an active bug to me?
> > 
> 
> Hmm I don't think good solution is to use same variable though,
> walsender needs to do slightly different thing on SIGHUP,

Hm? You mean SyncRepInitConfig()?  I don't think that matters, because
it's only needed if config is changed while streaming.

> plus the
> got_SIGHUP is not the type of variable we want to export. I wonder if it
> would be enough to just add check for got_SIGHUP somewhere at the
> beginning of exec_replication_command().

I don't think that'd be sufficient, because we really want config
changes to get picked up while idle, and that won't work from within
exec_replication_command().  And that pretty much means it'll have to
happen outside of walsender.c.

FWIW, while it's not pretty, I actually see very little reason not to
share got_SIGHUP (with a bit less generic name name) between different
types of processes (i.e. putting it in miscadmin.h or such). It's not
exactly pretty, but there's also no benefit in duplicating it
everywhere, and without it you run into the issue presented here.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Petr Jelinek
On 24/04/17 02:04, Andres Freund wrote:
> Hi,
> 
> Oh, and one more point: There desperately need to be tests exercising
> "SQL via walsender". Including the issue of parallelism here, the missed
> cancel handler, timeouts, and such.  One way to do so is to use
> pg_regress with an adjusted connection string (specifying
> replication=database), doing the same for isolationtester, or using
> dblink.
> 

Well, there are some tests (the logical replication uses that
functionality) and it's not quite clear to me what all to run there.
I assume you don't want to rerun whole regression suite against
walsender given the time it would take in normal tests (although I could
see doing that optionally somehow) but then what to pick from there.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Petr Jelinek
On 24/04/17 01:43, Andres Freund wrote:
> 
>> BTW while looking at the code, I don't understand why we call
>> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
>> the SetLatch be enough (they both end up calling sendSelfPipeByte()
>> eventually)?
> 
> Historic raisins - there didn't use to be a SetLatch in
> procsignal_sigusr1_handler. That changed when I whacked around catchup &
> notify to be based on latches ([1] and following).
> 
> [1] 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47
> 

Okay, but why call both SetLatch and latch_sigusr1_handler? What does
that buy us?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Petr Jelinek
On 24/04/17 01:59, Andres Freund wrote:
> Hi,
> 
> On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
>> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
>> does not break anything for existing walsender usage.
> 
>> diff --git a/src/backend/replication/logical/launcher.c 
>> b/src/backend/replication/logical/launcher.c
>> index 761fbfa..e9dd886 100644
>> --- a/src/backend/replication/logical/launcher.c
>> +++ b/src/backend/replication/logical/launcher.c
>> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char 
>> *subname, Oid userid,
>>  BackgroundWorkerbgw;
>>  BackgroundWorkerHandle *bgw_handle;
>>  int i;
>> -int slot;
>> +int slot = 0;
>>  LogicalRepWorker   *worker = NULL;
>>  int nsyncworkers = 0;
>>  TimestampTz now = GetCurrentTimestamp();
> 
> That seems independent and unnecessary?
> 

Yeah that leaked from different patch I was working on in parallel.

> 
>> -/* SIGUSR1: set flag to send WAL records */
>> -static void
>> -WalSndXLogSendHandler(SIGNAL_ARGS)
>> -{
>> -int save_errno = errno;
>> -
>> -latch_sigusr1_handler();
>> -
>> -errno = save_errno;
>> -}
> 
>>  pqsignal(SIGPIPE, SIG_IGN);
>> -pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
>> */
>> +pqsignal(SIGUSR1, procsignal_sigusr1_handler);
> 
> Those aren't actually entirely equivalent. I'm not sure it matters much,
> but WalSndXLogSendHandler didn't include a SetLatch(), but
> procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
> will just be elided, but what if walsender already woke up and did it's
> work?
> 

They aren't exactly same no, but I don't see harm in what
procsignal_sigusr1_handler. I mean worst case scenario is latch is set
so walsender checks for work.

> I think it'd be better to have PostgresMain() set up signals by default,
> and then have WalSndSignals() overwrites the ones it needs separate.

Hmm that sounds like a good idea.

> WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> currently sets a different variable from postgres.c, but that seems like
> a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> an active bug to me?
> 

Hmm I don't think good solution is to use same variable though,
walsender needs to do slightly different thing on SIGHUP, plus the
got_SIGHUP is not the type of variable we want to export. I wonder if it
would be enough to just add check for got_SIGHUP somewhere at the
beginning of exec_replication_command().

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-23 16:59:41 -0700, Andres Freund wrote:
> Hi,
> 
> On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
> > Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
> > does not break anything for existing walsender usage.
> 
> > diff --git a/src/backend/replication/logical/launcher.c 
> > b/src/backend/replication/logical/launcher.c
> > index 761fbfa..e9dd886 100644
> > --- a/src/backend/replication/logical/launcher.c
> > +++ b/src/backend/replication/logical/launcher.c
> > @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const 
> > char *subname, Oid userid,
> > BackgroundWorkerbgw;
> > BackgroundWorkerHandle *bgw_handle;
> > int i;
> > -   int slot;
> > +   int slot = 0;
> > LogicalRepWorker   *worker = NULL;
> > int nsyncworkers = 0;
> > TimestampTz now = GetCurrentTimestamp();
> 
> That seems independent and unnecessary?
> 
> 
> > -/* SIGUSR1: set flag to send WAL records */
> > -static void
> > -WalSndXLogSendHandler(SIGNAL_ARGS)
> > -{
> > -   int save_errno = errno;
> > -
> > -   latch_sigusr1_handler();
> > -
> > -   errno = save_errno;
> > -}
> 
> > pqsignal(SIGPIPE, SIG_IGN);
> > -   pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
> > */
> > +   pqsignal(SIGUSR1, procsignal_sigusr1_handler);
> 
> Those aren't actually entirely equivalent. I'm not sure it matters much,
> but WalSndXLogSendHandler didn't include a SetLatch(), but
> procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
> will just be elided, but what if walsender already woke up and did it's
> work?
> 
> I think it'd be better to have PostgresMain() set up signals by default,
> and then have WalSndSignals() overwrites the ones it needs separate.
> WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
> currently sets a different variable from postgres.c, but that seems like
> a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
> WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
> an active bug to me?

Oh, and one more point: There desperately need to be tests exercising
"SQL via walsender". Including the issue of parallelism here, the missed
cancel handler, timeouts, and such.  One way to do so is to use
pg_regress with an adjusted connection string (specifying
replication=database), doing the same for isolationtester, or using
dblink.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
Hi,

On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote:
> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
> does not break anything for existing walsender usage.

> diff --git a/src/backend/replication/logical/launcher.c 
> b/src/backend/replication/logical/launcher.c
> index 761fbfa..e9dd886 100644
> --- a/src/backend/replication/logical/launcher.c
> +++ b/src/backend/replication/logical/launcher.c
> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char 
> *subname, Oid userid,
>   BackgroundWorkerbgw;
>   BackgroundWorkerHandle *bgw_handle;
>   int i;
> - int slot;
> + int slot = 0;
>   LogicalRepWorker   *worker = NULL;
>   int nsyncworkers = 0;
>   TimestampTz now = GetCurrentTimestamp();

That seems independent and unnecessary?


> -/* SIGUSR1: set flag to send WAL records */
> -static void
> -WalSndXLogSendHandler(SIGNAL_ARGS)
> -{
> - int save_errno = errno;
> -
> - latch_sigusr1_handler();
> -
> - errno = save_errno;
> -}

>   pqsignal(SIGPIPE, SIG_IGN);
> - pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
> */
> + pqsignal(SIGUSR1, procsignal_sigusr1_handler);

Those aren't actually entirely equivalent. I'm not sure it matters much,
but WalSndXLogSendHandler didn't include a SetLatch(), but
procsignal_sigusr1_handler() does.  Normally redundant SetLatch() calls
will just be elided, but what if walsender already woke up and did it's
work?

I think it'd be better to have PostgresMain() set up signals by default,
and then have WalSndSignals() overwrites the ones it needs separate.
WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler
currently sets a different variable from postgres.c, but that seems like
a bad idea, because afaics we'll plainly ignore SIGHUPS unless in
WalSndLoop, WalSndWriteData,WalSndWaitForWal.  That actually seems like
an active bug to me?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-23 Thread Andres Freund
On 2017-04-21 04:20:26 +0200, Petr Jelinek wrote:
> Looks like SIGUSR1 being different is problem here - it's normally used
> to . I also noticed that we don't handle SIGINT (query cancel).

I think we really need to unify the paths between walsender and normal
backends to a much larger degree.


> BTW while looking at the code, I don't understand why we call
> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
> the SetLatch be enough (they both end up calling sendSelfPipeByte()
> eventually)?

Historic raisins - there didn't use to be a SetLatch in
procsignal_sigusr1_handler. That changed when I whacked around catchup &
notify to be based on latches ([1] and following).

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-22 Thread Petr Jelinek
On 21/04/17 04:37, Petr Jelinek wrote:
> On 21/04/17 04:32, Craig Ringer wrote:
>> On 21 April 2017 at 10:20, Petr Jelinek  wrote:
>>> On 21/04/17 03:40, Andres Freund wrote:

 Since [1] walsender (not receiver as commit message says) can execute
 SQL queries.  While doing some testing of [2] I noticed that SQL queries
 in walsender get stuck if parallelism is used - I have not investigated
 why that is yet, but it surely is an issue.  On first blush I'd suspect
 that some signalling is not wired up correctly (cf. am_walsender branches
 in PostgresMain() and such).
>>>
>>> Looks like SIGUSR1 being different is problem here - it's normally used
>>> to . I also noticed that we don't handle SIGINT (query cancel).
>>>
>>> I'll write proper patch but can you try to just use
>>> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
>>> the issue?
>>
>> That's what my recovery conflicts for logical decoding on standby
>> patch does, FWIW.
>>
>> I haven't found any issues yet..
>>
> 
> Ah I knew I've seen that change somewhere. I thought it was either in my
> patch or master which is why I thought it's working fine already.
> 

Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it
does not break anything for existing walsender usage.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From f59218135092bec9a64cf86b75f1f62a0271d448 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Fri, 21 Apr 2017 05:11:30 +0200
Subject: [PATCH] Handle signals correctly in walsender

Since we now support SQL queries in walsender, we also need to handle
SIGINT for query cancel and SIGUSR1 for IPC.
---
 src/backend/replication/logical/launcher.c |  2 +-
 src/backend/replication/walsender.c| 16 ++--
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/logical/launcher.c 
b/src/backend/replication/logical/launcher.c
index 761fbfa..e9dd886 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char 
*subname, Oid userid,
BackgroundWorkerbgw;
BackgroundWorkerHandle *bgw_handle;
int i;
-   int slot;
+   int slot = 0;
LogicalRepWorker   *worker = NULL;
int nsyncworkers = 0;
TimestampTz now = GetCurrentTimestamp();
diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index b4ce75a..78369ae 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -212,7 +212,6 @@ static struct
 
 /* Signal handlers */
 static void WalSndSigHupHandler(SIGNAL_ARGS);
-static void WalSndXLogSendHandler(SIGNAL_ARGS);
 static void WalSndLastCycleHandler(SIGNAL_ARGS);
 
 /* Prototypes for private functions */
@@ -2830,17 +2829,6 @@ WalSndSigHupHandler(SIGNAL_ARGS)
errno = save_errno;
 }
 
-/* SIGUSR1: set flag to send WAL records */
-static void
-WalSndXLogSendHandler(SIGNAL_ARGS)
-{
-   int save_errno = errno;
-
-   latch_sigusr1_handler();
-
-   errno = save_errno;
-}
-
 /* SIGUSR2: set flag to do a last cycle and shut down afterwards */
 static void
 WalSndLastCycleHandler(SIGNAL_ARGS)
@@ -2869,12 +2857,12 @@ WalSndSignals(void)
/* Set up signal handlers */
pqsignal(SIGHUP, WalSndSigHupHandler);  /* set flag to read 
config

 * file */
-   pqsignal(SIGINT, SIG_IGN);  /* not used */
+   pqsignal(SIGINT, StatementCancelHandler);   /* query cancel */
pqsignal(SIGTERM, die); /* request shutdown */
pqsignal(SIGQUIT, quickdie);/* hard crash time */
InitializeTimeouts();   /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
-   pqsignal(SIGUSR1, WalSndXLogSendHandler);   /* request WAL sending 
*/
+   pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, WalSndLastCycleHandler);  /* request a last cycle 
and

 * shutdown */
 
-- 
2.7.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-20 Thread Petr Jelinek
On 21/04/17 04:32, Craig Ringer wrote:
> On 21 April 2017 at 10:20, Petr Jelinek  wrote:
>> On 21/04/17 03:40, Andres Freund wrote:
>>>
>>> Since [1] walsender (not receiver as commit message says) can execute
>>> SQL queries.  While doing some testing of [2] I noticed that SQL queries
>>> in walsender get stuck if parallelism is used - I have not investigated
>>> why that is yet, but it surely is an issue.  On first blush I'd suspect
>>> that some signalling is not wired up correctly (cf. am_walsender branches
>>> in PostgresMain() and such).
>>
>> Looks like SIGUSR1 being different is problem here - it's normally used
>> to . I also noticed that we don't handle SIGINT (query cancel).
>>
>> I'll write proper patch but can you try to just use
>> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
>> the issue?
> 
> That's what my recovery conflicts for logical decoding on standby
> patch does, FWIW.
> 
> I haven't found any issues yet..
> 

Ah I knew I've seen that change somewhere. I thought it was either in my
patch or master which is why I thought it's working fine already.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-20 Thread Craig Ringer
On 21 April 2017 at 10:20, Petr Jelinek  wrote:
> On 21/04/17 03:40, Andres Freund wrote:
>>
>> Since [1] walsender (not receiver as commit message says) can execute
>> SQL queries.  While doing some testing of [2] I noticed that SQL queries
>> in walsender get stuck if parallelism is used - I have not investigated
>> why that is yet, but it surely is an issue.  On first blush I'd suspect
>> that some signalling is not wired up correctly (cf. am_walsender branches
>> in PostgresMain() and such).
>
> Looks like SIGUSR1 being different is problem here - it's normally used
> to . I also noticed that we don't handle SIGINT (query cancel).
>
> I'll write proper patch but can you try to just use
> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
> the issue?

That's what my recovery conflicts for logical decoding on standby
patch does, FWIW.

I haven't found any issues yet..

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walsender & parallelism

2017-04-20 Thread Petr Jelinek
On 21/04/17 03:40, Andres Freund wrote:
> 
> Since [1] walsender (not receiver as commit message says) can execute
> SQL queries.  While doing some testing of [2] I noticed that SQL queries
> in walsender get stuck if parallelism is used - I have not investigated
> why that is yet, but it surely is an issue.  On first blush I'd suspect
> that some signalling is not wired up correctly (cf. am_walsender branches
> in PostgresMain() and such).

Looks like SIGUSR1 being different is problem here - it's normally used
to . I also noticed that we don't handle SIGINT (query cancel).

I'll write proper patch but can you try to just use
procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes
the issue?

BTW while looking at the code, I don't understand why we call
latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just
the SetLatch be enough (they both end up calling sendSelfPipeByte()
eventually)?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers