Re: [HACKERS] walsender & parallelism
On 09/06/17 20:56, Andres Freund wrote: > On 2017-06-08 23:04:31 -0700, Noah Misch wrote: >> On Wed, Jun 07, 2017 at 10:54:57PM -0700, Noah Misch wrote: >>> On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote: On 2017-06-02 22:12:46 -0700, Noah Misch wrote: > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote: >> On 5/31/17 23:54, Peter Eisentraut wrote: >>> On 5/29/17 22:01, Noah Misch wrote: On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > wrote: >> Hi, >> >> so this didn't really move anywhere AFAICS, do we think the approach >> I've chosen is good or do we want to do something else here? > > Can you add it to the open items list? [Action required within three days. This is a generic notification.] >>> >>> I have posted an update. The next update will be Friday. >> >> Andres is now working on this. Let's give him a bit of time. ;-) > > If you intended this as your soon-due status update, it is missing a > mandatory > bit. I'm now owning this item. I've posted patches, await review. If none were to be forthcoming, I'll do another pass Monday, and commit. >>> >>> This PostgreSQL 10 open item is past due for your status update. Kindly >>> send >>> a status update within 24 hours, and include a date for your subsequent >>> status >>> update. Refer to the policy on open item ownership: >>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com >> >> IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due >> for your status update. Please reacquaint yourself with the policy on open >> item ownership[1] and then reply immediately. If I do not hear from you by >> 2017-06-10 07:00 UTC, I will transfer this item to release management team >> ownership without further notice. > > The issue starting this thread is resolved, including several issues > found in its wake, with 47fd420fb4d3e77dde960312f8672c82b14ecbad > 6e1dd2773eb60a6ab87b27b8d9391b756e904ac3 and > c1abe6c786d8f00643de8519140d77644b474163 > > There's a remaining testing patch, and some related things left though. > Basically patches 0001 and 0003 from > http://archives.postgresql.org/message-id/81eaca60-3f7b-6a21-5531-fd51ffbf3f63%402ndquadrant.com > I'd personally rather see a separate open-items entry for those, since > this isn't related to paralellism, signal handler or such. > > Petr, Peter (henceforth P^2), do you agree? Makes sense, one could argue that 0003 could be even 2 open items with 2 patches - one with the fixes for parser and one for the other type of message support (I am not even sure if we want to add other message support into PG10). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-06-08 23:04:31 -0700, Noah Misch wrote: > On Wed, Jun 07, 2017 at 10:54:57PM -0700, Noah Misch wrote: > > On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote: > > > On 2017-06-02 22:12:46 -0700, Noah Misch wrote: > > > > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote: > > > > > On 5/31/17 23:54, Peter Eisentraut wrote: > > > > > > On 5/29/17 22:01, Noah Misch wrote: > > > > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > > > > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > > > > > >>> wrote: > > > > > Hi, > > > > > > > > > > so this didn't really move anywhere AFAICS, do we think the > > > > > approach > > > > > I've chosen is good or do we want to do something else here? > > > > > >>> > > > > > >>> Can you add it to the open items list? > > > > > >> > > > > > >> [Action required within three days. This is a generic > > > > > >> notification.] > > > > > > > > > > > > I have posted an update. The next update will be Friday. > > > > > > > > > > Andres is now working on this. Let's give him a bit of time. ;-) > > > > > > > > If you intended this as your soon-due status update, it is missing a > > > > mandatory > > > > bit. > > > > > > I'm now owning this item. I've posted patches, await review. If none > > > were to be forthcoming, I'll do another pass Monday, and commit. > > > > This PostgreSQL 10 open item is past due for your status update. Kindly > > send > > a status update within 24 hours, and include a date for your subsequent > > status > > update. Refer to the policy on open item ownership: > > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com > > IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due > for your status update. Please reacquaint yourself with the policy on open > item ownership[1] and then reply immediately. If I do not hear from you by > 2017-06-10 07:00 UTC, I will transfer this item to release management team > ownership without further notice. The issue starting this thread is resolved, including several issues found in its wake, with 47fd420fb4d3e77dde960312f8672c82b14ecbad 6e1dd2773eb60a6ab87b27b8d9391b756e904ac3 and c1abe6c786d8f00643de8519140d77644b474163 There's a remaining testing patch, and some related things left though. Basically patches 0001 and 0003 from http://archives.postgresql.org/message-id/81eaca60-3f7b-6a21-5531-fd51ffbf3f63%402ndquadrant.com I'd personally rather see a separate open-items entry for those, since this isn't related to paralellism, signal handler or such. Petr, Peter (henceforth P^2), do you agree? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On Wed, Jun 07, 2017 at 10:54:57PM -0700, Noah Misch wrote: > On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote: > > On 2017-06-02 22:12:46 -0700, Noah Misch wrote: > > > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote: > > > > On 5/31/17 23:54, Peter Eisentraut wrote: > > > > > On 5/29/17 22:01, Noah Misch wrote: > > > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > > > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > > > > >>> wrote: > > > > Hi, > > > > > > > > so this didn't really move anywhere AFAICS, do we think the > > > > approach > > > > I've chosen is good or do we want to do something else here? > > > > >>> > > > > >>> Can you add it to the open items list? > > > > >> > > > > >> [Action required within three days. This is a generic notification.] > > > > > > > > > > I have posted an update. The next update will be Friday. > > > > > > > > Andres is now working on this. Let's give him a bit of time. ;-) > > > > > > If you intended this as your soon-due status update, it is missing a > > > mandatory > > > bit. > > > > I'm now owning this item. I've posted patches, await review. If none > > were to be forthcoming, I'll do another pass Monday, and commit. > > This PostgreSQL 10 open item is past due for your status update. Kindly send > a status update within 24 hours, and include a date for your subsequent status > update. Refer to the policy on open item ownership: > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due for your status update. Please reacquaint yourself with the policy on open item ownership[1] and then reply immediately. If I do not hear from you by 2017-06-10 07:00 UTC, I will transfer this item to release management team ownership without further notice. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On Fri, Jun 02, 2017 at 11:06:29PM -0700, Andres Freund wrote: > On 2017-06-02 22:12:46 -0700, Noah Misch wrote: > > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote: > > > On 5/31/17 23:54, Peter Eisentraut wrote: > > > > On 5/29/17 22:01, Noah Misch wrote: > > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > > > >>> wrote: > > > Hi, > > > > > > so this didn't really move anywhere AFAICS, do we think the approach > > > I've chosen is good or do we want to do something else here? > > > >>> > > > >>> Can you add it to the open items list? > > > >> > > > >> [Action required within three days. This is a generic notification.] > > > > > > > > I have posted an update. The next update will be Friday. > > > > > > Andres is now working on this. Let's give him a bit of time. ;-) > > > > If you intended this as your soon-due status update, it is missing a > > mandatory > > bit. > > I'm now owning this item. I've posted patches, await review. If none > were to be forthcoming, I'll do another pass Monday, and commit. This PostgreSQL 10 open item is past due for your status update. Kindly send a status update within 24 hours, and include a date for your subsequent status update. Refer to the policy on open item ownership: https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-06-02 22:12:46 -0700, Noah Misch wrote: > On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote: > > On 5/31/17 23:54, Peter Eisentraut wrote: > > > On 5/29/17 22:01, Noah Misch wrote: > > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > > >>> wrote: > > Hi, > > > > so this didn't really move anywhere AFAICS, do we think the approach > > I've chosen is good or do we want to do something else here? > > >>> > > >>> Can you add it to the open items list? > > >> > > >> [Action required within three days. This is a generic notification.] > > > > > > I have posted an update. The next update will be Friday. > > > > Andres is now working on this. Let's give him a bit of time. ;-) > > If you intended this as your soon-due status update, it is missing a mandatory > bit. I'm now owning this item. I've posted patches, await review. If none were to be forthcoming, I'll do another pass Monday, and commit. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On Fri, Jun 02, 2017 at 11:27:55PM -0400, Peter Eisentraut wrote: > On 5/31/17 23:54, Peter Eisentraut wrote: > > On 5/29/17 22:01, Noah Misch wrote: > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > >>> wrote: > Hi, > > so this didn't really move anywhere AFAICS, do we think the approach > I've chosen is good or do we want to do something else here? > >>> > >>> Can you add it to the open items list? > >> > >> [Action required within three days. This is a generic notification.] > > > > I have posted an update. The next update will be Friday. > > Andres is now working on this. Let's give him a bit of time. ;-) If you intended this as your soon-due status update, it is missing a mandatory bit. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-06-02 23:27:55 -0400, Peter Eisentraut wrote: > On 5/31/17 23:54, Peter Eisentraut wrote: > > On 5/29/17 22:01, Noah Misch wrote: > >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > >>> wrote: > Hi, > > so this didn't really move anywhere AFAICS, do we think the approach > I've chosen is good or do we want to do something else here? > >>> > >>> Can you add it to the open items list? > >> > >> [Action required within three days. This is a generic notification.] > > > > I have posted an update. The next update will be Friday. > > Andres is now working on this. Let's give him a bit of time. ;-) Yep. I've posted patches for most things in this thread over at http://archives.postgresql.org/message-id/20170603002023.rcppadggfiaav377%40alap3.anarazel.de because the proper fixes here are only possible after resolving the discussion there. The only thing I know of that's going remaining afterwards is SIGHUP handling, of which there's already a patch in this thread (although I want to prune its scope before committing). I suspect that the review for those will probably happen over the next few days, I plan to push those afterwards. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 5/31/17 23:54, Peter Eisentraut wrote: > On 5/29/17 22:01, Noah Misch wrote: >> On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: >>> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek >>> wrote: Hi, so this didn't really move anywhere AFAICS, do we think the approach I've chosen is good or do we want to do something else here? >>> >>> Can you add it to the open items list? >> >> [Action required within three days. This is a generic notification.] > > I have posted an update. The next update will be Friday. Andres is now working on this. Let's give him a bit of time. ;-) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-06-01 22:17:57 -0400, Peter Eisentraut wrote: > On 6/1/17 00:06, Andres Freund wrote: > > On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote: > >> I think the easiest and safest thing to do now is to just prevent > >> parallel plans in the walsender. See attached patch. This prevents the > >> hang in the select_parallel tests run under your new test setup. > > I'm not quite sure I can buy this. The lack of wired up signals has > > more problems than just hurting parallelism. > > Which problems are those? For example not wiring up sigusr1, which is the cause of the paralellism hang, breaks several things including recovery conflict interrupts. Which means HS standby is affected. Just forbidding parallelism doesn't address this. > I can see from the code what they might be, > but which ones actually happen in practice? And are there any > regressions from 9.6? Yes. For example the issue that atm walsender backends don't ever quit when executing sql statements is new. > If someone wants to work all this out, that would be great. But I'm > here mainly to fix the one reported problem. I'll deal with the issues in this thread. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 6/1/17 00:06, Andres Freund wrote: > On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote: >> I think the easiest and safest thing to do now is to just prevent >> parallel plans in the walsender. See attached patch. This prevents the >> hang in the select_parallel tests run under your new test setup. > I'm not quite sure I can buy this. The lack of wired up signals has > more problems than just hurting parallelism. Which problems are those? I can see from the code what they might be, but which ones actually happen in practice? And are there any regressions from 9.6? If someone wants to work all this out, that would be great. But I'm here mainly to fix the one reported problem. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 01/06/17 06:06, Andres Freund wrote: > On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote: >> I think the easiest and safest thing to do now is to just prevent >> parallel plans in the walsender. See attached patch. This prevents the >> hang in the select_parallel tests run under your new test setup. > > I'm not quite sure I can buy this. The lack of wired up signals has > more problems than just hurting parallelism. In fact, the USR1 thing > seems like something that we actually should backpatch, rather than > defer to v11. I think there's some fair arguments to be made that we > shouldn't do the refactoring right now - although I'm not sure about it > - but just not fixing the bugs seems like a bad plan. > I think the signal handling needs to be fixed. It does not have to be done via large refactoring, but signals should be handled properly (= we need to share SIGHUP/SIGUSR1 handling between postgres.c and walsender.c). The rest can wait for PG11. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-05-31 23:51:08 -0400, Peter Eisentraut wrote: > I think the easiest and safest thing to do now is to just prevent > parallel plans in the walsender. See attached patch. This prevents the > hang in the select_parallel tests run under your new test setup. I'm not quite sure I can buy this. The lack of wired up signals has more problems than just hurting parallelism. In fact, the USR1 thing seems like something that we actually should backpatch, rather than defer to v11. I think there's some fair arguments to be made that we shouldn't do the refactoring right now - although I'm not sure about it - but just not fixing the bugs seems like a bad plan. > @@ -272,6 +273,7 @@ standard_planner(Query *parse, int cursorOptions, > ParamListInfo boundParams) >*/ > if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 && > IsUnderPostmaster && > + !am_walsender && > dynamic_shared_memory_type != DSM_IMPL_NONE && > parse->commandType == CMD_SELECT && > !parse->hasModifyingCTE && If we were to go for this, surely this'd need a comment. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 1 June 2017 at 11:51, Peter Eisentraut wrote: > Unifying the signal handling and query processing further seems like a > good idea, but the patches are pretty involved, so I suggest to put them > into the next commit fest. I had a quick look a the idea of just getting rid of walsenders as a specific entity. Making them normal backends wherever possible. But it starts running into trouble when you're allowing walsenders to stay alive well into postmaster shutdown when normal backends are terminated, so there's probably going to be a need for more ongoing separation than I'd really like to stop walsenders doing things they can't safely do during shutdown. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 5/29/17 22:01, Noah Misch wrote: > On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: >> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek >> wrote: >>> Hi, >>> >>> so this didn't really move anywhere AFAICS, do we think the approach >>> I've chosen is good or do we want to do something else here? >> >> Can you add it to the open items list? > > [Action required within three days. This is a generic notification.] I have posted an update. The next update will be Friday. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 5/23/17 13:57, Petr Jelinek wrote: > On 23/05/17 19:45, Andres Freund wrote: >> >> >> On May 23, 2017 1:42:41 PM EDT, Petr Jelinek >> wrote: >>> Hi, >>> >>> so this didn't really move anywhere AFAICS, do we think the approach >>> I've chosen is good or do we want to do something else here? >> >> Can you add it to the open items list? >> > > Done I think the easiest and safest thing to do now is to just prevent parallel plans in the walsender. See attached patch. This prevents the hang in the select_parallel tests run under your new test setup. Unifying the signal handling and query processing further seems like a good idea, but the patches are pretty involved, so I suggest to put them into the next commit fest. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services From 26002c4e58647c67cce97e66b34b489bb838e39a Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Wed, 31 May 2017 23:27:13 -0400 Subject: [PATCH] Prevent parallel query in walsender Parallel query in walsender currently hangs, because walsender does not have the full SIGUSR1 signal handler to handle this. It's safer and easier to prevent using a parallel plan for now. Eventually, the signal handling in walsender should be unified with regular backends. Reported-by: Andres Freund --- src/backend/optimizer/plan/planner.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 40cb79d4cd..bb921ba29c 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -50,6 +50,7 @@ #include "parser/analyze.h" #include "parser/parsetree.h" #include "parser/parse_agg.h" +#include "replication/walsender.h" #include "rewrite/rewriteManip.h" #include "storage/dsm_impl.h" #include "utils/rel.h" @@ -272,6 +273,7 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams) */ if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 && IsUnderPostmaster && + !am_walsender && dynamic_shared_memory_type != DSM_IMPL_NONE && parse->commandType == CMD_SELECT && !parse->hasModifyingCTE && -- 2.13.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On Tue, May 23, 2017 at 01:45:59PM -0400, Andres Freund wrote: > On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > wrote: > >Hi, > > > >so this didn't really move anywhere AFAICS, do we think the approach > >I've chosen is good or do we want to do something else here? > > Can you add it to the open items list? [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Peter, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 23/05/17 19:45, Andres Freund wrote: > > > On May 23, 2017 1:42:41 PM EDT, Petr Jelinek > wrote: >> Hi, >> >> so this didn't really move anywhere AFAICS, do we think the approach >> I've chosen is good or do we want to do something else here? > > Can you add it to the open items list? > Done -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On May 23, 2017 1:42:41 PM EDT, Petr Jelinek wrote: >Hi, > >so this didn't really move anywhere AFAICS, do we think the approach >I've chosen is good or do we want to do something else here? Can you add it to the open items list? Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
Hi, so this didn't really move anywhere AFAICS, do we think the approach I've chosen is good or do we want to do something else here? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-04-25 11:18:35 -0400, Robert Haas wrote: > On Thu, Apr 20, 2017 at 9:40 PM, Andres Freund wrote: > > This'd be easier to look at if the fairly separate facility of allowing > > walsender to execute SQL commands had been committed separately, rather > > than as part of a fairly large commit of largely unrelated things. > > Good grief, yes. I really don't think that should have been slipped > into a commit that was mostly about logical replication, with only a > brief mention in the commit message and, AFAICS, no documentation > whatsoever. In my opinion some explanation about how this came about would be appropriate. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On Thu, Apr 20, 2017 at 9:40 PM, Andres Freund wrote: > This'd be easier to look at if the fairly separate facility of allowing > walsender to execute SQL commands had been committed separately, rather > than as part of a fairly large commit of largely unrelated things. Good grief, yes. I really don't think that should have been slipped into a commit that was mostly about logical replication, with only a brief mention in the commit message and, AFAICS, no documentation whatsoever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 25/04/17 01:25, Andres Freund wrote: > Hi, > > On 2017-04-24 07:31:18 +0200, Petr Jelinek wrote: >> The previous coding tried to run the unknown string throur lexer which >> could fail for some valid SQL statements as the replication command >> parser is too simple to handle all the complexities of SQL language. >> >> Instead just fall back to SQL parser on any unknown command. >> >> Also remove SHOW command handling from walsender as it's handled by the >> simple query code path. > > This break SHOW var; over the plain replication connections though > (which was quite intentionally supported), so I don't think that's ok? > Hmm I guess we don't use this anywhere in tests (which is not surprising as it would have to be used in plain walsender connection). Fixed. > > I don't like much how SHOW and walsender understanding SQL statements > turned out, I think code like > > if (am_walsender) > { > if > (!exec_replication_command(query_string)) > > exec_simple_query(query_string); > } > else > exec_simple_query(query_string); > > shows some of the issues here. > Not sure how it's related to SHOW, but in any case, it's just result of the parsing fallback. > >> New alternative output for largeobject test has been added as the >> replication connection does not support fastpath function calls. > > I think just supporting fastpath over the replication protocol is less > work than maintaining the alternative file. > > >> --- a/src/Makefile.global.in >> +++ b/src/Makefile.global.in >> @@ -321,6 +321,7 @@ BZIP2= bzip2 >> # Testing >> >> check: temp-install >> +check-walsender-sql: temp-install > > This doesn't strike me as something that should go into > a/src/Makefile.global.in - I'd rather put it in > b/src/test/regress/GNUmakefile. check is in .global because it's used > in a lot of different makefiles, but that's not true for this target, right? > Shows how well I know our build system :) Anyway, I gave it a try to support as much as possible and the result is attached. First patch adds the new make target that runs regression tests through walsender. I had to change one test to run SHOW TIMEZONE instead of SHOW TIME ZONE, otherwise no tests need to be changed. The second patch unifies the sighup handling across all kinds of processes. This is mostly replacing got_SIGHuP with new variable and removing local SIGHUP handlers in favor of generic one where possible. And the third patch changes walsender. There are 3 things it does a) it overhauls signal handling there, b) it allows both fast path function calls and extended protocol messages (once I did function fast path I didn't really see any difference in terms of extended protocol, please correct me if I am wrong in that) and c) improves the fallback in parser. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services 0001-Add-check-walsender-sql-make-target.patch Description: binary/octet-stream 0002-Unify-SIGHUP-hadnling-across-all-types-of-processes.patch Description: binary/octet-stream 0003-Make-walsender-behave-more-like-normal-backend.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
Hi, On 2017-04-24 07:31:18 +0200, Petr Jelinek wrote: > The previous coding tried to run the unknown string throur lexer which > could fail for some valid SQL statements as the replication command > parser is too simple to handle all the complexities of SQL language. > > Instead just fall back to SQL parser on any unknown command. > > Also remove SHOW command handling from walsender as it's handled by the > simple query code path. This break SHOW var; over the plain replication connections though (which was quite intentionally supported), so I don't think that's ok? I don't like much how SHOW and walsender understanding SQL statements turned out, I think code like if (am_walsender) { if (!exec_replication_command(query_string)) exec_simple_query(query_string); } else exec_simple_query(query_string); shows some of the issues here. > Checks the SQL over walsender interface by running the standard set of > regression tests against it. I like that approach a lot. > New alternative output for largeobject test has been added as the > replication connection does not support fastpath function calls. I think just supporting fastpath over the replication protocol is less work than maintaining the alternative file. > --- a/src/Makefile.global.in > +++ b/src/Makefile.global.in > @@ -321,6 +321,7 @@ BZIP2 = bzip2 > # Testing > > check: temp-install > +check-walsender-sql: temp-install This doesn't strike me as something that should go into a/src/Makefile.global.in - I'd rather put it in b/src/test/regress/GNUmakefile. check is in .global because it's used in a lot of different makefiles, but that's not true for this target, right? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 24/04/17 20:00, Andres Freund wrote: > On 2017-04-24 18:29:51 +0200, Petr Jelinek wrote: >> On 24/04/17 07:42, Andres Freund wrote: >>> >>> >>> On April 23, 2017 10:31:18 PM PDT, Petr Jelinek >>> wrote: On 24/04/17 04:31, Petr Jelinek wrote: So actually maybe running regression tests through it might be reasonable approach if we add new make target for it. >>> >>> That sounds like a good plan. >>> >>> Note that the first patch is huge. That's because I needed to add alternative output for largeobject test because it uses fastpath function calls which are not allowed over replication protocol. >>> >>> There's no need for that restriction, is there? At least for db >>> walsenders... >>> >> >> No, there is no real need to restring the extended protocol either but >> we do so currently. The point of allowing SQL was to allow logical >> replication to work, not to merge walsender completely into normal >> backend code. > > Well, that's understandable, but there's also the competing issue that > we need something that is well defined and behaved. > It's well defined, it supports simple protocol queries, that's it. > >> Obviously it >> means walsender is still special but as I said, my plan was to make it >> work for logical replication not to merge it completely with existing >> backends. > > Yea, and I don't think that's an argument for anything on its own, > sorry. > It's not meant argument, it's plain statement of my intentions. I am not stopping you from doing more if you want, however I don't see that it's needed or any arguments about why it is needed. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-04-24 18:29:51 +0200, Petr Jelinek wrote: > On 24/04/17 07:42, Andres Freund wrote: > > > > > > On April 23, 2017 10:31:18 PM PDT, Petr Jelinek > > wrote: > >> On 24/04/17 04:31, Petr Jelinek wrote: > >> So actually maybe running regression tests through it might be > >> reasonable approach if we add new make target for it. > > > > That sounds like a good plan. > > > > > >> Note that the first patch is huge. That's because I needed to add > >> alternative output for largeobject test because it uses fastpath > >> function calls which are not allowed over replication protocol. > > > > There's no need for that restriction, is there? At least for db > > walsenders... > > > > No, there is no real need to restring the extended protocol either but > we do so currently. The point of allowing SQL was to allow logical > replication to work, not to merge walsender completely into normal > backend code. Well, that's understandable, but there's also the competing issue that we need something that is well defined and behaved. > Obviously it > means walsender is still special but as I said, my plan was to make it > work for logical replication not to merge it completely with existing > backends. Yea, and I don't think that's an argument for anything on its own, sorry. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 24/04/17 07:42, Andres Freund wrote: > > > On April 23, 2017 10:31:18 PM PDT, Petr Jelinek > wrote: >> On 24/04/17 04:31, Petr Jelinek wrote: >> So actually maybe running regression tests through it might be >> reasonable approach if we add new make target for it. > > That sounds like a good plan. > > >> Note that the first patch is huge. That's because I needed to add >> alternative output for largeobject test because it uses fastpath >> function calls which are not allowed over replication protocol. > > There's no need for that restriction, is there? At least for db walsenders... > No, there is no real need to restring the extended protocol either but we do so currently. The point of allowing SQL was to allow logical replication to work, not to merge walsender completely into normal backend code. And I don't know about differences well enough to go for the full merge, especially not at this stage of PG10 dev. >> As part of this I realized that the parser fallback that I wrote >> originally for handling SQL in walsender is not robust enough as the >> lexer would fail on some valid SQL statements. So there is also second >> patch that does this using different approach which allows any SQL >> statement. > > Haven't looked at the new thing yet, but I'm not particularly surprised... > Wonder of there's a good way to fully integrate both grammars, without > exploding keywords. > I think we'd need to keyword the first word of every replication command if we wanted to merge both grammar files together. I am not sure if there is all that much need for that, the pass-through for everything that is not replication command seems to work just fine. Obviously it means walsender is still special but as I said, my plan was to make it work for logical replication not to merge it completely with existing backends. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On April 23, 2017 10:31:18 PM PDT, Petr Jelinek wrote: >On 24/04/17 04:31, Petr Jelinek wrote: >So actually maybe running regression tests through it might be >reasonable approach if we add new make target for it. That sounds like a good plan. >Note that the first patch is huge. That's because I needed to add >alternative output for largeobject test because it uses fastpath >function calls which are not allowed over replication protocol. There's no need for that restriction, is there? At least for db walsenders... >As part of this I realized that the parser fallback that I wrote >originally for handling SQL in walsender is not robust enough as the >lexer would fail on some valid SQL statements. So there is also second >patch that does this using different approach which allows any SQL >statement. Haven't looked at the new thing yet, but I'm not particularly surprised... Wonder of there's a good way to fully integrate both grammars, without exploding keywords. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-04-24 04:27:58 +0200, Petr Jelinek wrote: > On 24/04/17 01:43, Andres Freund wrote: > > > >> BTW while looking at the code, I don't understand why we call > >> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just > >> the SetLatch be enough (they both end up calling sendSelfPipeByte() > >> eventually)? > > > > Historic raisins - there didn't use to be a SetLatch in > > procsignal_sigusr1_handler. That changed when I whacked around catchup & > > notify to be based on latches ([1] and following). > > > > [1] > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47 > > > > Okay, but why call both SetLatch and latch_sigusr1_handler? What does > that buy us? Nothing. It's how the code evolved, we can change that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-04-24 04:26:16 +0200, Petr Jelinek wrote: > > WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler > > currently sets a different variable from postgres.c, but that seems like > > a bad idea, because afaics we'll plainly ignore SIGHUPS unless in > > WalSndLoop, WalSndWriteData,WalSndWaitForWal. That actually seems like > > an active bug to me? > > > > Hmm I don't think good solution is to use same variable though, > walsender needs to do slightly different thing on SIGHUP, Hm? You mean SyncRepInitConfig()? I don't think that matters, because it's only needed if config is changed while streaming. > plus the > got_SIGHUP is not the type of variable we want to export. I wonder if it > would be enough to just add check for got_SIGHUP somewhere at the > beginning of exec_replication_command(). I don't think that'd be sufficient, because we really want config changes to get picked up while idle, and that won't work from within exec_replication_command(). And that pretty much means it'll have to happen outside of walsender.c. FWIW, while it's not pretty, I actually see very little reason not to share got_SIGHUP (with a bit less generic name name) between different types of processes (i.e. putting it in miscadmin.h or such). It's not exactly pretty, but there's also no benefit in duplicating it everywhere, and without it you run into the issue presented here. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 24/04/17 02:04, Andres Freund wrote: > Hi, > > Oh, and one more point: There desperately need to be tests exercising > "SQL via walsender". Including the issue of parallelism here, the missed > cancel handler, timeouts, and such. One way to do so is to use > pg_regress with an adjusted connection string (specifying > replication=database), doing the same for isolationtester, or using > dblink. > Well, there are some tests (the logical replication uses that functionality) and it's not quite clear to me what all to run there. I assume you don't want to rerun whole regression suite against walsender given the time it would take in normal tests (although I could see doing that optionally somehow) but then what to pick from there. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 24/04/17 01:43, Andres Freund wrote: > >> BTW while looking at the code, I don't understand why we call >> latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just >> the SetLatch be enough (they both end up calling sendSelfPipeByte() >> eventually)? > > Historic raisins - there didn't use to be a SetLatch in > procsignal_sigusr1_handler. That changed when I whacked around catchup & > notify to be based on latches ([1] and following). > > [1] > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47 > Okay, but why call both SetLatch and latch_sigusr1_handler? What does that buy us? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 24/04/17 01:59, Andres Freund wrote: > Hi, > > On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote: >> Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it >> does not break anything for existing walsender usage. > >> diff --git a/src/backend/replication/logical/launcher.c >> b/src/backend/replication/logical/launcher.c >> index 761fbfa..e9dd886 100644 >> --- a/src/backend/replication/logical/launcher.c >> +++ b/src/backend/replication/logical/launcher.c >> @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char >> *subname, Oid userid, >> BackgroundWorkerbgw; >> BackgroundWorkerHandle *bgw_handle; >> int i; >> -int slot; >> +int slot = 0; >> LogicalRepWorker *worker = NULL; >> int nsyncworkers = 0; >> TimestampTz now = GetCurrentTimestamp(); > > That seems independent and unnecessary? > Yeah that leaked from different patch I was working on in parallel. > >> -/* SIGUSR1: set flag to send WAL records */ >> -static void >> -WalSndXLogSendHandler(SIGNAL_ARGS) >> -{ >> -int save_errno = errno; >> - >> -latch_sigusr1_handler(); >> - >> -errno = save_errno; >> -} > >> pqsignal(SIGPIPE, SIG_IGN); >> -pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending >> */ >> +pqsignal(SIGUSR1, procsignal_sigusr1_handler); > > Those aren't actually entirely equivalent. I'm not sure it matters much, > but WalSndXLogSendHandler didn't include a SetLatch(), but > procsignal_sigusr1_handler() does. Normally redundant SetLatch() calls > will just be elided, but what if walsender already woke up and did it's > work? > They aren't exactly same no, but I don't see harm in what procsignal_sigusr1_handler. I mean worst case scenario is latch is set so walsender checks for work. > I think it'd be better to have PostgresMain() set up signals by default, > and then have WalSndSignals() overwrites the ones it needs separate. Hmm that sounds like a good idea. > WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler > currently sets a different variable from postgres.c, but that seems like > a bad idea, because afaics we'll plainly ignore SIGHUPS unless in > WalSndLoop, WalSndWriteData,WalSndWaitForWal. That actually seems like > an active bug to me? > Hmm I don't think good solution is to use same variable though, walsender needs to do slightly different thing on SIGHUP, plus the got_SIGHUP is not the type of variable we want to export. I wonder if it would be enough to just add check for got_SIGHUP somewhere at the beginning of exec_replication_command(). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
Hi, On 2017-04-23 16:59:41 -0700, Andres Freund wrote: > Hi, > > On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote: > > Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it > > does not break anything for existing walsender usage. > > > diff --git a/src/backend/replication/logical/launcher.c > > b/src/backend/replication/logical/launcher.c > > index 761fbfa..e9dd886 100644 > > --- a/src/backend/replication/logical/launcher.c > > +++ b/src/backend/replication/logical/launcher.c > > @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const > > char *subname, Oid userid, > > BackgroundWorkerbgw; > > BackgroundWorkerHandle *bgw_handle; > > int i; > > - int slot; > > + int slot = 0; > > LogicalRepWorker *worker = NULL; > > int nsyncworkers = 0; > > TimestampTz now = GetCurrentTimestamp(); > > That seems independent and unnecessary? > > > > -/* SIGUSR1: set flag to send WAL records */ > > -static void > > -WalSndXLogSendHandler(SIGNAL_ARGS) > > -{ > > - int save_errno = errno; > > - > > - latch_sigusr1_handler(); > > - > > - errno = save_errno; > > -} > > > pqsignal(SIGPIPE, SIG_IGN); > > - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending > > */ > > + pqsignal(SIGUSR1, procsignal_sigusr1_handler); > > Those aren't actually entirely equivalent. I'm not sure it matters much, > but WalSndXLogSendHandler didn't include a SetLatch(), but > procsignal_sigusr1_handler() does. Normally redundant SetLatch() calls > will just be elided, but what if walsender already woke up and did it's > work? > > I think it'd be better to have PostgresMain() set up signals by default, > and then have WalSndSignals() overwrites the ones it needs separate. > WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler > currently sets a different variable from postgres.c, but that seems like > a bad idea, because afaics we'll plainly ignore SIGHUPS unless in > WalSndLoop, WalSndWriteData,WalSndWaitForWal. That actually seems like > an active bug to me? Oh, and one more point: There desperately need to be tests exercising "SQL via walsender". Including the issue of parallelism here, the missed cancel handler, timeouts, and such. One way to do so is to use pg_regress with an adjusted connection string (specifying replication=database), doing the same for isolationtester, or using dblink. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
Hi, On 2017-04-22 17:53:19 +0200, Petr Jelinek wrote: > Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it > does not break anything for existing walsender usage. > diff --git a/src/backend/replication/logical/launcher.c > b/src/backend/replication/logical/launcher.c > index 761fbfa..e9dd886 100644 > --- a/src/backend/replication/logical/launcher.c > +++ b/src/backend/replication/logical/launcher.c > @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char > *subname, Oid userid, > BackgroundWorkerbgw; > BackgroundWorkerHandle *bgw_handle; > int i; > - int slot; > + int slot = 0; > LogicalRepWorker *worker = NULL; > int nsyncworkers = 0; > TimestampTz now = GetCurrentTimestamp(); That seems independent and unnecessary? > -/* SIGUSR1: set flag to send WAL records */ > -static void > -WalSndXLogSendHandler(SIGNAL_ARGS) > -{ > - int save_errno = errno; > - > - latch_sigusr1_handler(); > - > - errno = save_errno; > -} > pqsignal(SIGPIPE, SIG_IGN); > - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending > */ > + pqsignal(SIGUSR1, procsignal_sigusr1_handler); Those aren't actually entirely equivalent. I'm not sure it matters much, but WalSndXLogSendHandler didn't include a SetLatch(), but procsignal_sigusr1_handler() does. Normally redundant SetLatch() calls will just be elided, but what if walsender already woke up and did it's work? I think it'd be better to have PostgresMain() set up signals by default, and then have WalSndSignals() overwrites the ones it needs separate. WalSndLastCycleHandler is genuinely different. WalSndSigHupHandler currently sets a different variable from postgres.c, but that seems like a bad idea, because afaics we'll plainly ignore SIGHUPS unless in WalSndLoop, WalSndWriteData,WalSndWaitForWal. That actually seems like an active bug to me? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 2017-04-21 04:20:26 +0200, Petr Jelinek wrote: > Looks like SIGUSR1 being different is problem here - it's normally used > to . I also noticed that we don't handle SIGINT (query cancel). I think we really need to unify the paths between walsender and normal backends to a much larger degree. > BTW while looking at the code, I don't understand why we call > latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just > the SetLatch be enough (they both end up calling sendSelfPipeByte() > eventually)? Historic raisins - there didn't use to be a SetLatch in procsignal_sigusr1_handler. That changed when I whacked around catchup & notify to be based on latches ([1] and following). [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59f71a0d0b56b2df48db4bf1738aece5551f7a47 - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 21/04/17 04:37, Petr Jelinek wrote: > On 21/04/17 04:32, Craig Ringer wrote: >> On 21 April 2017 at 10:20, Petr Jelinek wrote: >>> On 21/04/17 03:40, Andres Freund wrote: Since [1] walsender (not receiver as commit message says) can execute SQL queries. While doing some testing of [2] I noticed that SQL queries in walsender get stuck if parallelism is used - I have not investigated why that is yet, but it surely is an issue. On first blush I'd suspect that some signalling is not wired up correctly (cf. am_walsender branches in PostgresMain() and such). >>> >>> Looks like SIGUSR1 being different is problem here - it's normally used >>> to . I also noticed that we don't handle SIGINT (query cancel). >>> >>> I'll write proper patch but can you try to just use >>> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes >>> the issue? >> >> That's what my recovery conflicts for logical decoding on standby >> patch does, FWIW. >> >> I haven't found any issues yet.. >> > > Ah I knew I've seen that change somewhere. I thought it was either in my > patch or master which is why I thought it's working fine already. > Here is patch. I changed both SIGINT and SIGUSR1 handlers, afaics it does not break anything for existing walsender usage. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From f59218135092bec9a64cf86b75f1f62a0271d448 Mon Sep 17 00:00:00 2001 From: Petr Jelinek Date: Fri, 21 Apr 2017 05:11:30 +0200 Subject: [PATCH] Handle signals correctly in walsender Since we now support SQL queries in walsender, we also need to handle SIGINT for query cancel and SIGUSR1 for IPC. --- src/backend/replication/logical/launcher.c | 2 +- src/backend/replication/walsender.c| 16 ++-- 2 files changed, 3 insertions(+), 15 deletions(-) diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c index 761fbfa..e9dd886 100644 --- a/src/backend/replication/logical/launcher.c +++ b/src/backend/replication/logical/launcher.c @@ -254,7 +254,7 @@ logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid, BackgroundWorkerbgw; BackgroundWorkerHandle *bgw_handle; int i; - int slot; + int slot = 0; LogicalRepWorker *worker = NULL; int nsyncworkers = 0; TimestampTz now = GetCurrentTimestamp(); diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index b4ce75a..78369ae 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -212,7 +212,6 @@ static struct /* Signal handlers */ static void WalSndSigHupHandler(SIGNAL_ARGS); -static void WalSndXLogSendHandler(SIGNAL_ARGS); static void WalSndLastCycleHandler(SIGNAL_ARGS); /* Prototypes for private functions */ @@ -2830,17 +2829,6 @@ WalSndSigHupHandler(SIGNAL_ARGS) errno = save_errno; } -/* SIGUSR1: set flag to send WAL records */ -static void -WalSndXLogSendHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - latch_sigusr1_handler(); - - errno = save_errno; -} - /* SIGUSR2: set flag to do a last cycle and shut down afterwards */ static void WalSndLastCycleHandler(SIGNAL_ARGS) @@ -2869,12 +2857,12 @@ WalSndSignals(void) /* Set up signal handlers */ pqsignal(SIGHUP, WalSndSigHupHandler); /* set flag to read config * file */ - pqsignal(SIGINT, SIG_IGN); /* not used */ + pqsignal(SIGINT, StatementCancelHandler); /* query cancel */ pqsignal(SIGTERM, die); /* request shutdown */ pqsignal(SIGQUIT, quickdie);/* hard crash time */ InitializeTimeouts(); /* establishes SIGALRM handler */ pqsignal(SIGPIPE, SIG_IGN); - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending */ + pqsignal(SIGUSR1, procsignal_sigusr1_handler); pqsignal(SIGUSR2, WalSndLastCycleHandler); /* request a last cycle and * shutdown */ -- 2.7.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 21/04/17 04:32, Craig Ringer wrote: > On 21 April 2017 at 10:20, Petr Jelinek wrote: >> On 21/04/17 03:40, Andres Freund wrote: >>> >>> Since [1] walsender (not receiver as commit message says) can execute >>> SQL queries. While doing some testing of [2] I noticed that SQL queries >>> in walsender get stuck if parallelism is used - I have not investigated >>> why that is yet, but it surely is an issue. On first blush I'd suspect >>> that some signalling is not wired up correctly (cf. am_walsender branches >>> in PostgresMain() and such). >> >> Looks like SIGUSR1 being different is problem here - it's normally used >> to . I also noticed that we don't handle SIGINT (query cancel). >> >> I'll write proper patch but can you try to just use >> procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes >> the issue? > > That's what my recovery conflicts for logical decoding on standby > patch does, FWIW. > > I haven't found any issues yet.. > Ah I knew I've seen that change somewhere. I thought it was either in my patch or master which is why I thought it's working fine already. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 21 April 2017 at 10:20, Petr Jelinek wrote: > On 21/04/17 03:40, Andres Freund wrote: >> >> Since [1] walsender (not receiver as commit message says) can execute >> SQL queries. While doing some testing of [2] I noticed that SQL queries >> in walsender get stuck if parallelism is used - I have not investigated >> why that is yet, but it surely is an issue. On first blush I'd suspect >> that some signalling is not wired up correctly (cf. am_walsender branches >> in PostgresMain() and such). > > Looks like SIGUSR1 being different is problem here - it's normally used > to . I also noticed that we don't handle SIGINT (query cancel). > > I'll write proper patch but can you try to just use > procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes > the issue? That's what my recovery conflicts for logical decoding on standby patch does, FWIW. I haven't found any issues yet.. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] walsender & parallelism
On 21/04/17 03:40, Andres Freund wrote: > > Since [1] walsender (not receiver as commit message says) can execute > SQL queries. While doing some testing of [2] I noticed that SQL queries > in walsender get stuck if parallelism is used - I have not investigated > why that is yet, but it surely is an issue. On first blush I'd suspect > that some signalling is not wired up correctly (cf. am_walsender branches > in PostgresMain() and such). Looks like SIGUSR1 being different is problem here - it's normally used to . I also noticed that we don't handle SIGINT (query cancel). I'll write proper patch but can you try to just use procsignal_sigusr1_handler for SIGUSR1 in walsender.c to see if it fixes the issue? BTW while looking at the code, I don't understand why we call latch_sigusr1_handler after calling SetLatch(MyLatch), shouldn't just the SetLatch be enough (they both end up calling sendSelfPipeByte() eventually)? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers