Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-28 Thread Robert Haas
On Fri, Jun 28, 2013 at 6:00 PM, Alvaro Herrera wrote: > MauMau escribió: > > Hi, > >> I did this. Please find attached the revised patch. I modified >> HandleChildCrash(). I tested the immediate shutdown, and the child >> cleanup succeeded. > > Thanks, committed. > > There are two matters pend

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-28 Thread Alvaro Herrera
MauMau escribió: Hi, > I did this. Please find attached the revised patch. I modified > HandleChildCrash(). I tested the immediate shutdown, and the child > cleanup succeeded. Thanks, committed. There are two matters pending here: 1. do we want postmaster to exit immediately after sending t

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-27 Thread MauMau
Hi, Alvaro san, From: "Alvaro Herrera" MauMau escribió: Yeah, I see that --- after removing that early exit, there are unwanted messages. And in fact there are some signals sent that weren't previously sent. Clearly we need something here: if we're in immediate shutdown handler, don't signal

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-25 Thread MauMau
From: "Alvaro Herrera" Yeah, I see that --- after removing that early exit, there are unwanted messages. And in fact there are some signals sent that weren't previously sent. Clearly we need something here: if we're in immediate shutdown handler, don't signal anyone (because they have already

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-24 Thread Alvaro Herrera
MauMau escribió: > From: "Alvaro Herrera" > >Actually, in further testing I noticed that the fast-path you introduced > >in BackendCleanup (or was it HandleChildCrash?) in the immediate > >shutdown case caused postmaster to fail to clean up properly after > >sending the SIGKILL signal, so I had t

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-22 Thread MauMau
From: "Robert Haas" On Fri, Jun 21, 2013 at 10:02 PM, MauMau wrote: I'm comfortable with 5 seconds. We are talking about the interval between sending SIGQUIT to the children and then sending SIGKILL to them. In most situations, the backends should terminate immediately. However, as I said

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-22 Thread MauMau
From: "Alvaro Herrera" MauMau escribió: I thought of adding some new state of pmState for some reason (that might be the same as your idea). But I refrained from doing that, because pmState has already many states. I was afraid adding a new pmState value for this bug fix would complicate the s

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-22 Thread Robert Haas
On Fri, Jun 21, 2013 at 10:02 PM, MauMau wrote: > I'm comfortable with 5 seconds. We are talking about the interval between > sending SIGQUIT to the children and then sending SIGKILL to them. In most > situations, the backends should terminate immediately. However, as I said a > few months ago,

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-22 Thread Alvaro Herrera
MauMau escribió: > Are you suggesting simplifying the following part in ServerLoop()? > I welcome the idea if this condition becomes simpler. However, I > cannot imagine how. > if (AbortStartTime > 0 && /* SIGKILL only once */ > (Shutdown == ImmediateShutdown || (FatalError && !SendStop)) &&

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau
From: "Robert Haas" On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane wrote: Robert Haas writes: More generally, what do we think the point is of sending SIGQUIT rather than SIGKILL in the first place, and why does that point cease to be valid after 5 seconds? Well, mostly it's about telling the c

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau
From: "Robert Haas" On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera wrote: I will go with 5 seconds, then. I'm uncomfortable with this whole concept, and particularly with such a short timeout. On a very busy system, things can take a LOT longer than they think we should; it can take 30 se

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Christopher Browne
The case where I wanted "routine" shutdown immediate (and I'm not sure I ever actually got it) was when we were using IBM HA/CMP, where I wanted a "terminate with a fair bit of prejudice". If we know we want to "switch right away now", immediate seemed pretty much right. I was fine with interrupt

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Robert Haas
On Fri, Jun 21, 2013 at 2:55 PM, Tom Lane wrote: > Robert Haas writes: >> More generally, what do we think the point is of sending SIGQUIT >> rather than SIGKILL in the first place, and why does that point cease >> to be valid after 5 seconds? > > Well, mostly it's about telling the client we're

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Tom Lane
Robert Haas writes: > More generally, what do we think the point is of sending SIGQUIT > rather than SIGKILL in the first place, and why does that point cease > to be valid after 5 seconds? Well, mostly it's about telling the client we're committing hara-kiri. Without that, there's no very good r

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Robert Haas
On Thu, Jun 20, 2013 at 12:33 PM, Alvaro Herrera wrote: > I will go with 5 seconds, then. I'm uncomfortable with this whole concept, and particularly with such a short timeout. On a very busy system, things can take a LOT longer than they think we should; it can take 30 seconds or more just to g

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau
From: "Alvaro Herrera" Actually, I think it would be cleaner to have a new state in pmState, namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT. When we're in this state, postmaster is only waiting for the timeout to expire; and when it does, it sends SIGKILL and exits. Pretty much

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread MauMau
From: "Alvaro Herrera" MauMau escribió: One concern is that umount would fail in such a situation because postgres has some open files on the filesystem, which is on the shared disk in case of traditional HA cluster. See my reply to Noah. If postmaster stays around, would this be any differ

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-21 Thread Hitoshi Harada
On Thu, Jun 20, 2013 at 3:40 PM, MauMau wrote: > > Here, "reliable" means that the database server is certainly shut >>> down when pg_ctl returns, not telling a lie that "I shut down the >>> server processes for you, so you do not have to be worried that some >>> postgres process might still rema

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
Actually, I think it would be cleaner to have a new state in pmState, namely PM_IMMED_SHUTDOWN which is entered when we send SIGQUIT. When we're in this state, postmaster is only waiting for the timeout to expire; and when it does, it sends SIGKILL and exits. Pretty much the same you have, except

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
MauMau escribió: > From: "Alvaro Herrera" > One concern is that umount would fail in such a situation because > postgres has some open files on the filesystem, which is on the > shared disk in case of traditional HA cluster. See my reply to Noah. If postmaster stays around, would this be any di

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread MauMau
From: "Alvaro Herrera" I will go with 5 seconds, then. OK, I agree. My point is that there is no difference. For one thing, once we enter immediate shutdown state, and sigkill has been sent, no further action is taken. Postmaster will just sit there indefinitely until processes are gone.

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread Alvaro Herrera
MauMau escribió: > First, thank you for the review. > > From: "Alvaro Herrera" > >This seems reasonable. Why 10 seconds? We could wait 5 seconds, or 15. > >Is there a rationale behind the 10? If we said 60, that would fit > >perfectly well within the already existing 60-second loop in postmast

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-20 Thread MauMau
First, thank you for the review. From: "Alvaro Herrera" This seems reasonable. Why 10 seconds? We could wait 5 seconds, or 15. Is there a rationale behind the 10? If we said 60, that would fit perfectly well within the already existing 60-second loop in postmaster, but that seems way too lon

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-06-19 Thread Alvaro Herrera
MauMau escribió: > Could you review the patch? The summary of the change is: > 1. postmaster waits for children to terminate when it gets an > immediate shutdown request, instead of exiting. > > 2. postmaster sends SIGKILL to remaining children if all of the > child processes do not terminate wi

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-02-07 Thread MauMau
Hello, Tom-san, folks, From: "Tom Lane" I think if we want to make it bulletproof we'd have to do what the OP suggested and switch to SIGKILL. I'm not enamored of that for the reasons I mentioned --- but one idea that might dodge the disadvantages is to have the postmaster wait a few seconds a

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-02-01 Thread Tom Lane
Andres Freund writes: > On 2013-02-01 08:55:24 -0500, Peter Eisentraut wrote: >> I found an old patch that I had prepared for this, which I have >> attached. YMMV. >> +static void >> +quickdie_alarm_handler(SIGNAL_ARGS) >> +{ >> +/* >> + * We got here if ereport() was blocking, so don't

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-02-01 Thread Andres Freund
On 2013-02-01 08:55:24 -0500, Peter Eisentraut wrote: > On 1/31/13 5:42 PM, MauMau wrote: > > Thank you for sharing your experience. So you also considered making > > postmaster SIGKILL children like me, didn't you? I bet most of people > > who encounter this problem would feel like that. > > >

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-02-01 Thread Peter Eisentraut
On 1/31/13 5:42 PM, MauMau wrote: > Thank you for sharing your experience. So you also considered making > postmaster SIGKILL children like me, didn't you? I bet most of people > who encounter this problem would feel like that. > > It is definitely pg_ctl who needs to be prepared, not the users.

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-02-01 Thread Andres Freund
On 2013-01-22 22:19:25 -0500, Tom Lane wrote: > Since we've fixed a couple of relatively nasty bugs recently, the core > committee has determined that it'd be a good idea to push out PG update > releases soon. The current plan is to wrap on Monday Feb 4 for public > announcement Thursday Feb 7. I

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-31 Thread Kevin Grittner
MauMau wrote: > Just doing "pkill postgres" will unexpectedly terminate postgres > of other instances. Not if you run each instance under a different OS user, and execute pkill with the right user.  (Never use root for that!)  This is just one of the reasons that you should not run multiple clus

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-31 Thread MauMau
From: "Peter Eisentraut" On 1/30/13 9:11 AM, MauMau wrote: When I ran "pg_ctl stop -mi" against the primary, some applications connected to the primary did not stop. The cause was that the backends was deadlocked in quickdie() with some call stack like the following. I'm sorry to have left the

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-31 Thread Peter Eisentraut
On 1/30/13 9:11 AM, MauMau wrote: > When I ran "pg_ctl stop -mi" against the primary, some applications > connected to the primary did not stop. The cause was that the backends > was deadlocked in quickdie() with some call stack like the following. > I'm sorry to have left the stack trace file on

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-30 Thread Tom Lane
"MauMau" writes: > From: "Tom Lane" >> The long and the short of it is that SIGQUIT is the emergency-stop panic >> button. You don't use it for routine shutdowns --- you use it when >> there is a damn good reason to and you're prepared to do some manual >> cleanup if necessary. > How about the

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-30 Thread MauMau
From: "Tom Lane" "MauMau" writes: I think the solution is the typical one. That is, to just remember the receipt of SIGQUIT by setting a global variable and call siglongjmp() in quickdie(), and perform tasks currently done in quickdie() when sigsetjmp() returns in PostgresMain(). I think

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-30 Thread Tom Lane
Andres Freund writes: > On 2013-01-30 10:23:09 -0500, Tom Lane wrote: >> Yeah, it's a known hazard that quickdie() operates like that. > What about not translating those? The messages are static and all memory > needed by postgres should be pre-allocated. That would reduce our exposure slightly,

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-30 Thread Andres Freund
On 2013-01-30 10:23:09 -0500, Tom Lane wrote: > "MauMau" writes: > > When I ran "pg_ctl stop -mi" against the primary, some applications > > connected to the primary did not stop. ... > > The root cause is that gettext() is called in the signal handler quickdie() > > via errhint(). > > Yeah, it

Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-30 Thread Tom Lane
"MauMau" writes: > When I ran "pg_ctl stop -mi" against the primary, some applications > connected to the primary did not stop. ... > The root cause is that gettext() is called in the signal handler quickdie() > via errhint(). Yeah, it's a known hazard that quickdie() operates like that. > I t

backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)

2013-01-30 Thread MauMau
From: "Tom Lane" Since we've fixed a couple of relatively nasty bugs recently, the core committee has determined that it'd be a good idea to push out PG update releases soon. The current plan is to wrap on Monday Feb 4 for public announcement Thursday Feb 7. If you're aware of any bug fixes yo

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-29 Thread Fujii Masao
On Sun, Jan 27, 2013 at 11:38 PM, MauMau wrote: > From: "Fujii Masao" >> >> On Sun, Jan 27, 2013 at 12:17 AM, MauMau wrote: >>> >>> Although you said the fix will solve my problem, I don't feel it will. >>> The >>> discussion is about the crash when the standby "re"starts after the >>> primary >

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-27 Thread MauMau
From: "Fujii Masao" On Sun, Jan 27, 2013 at 12:17 AM, MauMau wrote: Although you said the fix will solve my problem, I don't feel it will. The discussion is about the crash when the standby "re"starts after the primary vacuums and truncates a table. On the other hand, in my case, the standb

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-27 Thread Fujii Masao
On Sun, Jan 27, 2013 at 12:17 AM, MauMau wrote: > From: "Fujii Masao" >> >> On Thu, Jan 24, 2013 at 11:53 PM, MauMau wrote: >>> >>> I'm wondering if the fix discussed in the above thread solves my problem. >>> I >>> found the following differences between Horiguchi-san's case and my case: >>> >>

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-26 Thread MauMau
From: "Fujii Masao" On Thu, Jan 24, 2013 at 11:53 PM, MauMau wrote: I'm wondering if the fix discussed in the above thread solves my problem. I found the following differences between Horiguchi-san's case and my case: (1) Horiguchi-san says the bug outputs the message: WARNING: page 0 of r

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-24 Thread Fujii Masao
On Thu, Jan 24, 2013 at 11:53 PM, MauMau wrote: > From: "Fujii Masao" >> >> On Thu, Jan 24, 2013 at 7:42 AM, MauMau wrote: >>> >>> I searched through PostgreSQL mailing lists with "WAL contains references >>> to >>> invalid pages", and i found 19 messages. Some people encountered similar >>> pr

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-24 Thread MauMau
From: "Fujii Masao" On Thu, Jan 24, 2013 at 7:42 AM, MauMau wrote: I searched through PostgreSQL mailing lists with "WAL contains references to invalid pages", and i found 19 messages. Some people encountered similar problem. There were some discussions regarding those problems (Tom and Sim

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-23 Thread Fujii Masao
On Thu, Jan 24, 2013 at 7:42 AM, MauMau wrote: > From: "Tom Lane" > >> Since we've fixed a couple of relatively nasty bugs recently, the core >> committee has determined that it'd be a good idea to push out PG update >> releases soon. The current plan is to wrap on Monday Feb 4 for public >> ann

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-23 Thread MauMau
From: "Tom Lane" Since we've fixed a couple of relatively nasty bugs recently, the core committee has determined that it'd be a good idea to push out PG update releases soon. The current plan is to wrap on Monday Feb 4 for public announcement Thursday Feb 7. If you're aware of any bug fixes yo

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-22 Thread Tom Lane
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> Since we've fixed a couple of relatively nasty bugs recently, the core >> committee has determined that it'd be a good idea to push out PG update >> releases soon. The current plan is to wrap on Monday Feb 4 for public >> announcem

Re: [HACKERS] Back-branch update releases coming in a couple weeks

2013-01-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Since we've fixed a couple of relatively nasty bugs recently, the core > committee has determined that it'd be a good idea to push out PG update > releases soon. The current plan is to wrap on Monday Feb 4 for public > announcement Thursday Feb 7. If you'r