Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-17 Thread Fujii Masao
Hi, On Wed, Jun 17, 2009 at 12:22 AM, Czichy, Thoralf (NSN - FI/Helsinki)thoralf.czi...@nsn.com wrote: [STONITH is not always best strategy if failures can be declared as user-space software problem only, limit STONITH to HW/OS failures] The isolation of the failing Postgres instance does not

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-16 Thread Czichy, Thoralf (NSN - FI/Helsinki)
hi, I am working together with Harald on this issue. Below some thoughts on why we think it should be possible to disable the postmaster-internal recovery attempt and instead have faults in the processes started by postmaster escalated to postmaster-exit. [Our typical embedded situation]

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-16 Thread Tom Lane
Czichy, Thoralf (NSN - FI/Helsinki) thoralf.czi...@nsn.com writes: I am working together with Harald on this issue. Below some thoughts on why we think it should be possible to disable the postmaster-internal recovery attempt and instead have faults in the processes started by postmaster

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-15 Thread Kolb, Harald (NSN - DE/Munich)
: [HACKERS] postmaster recovery and automatic restart suppression Kolb, Harald (NSN - DE/Munich) harald.k...@nsn.com writes: If you don't want to see this option as a GUC parameter, would it be acceptable to have it as a new postmaster cmd line option ? That would make two kluges, not one

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-15 Thread Alvaro Herrera
Kolb, Harald (NSN - DE/Munich) escribió: The recovery and restart feature is an excellent solution if the db is running in a standalone environment and I understand that this should not be weakened. But in a configuration where the db is only one resource among others and where you have a

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-10 Thread Fujii Masao
Hi, On Wed, Jun 10, 2009 at 4:21 AM, Simon Riggssi...@2ndquadrant.com wrote: On Tue, 2009-06-09 at 20:59 +0200, Kolb, Harald (NSN - DE/Munich) wrote: There are some good reasons why a switchover could be an appropriate means in case the DB is facing troubles. It may be that the root cause

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Kolb, Harald (NSN - DE/Munich)
: [HACKERS] postmaster recovery and automatic restart suppression Robert Haas robertmh...@gmail.com writes: I see that you've carefully not quoted Greg's remark about mechanism not policy with which I completely agree. Mechanism should exist to support useful policy. I don't believe

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Simon Riggs
On Tue, 2009-06-09 at 20:59 +0200, Kolb, Harald (NSN - DE/Munich) wrote: There are some good reasons why a switchover could be an appropriate means in case the DB is facing troubles. It may be that the root cause is not the DB itsself, but used resources or other things which are going crazy

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Tom Lane
Kolb, Harald (NSN - DE/Munich) harald.k...@nsn.com writes: If you don't want to see this option as a GUC parameter, would it be acceptable to have it as a new postmaster cmd line option ? That would make two kluges, not one (we don't do options that are settable in only one way). And it does

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Kevin Grittner
Kolb, Harald (NSN - DE/Munich) harald.k...@nsn.com wrote: From: ext Tom Lane [mailto:t...@sss.pgh.pa.us] Mechanism should exist to support useful policy. I don't believe that the proposed switch has any real-world usefulness. There are some good reasons why a switchover could be an

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Greg Stark
Not really since once you fail over you may as well stop the rebuild since you'll have to restore the whole database. Moreover wouldn't that have to be a manual decision? The closest thing I can come to a use case would be if you run a very large cluster with hundreds of read-only

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Kolb, Harald (NSN - DE/Munich) harald.k...@nsn.com wrote: There are some good reasons why a switchover could be an appropriate means in case the DB is facing troubles. It may be that the root cause is not the DB itsself, but used resources or

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Kolb, Harald (NSN - DE/Munich) harald.k...@nsn.com wrote: There are some good reasons why a switchover could be an appropriate means in case the DB is facing troubles. It may be that the root cause is not

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-09 Thread Simon Riggs
On Tue, 2009-06-09 at 15:48 -0500, Kevin Grittner wrote: My first reaction on hearing the request was that it might have *some* use; but in trying to recall any restart where it is what I would have wanted, I come up dry. I haven't even really come up with a good hypothetical use case.

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Fujii Masao
Hi, On Fri, Jun 5, 2009 at 9:24 PM, Kolb, Harald (NSN - DE/Munich)harald.k...@nsn.com wrote: Good point. I also think that this makes a handling of failover more complicated. In other words, clusterware cannot determine whether to do failover when it detects the death of the primary postgres.

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Gregory Stark
Fujii Masao masao.fu...@gmail.com writes: On the other hand, the primary postgres might *not* restart automatically. So, it's difficult for clusterware to choose whether to do failover when it detects the death of the primary postgres, I think. I think the accepted way to handle this kind of

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Fujii Masao
Hi, On Mon, Jun 8, 2009 at 6:45 PM, Gregory Starkst...@enterprisedb.com wrote: Fujii Masao masao.fu...@gmail.com writes: On the other hand, the primary postgres might *not* restart automatically. So, it's difficult for clusterware to choose whether to do failover when it detects the death of

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Tom Lane
Gregory Stark st...@enterprisedb.com writes: I think the accepted way to handle this kind of situation is called STONITH -- Shoot The Other Node In The Head. Yeah, and the reason people go to the trouble of having special hardware for that is that pure-software solutions are unreliable. I

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Simon Riggs
On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: I think the proposed don't-restart flag is exceedingly ugly and will not solve any real-world problem. Agreed. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Greg Stark
On Mon, Jun 8, 2009 at 6:58 PM, Simon Riggssi...@2ndquadrant.com wrote: On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: I think the proposed don't-restart flag is exceedingly ugly and will not solve any real-world problem. Agreed. Hm. I'm not sure I see a solid use case for it -- in my

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Tom Lane
Greg Stark st...@enterprisedb.com writes: On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: I think the proposed don't-restart flag is exceedingly ugly and will not solve any real-world problem. Hm. I'm not sure I see a solid use case for it -- in my experience you want to be pretty sure

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Robert Haas
On Mon, Jun 8, 2009 at 4:30 PM, Tom Lanet...@sss.pgh.pa.us wrote: Greg Stark st...@enterprisedb.com writes: On Mon, 2009-06-08 at 09:47 -0400, Tom Lane wrote: I think the proposed don't-restart flag is exceedingly ugly and will not solve any real-world problem. Hm. I'm not sure I see a solid

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: I see that you've carefully not quoted Greg's remark about mechanism not policy with which I completely agree. Mechanism should exist to support useful policy. I don't believe that the proposed switch has any real-world usefulness.

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Robert Haas
On Mon, Jun 8, 2009 at 7:34 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I see that you've carefully not quoted Greg's remark about mechanism not policy with which I completely agree. Mechanism should exist to support useful policy.  I don't believe that

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-05 Thread Fujii Masao
Hi, On Fri, Jun 5, 2009 at 1:02 AM, Kolb, Harald (NSN - DE/Munich) harald.k...@nsn.com wrote: Hi, in case of a serious failure of a backend or an auxiliary process the postmaster performs a crash recovery and restarts the db automatically. Is there a possibility to deactivate the restart

Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-05 Thread Kolb, Harald (NSN - DE/Munich)
Hi, -Original Message- From: ext Fujii Masao [mailto:masao.fu...@gmail.com] Sent: Friday, June 05, 2009 8:14 AM To: Kolb, Harald (NSN - DE/Munich) Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] postmaster recovery and automatic restart suppression Hi, On Fri, Jun

[HACKERS] postmaster recovery and automatic restart suppression

2009-06-04 Thread Kolb, Harald (NSN - DE/Munich)
Hi, in case of a serious failure of a backend or an auxiliary process the postmaster performs a crash recovery and restarts the db automatically. Is there a possibility to deactivate the restart and to force the postmaster to simply exit at the end ? The background is that we will have a