Hi

Just a remark : 
perhaps it still depend on HOW corosync failed ... but in any cases, with 
a 3 nodes HA cluster, the quorum functionnality should
be used (no-quorum-policy must be set to anything else than ignore) , 
meaning that one node "isolated" 
does not belongs to a "quorate" cluster, so it should not be authorized to 
fence another node. So in the worst
case where the three nodes have corosync problems (and again : it is 
effectively a case of multiple failures !) , none of the nodes
should fence any other.

Regards
Alain




De :    "Ulrich Windl" <ulrich.wi...@rz.uni-regensburg.de>
A :     undisclosed-recipients:;
Cc :    "General Linux-HA mailing list" <linux-ha@lists.linux-ha.org>
Date :  04/12/2012 08:07
Objet : [Linux-HA] Antw: Re:  Corosync on cluster with 3+ nodes
Envoyé par :    linux-ha-boun...@lists.linux-ha.org



Hi!

I think it still depend on HOW corosync failed: If just some process hung 
or died, or the cluster network got unreachable, the node can still be 
fenced from remote (e.g. via SBD through FC).

Regards,
Ulrich

>>> Digimer <li...@alteeve.ca> schrieb am 03.12.2012 um 13:13 in Nachricht
<50bc976e.7040...@alteeve.ca>:
> As I keep saying; If corosync dies, the cluster stack on the node fails.
> So if corosync is dead, that node will not do anything; It will not
> fence, it will not restart services, nothing. If you somehow lose
> corosync on all nodes at the same time, then your cluster is dead.
> 
> Clusters protect against single points of failure. Rarely can multiple
> simultaneous failures be protected against.
> 
> On 12/03/2012 07:07 AM, Hermes Flying wrote:
> > I see, I will look into this.
> > There is one thing I am confused about.
> > If I have Node-A/Node-B/Node-C and let's say Node-A has the VIP.
> > What happens if corosync fails in all nodes? Will all 3 try to kill
> > eachother?
> > 
> > 
> > 
------------------------------------------------------------------------
> > *From:* Digimer <li...@alteeve.ca>
> > *To:* Hermes Flying <flyingher...@yahoo.com>
> > *Cc:* General Linux-HA mailing list <linux-ha@lists.linux-ha.org>
> > *Sent:* Monday, December 3, 2012 1:20 PM
> > *Subject:* Re: [Linux-HA] Corosync on cluster with 3+ nodes
> > 
> > Most Fujitsu servers work with fence_ipmilan. I suspect the same is 
true
> > with the IBM servers. Ask the hardware people if they have out-of-band
> > management on the servers (which is the generic/marketing way of 
saying
> > IPMI or the like).
> > 
> > On 12/03/2012 05:00 AM, Hermes Flying wrote:
> >> Some are IBM servers mentioning VMK (optional) and others are Fujitsu
> >> mentioning iRMC.
> >> Not sure what can I do in the IBM case
> >>
> >>
> >> 
------------------------------------------------------------------------
> >> *From:* Digimer <li...@alteeve.ca <mailto:li...@alteeve.ca>>
> >> *To:* Hermes Flying <flyingher...@yahoo.com 
> > <mailto:flyingher...@yahoo.com>>
> >> *Cc:* General Linux-HA mailing list <linux-ha@lists.linux-ha.org 
> > <mailto:linux-ha@lists.linux-ha.org>>
> >> *Sent:* Sunday, December 2, 2012 10:36 PM
> >> *Subject:* Re: [Linux-HA] Corosync on cluster with 3+ nodes
> >>
> >> I don't know anything about your setup or office situation, so I 
can't
> >> speak to your specific issues.
> >>
> >> What I can say is that almost all actual servers have remote 
management;
> >> Fujitsu servers have IPMI, HP servers have iLO, Dell servers have 
DRAC,
> >> IBM servers have RSA and so on. Most generic servers have at least 
basic
> >> IPMI. So it's very likely that you will already have what you need to
> >> implement fencing. If you don't, then something like the APC AP7900 
is
> >> usually about $500 canadian and will work as a fence device. I don't
> >> know what country you are in, so I can't speak more on availability,
> >> applicability or cost.
> >>
> >> You might seriously want to consider hiring someone to help you with
> >> this setup. Clustering is not a straight-forward topic. The cost to 
hire
> >> a consultant might save you enough time and headache that it's 
cheaper
> >> in the end than trying to do it yourself.
> >>
> >> On 12/02/2012 03:31 PM, Hermes Flying wrote:
> >>> I am an application developer and so far I didn't care about HW 
specs
> >>> (not my responsibility).
> >>> Now it seems I must care and so since I am not sure what HW is being
> >>> recommended for servers I will verify tomorrow and let you know.
> >>> My concern is that by going to pacemaker (although seems to be 
exactly
> >>> what I need) is that we would need to request for new HW on migrate 
to
> >>> new deployment. I don't even know if these devices are expensive.
> >>> Did you ever had an issue with this? If I have problem with this, 
should
> >>> I be looking into different solution? Any suggestion for 
alternative?
> >>>
> >>>
> >>> 
------------------------------------------------------------------------
> >>> *From:* Digimer <li...@alteeve.ca <mailto:li...@alteeve.ca>
> > <mailto:li...@alteeve.ca <mailto:li...@alteeve.ca>>>
> >>> *To:* Hermes Flying <flyingher...@yahoo.com 
> > <mailto:flyingher...@yahoo.com>
> >> <mailto:flyingher...@yahoo.com <mailto:flyingher...@yahoo.com>>>
> >>> *Cc:* General Linux-HA mailing list <linux-ha@lists.linux-ha.org 
> > <mailto:linux-ha@lists.linux-ha.org>
> >> <mailto:linux-ha@lists.linux-ha.org <
mailto:linux-ha@lists.linux-ha.org>>>
> >>> *Sent:* Sunday, December 2, 2012 10:19 PM
> >>> *Subject:* Re: [Linux-HA] Corosync on cluster with 3+ nodes
> >>>
> >>> It must be an external device. If, for example, the kernel crashes 
hard,
> >>> or if you get a spinlock, the system may not respond to anything or 
a
> >>> service may never stop, blocking a reboot. You can not trust that 
the
> >>> system is accessible or functioning in any way.
> >>>
> >>> Fencing *must* be an external device. Period.
> >>>
> >>> What kind of servers do you have?
> >>>
> >>> On 12/02/2012 03:15 PM, Hermes Flying wrote:
> >>>> If I have a requirement not to include external HW, is there any 
other
> >>>> way? I mean, I am not -by far- a linux expert, but how come it 
doesn't
> >>>> do a restart or halt?
> >>>>
> >>>>
> >>>> 
------------------------------------------------------------------------
> >>>> *From:* Digimer <li...@alteeve.ca <mailto:li...@alteeve.ca>
> > <mailto:li...@alteeve.ca <mailto:li...@alteeve.ca>>
> >> <mailto:li...@alteeve.ca <mailto:li...@alteeve.ca>
> > <mailto:li...@alteeve.ca <mailto:li...@alteeve.ca>>>>
> >>>> *To:* Hermes Flying <flyingher...@yahoo.com 
> > <mailto:flyingher...@yahoo.com>
> >> <mailto:flyingher...@yahoo.com <mailto:flyingher...@yahoo.com>>
> >>> <mailto:flyingher...@yahoo.com <mailto:flyingher...@yahoo.com>
> > <mailto:flyingher...@yahoo.com <mailto:flyingher...@yahoo.com>>>>
> >>>> *Cc:* General Linux-HA mailing list <linux-ha@lists.linux-ha.org 
> > <mailto:linux-ha@lists.linux-ha.org>
> >> <mailto:linux-ha@lists.linux-ha.org <
mailto:linux-ha@lists.linux-ha.org>>
> >>> <mailto:linux-ha@lists.linux-ha.org 
> > <mailto:linux-ha@lists.linux-ha.org> <
mailto:linux-ha@lists.linux-ha.org 
> > <mailto:linux-ha@lists.linux-ha.org>>>>
> >>>> *Sent:* Sunday, December 2, 2012 10:01 PM
> >>>> *Subject:* Re: [Linux-HA] Corosync on cluster with 3+ nodes
> >>>>
> >>>> On 12/02/2012 02:56 PM, Hermes Flying wrote:
> >>>>> Clear! In order to do fence and crash a node, is there a specific 
HW
> >>>>> requirement to do this?
> >>>>
> >>>> Yes, there must be external hardware (out-of-band management counts 
as
> >>>> "external", despite physically being on the server's mainboard). 
The
> >>>> most common fence device is IPMI, iLO, RSA, iDRAC and the like.
> >>>> Alternatives are switched PDUs, like APC's AP7900.
> >>>>
> >>>> --
> >>>> Digimer
> >>>> Papers and Projects: https://alteeve.ca/w/ 
> >>>> What if the cure for cancer is trapped in the mind of a person 
without
> >>>> access to education?
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Digimer
> >>> Papers and Projects: https://alteeve.ca/w/ 
> >>> What if the cure for cancer is trapped in the mind of a person 
without
> >>> access to education?
> >>>
> >>>
> >>
> >>
> >> --
> >> Digimer
> >> Papers and Projects: https://alteeve.ca/w/ 
> >> What if the cure for cancer is trapped in the mind of a person 
without
> >> access to education?
> >>
> >>
> > 
> > 
> > -- 
> > Digimer
> > Papers and Projects: https://alteeve.ca/w/ 
> > What if the cure for cancer is trapped in the mind of a person without
> > access to education?
> > 
> > 
> 



 
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to