Googling on "fencing agent IPMI " helps :) This link might be useful. https://fedorahosted.org/cluster/wiki/IPMI_FencingConfig
Regards Arjun On Wed, Oct 29, 2014 at 2:11 PM, kamal kishi <kamal.ki...@gmail.com> wrote: > Thanks for the info, was trying to configure IPMI in the servers. > Can you please suggest a configuration procedure for enabling and > configuring the IPMI(Which you might have referred to). > The sites I came across are not understandable. > The servers I'm using is DELL POWEREDGE R320 > > On Tue, Oct 28, 2014 at 7:55 PM, Digimer <li...@alteeve.ca> wrote: >> >> On 28/10/14 02:24 AM, kamal kishi wrote: >>> >>> Hi, >>> >>> I know, no fencing configuration creates issue. >>> But the current scenario is due to fencing?? >> >> >> Maybe, maybe not. I can say that *not* having it will make solving the >> problem much more difficult. Please get it working, it's pretty easy and it >> will make your life a lot easier. >> >>> The syslog isn't revealing much about the same. >>> I would love to configure fencing but currently need some solution to >>> overcome the current scenario, if you say fencing is the only solution >>> then I might have to do it remotely. >> >> >> It is critical, yes. Please add it, test it and then hook DRBD into it. >> >>> OS -> UBUNTU 12.04 (64 bits) >>> DRBD -> 8.3.11 >> >> >> That is quite old. Can you update to 8.3.16? Also, what version is >> pacemaker and corosync? >> >>> Thanks for the quick reply >>> >>> On Tue, Oct 28, 2014 at 11:19 AM, Digimer <li...@alteeve.ca >>> <mailto:li...@alteeve.ca>> wrote: >>> >>> On 28/10/14 01:39 AM, kamal kishi wrote: >>> >>> Hi all, >>> >>> Facing a strange issue which I'm not able to resolve as >>> I'm not >>> sure where what is going wrong as the logs is not giving away >>> much to my >>> knowledge. >>> >>> Issue - >>> Have configured 2 Node Clustering, have attached the >>> configuration >>> file(New CRM conf of BIC.txt). >>> >>> If Server2 which is primary is shutdown(forcefully by turning >>> off the >>> switch), Server1 restarts within few seconds and starts the >>> resources. >>> Even though the Server1 restarts and starts the resources the >>> time taken >>> to recover is too long to convince the clients and the current >>> working >>> is erroneous is what I feel. >>> >>> Have attached the syslog with this mail.(syslog) >>> >>> Do go through the same and let know a solution to resolve the >>> same as >>> the setup is in clients place. >>> >>> -- >>> Regards, >>> Kamal Kishore B V >>> >>> >>> You really need fencing, first and foremost. This will cause the >>> survivor to put the lost node into a known state and then safely >>> begin taking over lost services. Do your nodes have IPMI (or iRMC, >>> iLO, DRAC, etc)? If so, setting up stonith is easy. >>> >>> Once it is setup, configure DRBD to use the fence-handler >>> 'crm-fence-peer.sh' and change the fencing policy to >>> 'resource-and-stonith'. Without this, you will get split-brains and >>> fail-over will be unpredictable. >>> >>> Once stonith is configured and tested in pacemaker and you've hooked >>> DRBD's fencing into pacemaker, see if you problem remains. If it >>> does, on both nodes, run: 'tail -f -n 0 /var/log/messages', kill a >>> node and wait for things to settle down. Share the log output here. >>> >>> Please also tell us your OS, pacemaker, drbd and corosync versions. >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.ca/w/ >>> What if the cure for cancer is trapped in the mind of a person >>> without access to education? >>> >>> _________________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> <mailto:Pacemaker@oss.clusterlabs.org> >>> http://oss.clusterlabs.org/__mailman/listinfo/pacemaker >>> <http://oss.clusterlabs.org/mailman/listinfo/pacemaker> >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: >>> http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf >>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >>> Bugs: http://bugs.clusterlabs.org >>> >>> >>> >>> >>> -- >>> Regards, >>> Kamal Kishore B V >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person without >> access to education? >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > > -- > Regards, > Kamal Kishore B V > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org