Re: [Pacemaker] pacemaker dev lead suggesting s/w upgrade

Andrew Beekhof Tue, 07 May 2013 15:50:31 -0700

Please keep all questions on the mailing list... I don't have the bandwidth for 
1-1 support.
At a minimum, include (as attachments) logs from all machines.


Also, since you're upgrading, can I suggest you go with 1.1.10-rc2.
Even though its "only" a release candidate, its far superior to 1.1.8

-- Andrew

On 08/05/2013, at 12:12 AM, Vinod Prabhu <vinod.pra...@ipaccess.com> wrote:

> Hi Andrew,
>  
> Greetings,
> We are using corosync/pacemaker for  high availability
> This is a 4 node HA cluster where each pair of nodes are configured  for DB 
> and file system replication. So there are 2 drbd  pairs in total.
> I have attached the output of “crm configure show” for reference.
> The issue is observed after upgrading pacemaker and corosync version.
> Pacemaker: 1.1.5to 1.1.8
> Corosync:     1.2.7 to 1.4.1
>  
> We follow the following procedure to upgrade [the repo is downloaded from  
> wget -O /etc/yum.repos.d/pacemaker.repo 
> http://clusterlabs.org/rpm-next/rhel-5/clusterlabs.repo]:
>  
> Crm configure save > local.conf
> crm node standby <on-each-node>
> service corosync stop <on-each-node>
> yum remove -y corosync
> yum remove -y corosync-debuginfo
> yum remove -y pacemaker-debuginfo
> yum remove -y heartbeat-libs
> yum remove -y heartbeat-debuginfo
> yum install -y pacemaker corosync
> cd /root/
> rpm -ivh crmsh-1.2.5-55.3.x86_64.rpm crmsh-debuginfo-1.2.5-55.3.x86_64.rpm
> service corosync start <on-each-node>
> crm node online <on-each-node>
> crm configure load replace local.conf
>  
> After the last step one of the node is frozen. Any help on this? Is there any 
> other document u require? Or is the up-gradation step missing any ?
>  
> Vinod
>  
> From: Babu Challa 
> Sent: Tuesday, May 07, 2013 3:17 PM
> To: Vinod Prabhu
> Subject: FW: pacemaker dev lead suggesting s/w upgrade
>  
> FYI
>  
> R
> Babu Challa
> T: +44 (0) 1954 717972 | M: +44 (0) 7912 859958| E: babu.cha...@ipaccess.com 
> | W: www.ipaccess.com
> ip.access Ltd, Building 2020, Cambourne Business Park, Cambourne, Cambridge, 
> CB23 6DW
>  
> The desire to excel is exclusive of the fact whether someone else appreciates 
> it or not. "Excellence" is a drive from inside, not outside. Excellence is 
> not for someone else to notice but for your own satisfaction and efficiency...
>  
> From: Babu Challa 
> Sent: 02 May 2013 10:27
> To: Vinod Prabhu; Michael van der Westhuizen; Dinesh Arney
> Cc: Karthik Ganesan; Gavin Stevens; Mandar Magikar
> Subject: pacemaker dev lead suggesting s/w upgrade
>  
> Hi All,
>  
> I have requested Andrew Beekhof,( team leader for Pacemaker development) for 
> his input for the HA issues. He believes there was a bug on pacemaker and he 
> is suggesting pacemaker upgrade .
>  
> Please find below email and his replay is in bold green
>  
> R
> Babu Challa
> T: +44 (0) 1954 717972 | M: +44 (0) 7912 859958| E: babu.cha...@ipaccess.com 
> | W: www.ipaccess.com
> ip.access Ltd, Building 2020, Cambourne Business Park, Cambourne, Cambridge, 
> CB23 6DW
>  
> The desire to excel is exclusive of the fact whether someone else appreciates 
> it or not. "Excellence" is a drive from inside, not outside. Excellence is 
> not for someone else to notice but for your own satisfaction and efficiency...
>  
>  
> On 01/05/2013, at 11:55 PM, Babu Challa <babu.cha...@ipaccess.com> wrote:
>  
> Hi Andrew,
> Thanks for the replay. Now I have managed to reproduce the issue. I am 
> enclosing steps here for pacemaker team for their understanding . Requesting 
> their advice for resolving this issue
>  
> Update your software.
>  
>  
> -----Original Message-----
> From: Andrew Beekhof [mailto:and...@beekhof.net] 
> Sent: 01 May 2013 01:20
> To: Babu Challa
> Cc: The Pacemaker cluster resource manager
> Subject: Re: corosync restarts service when slave node joins the cluster
>  
> Hi Andrew,
>  
> Greetings,
>  
> We are using corosync/pacemaker for  high availability
>  
>  This is a 4 node HA cluster where each pair of nodes are configured  for DB 
> and file system replication. We have very tricky situation. We have 
> configured two clusters with exact same configuration on each. But on one 
> cluster,  corosync restarting the services when slave node is rebooted and 
> re-joins the cluster.
>  
> We have tried to reproduce the issue on other cluster with multiple HA  
> scenarios but no luck
>  
>  Few questions:
>  
>  1.       If rebooted slave is a  DC (designated Controller) , is there any 
> possibility of this issue
> 2.       Is there any known issue in pacemaker version currently  we are 
> using (1.1.5) which will be resolved if we upgrade to latest (1.8)
>  
> I believe there was one, check the ChangeLog
>  
> 3.       Is there any chance that pacemaker/corosync behaves differently even 
> though configuration is same on each cluster
>  
> Timing issues do occur, how identical is the hardware?
>  
> 4.       Can you please let us know if there is any possible reason for this 
> issue. That’s really helpful to reproduce this issue and fix it
>  
> More than likely it has been fixed in a later version.
>  
>  
>  Versions we are using;
>  
>  Pacemaker version - pacemaker-1.1.5
> Corosync version - corosync-1.2.7
> heartbeat-3.0.3-2.3
>  
>  R
> Babu Challa
> T: +44 (0) 1954 717972 | M: +44 (0) 7912 859958| E:
>  babu.cha...@ipaccess.com | W: www.ipaccess.com ip.access Ltd, Building
>  2020, Cambourne Business Park, Cambourne, Cambridge, CB23 6DW
>  
>  The desire to excel is exclusive of the fact whether someone else 
> appreciates it or not. "Excellence" is a drive from inside, not outside. 
> Excellence is not for someone else to notice but for your own satisfaction 
> and efficiency...
>  
> This message contains confidential information and may be privileged. If you 
> are not the intended recipient, please notify the sender and delete the 
> message immediately.
>  
> ip.access ltd, registration number 3400157, Building 2020, Cambourne
>  Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom
>  
>  
> 
> 
> 
> 
> This message contains confidential information and may be privileged. If you 
> are not the intended recipient, please notify the sender and delete the 
> message immediately.
> 
> ip.access ltd, registration number 3400157, Building 2020, 
> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom
> 
> 
> <nos.conf>


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] pacemaker dev lead suggesting s/w upgrade

Reply via email to