Thanks Ondrej for the response. I also figured out the same and reduced the HADR_TIMEOUT and increased the promote timeout which helped in resolving the issue.
Regards, Dileep V Nair Senior AIX Administrator Cloud Managed Services Delivery (MSD), India IBM Cloud E-mail: dilen...@in.ibm.com Outer Ring Road, Embassy Manya Bangalore, KA 560045 India From: Ondrej Famera <ofam...@redhat.com> To: Dileep V Nair <dilen...@in.ibm.com> Cc: Cluster Labs - All topics related to open-source clustering welcomed <users@clusterlabs.org> Date: 02/12/2018 11:46 AM Subject: Re: [ClusterLabs] Issues with DB2 HADR Resource Agent On 02/01/2018 07:24 PM, Dileep V Nair wrote: > Thanks Ondrej for the response. I have set the PEER_WINDOWto 1000 which > I guess is a reasonable value. What I am noticing is it does not wait > for the PEER_WINDOW. Before that itself the DB goes into a > REMOTE_CATCHUP_PENDING state and Pacemaker give an Error saying a DB in > STANDBY/REMOTE_CATCHUP_PENDING/DISCONNECTED can never be promoted. > > > Regards, > > *Dileep V Nair* Hi Dileep, sorry for later response. The DB2 should not get into the 'REMOTE_CATCHUP' phase or the DB2 resource agent will indeed not promote. From my experience it usually gets into that state when the DB2 on standby was restarted during or after PEER_WINDOW timeout. When the primary DB2 fails then standby should end up in some state that would match the one on line 770 of DB2 resource agent and the promote operation is attempted. 770 STANDBY/*PEER/DISCONNECTED|Standby/DisconnectedPeer) https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ClusterLabs_resource-2Dagents_blob_master_heartbeat_db2-23L770&d=DwIDBA&c=jf_iaSHvJObTbx-siA1ZOg&r=syjI0TzCX7--Qy0vFS1xy17vob_50Cur84Jg-YprJuw&m=dhvUwjWghTBfDEHmzU3P5eaU9Ce3DkCRdRPNd71L1bU&s=3vPiNA4KGdZzc0xJOYv5hMCObjWdlxZDO_bLb86YaGM&e= The DB2 on standby can get restarted when the 'promote' operation times out, so you can try increasing the 'promote' timeout to something higher if this was the case. So if you see that DB2 was restarted after Primary failed, increase the promote timeout. If DB2 was not restarted then question is why DB2 has decided to change the status in this way. Let me know if above helped. -- Ondrej Faměra @Red Hat
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org