Re: [ClusterLabs] systemd RA start/stop delays
On 08/18/2016 04:00 PM, Ken Gaillot wrote: > On 08/17/2016 08:17 PM, TEG AMJG wrote: >> Hi >> >> I am having a problem with a simple Active/Passive cluster which >> consists in the next configuration >> >> Cluster Name: kamcluster >> Corosync Nodes: >> kam1vs3 kam2vs3 >> Pacemaker Nodes: >> kam1vs3 kam2vs3 >> >> Resources: >> Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) >> Attributes: ip=10.0.1.206 cidr_netmask=32 >> Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s) >> stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) >> monitor interval=10s (ClusterIP-monitor-interval-10s) >> Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2) >> Attributes: ip=10.0.1.207 cidr_netmask=32 >> Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s) >> stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s) >> monitor interval=10s (ClusterIP2-monitor-interval-10s) >> Resource: rtpproxycluster (class=systemd type=rtpproxy) >> Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s) >> stop interval=0s on-fail=block >> (rtpproxycluster-stop-interval-0s) >> Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem) >> Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4 >> Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s) >> monitor interval=10s on-fail=fence >> (kamailioetcfs-monitor-interval-1 >>0s) >> stop interval=0s on-fail=fence >> (kamailioetcfs-stop-interval-0s) >> Clone: fence_kam2_xvm-clone >> Meta Attrs: interleave=true clone-max=2 clone-node-max=1 >> Resource: fence_kam2_xvm (class=stonith type=fence_xvm) >>Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3 >>Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s) >> Master: kamailioetcclone >> Meta Attrs: master-max=1 master-node-max=1 clone-max=2 >> clone-node-max=1 notify=t >>rue on-fail=fence >> Resource: kamailioetc (class=ocf provider=linbit type=drbd) >>Attributes: drbd_resource=kamailioetc >>Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s) >>promote interval=0s on-fail=fence >> (kamailioetc-promote-interval-0s) >>demote interval=0s on-fail=fence >> (kamailioetc-demote-interval-0s) >>stop interval=0s on-fail=fence (kamailioetc-stop-interval-0s) >>monitor interval=10s (kamailioetc-monitor-interval-10s) >> Clone: fence_kam1_xvm-clone >> Meta Attrs: interleave=true clone-max=2 clone-node-max=1 >> Resource: fence_kam1_xvm (class=stonith type=fence_xvm) >>Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3 >>Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s) >> Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio) >> Attributes: listen_address=10.0.1.206 >> conffile=/etc/kamailio/kamailio.cfg pidfil >> >> e=/var/run/kamailio.pid monitoring_ip=10.0.1.206 >> monitoring_ip2=10.0.1.207 port=50 >>60 proto=udp >> kamctlrc=/etc/kamailio/kamctlrc shmem=128 pkg=8 >> Meta Attrs: target-role=Stopped >> Operations: start interval=0s timeout=60 >> (kamailiocluster-start-interval-0s) >> stop interval=0s timeout=30 (kamailiocluster-stop-interval-0s) >> monitor interval=5s (kamailiocluster-monitor-interval-5s) >> >> Stonith Devices: >> Fencing Levels: >> >> Location Constraints: >> Ordering Constraints: >> start fence_kam1_xvm-clone then start fence_kam2_xvm-clone >> (kind:Mandatory) (id: >> >> order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory) >> start fence_kam2_xvm-clone then promote kamailioetcclone >> (kind:Mandatory) (id:or >> >> der-fence_kam2_xvm-clone-kamailioetcclone-mandatory) >> promote kamailioetcclone then start kamailioetcfs (kind:Optional) >> (id:order-kama >>ilioetcclone-kamailioetcfs-Optional) >> Resource Sets: >> set kamailioetcfs sequential=true (id:pcs_rsc_set_kamailioetcfs) set >> ClusterIP >> ClusterIP2 sequential=false >> (id:pcs_rsc_set_ClusterIP_ClusterIP2) set rtpproxyclu >>ster >> kamailiocluster sequent
Re: [ClusterLabs] systemd RA start/stop delays
On 08/17/2016 08:17 PM, TEG AMJG wrote: > Hi > > I am having a problem with a simple Active/Passive cluster which > consists in the next configuration > > Cluster Name: kamcluster > Corosync Nodes: > kam1vs3 kam2vs3 > Pacemaker Nodes: > kam1vs3 kam2vs3 > > Resources: > Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) > Attributes: ip=10.0.1.206 cidr_netmask=32 > Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s) > stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) > monitor interval=10s (ClusterIP-monitor-interval-10s) > Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2) > Attributes: ip=10.0.1.207 cidr_netmask=32 > Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s) > stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s) > monitor interval=10s (ClusterIP2-monitor-interval-10s) > Resource: rtpproxycluster (class=systemd type=rtpproxy) > Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s) > stop interval=0s on-fail=block > (rtpproxycluster-stop-interval-0s) > Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem) > Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4 > Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s) > monitor interval=10s on-fail=fence > (kamailioetcfs-monitor-interval-1 >0s) > stop interval=0s on-fail=fence > (kamailioetcfs-stop-interval-0s) > Clone: fence_kam2_xvm-clone > Meta Attrs: interleave=true clone-max=2 clone-node-max=1 > Resource: fence_kam2_xvm (class=stonith type=fence_xvm) >Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3 >Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s) > Master: kamailioetcclone > Meta Attrs: master-max=1 master-node-max=1 clone-max=2 > clone-node-max=1 notify=t >rue on-fail=fence > Resource: kamailioetc (class=ocf provider=linbit type=drbd) >Attributes: drbd_resource=kamailioetc >Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s) >promote interval=0s on-fail=fence > (kamailioetc-promote-interval-0s) >demote interval=0s on-fail=fence > (kamailioetc-demote-interval-0s) >stop interval=0s on-fail=fence (kamailioetc-stop-interval-0s) >monitor interval=10s (kamailioetc-monitor-interval-10s) > Clone: fence_kam1_xvm-clone > Meta Attrs: interleave=true clone-max=2 clone-node-max=1 > Resource: fence_kam1_xvm (class=stonith type=fence_xvm) >Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3 >Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s) > Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio) > Attributes: listen_address=10.0.1.206 > conffile=/etc/kamailio/kamailio.cfg pidfil > > e=/var/run/kamailio.pid monitoring_ip=10.0.1.206 > monitoring_ip2=10.0.1.207 port=50 >60 proto=udp > kamctlrc=/etc/kamailio/kamctlrc shmem=128 pkg=8 > Meta Attrs: target-role=Stopped > Operations: start interval=0s timeout=60 > (kamailiocluster-start-interval-0s) > stop interval=0s timeout=30 (kamailiocluster-stop-interval-0s) > monitor interval=5s (kamailiocluster-monitor-interval-5s) > > Stonith Devices: > Fencing Levels: > > Location Constraints: > Ordering Constraints: > start fence_kam1_xvm-clone then start fence_kam2_xvm-clone > (kind:Mandatory) (id: > > order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory) > start fence_kam2_xvm-clone then promote kamailioetcclone > (kind:Mandatory) (id:or > > der-fence_kam2_xvm-clone-kamailioetcclone-mandatory) > promote kamailioetcclone then start kamailioetcfs (kind:Optional) > (id:order-kama >ilioetcclone-kamailioetcfs-Optional) > Resource Sets: > set kamailioetcfs sequential=true (id:pcs_rsc_set_kamailioetcfs) set > ClusterIP > ClusterIP2 sequential=false > (id:pcs_rsc_set_ClusterIP_ClusterIP2) set rtpproxyclu >ster > kamailiocluster sequential=true > (id:pcs_rsc_set_rtpproxycluster_kamailioclust >
Re: [ClusterLabs] ClusterLabs] Failing over NFSv4/TCP exports
Thanks for the info. I only use esxi, which likely explains why I never had issues... Patrick Zwahlen wrote: >Hi, > >> -Original Message- >> From: Andreas Kurz [mailto:andreas.k...@gmail.com] >> Sent: mercredi, 17 août 2016 23:16 >> To: Cluster Labs - All topics related to open-source clustering welcomed >> >> Subject: Re: [ClusterLabs] Failing over NFSv4/TCP exports >> >> This is a known problem ... have a look into the portblock RA - it has >> the feature to send out TCP tickle ACKs to reset such hanging sessions. >> So you can configure a portblock resource that blocks the tcp port >> before starting the VIP and another portblock resource that unblocks the >> port afterwards and sends out that tickle ACKs. > >Thanks Andreas for pointing me to the portblock RA. I wasn't aware of it and >will read/test. > >I also made some further testing using ESXi and I found out that the ESXi NFS >client behaves in a completely different way when compared to the Linux client >and at first sight it actually seems to work (where the Linux client fails). > >It's mainly due to 2 things: > >1) Their NFS client is much more aggressive in terms of monitoring the server >and restarting sessions. > >2) Every new TCP session comes from a different source port compared to the >Linux client which seems to stick to a single source port. This actually >solves the issue of failing back to a node with FIN_WAIT1 sessions. > >Regards, Patrick > >** >This email and any files transmitted with it are confidential and >intended solely for the use of the individual or entity to whom they >are addressed. If you have received this email in error please notify >the system manager. "postmas...@navixia.com" Navixia SA >** >___ >Users mailing list: Users@clusterlabs.org >http://clusterlabs.org/mailman/listinfo/users > >Project Home: http://www.clusterlabs.org >Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Failing over NFSv4/TCP exports
Hi, > -Original Message- > From: Andreas Kurz [mailto:andreas.k...@gmail.com] > Sent: mercredi, 17 août 2016 23:16 > To: Cluster Labs - All topics related to open-source clustering welcomed > > Subject: Re: [ClusterLabs] Failing over NFSv4/TCP exports > > This is a known problem ... have a look into the portblock RA - it has > the feature to send out TCP tickle ACKs to reset such hanging sessions. > So you can configure a portblock resource that blocks the tcp port > before starting the VIP and another portblock resource that unblocks the > port afterwards and sends out that tickle ACKs. Thanks Andreas for pointing me to the portblock RA. I wasn't aware of it and will read/test. I also made some further testing using ESXi and I found out that the ESXi NFS client behaves in a completely different way when compared to the Linux client and at first sight it actually seems to work (where the Linux client fails). It's mainly due to 2 things: 1) Their NFS client is much more aggressive in terms of monitoring the server and restarting sessions. 2) Every new TCP session comes from a different source port compared to the Linux client which seems to stick to a single source port. This actually solves the issue of failing back to a node with FIN_WAIT1 sessions. Regards, Patrick ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. "postmas...@navixia.com" Navixia SA ** ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org