Re: [ClusterLabs] systemd RA start/stop delays

2016-08-18 Thread Klaus Wenninger
On 08/18/2016 04:00 PM, Ken Gaillot wrote:
> On 08/17/2016 08:17 PM, TEG AMJG wrote:
>> Hi
>>
>> I am having a problem with a simple Active/Passive cluster which
>> consists in the next configuration
>>
>> Cluster Name: kamcluster
>> Corosync Nodes:
>>  kam1vs3 kam2vs3
>> Pacemaker Nodes:
>>  kam1vs3 kam2vs3
>>
>> Resources:
>>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>>   Attributes: ip=10.0.1.206 cidr_netmask=32
>>   Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s)
>>   stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
>>   monitor interval=10s (ClusterIP-monitor-interval-10s)
>>  Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2)
>>   Attributes: ip=10.0.1.207 cidr_netmask=32
>>   Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s)
>>   stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s)
>>   monitor interval=10s (ClusterIP2-monitor-interval-10s)
>>  Resource: rtpproxycluster (class=systemd type=rtpproxy)
>>   Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s)
>>   stop interval=0s on-fail=block
>> (rtpproxycluster-stop-interval-0s)
>>  Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem)
>>   Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4
>>   Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s)
>>   monitor interval=10s on-fail=fence
>> (kamailioetcfs-monitor-interval-1  
>>0s)
>>   stop interval=0s on-fail=fence
>> (kamailioetcfs-stop-interval-0s)
>>  Clone: fence_kam2_xvm-clone
>>   Meta Attrs: interleave=true clone-max=2 clone-node-max=1
>>   Resource: fence_kam2_xvm (class=stonith type=fence_xvm)
>>Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3
>>Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s)
>>  Master: kamailioetcclone
>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>> clone-node-max=1 notify=t  
>>rue on-fail=fence
>>   Resource: kamailioetc (class=ocf provider=linbit type=drbd)
>>Attributes: drbd_resource=kamailioetc
>>Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s)
>>promote interval=0s on-fail=fence
>> (kamailioetc-promote-interval-0s)
>>demote interval=0s on-fail=fence
>> (kamailioetc-demote-interval-0s)
>>stop interval=0s on-fail=fence (kamailioetc-stop-interval-0s)
>>monitor interval=10s (kamailioetc-monitor-interval-10s)
>>  Clone: fence_kam1_xvm-clone
>>   Meta Attrs: interleave=true clone-max=2 clone-node-max=1
>>   Resource: fence_kam1_xvm (class=stonith type=fence_xvm)
>>Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3
>>Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s)
>>  Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio)
>>   Attributes: listen_address=10.0.1.206
>> conffile=/etc/kamailio/kamailio.cfg pidfil  
>>  
>>  e=/var/run/kamailio.pid monitoring_ip=10.0.1.206
>> monitoring_ip2=10.0.1.207 port=50  
>>60 proto=udp
>> kamctlrc=/etc/kamailio/kamctlrc shmem=128 pkg=8
>>   Meta Attrs: target-role=Stopped
>>   Operations: start interval=0s timeout=60
>> (kamailiocluster-start-interval-0s)
>>   stop interval=0s timeout=30 (kamailiocluster-stop-interval-0s)
>>   monitor interval=5s (kamailiocluster-monitor-interval-5s)
>>
>> Stonith Devices:
>> Fencing Levels:
>>
>> Location Constraints:
>> Ordering Constraints:
>>   start fence_kam1_xvm-clone then start fence_kam2_xvm-clone
>> (kind:Mandatory) (id:  
>>  
>>  order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory)
>>   start fence_kam2_xvm-clone then promote kamailioetcclone
>> (kind:Mandatory) (id:or
>>
>>  der-fence_kam2_xvm-clone-kamailioetcclone-mandatory)
>>   promote kamailioetcclone then start kamailioetcfs (kind:Optional)
>> (id:order-kama  
>>ilioetcclone-kamailioetcfs-Optional)
>>   Resource Sets:
>> set kamailioetcfs sequential=true (id:pcs_rsc_set_kamailioetcfs) set
>> ClusterIP  
>> ClusterIP2 sequential=false
>> (id:pcs_rsc_set_ClusterIP_ClusterIP2) set rtpproxyclu  
>>ster
>> kamailiocluster sequent

Re: [ClusterLabs] systemd RA start/stop delays

2016-08-18 Thread Ken Gaillot
On 08/17/2016 08:17 PM, TEG AMJG wrote:
> Hi
> 
> I am having a problem with a simple Active/Passive cluster which
> consists in the next configuration
> 
> Cluster Name: kamcluster
> Corosync Nodes:
>  kam1vs3 kam2vs3
> Pacemaker Nodes:
>  kam1vs3 kam2vs3
> 
> Resources:
>  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=10.0.1.206 cidr_netmask=32
>   Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s)
>   stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
>   monitor interval=10s (ClusterIP-monitor-interval-10s)
>  Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=10.0.1.207 cidr_netmask=32
>   Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s)
>   stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s)
>   monitor interval=10s (ClusterIP2-monitor-interval-10s)
>  Resource: rtpproxycluster (class=systemd type=rtpproxy)
>   Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s)
>   stop interval=0s on-fail=block
> (rtpproxycluster-stop-interval-0s)
>  Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4
>   Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s)
>   monitor interval=10s on-fail=fence
> (kamailioetcfs-monitor-interval-1  
>0s)
>   stop interval=0s on-fail=fence
> (kamailioetcfs-stop-interval-0s)
>  Clone: fence_kam2_xvm-clone
>   Meta Attrs: interleave=true clone-max=2 clone-node-max=1
>   Resource: fence_kam2_xvm (class=stonith type=fence_xvm)
>Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3
>Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s)
>  Master: kamailioetcclone
>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
> clone-node-max=1 notify=t  
>rue on-fail=fence
>   Resource: kamailioetc (class=ocf provider=linbit type=drbd)
>Attributes: drbd_resource=kamailioetc
>Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s)
>promote interval=0s on-fail=fence
> (kamailioetc-promote-interval-0s)
>demote interval=0s on-fail=fence
> (kamailioetc-demote-interval-0s)
>stop interval=0s on-fail=fence (kamailioetc-stop-interval-0s)
>monitor interval=10s (kamailioetc-monitor-interval-10s)
>  Clone: fence_kam1_xvm-clone
>   Meta Attrs: interleave=true clone-max=2 clone-node-max=1
>   Resource: fence_kam1_xvm (class=stonith type=fence_xvm)
>Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3
>Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s)
>  Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio)
>   Attributes: listen_address=10.0.1.206
> conffile=/etc/kamailio/kamailio.cfg pidfil  
>  
>  e=/var/run/kamailio.pid monitoring_ip=10.0.1.206
> monitoring_ip2=10.0.1.207 port=50  
>60 proto=udp
> kamctlrc=/etc/kamailio/kamctlrc shmem=128 pkg=8
>   Meta Attrs: target-role=Stopped
>   Operations: start interval=0s timeout=60
> (kamailiocluster-start-interval-0s)
>   stop interval=0s timeout=30 (kamailiocluster-stop-interval-0s)
>   monitor interval=5s (kamailiocluster-monitor-interval-5s)
> 
> Stonith Devices:
> Fencing Levels:
> 
> Location Constraints:
> Ordering Constraints:
>   start fence_kam1_xvm-clone then start fence_kam2_xvm-clone
> (kind:Mandatory) (id:  
>  
>  order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory)
>   start fence_kam2_xvm-clone then promote kamailioetcclone
> (kind:Mandatory) (id:or
>
>  der-fence_kam2_xvm-clone-kamailioetcclone-mandatory)
>   promote kamailioetcclone then start kamailioetcfs (kind:Optional)
> (id:order-kama  
>ilioetcclone-kamailioetcfs-Optional)
>   Resource Sets:
> set kamailioetcfs sequential=true (id:pcs_rsc_set_kamailioetcfs) set
> ClusterIP  
> ClusterIP2 sequential=false
> (id:pcs_rsc_set_ClusterIP_ClusterIP2) set rtpproxyclu  
>ster
> kamailiocluster sequential=true
> (id:pcs_rsc_set_rtpproxycluster_kamailioclust  
>  

Re: [ClusterLabs] ClusterLabs] Failing over NFSv4/TCP exports

2016-08-18 Thread Dan Swartzendruber
Thanks for the info. I only use esxi, which likely explains why I never had 
issues...

Patrick Zwahlen  wrote:
>Hi,
>
>> -Original Message-
>> From: Andreas Kurz [mailto:andreas.k...@gmail.com]
>> Sent: mercredi, 17 août 2016 23:16
>> To: Cluster Labs - All topics related to open-source clustering welcomed
>> 
>> Subject: Re: [ClusterLabs] Failing over NFSv4/TCP exports
>> 
>> This is a known problem ... have a look into the portblock RA - it has
>> the feature to send out TCP tickle ACKs to reset such hanging sessions.
>> So you can configure a portblock resource that blocks the tcp port
>> before starting the VIP and another portblock resource that unblocks the
>> port afterwards and sends out that tickle ACKs.
>
>Thanks Andreas for pointing me to the portblock RA. I wasn't aware of it and 
>will read/test.
>
>I also made some further testing using ESXi and I found out that the ESXi NFS 
>client behaves in a completely different way when compared to the Linux client 
>and at first sight it actually seems to work (where the Linux client fails).
>
>It's mainly due to 2 things:
>
>1) Their NFS client is much more aggressive in terms of monitoring the server 
>and restarting sessions.
>
>2) Every new TCP session comes from a different source port compared to the 
>Linux client which seems to stick to a single source port. This actually 
>solves the issue of failing back to a node with FIN_WAIT1 sessions.
>
>Regards, Patrick
>
>**
>This email and any files transmitted with it are confidential and
>intended solely for the use of the individual or entity to whom they
>are addressed. If you have received this email in error please notify
>the system manager. "postmas...@navixia.com"  Navixia SA
>**
>___
>Users mailing list: Users@clusterlabs.org
>http://clusterlabs.org/mailman/listinfo/users
>
>Project Home: http://www.clusterlabs.org
>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failing over NFSv4/TCP exports

2016-08-18 Thread Patrick Zwahlen
Hi,

> -Original Message-
> From: Andreas Kurz [mailto:andreas.k...@gmail.com]
> Sent: mercredi, 17 août 2016 23:16
> To: Cluster Labs - All topics related to open-source clustering welcomed
> 
> Subject: Re: [ClusterLabs] Failing over NFSv4/TCP exports
> 
> This is a known problem ... have a look into the portblock RA - it has
> the feature to send out TCP tickle ACKs to reset such hanging sessions.
> So you can configure a portblock resource that blocks the tcp port
> before starting the VIP and another portblock resource that unblocks the
> port afterwards and sends out that tickle ACKs.

Thanks Andreas for pointing me to the portblock RA. I wasn't aware of it and 
will read/test.

I also made some further testing using ESXi and I found out that the ESXi NFS 
client behaves in a completely different way when compared to the Linux client 
and at first sight it actually seems to work (where the Linux client fails).

It's mainly due to 2 things:

1) Their NFS client is much more aggressive in terms of monitoring the server 
and restarting sessions.

2) Every new TCP session comes from a different source port compared to the 
Linux client which seems to stick to a single source port. This actually solves 
the issue of failing back to a node with FIN_WAIT1 sessions.

Regards, Patrick

**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager. "postmas...@navixia.com"  Navixia SA
**
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org