Re: [ClusterLabs] Q: (SLES11 SP4) lrm_rsc_op without last-run?

2018-08-23 Thread Ken Gaillot
On Thu, 2018-08-23 at 08:08 +0200, Ulrich Windl wrote: > Hi! > > Many years ago I wrote a parser that could format the CIB XML in a > flexible way. Today I used it again to print some statistics for > "exec-time". Thereby I discovered one operation that has a valid > "exec-time", a valid

Re: [ClusterLabs] Q: automaticlly remove expired location constraints

2018-08-23 Thread Ken Gaillot
On Thu, 2018-08-23 at 12:27 +0200, Ulrich Windl wrote: > Hi! > > I have a non-trivial question: How can I remove expired manual > migration requests, like the following?: > location cli-standby-rsc rsc rule -inf: #uname eq host and date lt > "2013-06-12 13:47:26Z" > > One problem is that the

[ClusterLabs] Antw: Re: Antw: Re: Spurious node loss in corosync cluster

2018-08-23 Thread Ulrich Windl
>>> Prasad Nagaraj schrieb am 22.08.2018 um 19:00 >>> in Nachricht : > Hi - My systems are single core cpu VMs running on azure platform. I am OK, so you don't have any control over overprovisioning CPU power and the VM being migrated between nodes, I guess. Be aware that the CPU time you are

[ClusterLabs] Antw: Re: Antw: Re: Spurious node loss in corosync cluster

2018-08-23 Thread Ulrich Windl
>>> Prasad Nagaraj schrieb am 22.08.2018 um 02:59 >>> in Nachricht : > Thanks Ken and Ulrich. There is definitely high IO on the system with > sometimes IOWAIT s of upto 90% > I have come across some previous posts that IOWAIT is also considered as > CPU load by Corosync. Is this true ? Does

[ClusterLabs] Antw: Re: Q: ordering for a monitoring op only?

2018-08-23 Thread Ulrich Windl
>>> Ryan Thomas schrieb am 21.08.2018 um 17:38 in Nachricht : > You could accomplish this be creating a custom RA which normally acts as a > pass-through and calls the "real" RA. However, it intercepts "monitor" > actions, checks nfs, and if nfs is down it returns success, otherwise it > passes

[ClusterLabs] Q: automaticlly remove expired location constraints

2018-08-23 Thread Ulrich Windl
Hi! I have a non-trivial question: How can I remove expired manual migration requests, like the following?: location cli-standby-rsc rsc rule -inf: #uname eq host and date lt "2013-06-12 13:47:26Z" One problem is that the date value is not a constant, and it had to be compared against the

Re: [ClusterLabs] Redundant ring not recovering after node is back

2018-08-23 Thread David Tolosa
I'm currently using an Ubuntu 18.04 server configuration with netplan. Here you have my current YAML configuration: # This file describes the network interfaces available on your system # For more information, see netplan(5). network: version: 2 renderer: networkd ethernets: eno1:

Re: [ClusterLabs] Redundant ring not recovering after node is back

2018-08-23 Thread David Tolosa
BTW, where I can download Corosync 3.x? I've only seen Corosync 2.99.3 Alpha4 at http://corosync.github.io/corosync/ 2018-08-23 9:11 GMT+02:00 David Tolosa : > I'm currently using an Ubuntu 18.04 server configuration with netplan. > > Here you have my current YAML configuration: > > # This file

Re: [ClusterLabs] Redundant ring not recovering after node is back

2018-08-23 Thread Jan Friesse
David, BTW, where I can download Corosync 3.x? I've only seen Corosync 2.99.3 Alpha4 at http://corosync.github.io/corosync/ Yes, that's Alpha 4 of Corosync 3. 2018-08-23 9:11 GMT+02:00 David Tolosa : I'm currently using an Ubuntu 18.04 server configuration with netplan. Here you have

Re: [ClusterLabs] Redundant ring not recovering after node is back

2018-08-23 Thread Jan Friesse
David, Hello, Im getting crazy about this problem, that I expect to resolve here, with your help guys: I have 2 nodes with Corosync redundant ring feature. Each node has 2 similarly connected/configured NIC's. Both nodes are connected each other by two crossover cables. I believe this is

Re: [ClusterLabs] Antw: Re: Spurious node loss in corosync cluster

2018-08-23 Thread Jan Friesse
Prasad, Hi - My systems are single core cpu VMs running on azure platform. I am Ok, now it make sense. I don't think you get too much guarantees in the cloud environment so quite a large scheduling pause simply can happen. Also single core CPU is kind of "unsupported" today. running

[ClusterLabs] Q: (SLES11 SP4) lrm_rsc_op without last-run?

2018-08-23 Thread Ulrich Windl
Hi! Many years ago I wrote a parser that could format the CIB XML in a flexible way. Today I used it again to print some statistics for "exec-time". Thereby I discovered one operation that has a valid "exec-time", a valid "last-rc-change", but no "last-run". All other operations had