On 02/18/2019 04:15 PM, Christine Caulfield wrote: > On 15/02/2019 16:58, Edwin Török wrote: >> On 15/02/2019 16:08, Christine Caulfield wrote: >>> On 15/02/2019 13:06, Edwin Török wrote: >>>> I tried again with 'debug: trace', lots of process pause here: >>>> https://clbin.com/ZUHpd >>>> >>>> And here is an strace running realtime prio 99, a LOT of epoll_wait and >>>> sendmsg (gz format): >>>> https://clbin.com/JINiV >>>> >>>> It detects large numbers of members left, but I think this is because >>>> the corosync on those hosts got similarly stuck: >>>> Feb 15 12:51:07 localhost corosync[29278]: [TOTEM ] A new membership >>>> (10.62.161.158:3152) was formed. Members left: 2 14 3 9 5 11 4 12 8 13 7 >>>> 1 10 >>>> Feb 15 12:51:07 localhost corosync[29278]: [TOTEM ] Failed to receive >>>> the leave message. failed: 2 14 3 9 5 11 4 12 8 13 7 1 10 >>>> >>>> Looking on another host that is still stuck 100% corosync it says: >>>> https://clbin.com/6UOn6 >>>> >>> Thanks, that's really quite odd. I have vague recollections of a problem >>> where corosync was spinning on epoll without reading anything but can't >>> find the details at the moment, annoying. >>> >>> Some thing you might be able to try that might help. >>> >>> 1) is is possible to run without sbd. Sometimes too much polling from >>> clients can cause odd behaviour
That results without sbd might be especially interesting in the light of the issue being triggered via config-reloads. Sbd has callbacks registered (RR at 99 as well) to be kicked off by config-reloads as well. >>> 2) is it possible to try with a different kernel? We've tried a vanilla >>> 4.19 and it's fine, but not with the Xen patches obviously >> I'll try with some bare-metal upstream distros and report back the repro >> steps if I can get it to reliably repro, hopefully early next week, it >> is unlikely I'll get a working repro today. >> >>> 3) Does running corosync with the -p option help? >> Yes, with "-p" I was able to run cluster create/GFS2 plug/unplug/destroy >> on 16 physical hosts in a loop for an hour with any crashes (previously >> it would crash within minutes). >> >> I found another workaround too: >> echo NO_RT_RUNTIME_SHARE >/sys/kernel/debug/sched_features >> >> This makes the 95% realtime process CPU limit from >> sched_rt_runtime_us/sched_rt_period_us apply per core, instead of >> globally, so there would be 5% time left for non-realtime tasks on each >> core. Seems to be enough to avoid the livelock, I was not able to >> observe corosync using high CPU % anymore. >> Still got more tests to run on this over the weekend, but looks promising. >> >> This is a safety layer of course, to prevent the system from fencing if >> we encounter high CPU usage in corosync/libq. I am still interested in >> tracking down the corosync/libq issue as it shouldn't have happened in >> the first place. >> > That's helpful to know. Does corosync still use lots of CPU time in this > situation (without RT) or does it behave normally? I'd expect the high load coming from some kind of busy-waiting (hidden behind whatever complexity) on something that doesn't happen because it is not scheduled. So I would under this other scheduler conditions at the max expect a short spike till the scheduler kicks in. > >>> Is there any situation where this has worked? either with different >>> components or different corosync.conf files? >>> >>> Also, and I don't think this is directly related to the issue, but I can >>> see configuration reloads happening from 2 nodes every 5 seconds. It's >>> very odd and maybe not what you want! >> The configuration reloads are a way of triggering this bug reliably, I >> should've mentioned that earlier >> (the problem happens during a configuration reload, but not always, and >> by doing configuration reloads in a loop that just add/remove one node >> the problem can be triggered reliably within minutes). >> >> > I've been trying this on my (KVM) virtual machines today but I can't > reproduce it on a Standard RHEL-7, so I'm interested to see how you get > on with a different kernel. > > Chrissie > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org