Thank you! I'll try 1.1.12. -- Best regards, Sergey Arlashin
On Jan 6, 2015, at 3:23 AM, Andrew Beekhof <and...@beekhof.net> wrote: > Yeah, I can imagine 1.1.6 behaving like this. > I'd highly recommend 1.1.12 > >> On 5 Jan 2015, at 5:14 pm, Sergey Arlashin <sergeyarl.maill...@gmail.com> >> wrote: >> >> Pacemaker 1.1.6 >> >> It runs on Ubuntu 12.04 LTS 64bit. >> >> Linux lb-node1 3.11.0-23-generic #40~precise1-Ubuntu SMP Wed Jun 4 22:06:36 >> UTC 2014 x86_64 x86_64 x86_64 GNU/Linux >> >> -- >> Best regards, >> Sergey Arlashin >> >> >> On Jan 5, 2015, at 7:59 AM, Andrew Beekhof <and...@beekhof.net> wrote: >> >>> pacemaker version? it looks familiar but it depends on the version number. >>> >>>> On 29 Dec 2014, at 10:24 pm, Sergey Arlashin >>>> <sergeyarl.maill...@gmail.com> wrote: >>>> >>>> Hi! >>>> Recently I've noticed that one of my nodes had OFFLINE status in 'crm >>>> status' output. But it actually was not. I could ssh on this node. I could >>>> get 'crm status' from that node's console. After some time it became >>>> online. It happened several times without any obvious reason with other >>>> nodes. >>>> >>>> Still no error of fatal messages in logs. The only warning messages I >>>> could get from corosync.log were the following: >>>> >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1346 -> 0.233.1347 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1347 -> 0.233.1348 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1348 -> 0.233.1349 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1349 -> 0.233.1350 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1350 -> 0.233.1351 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1351 -> 0.233.1352 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1352 -> 0.233.1353 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1353 -> 0.233.1354 not applied to 0.233.1354: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update >>>> 491 for last-failure-Cachier=1419729443 failed: Application of an update >>>> diff failed >>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update >>>> 494 for fail-count-Cachier=1 failed: Application of an update diff failed >>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update >>>> 497 for probe_complete=true failed: Application of an update diff failed >>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update >>>> 500 for last-failure-Cachier=1419729443 failed: Application of an update >>>> diff failed >>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: Update >>>> 503 for fail-count-Cachier=1 failed: Application of an update diff failed >>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1338 -> 0.233.1339 not applied to 0.233.1382: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1339 -> 0.233.1340 not applied to 0.233.1382: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1340 -> 0.233.1341 not applied to 0.233.1382: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1341 -> 0.233.1342 not applied to 0.233.1382: current "num_updates" >>>> is greater than required >>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>> 0.233.1342 -> 0.233.1343 not applied to 0.233.1382: current "num_updates" >>>> is greater than required >>>> >>>> After exploring corosync processes with ps I found out that on all my >>>> nodes there are zombie corosync procs like: >>>> >>>> root 13892 0.0 0.0 0 0 ? Z Dec26 0:04 >>>> [corosync] <defunct> >>>> root 21793 0.0 0.0 0 0 ? Z Dec26 0:00 >>>> [corosync] <defunct> >>>> root 27009 1.3 1.0 714292 10784 ? Ssl Dec18 223:38 >>>> /usr/sbin/corosync >>>> >>>> Is it ok to have zombie corosync procs on nodes? Or does it suggest that >>>> something wrong is going on ? >>>> >>>> Thanks in advance >>>> >>>> -- >>>> Best regards, >>>> Sergey Arlashin >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org