On 01.01.2016 11:34, Vladislav Bogdanov wrote: > 31.12.2015 15:33:45 CET, Bogdan Dobrelya <bdobre...@mirantis.com> wrote: >> On 31.12.2015 14:48, Vladislav Bogdanov wrote: >>> blackbox tracing inside pacemaker, USR1, USR2 and TRAP signals iirc, >> quick google search should point you to Andrew's blog with all >> information about that feature. >>> Next, if you use ocf-shellfuncs in your RA, you could enable tracing >> for resource itself, just add 'trace_ra=1' to every operation config >> (start and monitor). >> >> Thank you, I will try to play with these things once I have the issue >> reproduced again. Cannot provide CIB as I don't have the env now. >> >> But still let me ask again, do anyone know or heard of anything like >> known/fixed bugs about corosync with pacemaker stop running monitor >> actions for a resource at some point, while notifications are still >> logged? >> >> Here is example: >> node-16 crmd: >> 2015-12-29T13:16:49.113679+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_monitor_27000: unknown error >> (node=node-16.test.domain.local, call=254, rc=1, cib-updat >> e=1454, confirmed=false) >> node-17: >> 2015-12-29T13:16:57.603834+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_monitor_103000: unknown error >> (node=node-17.test.domain.local, call=181, rc=1, cib-upda >> te=297, confirmed=false) >> node-18: >> 2015-12-29T13:20:16.870619+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_monitor_103000: not running >> (node=node-18.test.domain.local, call=187, rc=7, cib-update >> =306, confirmed=false) >> node-20: >> 2015-12-29T13:20:51.486219+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_monitor_30000: not running >> (node=node-20.test.domain.local, call=180, rc=7, cib-update= >> 308, confirmed=false) >> >> after that point only notifications got logged for affected nodes, like >> Operation p_rabbitmq-server_notify_0: ok >> (node=node-20.test.domain.local, call=287, rc=0, cib-update=0, >> confirmed=t >> rue) >> >> While the node-19 was not affected, and actions >> monitor/stop/start/notify logged OK all the time, like: >> 2015-12-29T14:30:00.973561+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_monitor_30000: not running >> (node=node-19.test.domain.local, call=423, rc=7, cib-update=438, >> confirmed=false) >> 2015-12-29T14:30:01.631609+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_notify_0: ok >> (node=node-19.test.domain.local, call=424, rc=0, cib-update=0, >> confirmed=true) >> 2015-12-29T14:31:19.084165+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_stop_0: ok (node=node-19.test.domain.local, >> call=427, rc=0, cib-update=439, confirmed=true) >> 2015-12-29T14:32:53.120157+00:00 notice: notice: process_lrm_event: >> Operation p_rabbitmq-server_start_0: unknown error >> (node=node-19.test.domain.local, call=428, rc=1, cib-update=441, >> confirmed=true) > > Well, not running and not logged is not the same thing. I do not have access > to code right now, but I'm pretty sure that successful recurring monitors are > not logged after the first run. trace_ra for monitor op should prove that. If > not, then it should be a bug. I recall something was fixed in that area > recently. >
Is it http://bugs.clusterlabs.org/show_bug.cgi?id=5072 / http://bugs.clusterlabs.org/show_bug.cgi?id=5063 ? I found nothing more recent in the pacemaker commits and issues. While not *exactly* my case though, several promote and demote actions still had took a place due the test. Btw, as I understood from the bug 5072/5063 comments, it remains unfixed for some reported cases, am I right? > Best, > Vladislav > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Best regards, Bogdan Dobrelya, Irc #bogdando _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org