Hi Andrew, I applied three corrections that you made and checked movement. I picked all "abort" processing with g_source_remove() of services.c just to make sure. * I set following "abort" in four places that carried out g_source_remove
>>> if (g_source_remove(op->opaque->repeat_timer) == FALSE) > { >>> abort(); >>> } As a result, "abort" still occurred. The problem does not seem to be yet settled by your correction. (gdb) where #0 0x00007fdd923e1f79 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdd923e5388 in __GI_abort () at abort.c:89 #2 0x00007fdd92b9fe77 in crm_abort (file=file@entry=0x7fdd92bd352b "logging.c", function=function@entry=0x7fdd92bd48c0 <__FUNCTION__.23262> "crm_glib_handler", line=line@entry=73, assert_condition=assert_condition@entry=0xe20b80 "Source ID 40 was not found when attempting to remove it", do_core=do_core@entry=1, do_fork=<optimized out>, do_fork@entry=1) at utils.c:1195 #3 0x00007fdd92bc7ca7 in crm_glib_handler (log_domain=0x7fdd92130b6e "GLib", flags=<optimized out>, message=0xe20b80 "Source ID 40 was not found when attempting to remove it", user_data=<optimized out>) at logging.c:73 #4 0x00007fdd920f2ae1 in g_logv () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #5 0x00007fdd920f2d72 in g_log () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #6 0x00007fdd920eac5c in g_source_remove () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #7 0x00007fdd92984b55 in cancel_recurring_action (op=op@entry=0xe19b90) at services.c:365 #8 0x00007fdd92984bee in services_action_cancel (name=name@entry=0xe1d2d0 "dummy2", action=<optimized out>, interval=interval@entry=10000) at services.c:387 #9 0x000000000040405a in cancel_op (rsc_id=rsc_id@entry=0xe1d2d0 "dummy2", action=action@entry=0xe10d90 "monitor", interval=10000) at lrmd.c:1404 #10 0x000000000040614f in process_lrmd_rsc_cancel (client=0xe17290, id=74, request=0xe1be10) at lrmd.c:1468 #11 process_lrmd_message (client=client@entry=0xe17290, id=74, request=request@entry=0xe1be10) at lrmd.c:1507 #12 0x0000000000402bac in lrmd_ipc_dispatch (c=0xe169c0, data=<optimized out>, size=361) at main.c:148 #13 0x00007fdd91e4d4d9 in qb_ipcs_dispatch_connection_request () from /usr/lib/libqb.so.0 #14 0x00007fdd92bc409d in gio_read_socket (gio=<optimized out>, condition=G_IO_IN, data=0xe158a8) at mainloop.c:437 #15 0x00007fdd920ebce5 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 ---Type <return> to continue, or q <return> to quit--- #16 0x00007fdd920ec048 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #17 0x00007fdd920ec30a in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #18 0x0000000000402774 in main (argc=<optimized out>, argv=0x7fff22cac268) at main.c:344 Best Regards, Hideo Yamauchi. ----- Original Message ----- > From: "renayama19661...@ybb.ne.jp" <renayama19661...@ybb.ne.jp> > To: Andrew Beekhof <and...@beekhof.net>; The Pacemaker cluster resource > manager <pacemaker@oss.clusterlabs.org> > Cc: > Date: 2014/10/10, Fri 10:55 > Subject: Re: [Pacemaker] [Problem]When Pacemaker uses a new version of glib, > g_source_remove fails. > > Hi Andrew, > > Okay! > > I test your patch. > And I inform you of a result. > > Many thanks! > Hideo Yamauchi. > > > > ----- Original Message ----- >> From: Andrew Beekhof <and...@beekhof.net> >> To: renayama19661...@ybb.ne.jp; The Pacemaker cluster resource manager > <pacemaker@oss.clusterlabs.org> >> Cc: >> Date: 2014/10/10, Fri 10:47 >> Subject: Re: [Pacemaker] [Problem]When Pacemaker uses a new version of > glib, g_source_remove fails. >> >> Perfect! >> >> Can you try this: >> >> diff --git a/lib/services/services.c b/lib/services/services.c >> index 8590b56..cb0f0ae 100644 >> --- a/lib/services/services.c >> +++ b/lib/services/services.c >> @@ -417,6 +417,7 @@ services_action_kick(const char *name, const char > *action, >> int interval /* ms */ >> free(id); >> >> if (op == NULL) { >> + op->opaque->repeat_timer = 0; >> return FALSE; >> } >> >> @@ -425,6 +426,7 @@ services_action_kick(const char *name, const char > *action, >> int interval /* ms */ >> } else { >> if (op->opaque->repeat_timer) { >> g_source_remove(op->opaque->repeat_timer); >> + op->opaque->repeat_timer = 0; >> } >> recurring_action_timer(op); >> return TRUE; >> @@ -459,6 +461,7 @@ handle_duplicate_recurring(svc_action_t * op, void >> (*action_callback) (svc_actio >> if (dup->pid != 0) { >> if (op->opaque->repeat_timer) { >> g_source_remove(op->opaque->repeat_timer); >> + op->opaque->repeat_timer = 0; >> } >> recurring_action_timer(dup); >> } >> >> >> On 10 Oct 2014, at 12:16 pm, renayama19661...@ybb.ne.jp wrote: >> >>> Hi Andrew, >>> >>> Setting of gdb of the Ubuntu environment does not yet go well and I > touch >> lrmd and cannot acquire trace. >>> Please wait for this a little more. >>> >>> >>> But.. I let lrmd terminate abnormally when g_source_remove() of >> cancel_recurring_action() returned FALSE. >>> ----- >>> gboolean >>> cancel_recurring_action(svc_action_t * op) >>> { >>> crm_info("Cancelling operation %s", op->id); >>> >>> if (recurring_actions) { >>> g_hash_table_remove(recurring_actions, op->id); >>> } >>> >>> if (op->opaque->repeat_timer) { >>> if (g_source_remove(op->opaque->repeat_timer) == FALSE) > { >>> abort(); >>> } >>> (snip) >>> -------core---- >>> #0 0x00007f30aa60ff79 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >>> >>> 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or > directory. >>> (gdb) where >>> #0 0x00007f30aa60ff79 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >>> #1 0x00007f30aa613388 in __GI_abort () at abort.c:89 >>> #2 0x00007f30aadcde77 in crm_abort (file=file@entry=0x7f30aae0152b >> "logging.c", >>> function=function@entry=0x7f30aae028c0 <__FUNCTION__.23262> >> "crm_glib_handler", line=line@entry=73, >>> assert_condition=assert_condition@entry=0x19d2ad0 "Source ID > 63 >> was not found when attempting to remove it", do_core=do_core@entry=1, >>> do_fork=<optimized out>, do_fork@entry=1) at utils.c:1195 >>> #3 0x00007f30aadf5ca7 in crm_glib_handler (log_domain=0x7f30aa35eb6e >> "GLib", flags=<optimized out>, >>> message=0x19d2ad0 "Source ID 63 was not found when attempting > to >> remove it", user_data=<optimized out>) at logging.c:73 >>> #4 0x00007f30aa320ae1 in g_logv () from >> /lib/x86_64-linux-gnu/libglib-2.0.so.0 >>> #5 0x00007f30aa320d72 in g_log () from >> /lib/x86_64-linux-gnu/libglib-2.0.so.0 >>> #6 0x00007f30aa318c5c in g_source_remove () from >> /lib/x86_64-linux-gnu/libglib-2.0.so.0 >>> #7 0x00007f30aabb2b55 in cancel_recurring_action > (op=op@entry=0x19caa90) >> at services.c:363 >>> #8 0x00007f30aabb2bee in services_action_cancel > (name=name@entry=0x19d0530 >> "dummy3", action=<optimized out>, > interval=interval@entry=10000) >>> at services.c:385 >>> #9 0x000000000040405a in cancel_op (rsc_id=rsc_id@entry=0x19d0530 >> "dummy3", action=action@entry=0x19cec10 "monitor", >> interval=10000) >>> at lrmd.c:1404 >>> #10 0x000000000040614f in process_lrmd_rsc_cancel (client=0x19c8290, > id=74, >> request=0x19ca8a0) at lrmd.c:1468 >>> #11 process_lrmd_message (client=client@entry=0x19c8290, id=74, >> request=request@entry=0x19ca8a0) at lrmd.c:1507 >>> #12 0x0000000000402bac in lrmd_ipc_dispatch (c=0x19c79c0, >> data=<optimized out>, size=361) at main.c:148 >>> #13 0x00007f30aa07b4d9 in qb_ipcs_dispatch_connection_request () from >> /usr/lib/libqb.so.0 >>> #14 0x00007f30aadf209d in gio_read_socket (gio=<optimized out>, >> condition=G_IO_IN, data=0x19c68a8) at mainloop.c:437 >>> #15 0x00007f30aa319ce5 in g_main_context_dispatch () from >> /lib/x86_64-linux-gnu/libglib-2.0.so.0 >>> ---Type <return> to continue, or q <return> to quit--- >>> #16 0x00007f30aa31a048 in ?? () from > /lib/x86_64-linux-gnu/libglib-2.0.so.0 >>> #17 0x00007f30aa31a30a in g_main_loop_run () from >> /lib/x86_64-linux-gnu/libglib-2.0.so.0 >>> #18 0x0000000000402774 in main (argc=<optimized out>, >> argv=0x7fffcdd90b88) at main.c:344 >>> --------- >>> >>> Best Regards, >>> Hideo Yamauchi. >>> >>> >>> >>> ----- Original Message ----- >>>> From: "renayama19661...@ybb.ne.jp" >> <renayama19661...@ybb.ne.jp> >>>> To: Andrew Beekhof <and...@beekhof.net> >>>> Cc: The Pacemaker cluster resource manager >> <pacemaker@oss.clusterlabs.org> >>>> Date: 2014/10/7, Tue 11:15 >>>> Subject: Re: [Pacemaker] [Problem]When Pacemaker uses a new > version of >> glib, g_source_remove fails. >>>> >>>> Hi Andrew, >>>> >>>>> Not quite. Returning FALSE from the callback also removes the >> source from >>>> glib. >>>>> So your test case effectively removes t1 twice: once > implicitly by >>>> returning >>>>> FALSE in timer_func1() and then again explicitly in > timer_func3() >>>> >>>> >>>> Your opinion is right. >>>> >>>> >>>> If Pacemaker repeats and does not remove the resources which timer > >> concluded in >>>> FALSE, glib does not return the error. >>>> >>>> >>>> Many Thanks, >>>> Hideo Yamauchi. >>>> >>>> >>>> ----- Original Message ----- >>>>> From: Andrew Beekhof <and...@beekhof.net> >>>>> To: renayama19661...@ybb.ne.jp >>>>> Cc: The Pacemaker cluster resource manager >>>> <pacemaker@oss.clusterlabs.org> >>>>> Date: 2014/10/7, Tue 11:06 >>>>> Subject: Re: [Pacemaker] [Problem]When Pacemaker uses a new > version >> of >>>> glib, g_source_remove fails. >>>>> >>>>> >>>>> On 7 Oct 2014, at 1:03 pm, renayama19661...@ybb.ne.jp wrote: >>>>> >>>>>> Hi Andrew, >>>>>> >>>>>>>> These problems seem to be due to a correction of > next >> glib >>>> somehow >>>>> or >>>>>>> other. >>>>>>>> * >>>>>>> >>>>> >>>> >> > https://github.com/GNOME/glib/commit/393503ba5bdc7c09cd46b716aaf3d2c63a6c7f9c >>>>>>> >>>>>>> The glib behaviour on unbuntu seems reasonable, > removing >> a source >>>>> multiple times >>>>>>> IS a valid error. >>>>>>> I need the stack trace to know where/how this > situation >> can occur >>>> in >>>>> pacemaker. >>>>>> >>>>>> >>>>>> Pacemaker does not remove resources several times as far > as I >> >>>> confirmed it. >>>>>> In Ubuntu(glib2.40), an error occurs just to remove > resources >> first. >>>>> >>>>> Not quite. Returning FALSE from the callback also removes the >> source from >>>> glib. >>>>> So your test case effectively removes t1 twice: once > implicitly by >>>> returning >>>>> FALSE in timer_func1() and then again explicitly in > timer_func3() >>>>> >>>>>> >>>>>> Confirmation and the deletion of resources seem to be >> necessary not to >>>> >>>>> produce an error in Ubuntu. >>>>>> And this works well in glib of RHEL6.x.(and RHEL7.0) >>>>>> >>>>>> if (g_main_context_find_source_by_id (NULL, t1) > != >> NULL) { >>>>>> g_source_remove(t1); >>>>>> } >>>>>> >>>>>> I send it to you after acquiring stack trace. >>>>>> >>>>>> Many Thanks! >>>>>> Hideo Yamauchi. >>>>>> >>>>>> ----- Original Message ----- >>>>>>> From: Andrew Beekhof <and...@beekhof.net> >>>>>>> To: renayama19661...@ybb.ne.jp; The Pacemaker > cluster >> resource >>>> manager >>>>> <pacemaker@oss.clusterlabs.org> >>>>>>> Cc: >>>>>>> Date: 2014/10/7, Tue 09:44 >>>>>>> Subject: Re: [Pacemaker] [Problem]When Pacemaker > uses a >> new >>>> version of >>>>> glib, g_source_remove fails. >>>>>>> >>>>>>> >>>>>>> On 6 Oct 2014, at 4:09 pm, > renayama19661...@ybb.ne.jp >> wrote: >>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> When I move the next sample in >> RHEL6.5(glib2-2.22.5-7.el6) and >>>> >>>>>>> Ubuntu14.04(libglib2.0-0:amd64 2.40.0-2), movement > is >> different. >>>>>>>> >>>>>>>> * Sample : test2.c >>>>>>>> {{{ >>>>>>>> #include <stdio.h> >>>>>>>> #include <stdlib.h> >>>>>>>> #include <glib.h> >>>>>>>> #include <sys/times.h> >>>>>>>> guint t1, t2, t3; >>>>>>>> gboolean timer_func2(gpointer data){ >>>>>>>> printf("TIMER > EXPIRE!2\n"); >>>>>>>> fflush(stdout); >>>>>>>> return FALSE; >>>>>>>> } >>>>>>>> gboolean timer_func1(gpointer data){ >>>>>>>> clock_t ret; >>>>>>>> struct tms buff; >>>>>>>> >>>>>>>> ret = times(&buff); >>>>>>>> printf("TIMER EXPIRE!1 > %d\n", >>>> (int)ret); >>>>>>>> fflush(stdout); >>>>>>>> return FALSE; >>>>>>>> } >>>>>>>> gboolean timer_func3(gpointer data){ >>>>>>>> printf("TIMER EXPIRE > 3!\n"); >>>>>>>> fflush(stdout); >>>>>>>> printf("remove > timer1!\n"); >>>>>>>> >>>>>>>> fflush(stdout); >>>>>>>> g_source_remove(t1); >>>>>>>> printf("remove > timer2!\n"); >>>>>>>> fflush(stdout); >>>>>>>> g_source_remove(t2); >>>>>>>> printf("remove > timer3!\n"); >>>>>>>> fflush(stdout); >>>>>>>> g_source_remove(t3); >>>>>>>> return FALSE; >>>>>>>> } >>>>>>>> int main(int argc, char** argv){ >>>>>>>> GMainLoop *m; >>>>>>>> clock_t ret; >>>>>>>> struct tms buff; >>>>>>>> gint64 t; >>>>>>>> m = g_main_new(FALSE); >>>>>>>> t1 = g_timeout_add(1000, timer_func1, > NULL); >>>>>>>> t2 = g_timeout_add(60000, timer_func2, > NULL); >>>>>>>> t3 = g_timeout_add(5000, timer_func3, > NULL); >>>>>>>> ret = times(&buff); >>>>>>>> printf("START! %d\n", >> (int)ret); >>>>>>>> g_main_run(m); >>>>>>>> } >>>>>>>> >>>>>>>> }}} >>>>>>>> * Result >>>>>>>> ---- RHEL6.5(glib2-2.22.5-7.el6) ---- >>>>>>>> [root@snmp1 ~]# ./test2 >>>>>>>> START! 429576012 >>>>>>>> TIMER EXPIRE!1 429576112 >>>>>>>> TIMER EXPIRE 3! >>>>>>>> remove timer1! >>>>>>>> remove timer2! >>>>>>>> remove timer3! >>>>>>>> >>>>>>>> ---- Ubuntu14.04(libglib2.0-0:amd64 2.40.0-2) > ---- >>>>>>>> root@a1be102:~# ./test2 >>>>>>>> START! 1718163089 >>>>>>>> TIMER EXPIRE!1 1718163189 >>>>>>>> TIMER EXPIRE 3! >>>>>>>> remove timer1! >>>>>>>> >>>>>>>> (process:1410): GLib-CRITICAL **: Source ID 1 > was not >> found >>>> when >>>>> attempting >>>>>>> to remove it >>>>>>>> remove timer2! >>>>>>>> remove timer3! >>>>>>>> >>>>>>>> >>>>>>>> These problems seem to be due to a correction of > next >> glib >>>> somehow >>>>> or >>>>>>> other. >>>>>>>> * >>>>>>> >>>>> >>>> >> > https://github.com/GNOME/glib/commit/393503ba5bdc7c09cd46b716aaf3d2c63a6c7f9c >>>>>>> >>>>>>> The glib behaviour on unbuntu seems reasonable, > removing >> a source >>>>> multiple times >>>>>>> IS a valid error. >>>>>>> I need the stack trace to know where/how this > situation >> can occur >>>> in >>>>> pacemaker. >>>>>>> >>>>>>>> >>>>>>>> In g_source_remove() until before change, the >> deletion of the >>>> timer >>>>> which >>>>>>> practice completed is possible, but > g_source_remove() >> after the >>>> change >>>>> causes an >>>>>>> error. >>>>>>>> >>>>>>>> Under this influence, we get the following crit > error >> in the >>>>> environment of >>>>>>> Pacemaker using a new version of glib. >>>>>>>> >>>>>>>> lrmd[1632]: error: crm_abort: > crm_glib_handler: >> Forked >>>> child >>>>> 1840 to >>>>>>>> record non-fatal assert at logging.c:73 : Source > ID >> 51 was not >>>> >>>>> found when >>>>>>>> attempting to remove it >>>>>>>> lrmd[1632]: crit: crm_glib_handler: GLib: > Source >> ID 51 was >>>> not >>>>> found >>>>>>>> when attempting to remove it >>>>>>>> >>>>>>>> It seems that some kind of coping is necessary > in >> Pacemaker >>>> when I >>>>> think >>>>>>> about next. >>>>>>>> * Distribution using a new version of glib > including >> Ubuntu. >>>>>>>> * Version up of future glib of RHEL. >>>>>>>> >>>>>>>> A similar problem is reported in the ML. >>>>>>>> * >>>>> > http://www.gossamer-threads.com/lists/linuxha/pacemaker/91333#91333 >>>>>>>> * >>>> http://www.gossamer-threads.com/lists/linuxha/pacemaker/92408 >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Hideo Yamauchi. >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Pacemaker mailing list: > Pacemaker@oss.clusterlabs.org >>>>>>>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>> >>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>> Getting started: >>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org