Hi Andrew, Hi Alan, We work hard to collect the evidence of reproduction and the problem of the phenomenon. However, we do not yet get the evidence. I will wait for the information from Alan.
Best Regards, Hideo Yamauchi. --- On Wed, 2011/11/2, Andrew Beekhof <and...@beekhof.net> wrote: > On Tue, Oct 18, 2011 at 12:19 PM, <renayama19661...@ybb.ne.jp> wrote: > > Hi, > > > > We sometimes fail in a stop of attrd. > > > > Step1. start a cluster in 2 nodes > > Step2. stop the first node.(/etc/init.d/heartbeat stop.) > > Step3. stop the second node after time passed a > > little.(/etc/init.d/heartbeat > > stop.) > > > > The attrd catches the TERM signal, but does not stop. > > There's no evidence that it actually catches it, only that it is sent. > I've seen it before but never figured out why it occurs. > > > > > (snip) > > Oct 5 02:37:38 hpdb0201 crmd: [12238]: info: do_exit: [crmd] stopped (0) > > Oct 5 02:37:38 hpdb0201 cib: [12234]: WARN: send_ipc_message: IPC Channel > > to > > 12238 is not connected > > Oct 5 02:37:38 hpdb0201 cib: [12234]: WARN: send_via_callback_channel: > > Delivery of reply to client 12238/0dbc9e28-d90d-4335-b9c4-9dd3fcb38163 > > failed > > Oct 5 02:37:38 hpdb0201 cib: [12234]: WARN: do_local_notify: A-Sync reply > > to > > crmd failed: reply failed > > Oct 5 02:37:38 hpdb0201 heartbeat: [12223]: info: killing > > /usr/lib64/heartbeat/attrd process group 12237 with signal 15 > > Oct 5 02:47:03 hpdb0201 cib: [12234]: info: cib_stats: Processed 97 > > operations > > (4123.00us average, 0% utilization) in the last 10min > > Oct 5 07:15:25 hpdb0201 ccm: [12233]: WARN: G_CH_check_int: working on IPC > > channel took 1010 ms (> 100 ms) > > Oct 5 07:15:26 hpdb0201 ccm: [12233]: WARN: G_CH_check_int: working on IPC > > channel took 1010 ms (> 100 ms) > > Oct 5 07:15:37 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch: > > Dispatch function for check for signals was delayed 1030 ms (> 1010 ms) > > before > > being called (GSource: 0xd28010) > > Oct 5 07:15:37 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch: > > started at 431583547 should have started at 431583444 > > Oct 5 07:15:44 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch: > > Dispatch function for send local status was delayed 1030 ms (> 1010 ms) > > before > > being called (GSource: 0xd27dd0) > > Oct 5 07:15:44 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch: > > started at 431584254 should have started at 431584151 > > Oct 5 07:15:44 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch: > > Dispatch function for check for signals was delayed 1030 ms (> 1010 ms) > > before > > being called (GSource: 0xd28010) > > Oct 5 07:15:44 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch: > > started at 431584254 should have started at 431584151 > > Oct 5 07:16:59 hpdb0201 heartbeat: [12223]: WARN: G_CH_check_int: working > > on > > write child took 1010 ms (> 100 ms) > > Oct 5 07:17:14 hpdb0201 stonithd: [12236]: WARN: G_CH_check_int: working on > > Heartbeat API channel took 1010 ms (> 100 ms) > > Oct 5 07:19:41 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch: > > Dispatch function for send local status was delayed 1030 ms (> 1010 ms) > > before > > being called (GSource: 0xd27dd0) > > Oct 5 07:19:41 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch: > > started at 431607988 should have started at 431607885 > > Oct 5 07:19:41 hpdb0201 heartbeat: [12223]: WARN: Gmain_timeout_dispatch: > > Dispatch function for check for signals was delayed 1030 ms (> 1010 ms) > > before > > being called (GSource: 0xd28010) > > Oct 5 07:19:41 hpdb0201 heartbeat: [12223]: info: Gmain_timeout_dispatch: > > started at 431607988 should have started at 431607885 > > (snip) > > > > We try the reproduction of the phenomenon, but do not reappear very much. > > > > The same phenomenon is reported by the next email. > > However, the argument of the problem is over on the way. > > > > * http://www.gossamer-threads.com/lists/linuxha/pacemaker/62147 > > > > The phenomenon occurred by the next combination. > > * pacemaker-1.0.11 > > * resource-agents-3.9.2 > > * cluster-glue-1.0.7 > > * heartbeat-3.0.5 > > > > I registered these contents with Bugzilla. > > * http://bugs.clusterlabs.org/show_bug.cgi?id=5004 > > > > Best Regards, > > Hideo Yamauchi. > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker