Re: [Pacemaker] Exiting corosync-notifyd results in shutting down of pacemakerd

2012-10-02 Thread Andrew Beekhof
On Wed, Oct 3, 2012 at 2:51 AM, GrĂ¼ninger, Andreas (LGL Extern)
 wrote:
> I am currently investigating the monitoring of corosync/pacemaker with snmp.
> crm_mon used with the OCF resource ClusterMon works as it should.
>
> But corosync-notifyd can't be used in our case.
> I start corosync-notifyd in the foreground as follows
> corosync-notifyd -f -l -s  -m 10.50.235.1
>
> When I stop the running corosync-notifyd with CTRL-C, pacemaker shuts down 
> with the following entries in the logfile.
> Is this an error or the desired result?

Based on the logs, pacemaker thinks corosync died.  Did that happen?
If so there is not much pacemaker can do :-(

>
> 
> Oct 02 18:42:19 [27126] pacemakerd:error: cfg_connection_destroy:   
> Connection destroyed
> Oct 02 18:42:19 [27126] pacemakerd:   notice: pcmk_shutdown_worker: 
> Shuting down Pacemaker
> Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping 
> crmd: Sent -15 to process 27177
> Oct 02 18:42:19 [27126] pacemakerd:error: cpg_connection_destroy:   
> Connection destroyed
> Oct 02 18:42:19 [27177]   crmd: info: crm_signal_dispatch:  
> Invoking handler for signal 15: Terminated
> Oct 02 18:42:19 [27177]   crmd:   notice: crm_shutdown: Requesting 
> shutdown, upper limit is 120ms
> Oct 02 18:42:19 [27128] stonith-ng:error: pcmk_cpg_dispatch:
> Connection to the CPG API failed: 2
> Oct 02 18:42:19 [27177]   crmd: info: do_shutdown_req:  Sending 
> shutdown request to zd-sol-s1-v61
> Oct 02 18:42:19 [27128] stonith-ng:error: stonith_peer_ais_destroy:   
>   AIS connection terminated
> Oct 02 18:42:19 [27128] stonith-ng: info: stonith_shutdown: 
> Terminating with  1 clients
> Oct 02 18:42:19 [27130]  attrd:error: pcmk_cpg_dispatch:
> Connection to the CPG API failed: 2
> Oct 02 18:42:19 [27130]  attrd: crit: attrd_ais_destroy:Lost 
> connection to Corosync service!
> Oct 02 18:42:19 [27130]  attrd:   notice: main: Exiting...
> Oct 02 18:42:19 [27130]  attrd:   notice: main: Disconnecting client 
> 81ffc38, pid=27177...
> Oct 02 18:42:19 [27128] stonith-ng: info: qb_ipcs_us_withdraw:  
> withdrawing server sockets
> Oct 02 18:42:19 [27128] stonith-ng: info: crm_xml_cleanup:  Cleaning up 
> memory from libxml2
> Oct 02 18:42:19 [27130]  attrd:error: attrd_cib_connection_destroy:   
>   Connection to the CIB terminated...
> Oct 02 18:42:19 [27127]cib:error: pcmk_cpg_dispatch:
> Connection to the CPG API failed: 2
> Oct 02 18:42:19 [27127]cib:error: cib_ais_destroy:  Corosync 
> connection lost!  Exiting.
> Oct 02 18:42:19 [27129]   lrmd: info: lrmd_ipc_destroy: LRMD 
> client disconnecting 807e768 - name: crmd id: 
> 1d659f61-d6e2-4ef3-f674-b9a8ba8029e8
> Oct 02 18:42:19 [27127]cib: info: terminate_cib:
> cib_ais_destroy: Exiting fast...
> Oct 02 18:42:19 [27127]cib: info: qb_ipcs_us_withdraw:  
> withdrawing server sockets
> Oct 02 18:42:19 [27127]cib: info: qb_ipcs_us_withdraw:  
> withdrawing server sockets
> Oct 02 18:42:19 [27127]cib: info: qb_ipcs_us_withdraw:  
> withdrawing server sockets
> Oct 02 18:42:19 [27126] pacemakerd:error: pcmk_child_exit:  Child process 
> attrd exited (pid=27130, rc=1)
> Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: 
> Sending message via cpg FAILED: (rc=9) Bad handle
> Oct 02 18:42:19 [27126] pacemakerd:error: pcmk_child_exit:  Child process 
> cib exited (pid=27127, rc=64)
> Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: 
> Sending message via cpg FAILED: (rc=9) Bad handle
> Oct 02 18:42:19 [27126] pacemakerd:   notice: pcmk_child_exit:  Child process 
> crmd terminated with signal 13 (pid=27177, core=0)
> Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: 
> Sending message via cpg FAILED: (rc=9) Bad handle
> Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping 
> pengine: Sent -15 to process 27131
> Oct 02 18:42:19 [27126] pacemakerd: info: pcmk_child_exit:  Child process 
> pengine exited (pid=27131, rc=0)
> Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: 
> Sending message via cpg FAILED: (rc=9) Bad handle
> Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping 
> lrmd: Sent -15 to process 27129
> Oct 02 18:42:19 [27129]   lrmd: info: crm_signal_dispatch:  
> Invoking handler for signal 15: Terminated
> Oct 02 18:42:19 [27129]   lrmd: info: lrmd_shutdown:Terminating 
> with  0 clients
> Oct 02 18:42:19 [27129]   lrmd: info: qb_ipcs_us_withdraw:  
> withdrawing server sockets
> Oct 02 18:42:19 [27126] pacemakerd: info: pcmk_child_exit:  Child process 
> lrmd exited (pid=27129, rc=0)
> Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: 
> Sending me

[Pacemaker] Exiting corosync-notifyd results in shutting down of pacemakerd

2012-10-02 Thread LGL Extern
I am currently investigating the monitoring of corosync/pacemaker with snmp.
crm_mon used with the OCF resource ClusterMon works as it should.

But corosync-notifyd can't be used in our case.
I start corosync-notifyd in the foreground as follows
corosync-notifyd -f -l -s  -m 10.50.235.1

When I stop the running corosync-notifyd with CTRL-C, pacemaker shuts down with 
the following entries in the logfile.
Is this an error or the desired result?


Oct 02 18:42:19 [27126] pacemakerd:error: cfg_connection_destroy:   
Connection destroyed
Oct 02 18:42:19 [27126] pacemakerd:   notice: pcmk_shutdown_worker: Shuting 
down Pacemaker
Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping crmd: 
Sent -15 to process 27177
Oct 02 18:42:19 [27126] pacemakerd:error: cpg_connection_destroy:   
Connection destroyed
Oct 02 18:42:19 [27177]   crmd: info: crm_signal_dispatch:  
Invoking handler for signal 15: Terminated
Oct 02 18:42:19 [27177]   crmd:   notice: crm_shutdown: Requesting 
shutdown, upper limit is 120ms
Oct 02 18:42:19 [27128] stonith-ng:error: pcmk_cpg_dispatch:
Connection to the CPG API failed: 2
Oct 02 18:42:19 [27177]   crmd: info: do_shutdown_req:  Sending 
shutdown request to zd-sol-s1-v61
Oct 02 18:42:19 [27128] stonith-ng:error: stonith_peer_ais_destroy: 
AIS connection terminated
Oct 02 18:42:19 [27128] stonith-ng: info: stonith_shutdown: 
Terminating with  1 clients
Oct 02 18:42:19 [27130]  attrd:error: pcmk_cpg_dispatch:
Connection to the CPG API failed: 2
Oct 02 18:42:19 [27130]  attrd: crit: attrd_ais_destroy:Lost 
connection to Corosync service!
Oct 02 18:42:19 [27130]  attrd:   notice: main: Exiting...
Oct 02 18:42:19 [27130]  attrd:   notice: main: Disconnecting client 
81ffc38, pid=27177...
Oct 02 18:42:19 [27128] stonith-ng: info: qb_ipcs_us_withdraw:  
withdrawing server sockets
Oct 02 18:42:19 [27128] stonith-ng: info: crm_xml_cleanup:  Cleaning up 
memory from libxml2
Oct 02 18:42:19 [27130]  attrd:error: attrd_cib_connection_destroy: 
Connection to the CIB terminated...
Oct 02 18:42:19 [27127]cib:error: pcmk_cpg_dispatch:
Connection to the CPG API failed: 2
Oct 02 18:42:19 [27127]cib:error: cib_ais_destroy:  Corosync 
connection lost!  Exiting.
Oct 02 18:42:19 [27129]   lrmd: info: lrmd_ipc_destroy: LRMD 
client disconnecting 807e768 - name: crmd id: 
1d659f61-d6e2-4ef3-f674-b9a8ba8029e8
Oct 02 18:42:19 [27127]cib: info: terminate_cib:
cib_ais_destroy: Exiting fast...
Oct 02 18:42:19 [27127]cib: info: qb_ipcs_us_withdraw:  
withdrawing server sockets
Oct 02 18:42:19 [27127]cib: info: qb_ipcs_us_withdraw:  
withdrawing server sockets
Oct 02 18:42:19 [27127]cib: info: qb_ipcs_us_withdraw:  
withdrawing server sockets
Oct 02 18:42:19 [27126] pacemakerd:error: pcmk_child_exit:  Child process 
attrd exited (pid=27130, rc=1)
Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: Sending 
message via cpg FAILED: (rc=9) Bad handle
Oct 02 18:42:19 [27126] pacemakerd:error: pcmk_child_exit:  Child process 
cib exited (pid=27127, rc=64)
Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: Sending 
message via cpg FAILED: (rc=9) Bad handle
Oct 02 18:42:19 [27126] pacemakerd:   notice: pcmk_child_exit:  Child process 
crmd terminated with signal 13 (pid=27177, core=0)
Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: Sending 
message via cpg FAILED: (rc=9) Bad handle
Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping 
pengine: Sent -15 to process 27131
Oct 02 18:42:19 [27126] pacemakerd: info: pcmk_child_exit:  Child process 
pengine exited (pid=27131, rc=0)
Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: Sending 
message via cpg FAILED: (rc=9) Bad handle
Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping lrmd: 
Sent -15 to process 27129
Oct 02 18:42:19 [27129]   lrmd: info: crm_signal_dispatch:  
Invoking handler for signal 15: Terminated
Oct 02 18:42:19 [27129]   lrmd: info: lrmd_shutdown:Terminating 
with  0 clients
Oct 02 18:42:19 [27129]   lrmd: info: qb_ipcs_us_withdraw:  
withdrawing server sockets
Oct 02 18:42:19 [27126] pacemakerd: info: pcmk_child_exit:  Child process 
lrmd exited (pid=27129, rc=0)
Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: Sending 
message via cpg FAILED: (rc=9) Bad handle
Oct 02 18:42:19 [27126] pacemakerd:   notice: stop_child:   Stopping 
stonith-ng: Sent -15 to process 27128
Oct 02 18:42:19 [27126] pacemakerd:   notice: pcmk_child_exit:  Child process 
stonith-ng terminated with signal 11 (pid=27128, core=128)
Oct 02 18:42:19 [27126] pacemakerd:error: send_cpg_message: Sen