On 17/05/2013, at 6:22 PM, Староверов Никита Александрович <nastarove...@kola.so-ups.ru> wrote:
> Hello, pacemaker users and developers. > > First, many thanks to clusterlabs.org for their software, Pacemaker helps us > very much! > > I am testing cluster configuration based on Pacemaker+CMAN. I configured > fencing as described in Pacemker documentation about CMAN based clusters and > it works. > May be I misunderstood something, but I can't acknowledge nodes fencing > manually. > I use fence_ipmilan as device and when I plug out power cable from server > stonith fails. I expected this, of course, but I don't know how to > acknowledge manual fencing. > When I try stonith_admin -C node_name, it does nothing. > I see this in logs: > > May 17 11:46:52 NODE1 stonith-ng[5434]: notice: stonith_manual_ack: > Injecting manual confirmation that NODE2 is safely off/down > May 17 11:46:52 NODE1 stonith-ng[5434]: notice: log_operation: Operation > 'off' [0] (call 2 from stonith_admin.10959) for host 'NODE2' with device > 'manual_ack' returned: 0 (OK) > May 17 11:46:52 NODE1 stonith-ng[5434]: error: crm_abort: do_local_reply: > Triggered assert at main.c:241 : client_obj->request_id > > May 17 11:46:52 NODE1 stonith-ng[5434]: error: crm_abort: crm_ipcs_sendv: > Triggered assert at ipc.c:575 : header->qb.id != 0 > > May 17 11:47:35 NODE1 stonith_admin[11162]: notice: crm_log_args: Invoked: > stonith_admin -C NODE2 > > May 17 11:47:35 NODE1 stonith-ng[5434]: notice: merge_duplicates: Merging > stonith action off for node NODE2 originating from client > stonith_admin.11162.b42172b1 with identical request from > stonith_admin.10959@NODE1.f2048550 (0s) > > > > May 17 11:47:35 NODE1 stonith-ng[5434]: notice: stonith_manual_ack: > Injecting manual confirmation that NODE2 is safely off/down > > May 17 11:47:35 NODE1 stonith-ng[5434]: notice: log_operation: Operation > 'off' [0] (call 2 from stonith_admin.11162) for host 'NODE2' with device > 'manual_ack' returned: 0 (OK) > May 17 11:47:35 NODE1 stonith-ng[5434]: error: crm_abort: do_local_reply: > Triggered assert at main.c:241 : client_obj->request_id > > May 17 11:47:35 NODE1 stonith-ng[5434]: error: crm_abort: crm_ipcs_sendv: > Triggered assert at ipc.c:575 : header->qb.id != 0 Well, thats not nothing, but it certainly doesn't look right either. I will investigate. Which version is this? > > Nothing happened after stonith_admin -C. > Fenced still trying to fence_pcmk, and I see lots of "Timer expired" from > stonith-ng, and failed fence_ipmilan operations. > > Yes, I can do fence_ack_manual on cman-master node, and then cleanup node > state with cibadmin, but it is very sloooow way. > If I lost many servers in cluster, for example, lost power in one rack with > two or more servers, I need a way to running again services on remaining > nodes as quickly as possible. > > I think fencing manual acknowledgement must be fast and simple and I suppose > that stonith_admin --confirm have to do that. > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org