I am using 1.0.9.1 of Pacemaker. I have applied the fix for bug 2477 and it is not working for me. I started with this: # crm_mon -n -1 ============ Last updated: Mon Nov 8 09:49:07 2010 Stack: Heartbeat Current DC: mgraid-mkp00009010repk-0 (f4e5e15c-d06b-4e37-89b9-4621af05128f) - partition with quorum Version: 1.0.9-89bd754939df5150de7cd76835f98fe90851b677 2 Nodes configured, unknown expected votes 4 Resources configured. ============ Node mgraid-mkp00009010repk-0 (f4e5e15c-d06b-4e37-89b9-4621af05128f): online SSMKP00009010REPK:0 (ocf::omneon:ss) Master icms:0 (lsb:S53icms) Started mgraid-stonith:0 (stonith:external/mgpstonith) Started omserver:0 (lsb:S49omserver) Started Node mgraid-mkp00009010repk-1 (856c1f72-7cd1-4906-8183-8be87eef96f2): online omserver:1 (lsb:S49omserver) Started SSMKP00009010REPK:1 (ocf::omneon:ss) Slave icms:1 (lsb:S53icms) Started mgraid-stonith:1 (stonith:external/mgpstonith) Started This is the output I received: # ./crm_resource -r ms-SSMKP00009010REPK -W resource ms-SSMKP00009010REPK is running on: mgraid-mkp00009010repk-0 resource ms-SSMKP00009010REPK is running on: mgraid-mkp00009010repk-1 The bug fix adds this check: if((the_rsc->variant == pe_native) && (the_rsc->role == RSC_ROLE_MASTER)) { state = "Master"; } fprintf(stdout, "resource %s is running on: %s %s\n", rsc, node->details->uname, state);
When I dump the_rsc with the debugger I see that the_rsc->variant is pe_master and not pe_native. Also, the_rsc->role is RSC_ROLE_STOPPED. This is even if I use the original crm_resource.c. The complete dump of the the_rsc structure is: (gdb) print *the_rsc $2 = {id = 0x64d260 "ms-SSMKP00009010REPK", clone_name = 0x0, long_name = 0x64d280 "ms-SSMKP00009010REPK", xml = 0x634ca0, ops_xml = 0x0, parent = 0x0, variant_opaque = 0x64d6a0, variant = pe_master, fns = 0x7f8496b67f00, cmds = 0x0, recovery_type = recovery_stop_start, restart_type = pe_restart_ignore, priority = 0, stickiness = 0, sort_index = 0, failure_timeout = 0, effective_priority = 0, migration_threshold = 1000000, flags = 262418, rsc_cons_lhs = 0x0, rsc_cons = 0x0, rsc_location = 0x0, actions = 0x0, allocated_to = 0x0, running_on = 0x658060, known_on = 0x0, allowed_nodes = 0x60e2c0, role = RSC_ROLE_STOPPED, next_role = RSC_ROLE_MASTER, meta = 0x648990, parameters = 0x648940, children = 0x610280} Any idea why this can happen? Is there another fix I need for 1.0.9.1 to make this change work? Thanks, Bob _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker