Could you attach /var/lib/pengine/pe-input-3802.bz2 from staging1? That would tell us why.
On Mon, Sep 26, 2011 at 10:28 PM, Charles Richard <chachi.rich...@gmail.com> wrote: > Hi, > > I'm making some headway finally with my pacemaker install but now that > crm_mon doesn't return errors any more and crm_verify is clear, I'm having a > problem where my master won't get promoted. Not sure what to do with this > one, any suggestions? Here's the log snippet and config files: > > Sep 26 04:06:12 staging1 crmd: [1686]: info: crm_timer_popped: PEngine > Recheck Timer (I_PE_CALC) just popped! > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > origin=crm_timer_popped ] > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: Progressed > to state S_POLICY_ENGINE after C_TIMER_POPPED > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: All 2 > cluster nodes are eligible to run resources. > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_pe_invoke: Query 106: > Requesting the current CIB: S_POLICY_ENGINE > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_pe_invoke_callback: Invoking > the PE: query=106, ref=pe_calc-dc-1317020772-95, seq=2564, quorate=1 > Sep 26 04:06:12 staging1 pengine: [1685]: info: unpack_config: Startup > probes: enabled > Sep 26 04:06:12 staging1 pengine: [1685]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Sep 26 04:06:12 staging1 pengine: [1685]: info: unpack_config: Node scores: > 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 > Sep 26 04:06:12 staging1 pengine: [1685]: info: unpack_domains: Unpacking > domains > Sep 26 04:06:12 staging1 pengine: [1685]: info: determine_online_status: > Node staging1.dev.applepeak.com is online > Sep 26 04:06:12 staging1 pengine: [1685]: info: determine_online_status: > Node staging2.dev.applepeak.com is online > Sep 26 04:06:12 staging1 pengine: [1685]: notice: group_print: Resource > Group: mysql > Sep 26 04:06:12 staging1 pengine: [1685]: notice: native_print: > fs_mysql#011(ocf::heartbeat:Filesystem):#011Stopped > Sep 26 04:06:12 staging1 pengine: [1685]: notice: native_print: > ip_mysql#011(ocf::heartbeat:IPaddr2):#011Stopped > Sep 26 04:06:12 staging1 pengine: [1685]: notice: native_print: > mysqld#011(lsb:mysqld):#011Stopped > Sep 26 04:06:12 staging1 pengine: [1685]: notice: clone_print: Master/Slave > Set: ms_drbd_mysql > Sep 26 04:06:12 staging1 pengine: [1685]: notice: short_print: Stopped: > [ drbd_mysql:0 drbd_mysql:1 ] > Sep 26 04:06:12 staging1 pengine: [1685]: info: master_color: ms_drbd_mysql: > Promoted 0 instances of a possible 1 to master > Sep 26 04:06:12 staging1 pengine: [1685]: info: native_merge_weights: > fs_mysql: Rolling back scores from ip_mysql > Sep 26 04:06:12 staging1 pengine: [1685]: info: native_merge_weights: > ip_mysql: Rolling back scores from mysqld > Sep 26 04:06:12 staging1 pengine: [1685]: info: master_color: ms_drbd_mysql: > Promoted 0 instances of a possible 1 to master > Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource > fs_mysql#011(Stopped) > Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource > ip_mysql#011(Stopped) > Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource > mysqld#011(Stopped) > Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource > drbd_mysql:0#011(Stopped) > Sep 26 04:06:12 staging1 pengine: [1685]: notice: LogActions: Leave resource > drbd_mysql:1#011(Stopped) > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Sep 26 04:06:12 staging1 crmd: [1686]: info: unpack_graph: Unpacked > transition 72: 0 actions in 0 synapses > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_te_invoke: Processing graph > 72 (ref=pe_calc-dc-1317020772-95) derived from > /var/lib/pengine/pe-input-3802.bz2 > Sep 26 04:06:12 staging1 crmd: [1686]: info: run_graph: > ==================================================== > Sep 26 04:06:12 staging1 crmd: [1686]: notice: run_graph: Transition 72 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-3802.bz2): Complete > Sep 26 04:06:12 staging1 crmd: [1686]: info: te_graph_trigger: Transition 72 > is now complete > Sep 26 04:06:12 staging1 crmd: [1686]: info: notify_crmd: Transition 72 > status: done - <null> > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Sep 26 04:06:12 staging1 crmd: [1686]: info: do_state_transition: Starting > PEngine Recheck Timer > Sep 26 04:06:12 staging1 pengine: [1685]: info: process_pe_message: > Transition 72: PEngine Input stored in: /var/lib/pengine/pe-input-3802.bz2 > Sep 26 04:15:09 staging1 cib: [1682]: info: cib_stats: Processed 1 > operations (0.00us average, 0% utilization) in the last 10min > > My drbd config file: > > resource mysqld { > > protocol C; > > startup { wfc-timeout 0; degr-wfc-timeout 120; } > > disk { on-io-error detach; } > > > on staging1 { > > device /dev/drbd0; > > disk /dev/vg_staging1/lv_data; > > meta-disk internal; > > address 10.10.20.1:7788; > > } > > on staging2 { > > device /dev/drbd0; > > disk /dev/vg_staging2/lv_data; > > meta-disk internal; > > address 10.10.20.2:7788; > > } > > } > > corosync.conf: > > compatibility: whitetank > > aisexec { > user: root > group: root > } > > totem { > version: 2 > secauth: off > threads: 0 > interface { > ringnumber: 0 > bindnetaddr: 10.10.10.0 > mcastaddr: 226.94.1.1 > mcastport: 5405 > } > } > > logging { > fileline: off > to_stderr: no > to_logfile: no > to_syslog: yes > logfile: /var/log/cluster/corosync.log > debug: off > timestamp: on > logger_subsys { > subsys: AMF > debug: off > } > } > > amf { > mode: disabled > } > > service { > #Load Pacemaker > name: pacemaker > ver: 0 > use_mgmtd: yes > } > > And my crm config: > > node staging1.dev.applepeak.com > node staging2.dev.applepeak.com > primitive drbd_mysql ocf:linbit:drbd \ > params drbd_resource="mysqld" \ > op monitor interval="15s" \ > op start interval="0" timeout="240s" \ > op stop interval="0" timeout="100s" > primitive fs_mysql ocf:heartbeat:Filesystem \ > params device="/dev/drbd0" directory="/opt/data/mysql/data/mysql" > fstype="ext4" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" > primitive ip_mysql ocf:heartbeat:IPaddr2 \ > params ip="10.10.10.31" nic="eth0" > primitive mysqld lsb:mysqld > group mysql fs_mysql ip_mysql mysqld > ms ms_drbd_mysql drbd_mysql \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master > order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > last-lrm-refresh="1316961847" \ > stop-all-resources="true" \ > no-quorum-policy="ignore" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > Thanks, > Charles > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker