Re: [ClusterLabs] Cluster failover failure with Unresolved dependency

Ken Gaillot Sat, 19 Mar 2016 00:49:04 -0700

On 03/16/2016 11:20 AM, Lorand Kelemen wrote:
> Dear Ken,
> 
> I already modified the startup as suggested during testing, thanks! I
> swapped the postfix ocf resource to the amavisd systemd resource, as latter
> controls postfix startup also as it turns out and having both resouces in
> the mail-services group causes conflicts (postfix is detected as not
> running).
> 
> Still experiencing the same behaviour, killing amavisd returns an rc=7 for
> the monitoring operation on the "victim" node, this soungs logical, but the
> logs contain the same: amavisd and virtualip cannot run anywhere.
> 
> I made sure systemd is clean (amavisd = inactive, not running instead of
> failed) and also reset the failcount on all resources before killing
> amavisd.
> 
> How can I make sure to have a clean state for the resources beside above
> actions?


What you did is fine. I'm not sure why amavisd and virtualip can't run.
Can you show the output of "cibadmin -Q" when the cluster is in that state?

> Also note: when causing a filesystem resource to fail (e.g. with unmout),
> the failover happens successfully, all resources are started on the
> "survivor" node.
> 
> Best regards,
> Lorand
> 
> 
> On Wed, Mar 16, 2016 at 4:34 PM, Ken Gaillot <kgail...@redhat.com> wrote:
> 
>> On 03/16/2016 05:49 AM, Lorand Kelemen wrote:
>>> Dear Ken,
>>>
>>> Thanks for the reply! I lowered migration-threshold to 1 and rearranged
>>> contraints like you suggested:
>>>
>>> Location Constraints:
>>> Ordering Constraints:
>>>   promote mail-clone then start fs-services (kind:Mandatory)
>>>   promote spool-clone then start fs-services (kind:Mandatory)
>>>   start fs-services then start network-services (kind:Mandatory)
>>
>> Certainly not a big deal, but I would change the above constraint to
>> start fs-services then start mail-services. The IP doesn't care whether
>> the filesystems are up yet or not, but postfix does.
>>
>>>   start network-services then start mail-services (kind:Mandatory)
>>> Colocation Constraints:
>>>   fs-services with spool-clone (score:INFINITY) (rsc-role:Started)
>>> (with-rsc-role:Master)
>>>   fs-services with mail-clone (score:INFINITY) (rsc-role:Started)
>>> (with-rsc-role:Master)
>>>   network-services with mail-services (score:INFINITY)
>>>   mail-services with fs-services (score:INFINITY)
>>>
>>> Now virtualip and postfix becomes stopped, I guess these are relevant
>> but I
>>> attach also full logs:
>>>
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_color: Resource postfix cannot run anywhere
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_color: Resource virtualip-1 cannot run anywhere
>>>
>>> Interesting, will try to play around with ordering - colocation, the
>>> solution must be in these settings...
>>>
>>> Best regards,
>>> Lorand
>>>
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: --- 0.215.7 2
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: +++ 0.215.8 (null)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +  /cib:  @num_updates=8
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       ++
>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']:
>>>  <lrm_rsc_op id="postfix_last_failure_0"
>>> operation_key="postfix_monitor_45000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1333" rc-code="7"
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> abort_transition_graph:       Transition aborted by postfix_monitor_45000
>>> 'create' on mail1: Inactive graph
>>> (magic=0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a, cib=0.215.8,
>>> source=process_graph_event:598, 1)
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> update_failcount:     Updating failcount for postfix on mail1 after
>> failed
>>> monitor: rc=7 (update=value++, time=1458124686)
>>
>> I don't think your constraints are causing problems now; the above
>> message indicates that the postfix resource failed. Postfix may not be
>> able to run anywhere because it's already failed on both nodes, and the
>> IP would be down because it has to be colocated with postfix, and
>> postfix can't run.
>>
>> The rc=7 above indicates that the postfix agent's monitor operation
>> returned 7, which is "not running". I'd check the logs for postfix errors.
>>
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> process_graph_event:  Detected action (2962.86)
>>> postfix_monitor_45000.1333=not running: failed
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_client_update:  Expanded fail-count-postfix=value++ to 1
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Completed cib_modify operation for section status:
>> OK
>>> (rc=0, origin=mail1/crmd/253, version=0.215.8)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_peer_update:    Setting fail-count-postfix[mail1]: (null) -> 1 from
>>> mail2
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:   notice:
>>> do_state_transition:  State transition S_IDLE -> S_POLICY_ENGINE [
>>> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> write_attribute:      Sent update 406 with 2 changes for
>>> fail-count-postfix, id=<n/a>, set=(null)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_peer_update:    Setting last-failure-postfix[mail1]: 1458124291 ->
>>> 1458124686 from mail2
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> write_attribute:      Sent update 407 with 2 changes for
>>> last-failure-postfix, id=<n/a>, set=(null)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Forwarding cib_modify operation for section status
>> to
>>> master (origin=local/attrd/406)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Forwarding cib_modify operation for section status
>> to
>>> master (origin=local/attrd/407)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: --- 0.215.8 2
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: +++ 0.215.9 (null)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +  /cib:  @num_updates=9
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       ++
>>>
>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']:
>>>  <nvpair id="status-1-fail-count-postfix" name="fail-count-postfix"
>>> value="1"/>
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Completed cib_modify operation for section status:
>> OK
>>> (rc=0, origin=mail2/attrd/406, version=0.215.9)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: --- 0.215.9 2
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: +++ 0.215.10 (null)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +  /cib:  @num_updates=10
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +
>>>
>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix']:
>>>  @value=1458124686
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_cib_callback:   Update 406 for fail-count-postfix: OK (0)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_cib_callback:   Update 406 for fail-count-postfix[mail1]=1: OK (0)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Completed cib_modify operation for section status:
>> OK
>>> (rc=0, origin=mail2/attrd/407, version=0.215.10)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_cib_callback:   Update 406 for fail-count-postfix[mail2]=(null): OK
>>> (0)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_cib_callback:   Update 407 for last-failure-postfix: OK (0)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_cib_callback:   Update 407 for
>>> last-failure-postfix[mail1]=1458124686: OK (0)
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local      attrd:     info:
>>> attrd_cib_callback:   Update 407 for
>>> last-failure-postfix[mail2]=1457610376: OK (0)
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> abort_transition_graph:       Transition aborted by
>>> status-1-fail-count-postfix, fail-count-postfix=1: Transient attribute
>>> change (create cib=0.215.9, source=abort_unless_down:319,
>>>
>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1'],
>>> 1)
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> abort_transition_graph:       Transition aborted by
>>> status-1-last-failure-postfix, last-failure-postfix=1458124686: Transient
>>> attribute change (modify cib=0.215.10, source=abort_unless_down:319,
>>>
>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix'],
>>> 1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> unpack_config:        On loss of CCM Quorum: Ignore
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_online_status:      Node mail1 is online
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_online_status:      Node mail2 is online
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource mail:0 active in
>>> master mode on mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource spool:0 active in
>>> master mode on mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-spool active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-spool active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-mail active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-mail active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:  warning:
>>> unpack_rsc_op_failure:        Processing failed op monitor for postfix on
>>> mail1: not running (7)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource spool:1 active in
>>> master mode on mail2
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource mail:1 active in
>>> master mode on mail2
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> group_print:   Resource Group: network-services
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      virtualip-1        (ocf::heartbeat:IPaddr2):
>>  Started
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> clone_print:   Master/Slave Set: spool-clone [spool]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Masters: [ mail1 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Slaves: [ mail2 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> clone_print:   Master/Slave Set: mail-clone [mail]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Masters: [ mail1 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Slaves: [ mail2 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> group_print:   Resource Group: fs-services
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      fs-spool   (ocf::heartbeat:Filesystem):    Started
>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      fs-mail    (ocf::heartbeat:Filesystem):    Started
>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> group_print:   Resource Group: mail-services
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      postfix    (ocf::heartbeat:postfix):       FAILED
>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: Promoting mail:0 (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: mail-clone: Promoted 1 instances of a possible 1 to master
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: Promoting spool:0 (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: spool-clone: Promoted 1 instances of a possible 1 to master
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> RecurringOp:   Start recurring monitor (45s) for postfix on mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   virtualip-1     (Started mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   spool:0 (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   spool:1 (Slave mail2)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   mail:0  (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   mail:1  (Slave mail2)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   fs-spool        (Started mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   fs-mail (Started mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> LogActions:   Recover postfix (Started mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> process_pe_message:   Calculated Transition 2963:
>>> /var/lib/pacemaker/pengine/pe-input-330.bz2
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> handle_response:      pe_calc calculation pe_calc-dc-1458124686-5541 is
>>> obsolete
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> unpack_config:        On loss of CCM Quorum: Ignore
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_online_status:      Node mail1 is online
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_online_status:      Node mail2 is online
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource mail:0 active in
>>> master mode on mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource spool:0 active in
>>> master mode on mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-spool active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-spool active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-mail active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource fs-mail active on
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:  warning:
>>> unpack_rsc_op_failure:        Processing failed op monitor for postfix on
>>> mail1: not running (7)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource spool:1 active in
>>> master mode on mail2
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> determine_op_status:  Operation monitor found resource mail:1 active in
>>> master mode on mail2
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> group_print:   Resource Group: network-services
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      virtualip-1        (ocf::heartbeat:IPaddr2):
>>  Started
>>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> clone_print:   Master/Slave Set: spool-clone [spool]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Masters: [ mail1 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Slaves: [ mail2 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> clone_print:   Master/Slave Set: mail-clone [mail]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Masters: [ mail1 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> short_print:       Slaves: [ mail2 ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> group_print:   Resource Group: fs-services
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      fs-spool   (ocf::heartbeat:Filesystem):    Started
>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      fs-mail    (ocf::heartbeat:Filesystem):    Started
>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> group_print:   Resource Group: mail-services
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_print:      postfix    (ocf::heartbeat:postfix):       FAILED
>> mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> get_failcount_full:   postfix has failed 1 times on mail1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:  warning:
>>> common_apply_stickiness:      Forcing postfix away from mail1 after 1
>>> failures (max=1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: Promoting mail:0 (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: mail-clone: Promoted 1 instances of a possible 1 to master
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: Promoting spool:0 (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> master_color: spool-clone: Promoted 1 instances of a possible 1 to master
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> rsc_merge_weights:    fs-mail: Rolling back scores from postfix
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> rsc_merge_weights:    postfix: Rolling back scores from virtualip-1
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_color: Resource postfix cannot run anywhere
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> native_color: Resource virtualip-1 cannot run anywhere
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> LogActions:   Stop    virtualip-1     (mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   spool:0 (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   spool:1 (Slave mail2)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   mail:0  (Master mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   mail:1  (Slave mail2)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   fs-spool        (Started mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:     info:
>>> LogActions:   Leave   fs-mail (Started mail1)
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> LogActions:   Stop    postfix (mail1)
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> do_state_transition:  State transition S_POLICY_ENGINE ->
>>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>> origin=handle_response ]
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.local    pengine:   notice:
>>> process_pe_message:   Calculated Transition 2964:
>>> /var/lib/pacemaker/pengine/pe-input-331.bz2
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:     info:
>>> do_te_invoke: Processing graph 2964 (ref=pe_calc-dc-1458124686-5542)
>>> derived from /var/lib/pacemaker/pengine/pe-input-331.bz2
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local       crmd:   notice:
>>> te_rsc_command:       Initiating action 5: stop postfix_stop_0 on mail1
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: --- 0.215.10 2
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: +++ 0.215.11 (null)
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +  /cib:  @num_updates=11
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +
>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']/lrm_rsc_op[@id='postfix_last_0']:
>>>  @operation_key=postfix_stop_0, @operation=stop,
>>> @transition-key=5:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>> @transition-magic=0:0;5:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>> @call-id=1335, @last-run=1458124686, @last-rc-change=1458124686,
>>> @exec-time=435
>>> Mar 16 11:38:07 [7420] HWJ-626.domain.local       crmd:     info:
>>> match_graph_event:    Action postfix_stop_0 (5) confirmed on mail1 (rc=0)
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Completed cib_modify operation for section status:
>> OK
>>> (rc=0, origin=mail1/crmd/254, version=0.215.11)
>>> Mar 16 11:38:07 [7420] HWJ-626.domain.local       crmd:   notice:
>>> te_rsc_command:       Initiating action 12: stop virtualip-1_stop_0 on
>> mail1
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: --- 0.215.11 2
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       Diff: +++ 0.215.12 (null)
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +  /cib:  @num_updates=12
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_perform_op:       +
>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='virtualip-1']/lrm_rsc_op[@id='virtualip-1_last_0']:
>>>  @operation_key=virtualip-1_stop_0, @operation=stop,
>>> @transition-key=12:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>> @transition-magic=0:0;12:2964:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>> @call-id=1337, @last-run=1458124687, @last-rc-change=1458124687,
>>> @exec-time=56
>>> Mar 16 11:38:07 [7420] HWJ-626.domain.local       crmd:     info:
>>> match_graph_event:    Action virtualip-1_stop_0 (12) confirmed on mail1
>>> (rc=0)
>>> Mar 16 11:38:07 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_request:  Completed cib_modify operation for section status:
>> OK
>>> (rc=0, origin=mail1/crmd/255, version=0.215.12)
>>> Mar 16 11:38:07 [7420] HWJ-626.domain.local       crmd:   notice:
>>> run_graph:    Transition 2964 (Complete=7, Pending=0, Fired=0, Skipped=0,
>>> Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-331.bz2):
>> Complete
>>> Mar 16 11:38:07 [7420] HWJ-626.domain.local       crmd:     info: do_log:
>>>     FSA: Input I_TE_SUCCESS from notify_crmd() received in state
>>> S_TRANSITION_ENGINE
>>> Mar 16 11:38:07 [7420] HWJ-626.domain.local       crmd:   notice:
>>> do_state_transition:  State transition S_TRANSITION_ENGINE -> S_IDLE [
>>> input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
>>> Mar 16 11:38:12 [7415] HWJ-626.domain.local        cib:     info:
>>> cib_process_ping:     Reporting our current digest to mail2:
>>> ed43bc3ecf0f15853900ba49fc514870 for 0.215.12 (0x152b110 0)
>>>
>>>
>>> On Mon, Mar 14, 2016 at 6:44 PM, Ken Gaillot <kgail...@redhat.com>
>> wrote:
>>>
>>>> On 03/10/2016 09:49 AM, Lorand Kelemen wrote:
>>>>> Dear List,
>>>>>
>>>>> After the creation and testing of a simple 2 node active-passive
>>>>> drbd+postfix cluster nearly everything works flawlessly (standby,
>> failure
>>>>> of a filesystem resource + failover, splitbrain + manual recovery)
>>>> however
>>>>> when delibarately killing the postfix instance, after reaching the
>>>>> migration threshold failover does not occur and resources revert to the
>>>>> Stopped state (except the master-slave drbd resource, which works as
>>>>> expected).
>>>>>
>>>>> Ordering and colocation is configured, STONITH and quorum disabled, the
>>>>> goal is to always have one node running all the resources and at any
>> sign
>>>>> of error it should fail over to the passive node, nothing fancy.
>>>>>
>>>>> Is my configuration wrong or am I hitting a bug?
>>>>>
>>>>> All software from centos 7 + elrepo repositories.
>>>>
>>>> With these versions, you can set "two_node: 1" in
>>>> /etc/corosync/corosync.conf (which will be done automatically if you
>>>> used "pcs cluster setup" initially), and then you don't need to ignore
>>>> quorum in pacemaker.
>>>>
>>>>> Regarding STONITH: the machines are running on free ESXi instances on
>>>>> separate machines, so the Vmware fencing agents won't work because in
>> the
>>>>> free version the API is read-only.
>>>>> Still trying to figure out a way to go, until then manual recovery +
>> huge
>>>>> arp cache times on the upstream firewall...
>>>>>
>>>>> Please find pe-input*.bz files attached, logs and config below. The
>>>>> situation: on node mail1 postfix was killed 3 times (migration
>>>> threshold),
>>>>> it should have failed over to mail2.
>>>>> When killing a filesystem resource three times this happens flawlessly.
>>>>>
>>>>> Thanks for your input!
>>>>>
>>>>> Best regards,
>>>>> Lorand
>>>>>
>>>>>
>>>>> Cluster Name: mailcluster
>>>>> Corosync Nodes:
>>>>>  mail1 mail2
>>>>> Pacemaker Nodes:
>>>>>  mail1 mail2
>>>>>
>>>>> Resources:
>>>>>  Group: network-services
>>>>>   Resource: virtualip-1 (class=ocf provider=heartbeat type=IPaddr2)
>>>>>    Attributes: ip=10.20.64.10 cidr_netmask=24 nic=lan0
>>>>>    Operations: start interval=0s timeout=20s
>>>> (virtualip-1-start-interval-0s)
>>>>>                stop interval=0s timeout=20s
>>>> (virtualip-1-stop-interval-0s)
>>>>>                monitor interval=30s (virtualip-1-monitor-interval-30s)
>>>>>  Master: spool-clone
>>>>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>> clone-node-max=1
>>>>> notify=true
>>>>>   Resource: spool (class=ocf provider=linbit type=drbd)
>>>>>    Attributes: drbd_resource=spool
>>>>>    Operations: start interval=0s timeout=240 (spool-start-interval-0s)
>>>>>                promote interval=0s timeout=90
>> (spool-promote-interval-0s)
>>>>>                demote interval=0s timeout=90 (spool-demote-interval-0s)
>>>>>                stop interval=0s timeout=100 (spool-stop-interval-0s)
>>>>>                monitor interval=10s (spool-monitor-interval-10s)
>>>>>  Master: mail-clone
>>>>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>> clone-node-max=1
>>>>> notify=true
>>>>>   Resource: mail (class=ocf provider=linbit type=drbd)
>>>>>    Attributes: drbd_resource=mail
>>>>>    Operations: start interval=0s timeout=240 (mail-start-interval-0s)
>>>>>                promote interval=0s timeout=90
>> (mail-promote-interval-0s)
>>>>>                demote interval=0s timeout=90 (mail-demote-interval-0s)
>>>>>                stop interval=0s timeout=100 (mail-stop-interval-0s)
>>>>>                monitor interval=10s (mail-monitor-interval-10s)
>>>>>  Group: fs-services
>>>>>   Resource: fs-spool (class=ocf provider=heartbeat type=Filesystem)
>>>>>    Attributes: device=/dev/drbd0 directory=/var/spool/postfix
>> fstype=ext4
>>>>> options=nodev,nosuid,noexec
>>>>>    Operations: start interval=0s timeout=60
>> (fs-spool-start-interval-0s)
>>>>>                stop interval=0s timeout=60 (fs-spool-stop-interval-0s)
>>>>>                monitor interval=20 timeout=40
>>>> (fs-spool-monitor-interval-20)
>>>>>   Resource: fs-mail (class=ocf provider=heartbeat type=Filesystem)
>>>>>    Attributes: device=/dev/drbd1 directory=/var/spool/mail fstype=ext4
>>>>> options=nodev,nosuid,noexec
>>>>>    Operations: start interval=0s timeout=60 (fs-mail-start-interval-0s)
>>>>>                stop interval=0s timeout=60 (fs-mail-stop-interval-0s)
>>>>>                monitor interval=20 timeout=40
>>>> (fs-mail-monitor-interval-20)
>>>>>  Group: mail-services
>>>>>   Resource: postfix (class=ocf provider=heartbeat type=postfix)
>>>>>    Operations: start interval=0s timeout=20s
>> (postfix-start-interval-0s)
>>>>>                stop interval=0s timeout=20s (postfix-stop-interval-0s)
>>>>>                monitor interval=45s (postfix-monitor-interval-45s)
>>>>>
>>>>> Stonith Devices:
>>>>> Fencing Levels:
>>>>>
>>>>> Location Constraints:
>>>>> Ordering Constraints:
>>>>>   start network-services then promote mail-clone (kind:Mandatory)
>>>>> (id:order-network-services-mail-clone-mandatory)
>>>>>   promote mail-clone then promote spool-clone (kind:Mandatory)
>>>>> (id:order-mail-clone-spool-clone-mandatory)
>>>>>   promote spool-clone then start fs-services (kind:Mandatory)
>>>>> (id:order-spool-clone-fs-services-mandatory)
>>>>>   start fs-services then start mail-services (kind:Mandatory)
>>>>> (id:order-fs-services-mail-services-mandatory)
>>>>> Colocation Constraints:
>>>>>   network-services with spool-clone (score:INFINITY) (rsc-role:Started)
>>>>> (with-rsc-role:Master)
>>>> (id:colocation-network-services-spool-clone-INFINITY)
>>>>>   network-services with mail-clone (score:INFINITY) (rsc-role:Started)
>>>>> (with-rsc-role:Master)
>>>> (id:colocation-network-services-mail-clone-INFINITY)
>>>>>   network-services with fs-services (score:INFINITY)
>>>>> (id:colocation-network-services-fs-services-INFINITY)
>>>>>   network-services with mail-services (score:INFINITY)
>>>>> (id:colocation-network-services-mail-services-INFINITY)
>>>>
>>>> I'm not sure whether it's causing your issue, but I would make the
>>>> constraints reflect the logical relationships better.
>>>>
>>>> For example, network-services only needs to be colocated with
>>>> mail-services logically; it's mail-services that needs to be with
>>>> fs-services, and fs-services that needs to be with
>>>> spool-clone/mail-clone master. In other words, don't make the
>>>> highest-level resource depend on everything else, make each level depend
>>>> on the level below it.
>>>>
>>>> Also, I would guess that the virtual IP only needs to be ordered before
>>>> mail-services, and mail-clone and spool-clone could both be ordered
>>>> before fs-services, rather than ordering mail-clone before spool-clone.
>>>>
>>>>> Resources Defaults:
>>>>>  migration-threshold: 3
>>>>> Operations Defaults:
>>>>>  on-fail: restart
>>>>>
>>>>> Cluster Properties:
>>>>>  cluster-infrastructure: corosync
>>>>>  cluster-name: mailcluster
>>>>>  cluster-recheck-interval: 5min
>>>>>  dc-version: 1.1.13-10.el7_2.2-44eb2dd
>>>>>  default-resource-stickiness: infinity
>>>>>  have-watchdog: false
>>>>>  last-lrm-refresh: 1457613674
>>>>>  no-quorum-policy: ignore
>>>>>  pe-error-series-max: 1024
>>>>>  pe-input-series-max: 1024
>>>>>  pe-warn-series-max: 1024
>>>>>  stonith-enabled: false
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.15 2
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.16 (null)
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=16
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']/lrm_rsc_op[@id='postfix_last_failure_0']:
>>>>>  @transition-key=4:1234:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @transition-magic=0:7;4:1234:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @call-id=1274, @last-rc-change=1457613440
>>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> abort_transition_graph:       Transition aborted by
>> postfix_monitor_45000
>>>>> 'modify' on mail1: Inactive graph
>>>>> (magic=0:7;4:1234:0:ae755a85-c250-498f-9c94-ddd8a7e2788a, cib=0.197.16,
>>>>> source=process_graph_event:598, 1)
>>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> update_failcount:     Updating failcount for postfix on mail1 after
>>>> failed
>>>>> monitor: rc=7 (update=value++, time=1457613440)
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_client_update:  Expanded fail-count-postfix=value++ to 3
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail1/crmd/196, version=0.197.16)
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_peer_update:    Setting fail-count-postfix[mail1]: 2 -> 3 from
>>>> mail2
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> write_attribute:      Sent update 400 with 2 changes for
>>>>> fail-count-postfix, id=<n/a>, set=(null)
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Forwarding cib_modify operation for section
>> status
>>>> to
>>>>> master (origin=local/attrd/400)
>>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> process_graph_event:  Detected action (1234.4)
>>>>> postfix_monitor_45000.1274=not running: failed
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_peer_update:    Setting last-failure-postfix[mail1]: 1457613347
>> ->
>>>>> 1457613440 from mail2
>>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> do_state_transition:  State transition S_IDLE -> S_POLICY_ENGINE [
>>>>> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> write_attribute:      Sent update 401 with 2 changes for
>>>>> last-failure-postfix, id=<n/a>, set=(null)
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.16 2
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.17 (null)
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=17
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-fail-count-postfix']:
>>>>>  @value=3
>>>>> Mar 10 13:37:20 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail2/attrd/400, version=0.197.17)
>>>>> Mar 10 13:37:20 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> abort_transition_graph:       Transition aborted by
>>>>> status-1-fail-count-postfix, fail-count-postfix=3: Transient attribute
>>>>> change (modify cib=0.197.17, source=abort_unless_down:319,
>>>>>
>>>>
>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-fail-count-postfix'],
>>>>> 1)
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_cib_callback:   Update 400 for fail-count-postfix: OK (0)
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_cib_callback:   Update 400 for fail-count-postfix[mail1]=3: OK
>> (0)
>>>>> Mar 10 13:37:20 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_cib_callback:   Update 400 for fail-count-postfix[mail2]=(null):
>> OK
>>>>> (0)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Forwarding cib_modify operation for section
>> status
>>>> to
>>>>> master (origin=local/attrd/401)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.17 2
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.18 (null)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=18
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix']:
>>>>>  @value=1457613440
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail2/attrd/401, version=0.197.18)
>>>>> Mar 10 13:37:21 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_cib_callback:   Update 401 for last-failure-postfix: OK (0)
>>>>> Mar 10 13:37:21 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_cib_callback:   Update 401 for
>>>>> last-failure-postfix[mail1]=1457613440: OK (0)
>>>>> Mar 10 13:37:21 [7418] HWJ-626.domain.local      attrd:     info:
>>>>> attrd_cib_callback:   Update 401 for
>>>>> last-failure-postfix[mail2]=1457610376: OK (0)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> abort_transition_graph:       Transition aborted by
>>>>> status-1-last-failure-postfix, last-failure-postfix=1457613440:
>> Transient
>>>>> attribute change (modify cib=0.197.18, source=abort_unless_down:319,
>>>>>
>>>>
>> path=/cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']/nvpair[@id='status-1-last-failure-postfix'],
>>>>> 1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> unpack_config:        On loss of CCM Quorum: Ignore
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_online_status:      Node mail1 is online
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_online_status:      Node mail2 is online
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource mail:0 active in
>>>>> master mode on mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource spool:0 active
>> in
>>>>> master mode on mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource fs-spool active
>> on
>>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource fs-mail active
>> on
>>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:  warning:
>>>>> unpack_rsc_op_failure:        Processing failed op monitor for postfix
>> on
>>>>> mail1: not running (7)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource spool:1 active
>> in
>>>>> master mode on mail2
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource mail:1 active in
>>>>> master mode on mail2
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> group_print:   Resource Group: network-services
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      virtualip-1        (ocf::heartbeat:IPaddr2):
>>>>  Started
>>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> clone_print:   Master/Slave Set: spool-clone [spool]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Masters: [ mail1 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Slaves: [ mail2 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> clone_print:   Master/Slave Set: mail-clone [mail]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Masters: [ mail1 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Slaves: [ mail2 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> group_print:   Resource Group: fs-services
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      fs-spool   (ocf::heartbeat:Filesystem):    Started
>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      fs-mail    (ocf::heartbeat:Filesystem):    Started
>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> group_print:   Resource Group: mail-services
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      postfix    (ocf::heartbeat:postfix):       FAILED
>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> get_failcount_full:   postfix has failed 3 times on mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:  warning:
>>>>> common_apply_stickiness:      Forcing postfix away from mail1 after 3
>>>>> failures (max=3)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: Promoting mail:0 (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: mail-clone: Promoted 1 instances of a possible 1 to
>> master
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: Promoting spool:0 (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: spool-clone: Promoted 1 instances of a possible 1 to
>> master
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> rsc_merge_weights:    postfix: Rolling back scores from virtualip-1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_color: Resource virtualip-1 cannot run anywhere
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> RecurringOp:   Start recurring monitor (45s) for postfix on mail2
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    virtualip-1     (mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   spool:0 (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   spool:1 (Slave mail2)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   mail:0  (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   mail:1  (Slave mail2)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    fs-spool        (Started mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    fs-mail (Started mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    postfix (Started mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> process_pe_message:   Calculated Transition 1235:
>>>>> /var/lib/pacemaker/pengine/pe-input-302.bz2
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> handle_response:      pe_calc calculation pe_calc-dc-1457613441-3756 is
>>>>> obsolete
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> unpack_config:        On loss of CCM Quorum: Ignore
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_online_status:      Node mail1 is online
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_online_status:      Node mail2 is online
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource mail:0 active in
>>>>> master mode on mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource spool:0 active
>> in
>>>>> master mode on mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource fs-spool active
>> on
>>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource fs-mail active
>> on
>>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:  warning:
>>>>> unpack_rsc_op_failure:        Processing failed op monitor for postfix
>> on
>>>>> mail1: not running (7)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource spool:1 active
>> in
>>>>> master mode on mail2
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> determine_op_status:  Operation monitor found resource mail:1 active in
>>>>> master mode on mail2
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> group_print:   Resource Group: network-services
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      virtualip-1        (ocf::heartbeat:IPaddr2):
>>>>  Started
>>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> clone_print:   Master/Slave Set: spool-clone [spool]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Masters: [ mail1 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Slaves: [ mail2 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> clone_print:   Master/Slave Set: mail-clone [mail]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Masters: [ mail1 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> short_print:       Slaves: [ mail2 ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> group_print:   Resource Group: fs-services
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      fs-spool   (ocf::heartbeat:Filesystem):    Started
>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      fs-mail    (ocf::heartbeat:Filesystem):    Started
>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> group_print:   Resource Group: mail-services
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_print:      postfix    (ocf::heartbeat:postfix):       FAILED
>>>> mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> get_failcount_full:   postfix has failed 3 times on mail1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:  warning:
>>>>> common_apply_stickiness:      Forcing postfix away from mail1 after 3
>>>>> failures (max=3)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: Promoting mail:0 (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: mail-clone: Promoted 1 instances of a possible 1 to
>> master
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: Promoting spool:0 (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> master_color: spool-clone: Promoted 1 instances of a possible 1 to
>> master
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> rsc_merge_weights:    postfix: Rolling back scores from virtualip-1
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> native_color: Resource virtualip-1 cannot run anywhere
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> RecurringOp:   Start recurring monitor (45s) for postfix on mail2
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    virtualip-1     (mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   spool:0 (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   spool:1 (Slave mail2)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   mail:0  (Master mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:     info:
>>>>> LogActions:   Leave   mail:1  (Slave mail2)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    fs-spool        (Started mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    fs-mail (Started mail1)
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> LogActions:   Stop    postfix (Started mail1)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> do_state_transition:  State transition S_POLICY_ENGINE ->
>>>>> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
>>>>> origin=handle_response ]
>>>>> Mar 10 13:37:21 [7419] HWJ-626.domain.local    pengine:   notice:
>>>>> process_pe_message:   Calculated Transition 1236:
>>>>> /var/lib/pacemaker/pengine/pe-input-303.bz2
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> do_te_invoke: Processing graph 1236 (ref=pe_calc-dc-1457613441-3757)
>>>>> derived from /var/lib/pacemaker/pengine/pe-input-303.bz2
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> te_rsc_command:       Initiating action 12: stop virtualip-1_stop_0 on
>>>> mail1
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> te_rsc_command:       Initiating action 5: stop postfix_stop_0 on mail1
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.18 2
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.19 (null)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=19
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='virtualip-1']/lrm_rsc_op[@id='virtualip-1_last_0']:
>>>>>  @operation_key=virtualip-1_stop_0, @operation=stop,
>>>>> @transition-key=12:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @transition-magic=0:0;12:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @call-id=1276, @last-run=1457613441, @last-rc-change=1457613441,
>>>>> @exec-time=66
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail1/crmd/197, version=0.197.19)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> match_graph_event:    Action virtualip-1_stop_0 (12) confirmed on mail1
>>>>> (rc=0)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.19 2
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.20 (null)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=20
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']/lrm_rsc_op[@id='postfix_last_0']:
>>>>>  @operation_key=postfix_stop_0, @operation=stop,
>>>>> @transition-key=5:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @transition-magic=0:0;5:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @call-id=1278, @last-run=1457613441, @last-rc-change=1457613441,
>>>>> @exec-time=476
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> match_graph_event:    Action postfix_stop_0 (5) confirmed on mail1
>> (rc=0)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> te_rsc_command:       Initiating action 79: stop fs-mail_stop_0 on
>> mail1
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail1/crmd/198, version=0.197.20)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.20 2
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.21 (null)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=21
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='fs-mail']/lrm_rsc_op[@id='fs-mail_last_0']:
>>>>>  @operation_key=fs-mail_stop_0, @operation=stop,
>>>>> @transition-key=79:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @transition-magic=0:0;79:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @call-id=1280, @last-run=1457613441, @last-rc-change=1457613441,
>>>>> @exec-time=88, @queue-time=1
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail1/crmd/199, version=0.197.21)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> match_graph_event:    Action fs-mail_stop_0 (79) confirmed on mail1
>>>> (rc=0)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> te_rsc_command:       Initiating action 77: stop fs-spool_stop_0 on
>> mail1
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: --- 0.197.21 2
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       Diff: +++ 0.197.22 (null)
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +  /cib:  @num_updates=22
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_perform_op:       +
>>>>>
>>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='fs-spool']/lrm_rsc_op[@id='fs-spool_last_0']:
>>>>>  @operation_key=fs-spool_stop_0, @operation=stop,
>>>>> @transition-key=77:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @transition-magic=0:0;77:1236:0:ae755a85-c250-498f-9c94-ddd8a7e2788a,
>>>>> @call-id=1282, @last-run=1457613441, @last-rc-change=1457613441,
>>>>> @exec-time=86
>>>>> Mar 10 13:37:21 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_request:  Completed cib_modify operation for section
>> status:
>>>> OK
>>>>> (rc=0, origin=mail1/crmd/200, version=0.197.22)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>>>>> match_graph_event:    Action fs-spool_stop_0 (77) confirmed on mail1
>>>> (rc=0)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:  warning:
>>>>> run_graph:    Transition 1236 (Complete=11, Pending=0, Fired=0,
>>>> Skipped=0,
>>>>> Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-303.bz2):
>>>>> Terminated
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:  warning:
>>>>> te_graph_trigger:     Transition failed: terminated
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_graph:  Graph 1236 with 12 actions: batch-limit=12 jobs,
>>>>> network-delay=0ms
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   16]: Completed pseudo op
>>>>> network-services_stopped_0     on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   15]: Completed pseudo op
>>>>> network-services_stop_0        on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   12]: Completed rsc op
>> virtualip-1_stop_0
>>>>>              on mail1 (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   84]: Completed pseudo op
>>>>> fs-services_stopped_0          on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   83]: Completed pseudo op
>>>> fs-services_stop_0
>>>>>             on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   77]: Completed rsc op fs-spool_stop_0
>>>>>             on mail1 (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   79]: Completed rsc op fs-mail_stop_0
>>>>>              on mail1 (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   90]: Completed pseudo op
>>>>> mail-services_stopped_0        on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   89]: Completed pseudo op
>>>>> mail-services_stop_0           on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action   86]: Pending rsc op
>> postfix_monitor_45000
>>>>>             on mail2 (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:         * [Input 85]: Unresolved dependency rsc op
>>>>> postfix_start_0 on mail2
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action    5]: Completed rsc op postfix_stop_0
>>>>>              on mail1 (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> print_synapse:        [Action    8]: Completed pseudo op all_stopped
>>>>>              on N/A (priority: 0, waiting: none)
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:     info:
>> do_log:
>>>>>     FSA: Input I_TE_SUCCESS from notify_crmd() received in state
>>>>> S_TRANSITION_ENGINE
>>>>> Mar 10 13:37:21 [7420] HWJ-626.domain.local       crmd:   notice:
>>>>> do_state_transition:  State transition S_TRANSITION_ENGINE -> S_IDLE [
>>>>> input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
>>>>> Mar 10 13:37:26 [7415] HWJ-626.domain.local        cib:     info:
>>>>> cib_process_ping:     Reporting our current digest to mail2:
>>>>> 3896ee29cdb6ba128330b0ef6e41bd79 for 0.197.22 (0x1544a30 0)
>>
>>
> 


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Cluster failover failure with Unresolved dependency

Reply via email to