On 5/8/07, Rene Purcell <[EMAIL PROTECTED]> wrote:



On 5/8/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote:
>
> grep ERROR logfile
>
> try this for starters:
>
> May  7 16:31:41 qclsles01 lrmd: [5020]: info: RA output:
> (resource_qclvmsles02:stop:stderr) Error: the domain
> 'resource_qclvmsles02'
> does not exist.
> May  7 16:31:41 qclsles01 lrmd: [5020]: info: RA output:
> (resource_qclvmsles02:stop:stdout) Domain resource_qclvmsles02
> terminated
> May  7 16:31:41 qclsles01 crmd: [22028]: WARN: process_lrm_event:lrm.cLRM
> operation (35) stop_0 on resource_qclvmsles02 Error: (4) insufficient
> privileges


yup I saw that.. it's weird. Heartbeat shutdown the vm, then say these
errors.. and if I cleanup the ressource he restart on the correct node..
There should be something I missed lol


On 5/7/07, Rene Purcell <[EMAIL PROTECTED] > wrote:
> > I would like to know if someone had tried the Novell setup described
> in "
> > http://www.novell.com/linux/technical_library/has.pdf " with a x86_64
> arch ?
> >
> > I've tested this setup with a classic x86 arch and everything was
> ok... but
> > I doublechecked my config and everything look good but my VM never
> start on
> > his original node when it come back online... and I can't find why!
> >
> >
> > here's the log when my node1 come back.. we can see the VM shutting
> down and
> > after that nothing happend in the other node..
> >
> > May  7 16:31:25 qclsles01 cib: [22024]: info:
> > cib_diff_notify:notify.cUpdate (client: 6403, call:13):
> > 0.65.1020 -> 0.65.1021 (ok)
> > May  7 16:31:25 qclsles01 tengine: [22591]: info:
> > te_update_diff:callbacks.cProcessing diff (cib_update):
> > 0.65.1020 -> 0.65.1021
> > May  7 16:31:25 qclsles01 tengine: [22591]: info:
> > extract_event:events.cAborting on transient_attributes changes
> > May  7 16:31:25 qclsles01 tengine: [22591]: info:
> update_abort_priority:
> > utils.c Abort priority upgraded to 1000000
> > May  7 16:31:25 qclsles01 tengine: [22591]: info:
> update_abort_priority:
> > utils.c Abort action 0 superceeded by 2
> > May  7 16:31:26 qclsles01 cib: [22024]: info: activateCibXml: io.c CIB
> size
> > is 161648 bytes (was 158548)
> > May  7 16:31:26 qclsles01 cib: [22024]: info:
> > cib_diff_notify:notify.cUpdate (client: 6403, call:14):
> > 0.65.1021 -> 0.65.1022 (ok)
> > May  7 16:31:26 qclsles01 haclient: on_event:evt:cib_changed
> > May  7 16:31:26 qclsles01 tengine: [22591]: info:
> > te_update_diff:callbacks.cProcessing diff (cib_update):
> > 0.65.1021 -> 0.65.1022
> > May  7 16:31:26 qclsles01 tengine: [22591]: info:
> > match_graph_event: events.cAction resource_qclvmsles02_stop_0 (9)
> > confirmed
> > May  7 16:31:26 qclsles01 cib: [25889]: info: write_cib_contents:io.cWrote
> > version 0.65.1022 of the CIB to disk (digest:
> > e71c271759371d44c4bad24d50b2421d)
> > May  7 16:31:39 qclsles01 kernel: xenbr0: port 3(vif12.0) entering
> disabled
> > state
> > May  7 16:31:39 qclsles01 kernel: device vif12.0 left promiscuous mode
> > May  7 16:31:39 qclsles01 kernel: xenbr0: port 3( vif12.0) entering
> disabled
> > state
> > May  7 16:31:39 qclsles01 logger: /etc/xen/scripts/vif-bridge: offline
> > XENBUS_PATH=backend/vif/12/0
> > May  7 16:31:40 qclsles01 logger: /etc/xen/scripts/block: remove
> > XENBUS_PATH=backend/vbd/12/768
> > May  7 16:31:40 qclsles01 logger: /etc/xen/scripts/block: remove
> > XENBUS_PATH=backend/vbd/12/832
> > May  7 16:31:40 qclsles01 logger: /etc/xen/scripts/block: remove
> > XENBUS_PATH=backend/vbd/12/5632
> > May  7 16:31:40 qclsles01 logger: /etc/xen/scripts/vif-bridge: brctl
> delif
> > xenbr0 vif12.0 failed
> > May  7 16:31:40 qclsles01 logger: /etc/xen/scripts/vif-bridge:
> ifconfig
> > vif12.0 down failed
> > May  7 16:31:40 qclsles01 logger: /etc/xen/scripts/vif-bridge:
> Successful
> > vif-bridge offline for vif12.0, bridge xenbr0.
> > May  7 16:31:40 qclsles01 logger:
> /etc/xen/scripts/xen-hotplug-cleanup:
> > XENBUS_PATH=backend/vbd/12/5632
> > May  7 16:31:40 qclsles01 logger:
> /etc/xen/scripts/xen-hotplug-cleanup:
> > XENBUS_PATH=backend/vbd/12/768
> > May  7 16:31:40 qclsles01 ifdown:     vif12.0
> > May  7 16:31:40 qclsles01 logger:
> /etc/xen/scripts/xen-hotplug-cleanup:
> > XENBUS_PATH=backend/vif/12/0
> > May  7 16:31:40 qclsles01 logger:
> /etc/xen/scripts/xen-hotplug-cleanup:
> > XENBUS_PATH=backend/vbd/12/832
> > May  7 16:31:40 qclsles01 ifdown: Interface not available and no
> > configuration found.
> > May  7 16:31:41 qclsles01 lrmd: [5020]: info: RA output:
> > (resource_qclvmsles02:stop:stderr) Error: the domain
> 'resource_qclvmsles02'
> > does not exist.
> > May  7 16:31:41 qclsles01 lrmd: [5020]: info: RA output:
> > (resource_qclvmsles02:stop:stdout) Domain resource_qclvmsles02
> terminated
> > May  7 16:31:41 qclsles01 crmd: [22028]: WARN: process_lrm_event:lrm.cLRM
> > operation (35) stop_0 on resource_qclvmsles02 Error: (4) insufficient
> > privileges
> > May  7 16:31:41 qclsles01 cib: [22024]: info: activateCibXml:io.c CIB
> size
> > is 164748 bytes (was 161648)
> > May  7 16:31:41 qclsles01 crmd: [22028]: info:
> > do_state_transition: fsa.cqclsles01: State transition
> > S_TRANSITION_ENGINE -> S_POLICY_ENGINE [
> > input=I_PE_CALC cause=C_IPC_MESSAGE origin=route_message ]
> > May  7 16:31:41 qclsles01 tengine: [22591]: info:
> > te_update_diff: callbacks.cProcessing diff (cib_update):
> > 0.65.1022 -> 0.65.1023
> > May  7 16:31:41 qclsles01 cib: [22024]: info:
> > cib_diff_notify:notify.cUpdate (client: 22028, call:100):
> > 0.65.1022 -> 0.65.1023 (ok)
> > May  7 16:31:41 qclsles01 crmd: [22028]: info: do_state_transition:
> fsa.c All
> > 2 cluster nodes are eligable to run resources.
> > May  7 16:31:41 qclsles01 tengine: [22591]: ERROR: match_graph_event:
> > events.c Action resource_qclvmsles02_stop_0 on qclsles01 failed
> (target: 0
> > vs. rc: 4): Error
> > May  7 16:31:41 qclsles01 tengine: [22591]: info:
> > match_graph_event:events.cAction resource_qclvmsles02_stop_0 (10)
> > confirmed
> > May  7 16:31:41 qclsles01 tengine: [22591]: info:
> > run_graph:graph.c====================================================
> > May  7 16:31:41 qclsles01 tengine: [22591]: notice:
> > run_graph: graph.cTransition 12: (Complete=3, Pending=0, Fired=0,
> > Skipped=2, Incomplete=0)
> > May  7 16:31:41 qclsles01 haclient: on_event:evt:cib_changed
> > May  7 16:31:41 qclsles01 cib: [26190]: info: write_cib_contents: io.cWrote
> > version 0.65.1023 of the CIB to disk (digest:
> > c80326e44b5a106fe9a384240c4a3cc9)
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: process_pe_message:
> > [generation] <cib generated="true" admin_epoch="0" have_quorum="true"
> > num_peers="2" cib_feature_revision="1.3" ccm_transition="10"
> > dc_uuid="46ef9c7b-5f6e-4cc0-a0bb-94227b605170" epoch="65"
> > num_updates="1023"/>
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN: unpack_config:
> unpack.c No
> > value specified for cluster preference: default_action_timeout
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config: unpack.cDefault stickiness: 1000000
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config:unpack.cDefault failure stickiness: -500
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config: unpack.cSTONITH of failed nodes is disabled
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config:unpack.cSTONITH will reboot nodes
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config: unpack.cCluster is symmetric - resources can run
> > anywhere by default
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: unpack_config:
> unpack.c On
> > loss of CCM Quorum: Stop ALL resources
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config:unpack.cOrphan resources are stopped
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config:unpack.cOrphan resource actions are stopped
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN: unpack_config:
> unpack.c No
> > value specified for cluster preference: remove_after_stop
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> > unpack_config:unpack.cStopped resources are removed from the status
> > section: false
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: unpack_config:
> unpack.c By
> > default resources are managed
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> determine_online_status:
> > unpack.c Node qclsles02 is online
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> determine_online_status:
> > unpack.c Node qclsles01 is online
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN:
> > unpack_rsc_op: unpack.cProcessing failed op
> > (resource_qclvmsles02_stop_0) for resource_qclvmsles02
> > on qclsles01
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN:
> > unpack_rsc_op:unpack.cHandling failed stop for resource_qclvmsles02 on
>
> > qclsles01
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> process_orphan_resource:
> > Orphan resource <lrm_resource id="resource_NFS" type="nfs" class="lsb"
> > provider="heartbeat">
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> process_orphan_resource:
> > Orphan resource   <lrm_rsc_op id="resource_NFS_monitor_0"
> > operation="monitor" crm-debug-origin="build_active_RAs"
> > transition_key="27:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> > transition_magic="0:0;27:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> call_id="9"
> > crm_feature_set="1.0.6" rc_code="0" op_status="0" interval="0"
> > op_digest="08b7001b97ccdaa1ca23a9f165256bc1"/>
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> process_orphan_resource:
> > Orphan resource   <lrm_rsc_op id="resource_NFS_stop_0"
> operation="stop"
> > crm-debug-origin="build_active_RAs"
> > transition_key="28:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> > transition_magic="0:0;28:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> call_id="10"
> > crm_feature_set="1.0.6" rc_code="0" op_status="0" interval="0"
> > op_digest="08b7001b97ccdaa1ca23a9f165256bc1"/>
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> process_orphan_resource:
> > Orphan resource </lrm_resource>
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN:
> process_orphan_resource:
> > unpack.c Nothing known about resource resource_NFS running on
> qclsles01
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> create_fake_resource:
> > Orphan resource <lrm_resource id="resource_NFS" type="nfs" class="lsb"
> > provider="heartbeat">
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> create_fake_resource:
> > Orphan resource   <lrm_rsc_op id="resource_NFS_monitor_0"
> > operation="monitor" crm-debug-origin="build_active_RAs"
> > transition_key="27:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> > transition_magic="0:0;27:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> call_id="9"
> > crm_feature_set="1.0.6" rc_code="0" op_status="0" interval="0"
> > op_digest="08b7001b97ccdaa1ca23a9f165256bc1"/>
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> create_fake_resource:
> > Orphan resource   <lrm_rsc_op id="resource_NFS_stop_0"
> operation="stop"
> > crm-debug-origin="build_active_RAs"
> > transition_key="28:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> > transition_magic="0:0;28:3a815bc6-ffaa-49b3-aac2-0ed46e85f085"
> call_id="10"
> > crm_feature_set="1.0.6" rc_code="0" op_status="0" interval="0"
> > op_digest="08b7001b97ccdaa1ca23a9f165256bc1"/>
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> create_fake_resource:
> > Orphan resource </lrm_resource>
> > May  7 16:31:41 qclsles01 pengine: [22592]: info:
> process_orphan_resource:
> > unpack.c Making sure orphan resource_NFS is stopped
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: resource_qclvmsles01
> > (heartbeat::ocf:Xen):    Started qclsles01
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: resource_qclvmsles02
>
> > (heartbeat::ocf:Xen):    Started qclsles01 (unmanaged) FAILED
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: resource_NFS
> > (lsb:nfs):    Stopped
> > May  7 16:31:41 qclsles01 pengine: [22592]: notice:
> > NoRoleChange:native.cLeave resource resource_qclvmsles01
> > (qclsles01)
> > May  7 16:31:41 qclsles01 pengine: [22592]: notice:
> > NoRoleChange:native.cMove  resource resource_qclvmsles02    (qclsles01
>
> > -> qclsles02)
> > May  7 16:31:41 qclsles01 crmd: [22028]: info:
> > do_state_transition:fsa.cqclsles01: State transition S_POLICY_ENGINE
> > -> S_TRANSITION_ENGINE [
> > input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN:
> > custom_action:utils.cAction resource_qclvmsles02_stop_0 stop is for
> > resource_qclvmsles02
> > (unmanaged)
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN:
> > custom_action:utils.cAction resource_qclvmsles02_start_0 start is for
> > resource_qclvmsles02
> > (unmanaged)
> > May  7 16:31:41 qclsles01 pengine: [22592]: notice:
> > stage8:allocate.cCreated transition graph 13.
> > May  7 16:31:41 qclsles01 tengine: [22591]: info:
> > unpack_graph:unpack.cUnpacked transition 13: 0 actions in 0 synapses
> > May  7 16:31:41 qclsles01 crmd: [22028]: info:
> > do_state_transition:fsa.cqclsles01 : State transition
> > S_TRANSITION_ENGINE -> S_IDLE [
> > input=I_TE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
> > May  7 16:31:41 qclsles01 pengine: [22592]: WARN: process_pe_message:
> > pengine.c No value specified for cluster preference:
> pe-input-series-max
> > May  7 16:31:41 qclsles01 tengine: [22591]: info:
> > run_graph:graph.cTransition 13: (Complete=0, Pending=0, Fired=0,
> > Skipped=0, Incomplete=0)
> > May  7 16:31:41 qclsles01 pengine: [22592]: info: process_pe_message:
> > pengine.c Transition 13: PEngine Input stored in:
> > /var/lib/heartbeat/pengine/pe-input-100.bz2
> > May  7 16:31:41 qclsles01 tengine: [22591]: info:
> > notify_crmd:actions.cTransition 13 status: te_complete - (null)
> >
> >
> > Thanks!
> >
> >
> > --
> > René Jr Purcell
> > Chargé de projet, sécurité et sytèmes
> > Techno Centre Logiciels Libres, http://www.tc2l.ca/
> > Téléphone : (418) 681-2929 #124
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



--
René Jr Purcell
Chargé de projet, sécurité et sytèmes
Techno Centre Logiciels Libres, http://www.tc2l.ca/
Téléphone : (418) 681-2929 #124


ah and how am I supposed to know which node is concerned int he log ?
I can read:

"May  7 16:31:41 qclsles01 crmd: [22028]: WARN: process_lrm_event:lrm.c LRM
operation (35) stop_0 on resource_qclvmsles02 Error: (4) insufficient
privileges"

on my first node and the same message except for the hostname in my second
node.. so which one have a privileges problem ?

--
René Jr Purcell
Chargé de projet, sécurité et sytèmes
Techno Centre Logiciels Libres, http://www.tc2l.ca/
Téléphone : (418) 681-2929 #124
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to