Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource
On Thu, Sep 29, 2016, at 02:45 PM, Jan Pokorný wrote: > Hello , > > On 29/09/16 12:41 -0400, Christopher Harvey wrote: > > I think something is failing at the execvp() level. I'm seeing > > useful looking trace logs in the code, but can't enable them right > > now. I have: > > PCMK_debug=yes > > PCMK_logfile=/tmp/pacemaker.log > > PCMK_logpriority=debug > > PCMK_trace_files=services_linux.c > > Just in case, pacemaker needs to be restarted once this change in the > appropriate configuration file is made. I did restart it. Maybe I should check pacemaker stdout? > Another try, does "crm_resource --show-metadata=ocf:acme:MsgBB-Active" > work for you? That works. (as root) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource
Hello , On 29/09/16 12:41 -0400, Christopher Harvey wrote: > I think something is failing at the execvp() level. I'm seeing > useful looking trace logs in the code, but can't enable them right > now. I have: > PCMK_debug=yes > PCMK_logfile=/tmp/pacemaker.log > PCMK_logpriority=debug > PCMK_trace_files=services_linux.c Just in case, pacemaker needs to be restarted once this change in the appropriate configuration file is made. Another try, does "crm_resource --show-metadata=ocf:acme:MsgBB-Active" work for you? -- Jan (Poki) pgptI8zu5Q5eN.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource
On Thu, Sep 29, 2016, at 12:20 PM, Jan Pokorný wrote: > On 28/09/16 16:55 -0500, Ken Gaillot wrote: > > On 09/28/2016 04:04 PM, Christopher Harvey wrote: > >> My corosync/pacemaker logs are seeing a bunch of messages like the > >> following: > >> > >> Sep 22 14:50:36 [1346] node-132-60 crmd: info: > >> action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613 > >> exited with rc=4 > > Another possibility is that "execvp" call, i.e., means to run this very > agent, failed at a fundemental level (could also be due to kernel's > security modules like SELinux, seccomp, etc. as already mentioned). I don't have seccomp or SELinux. > Do other agents work flawlessly for you? I only have my custom agent. All actions work except meta-data. In fact, I put the following at the very top of my resource agent: #! /bin/bash touch /tmp/yeah echo "yeah running ${@}" >> /tmp/yeah and the 'yeah' file gets filled with monitor/start/stop, but not meta-data. I think something is failing at the execvp() level. I'm seeing useful looking trace logs in the code, but can't enable them right now. I have: PCMK_debug=yes PCMK_logfile=/tmp/pacemaker.log PCMK_logpriority=debug PCMK_trace_files=services_linux.c but I'm not seeing the pacemaker.log anywhere, and corosync.log only has info and higher. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource
On 28/09/16 16:55 -0500, Ken Gaillot wrote: > On 09/28/2016 04:04 PM, Christopher Harvey wrote: >> My corosync/pacemaker logs are seeing a bunch of messages like the >> following: >> >> Sep 22 14:50:36 [1346] node-132-60 crmd: info: >> action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613 >> exited with rc=4 Another possibility is that "execvp" call, i.e., means to run this very agent, failed at a fundemental level (could also be due to kernel's security modules like SELinux, seccomp, etc. as already mentioned). Do other agents work flawlessly for you? > This is the (unmodified) exit status of the process, so the resource > agent must be returning "4" for some reason. Normally, that is used to > indicate "insufficient privileges". > >> Sep 22 14:50:36 [1346] node-132-60 crmd:error: >> generic_get_metadata: Failed to retrieve meta-data for >> ocf:acme:MsgBB-Active >> Sep 22 14:50:36 [1346] node-132-60 crmd: warning: >> get_rsc_metadata: No metadata found for MsgBB-Active::ocf:acme: >> Input/output error (-5) >> Sep 22 14:50:36 [1346] node-132-60 crmd:error: >> build_operation_update: No metadata for acme::ocf:MsgBB-Active >> Sep 22 14:50:36 [1346] node-132-60 crmd: notice: >> process_lrm_event: Operation MsgBB-Active_start_0: ok >> (node=node-132-60, call=25, rc=0, cib-update=27, confirmed=true) >> >> I am able to run the meta-data command on the command line: > > I would suspect that your user account has some privileges that the lrmd > user (typically hacluster:haclient) doesn't have. Try "su - hacluster" > first and see if it's any different. Maybe directory or file > permissions, or SELinux? In fact lrmd (along with stonithd) is an exception in the daemons' conglomerate as it runs as root:root, so as to portably handle execution of the resources that, naturally and in general, require execution with as high (here: inherited), privileges. -- Jan (Poki) pgprzjCCKfyxO.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource
On 09/28/2016 04:04 PM, Christopher Harvey wrote: > My corosync/pacemaker logs are seeing a bunch of messages like the > following: > > Sep 22 14:50:36 [1346] node-132-60 crmd: info: > action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613 > exited with rc=4 This is the (unmodified) exit status of the process, so the resource agent must be returning "4" for some reason. Normally, that is used to indicate "insufficient privileges". > Sep 22 14:50:36 [1346] node-132-60 crmd:error: > generic_get_metadata: Failed to retrieve meta-data for > ocf:acme:MsgBB-Active > Sep 22 14:50:36 [1346] node-132-60 crmd: warning: > get_rsc_metadata: No metadata found for MsgBB-Active::ocf:acme: > Input/output error (-5) > Sep 22 14:50:36 [1346] node-132-60 crmd:error: > build_operation_update: No metadata for acme::ocf:MsgBB-Active > Sep 22 14:50:36 [1346] node-132-60 crmd: notice: > process_lrm_event: Operation MsgBB-Active_start_0: ok > (node=node-132-60, call=25, rc=0, cib-update=27, confirmed=true) > > I am able to run the meta-data command on the command line: I would suspect that your user account has some privileges that the lrmd user (typically hacluster:haclient) doesn't have. Try "su - hacluster" first and see if it's any different. Maybe directory or file permissions, or SELinux? > node-132-43 # /lib/ocf/resource.d/acme/MsgBB-Active meta-data > > > > 1.0 > > > MsgBB-Active resource (long desc) > > MsgBB-Active resource > > > > > > > > > > > > > > node-132-43 # echo $? > 0 > > Resource code here: > #! /bin/bash > > ### > # Initialization: > > : ${OCF_FUNCTIONS=${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs} > . ${OCF_FUNCTIONS} > : ${__OCF_ACTION=$1} > > ### > > meta_data() > { > cat < > > > 1.0 > > > MsgBB-Active resource (long desc) > > MsgBB-Active resource > > > > > > > > > > > > > END > } > > # don't exit on TERM, to test that lrmd makes sure that we do exit > trap sigterm_handler TERM > sigterm_handler() { > ocf_log info "They use TERM to bring us down. No such luck." > return > } > > msgbb_usage() { > cat < usage: $0 {start|stop|monitor|validate-all|meta-data} > > Expects to have a fully populated OCF RA-compliant environment set. > END > } > > msgbb_monitor() { > # trimmed. > } > > msgbb_stop() { > # trimmed. > } > > msgbb_start() { > # trimmed. > } > > msgbb_validate() { > # trimmed. > } > > case $__OCF_ACTION in > meta-data) meta_data > exit $OCF_SUCCESS > ;; > start) msgbb_start;; > stop) msgbb_stop;; > monitor)msgbb_monitor;; > reload) ocf_log err "Reloading..." > msgbb_start > ;; > validate-all) msgbb_validate;; > usage|help) msgbb_usage > exit $OCF_SUCCESS > ;; > *) msgbb_usage > exit $OCF_ERR_UNIMPLEMENTED > ;; > esac > rc=$? > ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc" > exit $rc > > > Thanks, > Chris ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Failed to retrieve meta-data for custom ocf resource
My corosync/pacemaker logs are seeing a bunch of messages like the following: Sep 22 14:50:36 [1346] node-132-60 crmd: info: action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613 exited with rc=4 Sep 22 14:50:36 [1346] node-132-60 crmd:error: generic_get_metadata: Failed to retrieve meta-data for ocf:acme:MsgBB-Active Sep 22 14:50:36 [1346] node-132-60 crmd: warning: get_rsc_metadata: No metadata found for MsgBB-Active::ocf:acme: Input/output error (-5) Sep 22 14:50:36 [1346] node-132-60 crmd:error: build_operation_update: No metadata for acme::ocf:MsgBB-Active Sep 22 14:50:36 [1346] node-132-60 crmd: notice: process_lrm_event: Operation MsgBB-Active_start_0: ok (node=node-132-60, call=25, rc=0, cib-update=27, confirmed=true) I am able to run the meta-data command on the command line: node-132-43 # /lib/ocf/resource.d/acme/MsgBB-Active meta-data 1.0 MsgBB-Active resource (long desc) MsgBB-Active resource node-132-43 # echo $? 0 Resource code here: #! /bin/bash ### # Initialization: : ${OCF_FUNCTIONS=${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs} . ${OCF_FUNCTIONS} : ${__OCF_ACTION=$1} ### meta_data() { cat < 1.0 MsgBB-Active resource (long desc) MsgBB-Active resource END } # don't exit on TERM, to test that lrmd makes sure that we do exit trap sigterm_handler TERM sigterm_handler() { ocf_log info "They use TERM to bring us down. No such luck." return } msgbb_usage() { cat