Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-29 Thread Christopher Harvey
On Thu, Sep 29, 2016, at 02:45 PM, Jan Pokorný wrote:
> Hello ,
> 
> On 29/09/16 12:41 -0400, Christopher Harvey wrote:
> > I think something is failing at the  execvp() level. I'm seeing
> > useful looking trace logs in the code, but can't enable them right
> > now. I have:
> > PCMK_debug=yes
> > PCMK_logfile=/tmp/pacemaker.log
> > PCMK_logpriority=debug
> > PCMK_trace_files=services_linux.c
> 
> Just in case, pacemaker needs to be restarted once this change in the
> appropriate configuration file is made.

I did restart it. Maybe I should check pacemaker stdout?

> Another try, does "crm_resource --show-metadata=ocf:acme:MsgBB-Active"
> work for you?

That works. (as root)



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-29 Thread Jan Pokorný
Hello ,

On 29/09/16 12:41 -0400, Christopher Harvey wrote:
> I think something is failing at the  execvp() level. I'm seeing
> useful looking trace logs in the code, but can't enable them right
> now. I have:
> PCMK_debug=yes
> PCMK_logfile=/tmp/pacemaker.log
> PCMK_logpriority=debug
> PCMK_trace_files=services_linux.c

Just in case, pacemaker needs to be restarted once this change in the
appropriate configuration file is made.

Another try, does "crm_resource --show-metadata=ocf:acme:MsgBB-Active"
work for you?

-- 
Jan (Poki)


pgptI8zu5Q5eN.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-29 Thread Christopher Harvey
On Thu, Sep 29, 2016, at 12:20 PM, Jan Pokorný wrote:
> On 28/09/16 16:55 -0500, Ken Gaillot wrote:
> > On 09/28/2016 04:04 PM, Christopher Harvey wrote:
> >> My corosync/pacemaker logs are seeing a bunch of messages like the
> >> following:
> >> 
> >> Sep 22 14:50:36 [1346] node-132-60   crmd: info:
> >> action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613
> >> exited with rc=4
> 
> Another possibility is that "execvp" call, i.e., means to run this very
> agent, failed at a fundemental level (could also be due to kernel's
> security modules like SELinux, seccomp, etc. as already mentioned).

I don't have seccomp or SELinux.

> Do other agents work flawlessly for you?

I only have my custom agent. All actions work except meta-data. In fact,
I put the following at the very top of my resource agent:
#! /bin/bash
touch /tmp/yeah
echo "yeah running ${@}" >> /tmp/yeah

and the 'yeah' file gets filled with monitor/start/stop, but not
meta-data. I think something is failing at the  execvp() level. I'm
seeing useful looking trace logs in the code, but can't enable them
right now. I have:
PCMK_debug=yes
PCMK_logfile=/tmp/pacemaker.log
PCMK_logpriority=debug
PCMK_trace_files=services_linux.c

but I'm not seeing the pacemaker.log anywhere, and corosync.log only has
info and higher.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-29 Thread Jan Pokorný
On 28/09/16 16:55 -0500, Ken Gaillot wrote:
> On 09/28/2016 04:04 PM, Christopher Harvey wrote:
>> My corosync/pacemaker logs are seeing a bunch of messages like the
>> following:
>> 
>> Sep 22 14:50:36 [1346] node-132-60   crmd: info:
>> action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613
>> exited with rc=4

Another possibility is that "execvp" call, i.e., means to run this very
agent, failed at a fundemental level (could also be due to kernel's
security modules like SELinux, seccomp, etc. as already mentioned).

Do other agents work flawlessly for you?

> This is the (unmodified) exit status of the process, so the resource
> agent must be returning "4" for some reason. Normally, that is used to
> indicate "insufficient privileges".
> 
>> Sep 22 14:50:36 [1346] node-132-60   crmd:error:
>> generic_get_metadata:   Failed to retrieve meta-data for
>> ocf:acme:MsgBB-Active
>> Sep 22 14:50:36 [1346] node-132-60   crmd:  warning:
>> get_rsc_metadata:   No metadata found for MsgBB-Active::ocf:acme:
>> Input/output error (-5)
>> Sep 22 14:50:36 [1346] node-132-60   crmd:error:
>> build_operation_update: No metadata for acme::ocf:MsgBB-Active
>> Sep 22 14:50:36 [1346] node-132-60   crmd:   notice:
>> process_lrm_event:  Operation MsgBB-Active_start_0: ok
>> (node=node-132-60, call=25, rc=0, cib-update=27, confirmed=true)
>> 
>> I am able to run the meta-data command on the command line:
> 
> I would suspect that your user account has some privileges that the lrmd
> user (typically hacluster:haclient) doesn't have. Try "su - hacluster"
> first and see if it's any different. Maybe directory or file
> permissions, or SELinux?

In fact lrmd (along with stonithd) is an exception in the daemons'
conglomerate as it runs as root:root, so as to portably handle
execution of the resources that, naturally and in general, require
execution with as high (here: inherited), privileges.

-- 
Jan (Poki)


pgprzjCCKfyxO.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-28 Thread Ken Gaillot
On 09/28/2016 04:04 PM, Christopher Harvey wrote:
> My corosync/pacemaker logs are seeing a bunch of messages like the
> following:
> 
> Sep 22 14:50:36 [1346] node-132-60   crmd: info:
> action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613
> exited with rc=4

This is the (unmodified) exit status of the process, so the resource
agent must be returning "4" for some reason. Normally, that is used to
indicate "insufficient privileges".

> Sep 22 14:50:36 [1346] node-132-60   crmd:error:
> generic_get_metadata:   Failed to retrieve meta-data for
> ocf:acme:MsgBB-Active
> Sep 22 14:50:36 [1346] node-132-60   crmd:  warning:
> get_rsc_metadata:   No metadata found for MsgBB-Active::ocf:acme:
> Input/output error (-5)
> Sep 22 14:50:36 [1346] node-132-60   crmd:error:
> build_operation_update: No metadata for acme::ocf:MsgBB-Active
> Sep 22 14:50:36 [1346] node-132-60   crmd:   notice:
> process_lrm_event:  Operation MsgBB-Active_start_0: ok
> (node=node-132-60, call=25, rc=0, cib-update=27, confirmed=true)
> 
> I am able to run the meta-data command on the command line:

I would suspect that your user account has some privileges that the lrmd
user (typically hacluster:haclient) doesn't have. Try "su - hacluster"
first and see if it's any different. Maybe directory or file
permissions, or SELinux?

> node-132-43 # /lib/ocf/resource.d/acme/MsgBB-Active meta-data
> 
> 
> 
> 1.0
> 
> 
> MsgBB-Active resource (long desc)
> 
> MsgBB-Active resource
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> node-132-43 # echo $?
> 0
> 
> Resource code here:
> #! /bin/bash
> 
> ###
> # Initialization:
> 
> : ${OCF_FUNCTIONS=${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs}
> . ${OCF_FUNCTIONS}
> : ${__OCF_ACTION=$1}
> 
> ###
> 
> meta_data()
> {
> cat < 
> 
> 
> 1.0
> 
> 
> MsgBB-Active resource (long desc)
> 
> MsgBB-Active resource
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> END
> }
> 
> # don't exit on TERM, to test that lrmd makes sure that we do exit
> trap sigterm_handler TERM
> sigterm_handler() {
> ocf_log info "They use TERM to bring us down. No such luck."
> return
> }
> 
> msgbb_usage() {
> cat < usage: $0 {start|stop|monitor|validate-all|meta-data}
> 
> Expects to have a fully populated OCF RA-compliant environment set.
> END
> }
> 
> msgbb_monitor() {
> # trimmed.
> }
> 
> msgbb_stop() {
> # trimmed.
> }
> 
> msgbb_start() {
> # trimmed.
> }
> 
> msgbb_validate() {
> # trimmed.
> }
> 
> case $__OCF_ACTION in
> meta-data)  meta_data
> exit $OCF_SUCCESS
> ;;
> start)  msgbb_start;;
> stop)   msgbb_stop;;
> monitor)msgbb_monitor;;
> reload) ocf_log err "Reloading..."
> msgbb_start
> ;;
> validate-all)   msgbb_validate;;
> usage|help) msgbb_usage
> exit $OCF_SUCCESS
> ;;
> *)  msgbb_usage
> exit $OCF_ERR_UNIMPLEMENTED
> ;;
> esac
> rc=$?
> ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"
> exit $rc
> 
> 
> Thanks,
> Chris

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-28 Thread Christopher Harvey
My corosync/pacemaker logs are seeing a bunch of messages like the
following:

Sep 22 14:50:36 [1346] node-132-60   crmd: info:
action_synced_wait: Managed MsgBB-Active_meta-data_0 process 15613
exited with rc=4
Sep 22 14:50:36 [1346] node-132-60   crmd:error:
generic_get_metadata:   Failed to retrieve meta-data for
ocf:acme:MsgBB-Active
Sep 22 14:50:36 [1346] node-132-60   crmd:  warning:
get_rsc_metadata:   No metadata found for MsgBB-Active::ocf:acme:
Input/output error (-5)
Sep 22 14:50:36 [1346] node-132-60   crmd:error:
build_operation_update: No metadata for acme::ocf:MsgBB-Active
Sep 22 14:50:36 [1346] node-132-60   crmd:   notice:
process_lrm_event:  Operation MsgBB-Active_start_0: ok
(node=node-132-60, call=25, rc=0, cib-update=27, confirmed=true)

I am able to run the meta-data command on the command line:

node-132-43 # /lib/ocf/resource.d/acme/MsgBB-Active meta-data



1.0


MsgBB-Active resource (long desc)

MsgBB-Active resource













node-132-43 # echo $?
0

Resource code here:
#! /bin/bash

###
# Initialization:

: ${OCF_FUNCTIONS=${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs}
. ${OCF_FUNCTIONS}
: ${__OCF_ACTION=$1}

###

meta_data()
{
cat <


1.0


MsgBB-Active resource (long desc)

MsgBB-Active resource












END
}

# don't exit on TERM, to test that lrmd makes sure that we do exit
trap sigterm_handler TERM
sigterm_handler() {
ocf_log info "They use TERM to bring us down. No such luck."
return
}

msgbb_usage() {
cat