Hello,

i've a question about metadata returned by an ocf resource agent using the "meta-data" command and the behaviour of the cluster.

When checking the resource agent's metadata using crm i get this:

# crm
crm(live)# ra
crm(live)ra#  meta cluster_oracle ocf
bla (ocf:heartbeat:cluster_oracle)

Master/Slave OCF Resource Agent for Oracle (clustered)

Parameters (* denotes required, [] the default):

oracle_role* (string): Ora role
    Required to assign the Oracle role. Must be "master" or "slave"

Operations' defaults (advisory minimum):

    start    timeout=240
    promote  timeout=90
    demote   timeout=90
    notify   timeout=90
    stop     timeout=100
    monitor  timeout=20 interval=20 depth=0
    monitor  timeout=20 interval=10 depth=0

So it seems for the "stop" action there is a timeout of 100 seconds defined. But at cluster shutdown i can see this in the ha-debug log:

...
Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: Initiating action 5: stop oracle_primary_stop_0 on node1 (local) Jan 18 14:31:35 node11 pengine: [12848]: notice: LogActions: Leave resource oracle_secondary (Stopped)
Jan 18 14:31:35 node1 lrmd: [12841]: info: rsc:oracle_primary:7: stop
Jan 18 14:31:35 node1 crmd: [12844]: info: do_lrm_rsc_op: Performing key=5:10:0:40ea1f42-c929-40d6-a0ed-569a7c8944bc op=oracle_primary_stop_0 ) Jan 18 14:31:35 node1 lrmd: [12841]: info: RA output: (oracle_primary:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/cluster_oracle[247]: Jan 18 14:31:35 node1 pengine: [12848]: WARN: process_pe_message: Transition 10: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-2220.bz2 Jan 18 14:31:35 node1 pengine: [12848]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues. Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop process (PID 14386) timed out (try 1). Killing with signal SIGTERM (15). Jan 18 14:31:55 node1 lrmd: [12841]: info: RA output: (oracle_primary:stop:stderr)
Session terminated, killing shell...
Jan 18 14:31:57 node1 lrmd: [12841]: info: RA output: (oracle_primary:stop:stderr) ...killed.

Apparently a timeout occured at the stop action after 20 seconds. But why, if the resource defined 100 secs?

With kind regards
Markus

_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to