Re: [ClusterLabs] custom resource agent FAILED (blocked)

2018-04-12 Thread emmanuel segura
the start function, need to start the resource when monitor doesn't return
success

2018-04-12 23:38 GMT+02:00 Bishoy Mikhael :

> Hi All,
>
> I'm trying to create a resource agent to promote a standby HDFS namenode
> to active when the virtual IP failover to another node.
>
> I've taken the skeleton from the Dummy OCF agent.
>
> The modifications I've done to the Dummy agent are as follows:
>
> HDFSHA_start() {
> HDFSHA_monitor
> if [ $? =  $OCF_SUCCESS ]; then
> /opt/hadoop/sbin/hdfs-ha.sh start
> return $OCF_SUCCESS
> fi
> }
>
> HDFSHA_stop() {
> HDFSHA_monitor
> if [ $? =  $OCF_SUCCESS ]; then
> /opt/hadoop/sbin/hdfs-ha.sh stop
> fi
> return $OCF_SUCCESS
> }
>
> HDFSHA_monitor() {
> # Monitor _MUST!_ differentiate correctly between running
> # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
> # That is THREE states, not just yes/no.
> active_nn=$(hdfs haadmin -getAllServiceState | grep active | cut -d":" -f
> 1)
> current_node=$(uname -n)
> if [[ ${active_nn} == ${current_node} ]]; then
>return $OCF_SUCCESS
> fi
> }
>
> HDFSHA_validate() {
>
> return $OCF_SUCCESS
> }
>
>
> I've created the resource as follows:
>
> # pcs resource create hdfs-ha ocf:heartbeat:HDFSHA op monitor interval=30s
>
>
> The resource fails right away as follows:
>
>
> # pcs status
>
> Cluster name: hdfs_cluster
>
> Stack: corosync
>
> Current DC: taulog (version 1.1.16-12.el7_4.8-94ff4df) - partition with
> quorum
>
> Last updated: Thu Apr 12 03:30:57 2018
>
> Last change: Thu Apr 12 03:30:54 2018 by root via cibadmin on lingcod
>
>
> 3 nodes configured
>
> 2 resources configured
>
>
> Online: [ dentex lingcod taulog ]
>
>
> Full list of resources:
>
>
>  VirtualIP (ocf::heartbeat:IPaddr2): Started taulog
>
>  hdfs-ha (ocf::heartbeat:HDFSHA): FAILED (blocked)[ taulog dentex ]
>
>
> Failed Actions:
>
> * hdfs-ha_stop_0 on taulog 'insufficient privileges' (4): call=12,
> status=complete, exitreason='none',
>
> last-rc-change='Thu Apr 12 03:17:37 2018', queued=0ms, exec=1ms
>
> * hdfs-ha_stop_0 on dentex 'insufficient privileges' (4): call=10,
> status=complete, exitreason='none',
>
> last-rc-change='Thu Apr 12 03:17:43 2018', queued=0ms, exec=1ms
>
>
>
> Daemon Status:
>
>   corosync: active/enabled
>
>   pacemaker: active/enabled
>
>   pcsd: active/enabled
>
> I debug the resource as follows, and it returns 0
>
> # pcs resource debug-monitor hdfs-ha
>
> Operation monitor for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
>
>  >  stderr: DEBUG: hdfs-ha monitor : 0
>
>
> # pcs resource debug-stop hdfs-ha
>
> Operation stop for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
>
>  >  stderr: DEBUG: hdfs-ha stop : 0
>
>
> # pcs resource debug-start hdfs-ha
>
> Operation start for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
>
>  >  stderr: DEBUG: hdfs-ha start : 0
>
>
>
> I don't understand what am I doing wrong!
>
>
> Regards,
>
> Bishoy Mikhael
>
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>


-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] custom resource agent FAILED (blocked)

2018-04-12 Thread Bishoy Mikhael
Hi All,

I'm trying to create a resource agent to promote a standby HDFS namenode to
active when the virtual IP failover to another node.

I've taken the skeleton from the Dummy OCF agent.

The modifications I've done to the Dummy agent are as follows:

HDFSHA_start() {
HDFSHA_monitor
if [ $? =  $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh start
return $OCF_SUCCESS
fi
}

HDFSHA_stop() {
HDFSHA_monitor
if [ $? =  $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh stop
fi
return $OCF_SUCCESS
}

HDFSHA_monitor() {
# Monitor _MUST!_ differentiate correctly between running
# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
# That is THREE states, not just yes/no.
active_nn=$(hdfs haadmin -getAllServiceState | grep active | cut -d":" -f 1)
current_node=$(uname -n)
if [[ ${active_nn} == ${current_node} ]]; then
   return $OCF_SUCCESS
fi
}

HDFSHA_validate() {

return $OCF_SUCCESS
}


I've created the resource as follows:

# pcs resource create hdfs-ha ocf:heartbeat:HDFSHA op monitor interval=30s


The resource fails right away as follows:


# pcs status

Cluster name: hdfs_cluster

Stack: corosync

Current DC: taulog (version 1.1.16-12.el7_4.8-94ff4df) - partition with
quorum

Last updated: Thu Apr 12 03:30:57 2018

Last change: Thu Apr 12 03:30:54 2018 by root via cibadmin on lingcod


3 nodes configured

2 resources configured


Online: [ dentex lingcod taulog ]


Full list of resources:


 VirtualIP (ocf::heartbeat:IPaddr2): Started taulog

 hdfs-ha (ocf::heartbeat:HDFSHA): FAILED (blocked)[ taulog dentex ]


Failed Actions:

* hdfs-ha_stop_0 on taulog 'insufficient privileges' (4): call=12,
status=complete, exitreason='none',

last-rc-change='Thu Apr 12 03:17:37 2018', queued=0ms, exec=1ms

* hdfs-ha_stop_0 on dentex 'insufficient privileges' (4): call=10,
status=complete, exitreason='none',

last-rc-change='Thu Apr 12 03:17:43 2018', queued=0ms, exec=1ms



Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

I debug the resource as follows, and it returns 0

# pcs resource debug-monitor hdfs-ha

Operation monitor for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha monitor : 0


# pcs resource debug-stop hdfs-ha

Operation stop for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha stop : 0


# pcs resource debug-start hdfs-ha

Operation start for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha start : 0



I don't understand what am I doing wrong!


Regards,

Bishoy Mikhael
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org