Re: [ClusterLabs] custom resource agent FAILED (blocked)

2018-04-12 Thread emmanuel segura
the start function, need to start the resource when monitor doesn't return
success

2018-04-12 23:38 GMT+02:00 Bishoy Mikhael :

> Hi All,
>
> I'm trying to create a resource agent to promote a standby HDFS namenode
> to active when the virtual IP failover to another node.
>
> I've taken the skeleton from the Dummy OCF agent.
>
> The modifications I've done to the Dummy agent are as follows:
>
> HDFSHA_start() {
> HDFSHA_monitor
> if [ $? =  $OCF_SUCCESS ]; then
> /opt/hadoop/sbin/hdfs-ha.sh start
> return $OCF_SUCCESS
> fi
> }
>
> HDFSHA_stop() {
> HDFSHA_monitor
> if [ $? =  $OCF_SUCCESS ]; then
> /opt/hadoop/sbin/hdfs-ha.sh stop
> fi
> return $OCF_SUCCESS
> }
>
> HDFSHA_monitor() {
> # Monitor _MUST!_ differentiate correctly between running
> # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
> # That is THREE states, not just yes/no.
> active_nn=$(hdfs haadmin -getAllServiceState | grep active | cut -d":" -f
> 1)
> current_node=$(uname -n)
> if [[ ${active_nn} == ${current_node} ]]; then
>return $OCF_SUCCESS
> fi
> }
>
> HDFSHA_validate() {
>
> return $OCF_SUCCESS
> }
>
>
> I've created the resource as follows:
>
> # pcs resource create hdfs-ha ocf:heartbeat:HDFSHA op monitor interval=30s
>
>
> The resource fails right away as follows:
>
>
> # pcs status
>
> Cluster name: hdfs_cluster
>
> Stack: corosync
>
> Current DC: taulog (version 1.1.16-12.el7_4.8-94ff4df) - partition with
> quorum
>
> Last updated: Thu Apr 12 03:30:57 2018
>
> Last change: Thu Apr 12 03:30:54 2018 by root via cibadmin on lingcod
>
>
> 3 nodes configured
>
> 2 resources configured
>
>
> Online: [ dentex lingcod taulog ]
>
>
> Full list of resources:
>
>
>  VirtualIP (ocf::heartbeat:IPaddr2): Started taulog
>
>  hdfs-ha (ocf::heartbeat:HDFSHA): FAILED (blocked)[ taulog dentex ]
>
>
> Failed Actions:
>
> * hdfs-ha_stop_0 on taulog 'insufficient privileges' (4): call=12,
> status=complete, exitreason='none',
>
> last-rc-change='Thu Apr 12 03:17:37 2018', queued=0ms, exec=1ms
>
> * hdfs-ha_stop_0 on dentex 'insufficient privileges' (4): call=10,
> status=complete, exitreason='none',
>
> last-rc-change='Thu Apr 12 03:17:43 2018', queued=0ms, exec=1ms
>
>
>
> Daemon Status:
>
>   corosync: active/enabled
>
>   pacemaker: active/enabled
>
>   pcsd: active/enabled
>
> I debug the resource as follows, and it returns 0
>
> # pcs resource debug-monitor hdfs-ha
>
> Operation monitor for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
>
>  >  stderr: DEBUG: hdfs-ha monitor : 0
>
>
> # pcs resource debug-stop hdfs-ha
>
> Operation stop for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
>
>  >  stderr: DEBUG: hdfs-ha stop : 0
>
>
> # pcs resource debug-start hdfs-ha
>
> Operation start for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0
>
>  >  stderr: DEBUG: hdfs-ha start : 0
>
>
>
> I don't understand what am I doing wrong!
>
>
> Regards,
>
> Bishoy Mikhael
>
> ___
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>


-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Failing operations immediately when node is known to be down

2018-04-12 Thread Ryan Thomas
I’m trying to implement a HA solution which recovers very quickly when a
node fails.  It my configuration, when I reboot a node, I see in the logs
that pacemaker realizes the node is down, and decides to move all resources
to the surviving node.  To do this, it initiates a ‘stop’ operation on each
of the resources to perform the move.  The ‘stop’ fails as expected after
20s (the default action timeout).  However, in this case, with the node
known to be down,  I’d like to avoid this 20 second delay.  The node is
known to be down, so any operations sent to the node will fail.  It would
be nice if operations sent to a down node would immediately fail, thus
reducing the time it takes the resource to be started on the surviving
node.  I do not want to reduce the timeout for the operation, because the
timeout is sensible for when a resource moves due to a non-node-failure.  Is
there a way to accomplish this?


Thanks for your help.
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] custom resource agent FAILED (blocked)

2018-04-12 Thread Bishoy Mikhael
Hi All,

I'm trying to create a resource agent to promote a standby HDFS namenode to
active when the virtual IP failover to another node.

I've taken the skeleton from the Dummy OCF agent.

The modifications I've done to the Dummy agent are as follows:

HDFSHA_start() {
HDFSHA_monitor
if [ $? =  $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh start
return $OCF_SUCCESS
fi
}

HDFSHA_stop() {
HDFSHA_monitor
if [ $? =  $OCF_SUCCESS ]; then
/opt/hadoop/sbin/hdfs-ha.sh stop
fi
return $OCF_SUCCESS
}

HDFSHA_monitor() {
# Monitor _MUST!_ differentiate correctly between running
# (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
# That is THREE states, not just yes/no.
active_nn=$(hdfs haadmin -getAllServiceState | grep active | cut -d":" -f 1)
current_node=$(uname -n)
if [[ ${active_nn} == ${current_node} ]]; then
   return $OCF_SUCCESS
fi
}

HDFSHA_validate() {

return $OCF_SUCCESS
}


I've created the resource as follows:

# pcs resource create hdfs-ha ocf:heartbeat:HDFSHA op monitor interval=30s


The resource fails right away as follows:


# pcs status

Cluster name: hdfs_cluster

Stack: corosync

Current DC: taulog (version 1.1.16-12.el7_4.8-94ff4df) - partition with
quorum

Last updated: Thu Apr 12 03:30:57 2018

Last change: Thu Apr 12 03:30:54 2018 by root via cibadmin on lingcod


3 nodes configured

2 resources configured


Online: [ dentex lingcod taulog ]


Full list of resources:


 VirtualIP (ocf::heartbeat:IPaddr2): Started taulog

 hdfs-ha (ocf::heartbeat:HDFSHA): FAILED (blocked)[ taulog dentex ]


Failed Actions:

* hdfs-ha_stop_0 on taulog 'insufficient privileges' (4): call=12,
status=complete, exitreason='none',

last-rc-change='Thu Apr 12 03:17:37 2018', queued=0ms, exec=1ms

* hdfs-ha_stop_0 on dentex 'insufficient privileges' (4): call=10,
status=complete, exitreason='none',

last-rc-change='Thu Apr 12 03:17:43 2018', queued=0ms, exec=1ms



Daemon Status:

  corosync: active/enabled

  pacemaker: active/enabled

  pcsd: active/enabled

I debug the resource as follows, and it returns 0

# pcs resource debug-monitor hdfs-ha

Operation monitor for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha monitor : 0


# pcs resource debug-stop hdfs-ha

Operation stop for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha stop : 0


# pcs resource debug-start hdfs-ha

Operation start for hdfs-ha (ocf:heartbeat:HDFSHA) returned 0

 >  stderr: DEBUG: hdfs-ha start : 0



I don't understand what am I doing wrong!


Regards,

Bishoy Mikhael
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Corosync 2.4.4 is available at corosync.org!

2018-04-12 Thread Ferenc Wágner
Jan Pokorný  writes:

> On 12/04/18 14:33 +0200, Jan Friesse wrote:
>
>> This release contains a lot of fixes, including fix for
>> CVE-2018-1084.
>
> Security related updates would preferably provide more context

Absolutely, thanks for providing that!  Looking at the git log, I wonder
if c139255 (totemsrp: Implement sanity checks of received msgs) has
direct security relevance as well.  Should I include that too in the
Debian security update?  Debian stable has 2.4.2, so I'm cherry picking
into that version.
-- 
Thanks,
Feri
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] 答复: No slave is promoted to be master

2018-04-12 Thread Ken Gaillot
On Thu, 2018-04-12 at 07:29 +, 范国腾 wrote:
> Hello,
>  
> We use the following command to create the cluster. Node2 is always
> the master when the cluster starts. Why does pacemaker not select
> node1 as the default master?
> How to configure if we want node1 to be the default master?

You can specify a location constraint giving pgsqld's master role a
positive score (not INFINITY) on node1, or a negative score (not
-INFINITY) on node2. Using a non-infinite score in a constraint tells
pacemaker that it's a preference, but not a requirement.

However, there's rarely a good reason to do that. In HA, the best
practice is that all nodes should be completely interchangeable, so
that the service can run equally on any node (since it might have to,
in a failure scenario). Such constraints can be useful temporarily,
e.g. if you need to upgrade the software on one node, or if one node is
underperforming (perhaps it's waiting on a hardware upgrade to come in,
or running some one-time job consuming a lot of CPU).

>  
> pcs cluster setup --name cluster_pgsql node1 node2
> pcs resource create pgsqld ocf:heartbeat:pgsqlms
> bindir=/usr/local/pgsql/bin pgdata=/home/postgres/data op start
> timeout=600s op stop timeout=60s op promote timeout=300s op demote
> timeout=120s op monitor interval=15s timeout=100s role="Master" op
> monitor interval=16s timeout=100s role="Slave" op notify
> timeout=60s;pcs resource master pgsql-ha pgsqld notify=true
> interleave=true;
>  
>  
> Sometimes it reports the following error, how to configure to avoid
> it?
-- 
Ken Gaillot 
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Corosync 2.4.4 is available at corosync.org!

2018-04-12 Thread Jan Pokorný
On 12/04/18 14:33 +0200, Jan Friesse wrote:
> I am pleased to announce the latest maintenance release of Corosync
> 2.4.4 available immediately from our website at
> http://build.clusterlabs.org/corosync/releases/.
> 
> This release contains a lot of fixes, including fix for CVE-2018-1084.

Security related updates would preferably provide more context
as a cue for users to evaluate urgency of applying the update
(or particular patch as denote below) and/or to consider the
risks involved.

That being said, there was this announcement at the oss-security list
earlier today: http://www.openwall.com/lists/oss-security/2018/04/12/2
from which I quote:

  An integer overflow leading to an out-of-bound read was found
  in authenticate_nss_2_3() in Corosync. An attacker could craft
  a malicious packet that would lead to a denial of service.

> Complete changelog for 2.4.4:
> 
> [...]
> 
>   totemcrypto: Check length of the packet

-- 
Poki


pgpv2TzGviVAA.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Corosync 2.4.4 is available at corosync.org!

2018-04-12 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
2.4.4 available immediately from our website at
http://build.clusterlabs.org/corosync/releases/.

This release contains a lot of fixes, including fix for CVE-2018-1084.

Complete changelog for 2.4.4:

Andrey Ter-Zakhariants (1):
  corosync-notifyd: improve error handling

Bin Liu (7):
  man:fix in corosync-qdevice.8
  quorumtool: remove duplicated help message
  cfg: nodeid should be unsigned int
  coroparse: Use readdir instead of readdir_r
  wd: fix snprintf warnings
  Fix compile errors in qdevice on FreeBSD
  qdevice: mv free(str) after port validation

Ferenc Wágner (6):
  Fix various typos
  Fix typo: recomended -> recommended
  man: support SOURCE_DATE_EPOCH
  configure: add --with-initconfigdir option
  Use static case blocks to determine distro flavor
  Use RuntimeDirectory instead of tmpfiles.d

Jan Friesse (29):
  coroparse: Do not convert empty uid, gid to 0
  sam: Fix snprintf compiler warnings
  quorumtool: Use full buffer size in snprintf
  man: Add note about qdevice parallel cmds start
  sync: Remove unneeded determine sync code
  sync: Call sync_init of all services at once
  corosync.conf: publicize nodelist.node.name
  totemudp[u]: Drop truncated packets on receive
  logging: Make blackbox configurable
  logging: Close before and open blackbox after fork
  init: Quote subshell result properly
  blackbox: Quote subshell result properly
  qdevice: quote certutils scripts properly
  sam_test_agent: Remove unused assignment
  qdevice: Fix NULL pointer dereference
  quorumtool: Don't set our_flags without v_handle
  qdevice: Nodelist is set into string not array
  qdevice: Check if user_data can be dereferenced
  qdevice: Add safer wrapper of strtoll
  qdevice: Replace strtol by strtonum
  qnetd: Replace strtol by strtonum
  main: Set errno before calling of strtol
  totemcrypto: Implement bad crypto header guess
  cpg: Use list_del instead of qb_list_del
  totemcrypto: Check length of the packet
  totemsrp: Implement sanity checks of received msgs
  totemsrp: Check join and leave msg length
  totemudp: Check lenght of message to sent
  qdevice msgio: Fix reading of msg longer than i32

Jan Pokorný (3):
  logsys: Avoid redundant callsite section checking
  man: corosync-qdevice: fix formatting vs. punctuation
  man: corosync-qdevice: some more stylistics

Jonathan Davies (1):
  man: fix cpg_mcast_joined.3.in

Rytis Karpuška (5):
  libcpg: Fix issue with partial big packet assembly
  totempg: Fix fragmentation segfault
  totempg: use iovec[i].iov_len instead of copy_len
  totempg: Fix corrupted messages
  cpg: Handle fragmented message sending interrupt

Toki Winter (1):
  corosync.aug: Add missing options

yuusuke (1):
  systemd: Delete unnecessary soft_margin

Upgrade is (as usually) highly recommended.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] No slave is promoted to be master

2018-04-12 Thread Jehan-Guillaume de Rorthais
Hi,
On Thu, 12 Apr 2018 08:31:39 +
范国腾  wrote:

> Thank you very much for help check this issue. The information is in the
> attachment. 
> 
> I have restarted the cluster after I send my first email. Not sure if it
> affects the checking of "the result of "crm_simulate -sL"

It does...

Could you please provide files
from /var/lib/pacemaker/pengine/pe-input-2039.bz2 to  pe-input-2065.bz2 ?

[...]
> Then the master is restarted and it could not start(that is ok and we know
> the reason)。

Why couldn't it start ?
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemake/Corosync good fit for embedded product?

2018-04-12 Thread Klaus Wenninger
On 04/12/2018 04:37 AM, David Hunt wrote:
> Thanks Guys,
>
> Ideally I would like to have event driven (rather than slower polled)
> inputs into pacemaker to quickly trigger the fall over. I assume
> adding event driven inputs to pacemaker isn't straightforward? If it
> was possible to add event inputs to pacemaker is pacemaker itself fast
> enough? Or is it also going to be relatively slow to switch?

I'm not aware of any systematic delays on top of what we discussed.
The time the rule-engine will need to calculate a transition will
of course depend on the complexity of your cluster and the
CPU-power you have available.

I've mentioned the delay of the DC reelection but what you
might consider as well in your calculations is fencing.
If you are using physical fencing-devices it depends on how
quickly these will react and give feedback. You might be able
to boost that by e.g. already keeping a control connection
to a fencing device open.

Regards,
Klaus

>
> It would seem based on this discussion it may still work still work to
> use pacemaker & corosync for initial setup & handle services which can
> handle a slower switch over time. For our services that require a much
> faster switch over time it would appear we need something propriety.
>
> Regards
> David
>
> On 12 April 2018 at 02:56, Klaus Wenninger  > wrote:
>
> On 04/11/2018 10:44 AM, Jan Friesse wrote:
> > David,
> >
> >> Hi,
> >>
> >> We are planning on creating a HA product in an active/standby
> >> configuration
> >> whereby the standby unit needs to take over from the active
> unit very
> >> fast
> >> (<50ms including all services restored).
> >>
> >> We are able to do very fast signaling (say 1000Hz) between the two
> >> units to
> >> detect failures so detecting a failure isn't really an issue.
> >>
> >> Pacemaker looks to be a very useful piece of software for managing
> >> resources so rather than roll our own it would make sense to reuse
> >> pacemaker.
> >>
> >> So my initial questions are:
> >>
> >>     1. Do people think pacemaker is the right thing to use?
> Everything I
> >>     read seem to be talking about multiple seconds for failure
> >> detection etc.
> >>     Feature wise it looks pretty similar to what we would want.
> >>     2. Has anyone done anything similar to this?
> >>     3. Any pointers on where/how to add additional failure
> detection
> >> inputs
> >>     to pacemaker?
> >>     4.
> >>     5. For a new design would you go with pacemaker+corosync,
> >>     pacemaker+corosync+knet or something different?
> >>
> >
> >
> > I will just share my point of view about Corosync side.
> >
> > Corosync is using it's own mechanism for detecting failure, based on
> > token rotation. Default timeout for detecting lost of token is 1
> > second, so detecting failure takes hugely more than 50ms. It can be
> > lowered, but that is not really tested.
> >
> > That means it's not currently possible to use different signaling
> > mechanism without significant Corosync change.
> >
> > So I don't think Corosync can be really used for described scenario.
> >
> > Honza
>
> On the other hand if a fail-over is triggered by loosing a node or
> anything
> that is being detected by corosync this is probably already the
> fast-path
> in a pacemaker-cluster.
>
> Detection of other types of failures (like a resource failing on
> an otherwise functional node) is probably even way slower.
> When a failure is detected by corosync, pacemaker has some kind of
> an event driven way to react on that.
> We even have to add some delay to the mere corosync detection time
> mentioned by Honza as pacemaker will have to run e.g. a selection
> cycle for the designated coordinator to be able to do decisions again.
>
> For other failures the base principle is rather probing a resource
> at a
> fixed rate (multiple seconds usually) for detection of failures
> instead
> of an event-driven mechanism.
> There might be trickery possible though using attributes to achieve
> event-driven-like reaction on certain failures. But I haven't done
> anything concrete to exploit these possibilities. Others might have
> more info (which I personally would be interested in as well ;-) ).
>
> Approaches to realize event-driven mechanisms for resource-failure-
> detection are under investigation/development (systemd-resources,
> IP resources sitting on interfaces, ...) but afaik there is nothing
> available out of the box by now.
>
> Having that all said I can add some personal experiences from
> having implemented an embedded product based on a
> pacemaker-cluster myself in the past:
>
> As reaction time based on 

Re: [ClusterLabs] No slave is promoted to be master

2018-04-12 Thread Jehan-Guillaume de Rorthais
Hi,
On Thu, 12 Apr 2018 03:14:52 +
范国腾  wrote:
> We have three nodes in the cluster. When the master postgres resource in one
> node(db1) crashed and could not start any more, we hope that one of the slave
> node(db2,db3) could be promoted be master. But it does not.
> 
> [cid:image001.jpg@01D3D24F.03F259A0]
> 
> Here is the log
> 
> db1:
> [cid:image002.jpg@01D3D24F.03F259A0]
> Db2:
> [cid:image007.png@01D3D24E.200050D0]
> Db3:
> [cid:image008.png@01D3D24E.200050D0]

Could you please provide:

* your full logs from all nodes as textual (compressed) files?
* the full setup of the cluster
* the result of "crm_simulate -sL"

Regards,
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] 答复: No slave is promoted to be master

2018-04-12 Thread 范国腾
Hello,

We use the following command to create the cluster. Node2 is always the master 
when the cluster starts. Why does pacemaker not select node1 as the default 
master?
How to configure if we want node1 to be the default master?

pcs cluster setup --name cluster_pgsql node1 node2
pcs resource create pgsqld ocf:heartbeat:pgsqlms bindir=/usr/local/pgsql/bin 
pgdata=/home/postgres/data op start timeout=600s op stop timeout=60s op promote 
timeout=300s op demote timeout=120s op monitor interval=15s timeout=100s 
role="Master" op monitor interval=16s timeout=100s role="Slave" op notify 
timeout=60s;pcs resource master pgsql-ha pgsqld notify=true interleave=true;


Sometimes it reports the following error, how to configure to avoid it?
[cid:image003.png@01D3D271.A4C44560]



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org