Re: [ClusterLabs] "VirtualDomain is active on 2 nodes" due to transient network failure

2016-08-30 Thread Andreas Kurz
Hi,

On Tue, Aug 30, 2016 at 10:03 PM, Scott Greenlese 
wrote:

> Added an appropriate subject line (was blank). Thanks...
>
>
> Scott Greenlese ... IBM z/BX Solutions Test, Poughkeepsie, N.Y.
> INTERNET: swgre...@us.ibm.com
> PHONE: 8/293-7301 (845-433-7301) M/S: POK 42HA/P966
>
> - Forwarded by Scott Greenlese/Poughkeepsie/IBM on 08/30/2016 03:59 PM
> -
>
> From: Scott Greenlese/Poughkeepsie/IBM@IBMUS
> To: Cluster Labs - All topics related to open-source clustering welcomed <
> users@clusterlabs.org>
> Date: 08/29/2016 06:36 PM
> Subject: [ClusterLabs] (no subject)
> --
>
>
>
> Hi folks,
>
> I'm assigned to system test Pacemaker/Corosync on the KVM on System Z
> platform
> with pacemaker-1.1.13-10 and corosync-2.3.4-7 .
>

Would be good to see your full cluster configuration (corosync.conf and
cib) - but first guess is: no fencing at all  and what is your
"no-quorum-policy" in Pacemaker?

Regards,
Andreas


>
> I have a cluster with 5 KVM hosts, and a total of 200
> ocf:pacemakerVirtualDomain resources defined to run
> across the 5 cluster nodes (symmertical is true for this cluster).
>
> The heartbeat network is communicating over vlan1293, which is hung off a
> network device, 0230 .
>
> In general, pacemaker does a good job of distributing my virtual guest
> resources evenly across the hypervisors
> in the cluster. These resource are a mixed bag:
>
> - "opaque" and remote "guest nodes" managed by the cluster.
> - allow-migrate=false and allow-migrate=true
> - qcow2 (file based) guests and LUN based guests
> - Sles and Ubuntu OS
>
> [root@zs95kj ]# pcs status |less
> Cluster name: test_cluster_2
> Last updated: Mon Aug 29 17:02:08 2016 Last change: Mon Aug 29 16:37:31
> 2016 by root via crm_resource on zs93kjpcs1
> Stack: corosync
> Current DC: zs95kjpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) - partition
> with quorum
> 103 nodes and 300 resources configured
>
> Node zs90kppcs1: standby
> Online: [ zs93KLpcs1 zs93kjpcs1 zs95KLpcs1 zs95kjpcs1 ]
>
> This morning, our system admin team performed a "non-disruptive"
> (concurrent) microcode code load on the OSA, which
> (to our surprise) dropped the network connection for 13 seconds on the S93
> CEC, from 11:18:34am to 11:18:47am , to be exact.
> This temporary outage caused the two cluster nodes on S93 (zs93kjpcs1 and
> zs93KLpcs1) to drop out of the cluster,
> as expected.
>
> However, pacemaker didn't handle this too well. The end result was
> numerous VirtualDomain resources in FAILED state:
>
> [root@zs95kj log]# date;pcs status |grep VirtualD |grep zs93 |grep FAILED
> Mon Aug 29 12:33:32 EDT 2016
> zs95kjg110104_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110092_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
> zs95kjg110099_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110102_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110106_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
> zs95kjg110112_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110115_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110118_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
> zs95kjg110124_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110127_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110130_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
> zs95kjg110136_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110139_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110142_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
> zs95kjg110148_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110152_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110155_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110161_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110164_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110167_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110173_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110176_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110179_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg110185_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
> zs95kjg109106_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
>
>
> As well as, several VirtualDomain resources showing "Started" on two
> cluster nodes:
>
> zs95kjg110079_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
> zs93KLpcs1 ]
> zs95kjg110108_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
> zs93KLpcs1 ]
> zs95kjg110186_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
> zs93KLpcs1 ]
> zs95kjg110188_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
> zs93KLpcs1 ]
> zs95kjg110198_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
> zs93KLpcs1 ]
>
>
> The virtual machines themselves were in fact, 

[ClusterLabs] "VirtualDomain is active on 2 nodes" due to transient network failure

2016-08-30 Thread Scott Greenlese


Added an appropriate subject line (was blank).  Thanks...


Scott Greenlese ... IBM z/BX Solutions Test,  Poughkeepsie, N.Y.
  INTERNET:  swgre...@us.ibm.com
  PHONE:  8/293-7301 (845-433-7301)M/S:  POK 42HA/P966

- Forwarded by Scott Greenlese/Poughkeepsie/IBM on 08/30/2016 03:59 PM
-

From:   Scott Greenlese/Poughkeepsie/IBM@IBMUS
To: Cluster Labs - All topics related to open-source clustering
welcomed 
Date:   08/29/2016 06:36 PM
Subject:[ClusterLabs] (no subject)



Hi folks,

I'm assigned to system test Pacemaker/Corosync on the KVM on System Z
platform
with pacemaker-1.1.13-10 and corosync-2.3.4-7 .

I have a cluster with 5 KVM hosts, and a total of 200
ocf:pacemakerVirtualDomain resources defined to run
across the 5 cluster nodes (symmertical is true for this cluster).

The heartbeat network is communicating over vlan1293, which is hung off a
network device, 0230 .

In general, pacemaker does a good job of distributing my virtual guest
resources evenly across the hypervisors
in the cluster. These resource are a mixed bag:

- "opaque" and remote "guest nodes" managed by the cluster.
- allow-migrate=false and allow-migrate=true
- qcow2 (file based) guests and LUN based guests
- Sles and Ubuntu OS

[root@zs95kj ]# pcs status |less
Cluster name: test_cluster_2
Last updated: Mon Aug 29 17:02:08 2016 Last change: Mon Aug 29 16:37:31
2016 by root via crm_resource on zs93kjpcs1
Stack: corosync
Current DC: zs95kjpcs1 (version 1.1.13-10.el7_2.ibm.1-44eb2dd) - partition
with quorum
103 nodes and 300 resources configured

Node zs90kppcs1: standby
Online: [ zs93KLpcs1 zs93kjpcs1 zs95KLpcs1 zs95kjpcs1 ]

This morning, our system admin team performed a
"non-disruptive" (concurrent) microcode code load on the OSA, which
(to our surprise) dropped the network connection for 13 seconds on the S93
CEC, from 11:18:34am to 11:18:47am , to be exact.
This temporary outage caused the two cluster nodes on S93 (zs93kjpcs1 and
zs93KLpcs1) to drop out of the cluster,
as expected.

However, pacemaker didn't handle this too well. The end result was numerous
VirtualDomain resources in FAILED state:

[root@zs95kj log]# date;pcs status |grep VirtualD |grep zs93 |grep FAILED
Mon Aug 29 12:33:32 EDT 2016
zs95kjg110104_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110092_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
zs95kjg110099_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110102_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110106_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
zs95kjg110112_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110115_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110118_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
zs95kjg110124_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110127_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110130_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
zs95kjg110136_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110139_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110142_res (ocf::heartbeat:VirtualDomain): FAILED zs93KLpcs1
zs95kjg110148_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110152_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110155_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110161_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110164_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110167_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110173_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110176_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110179_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg110185_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1
zs95kjg109106_res (ocf::heartbeat:VirtualDomain): FAILED zs93kjpcs1


As well as, several VirtualDomain resources showing "Started" on two
cluster nodes:

zs95kjg110079_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
zs93KLpcs1 ]
zs95kjg110108_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
zs93KLpcs1 ]
zs95kjg110186_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
zs93KLpcs1 ]
zs95kjg110188_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
zs93KLpcs1 ]
zs95kjg110198_res (ocf::heartbeat:VirtualDomain): Started[ zs93kjpcs1
zs93KLpcs1 ]


The virtual machines themselves were in fact, "running" on both hosts. For
example:

[root@zs93kl ~]# virsh list |grep zs95kjg110079
70 zs95kjg110079 running

[root@zs93kj cli]# virsh list |grep zs95kjg110079
18 zs95kjg110079 running


On this particular VM, here was file corruption of this file-based qcow2
guest's image, such that you could not ping or ssh,
and if you open a virsh console, you get "initramfs" prompt.

To recover, we had to mount the volume on another VM and then run fsck to
recover it.

I walked 

Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Dimitri Maziuk
On 08/30/2016 11:15 AM, Dejan Muhamedagic wrote:

> I suppose that it is explained in enough detail here:
> 
> https://en.wikipedia.org/wiki/Shebang_(Unix)

I expect you're being deliberately obtuse.

It does not explain which program loader interprets line 1 of findif.sh:
"#!/bin/sh" when it is invoked from line 69 of IPAddr2 RA:

. ${OCF_FUNCTIONS_DIR}/findif.sh

https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/IPaddr2
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/findif.sh

Similarly, I have not read the code so I don't know who invokes IPArrd2
and how exactly they do it. If you tell me it makes the kernel look at
the magic number and spawn whatever shell's specified there, I believe you.

> As already mentioned elsewhere in the thread, local is supported
> in most shell implementations and without it we otherwise
> wouldn't to be able to maintain software. Not sure where local
> originates, but wouldn't bet that it's bash.

Well 2 out of 3 is "most", can't argue with that.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Lars Ellenberg
On Tue, Aug 30, 2016 at 06:15:49PM +0200, Dejan Muhamedagic wrote:
> On Tue, Aug 30, 2016 at 10:08:00AM -0500, Dmitri Maziuk wrote:
> > On 2016-08-30 03:44, Dejan Muhamedagic wrote:
> > 
> > >The kernel reads the shebang line and it is what defines the
> > >interpreter which is to be invoked to run the script.
> > 
> > Yes, and does the kernel read when the script is source'd or executed via
> > any of the mechanisms that have the executable specified in the call,
> > explicitly or implicitly?
> 
> I suppose that it is explained in enough detail here:
> 
> https://en.wikipedia.org/wiki/Shebang_(Unix)
> 
> In particular:
> 
> https://en.wikipedia.org/wiki/Shebang_(Unix)#Magic_number
> 
> > >None of /bin/sh RA requires bash.
> > 
> > Yeah, only "local".
> 
> As already mentioned elsewhere in the thread, local is supported
> in most shell implementations and without it we otherwise
> wouldn't to be able to maintain software. Not sure where local
> originates, but wouldn't bet that it's bash.

Let's just agree that as currently implemented,
our collection of /bin/sh scripts won't run on ksh as shipped with
solaris (while there likely are ksh derivatives in *BSD somewhere
that would be mostly fine with them).

And before this turns even more into a "yes, I'm that old, too" thread,
may I suggest to document that we expect a
"dash compatible" /bin/sh, and that we expect scripts
to have a bash shebang (or as appropriate) if they go beyond that.

Then check for incompatible shells in ocf-shellfuncs,
and just exit early if we detect incompatibilities.

For a suggestion on checking for a proper "local" see below.
(Add more checks later, if someone feels like it.)

Though, if someone rewrites not the current agents, but the "lib/ocf*"
help stuff to be sourced by shell based agents in a way that would
support RAs in all bash, dash, ash, ksh, whatever,
and the result turns out not too much worse than what we have now,
I'd have no problem with that...

Cheers,

Lars


And for the "typeset" crowd,
if you think s/local/typeset/ was all that was necessary
to support function local variables in ksh, think again:

ksh -c '
function a {
echo "start of a: x=$x"
typeset x=a
echo "before b: x=$x"
b
echo "end of a: x=$x"
}
function b {
echo "start of b: x=$x ### HAHA guess this one was unexpected 
to all but ksh users"
typeset x=b
echo "end of b: x=$x"
}
x=x
echo "before a: x=$x"
a
echo "after a: x=$x"
'

Try the same with bash.
Also remember that sometimes we set a "local" variable in a function
and expect it to be visible in nested functions, but also set a new
value in a nested function and expect that value to be reflected
in the outer scope (up to the last "local").




diff --git a/heartbeat/ocf-shellfuncs.in b/heartbeat/ocf-shellfuncs.in
index 6d9669d..4151630 100644
--- a/heartbeat/ocf-shellfuncs.in
+++ b/heartbeat/ocf-shellfuncs.in
@@ -920,3 +920,37 @@ ocf_is_true "$OCF_TRACE_RA" && ocf_start_trace
 if ocf_is_true "$HA_use_logd"; then
: ${HA_LOGD:=yes}
 fi
+
+# We use a lot of function local variables with the "local" keyword.
+# Which works fine with dash and bash,
+# but does not work with e.g. ksh.
+# Fail cleanly with a sane error message,
+# if the current shell does not feel compatible.
+
+__ocf_check_for_incompatible_shell_l2()
+{
+   [ $__ocf_check_for_incompatible_shell_k = v1 ] || return 1
+   local __ocf_check_for_incompatible_shell_k=v2
+   [ $__ocf_check_for_incompatible_shell_k = v2 ] || return 1
+   return 0
+}
+
+__ocf_check_for_incompatible_shell_l1()
+{
+   [ $__ocf_check_for_incompatible_shell_k = v0 ] || return 1
+   local __ocf_check_for_incompatible_shell_k=v1
+   __ocf_check_for_incompatible_shell_l2
+   [ $__ocf_check_for_incompatible_shell_k = v1 ] || return 1
+   return 0
+}
+
+__ocf_check_for_incompatible_shell()
+{
+   local __ocf_check_for_incompatible_shell_k=v0
+   __ocf_check_for_incompatible_shell_l1
+   [ $__ocf_check_for_incompatible_shell_k = v0 ] && return 0
+   ocf_exit_reason "Current shell seems to be incompatible. We suggest 
dash or bash (compatible)."
+   exit $OCF_ERR_GENERIC
+}
+
+__ocf_check_for_incompatible_shell


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Dejan Muhamedagic
On Tue, Aug 30, 2016 at 10:08:00AM -0500, Dmitri Maziuk wrote:
> On 2016-08-30 03:44, Dejan Muhamedagic wrote:
> 
> >The kernel reads the shebang line and it is what defines the
> >interpreter which is to be invoked to run the script.
> 
> Yes, and does the kernel read when the script is source'd or executed via
> any of the mechanisms that have the executable specified in the call,
> explicitly or implicitly?

I suppose that it is explained in enough detail here:

https://en.wikipedia.org/wiki/Shebang_(Unix)

In particular:

https://en.wikipedia.org/wiki/Shebang_(Unix)#Magic_number

> >None of /bin/sh RA requires bash.
> 
> Yeah, only "local".

As already mentioned elsewhere in the thread, local is supported
in most shell implementations and without it we otherwise
wouldn't to be able to maintain software. Not sure where local
originates, but wouldn't bet that it's bash.

Thanks,

Dejan

> Dima
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Service pacemaker start kills my cluster and other NFS HA issues

2016-08-30 Thread Pablo Pines Leon
Hello,

I have set up a DRBD-Corosync-Pacemaker cluster following the instructions from 
https://wiki.ubuntu.com/ClusterStack/Natty adapting them to CentOS 7 (e.g: 
using systemd). After testing it in Virtual Machines it seemed to be working 
fine, so it is now implemented in physical machines, and I have noticed that 
the failover works fine as long as I kill the master by pulling the AC cable, 
but not if I issue the halt, reboot or shutdown commands, that makes the 
cluster get in a situation like this:

Last updated: Tue Aug 30 16:55:58 2016  Last change: Tue Aug 23 
11:49:43 2016 by hacluster via crmd on nfsha2
Stack: corosync
Current DC: nfsha2 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
2 nodes and 9 resources configured

Online: [ nfsha1 nfsha2 ]

 Master/Slave Set: ms_drbd_export [res_drbd_export]
 Masters: [ nfsha2 ]
 Slaves: [ nfsha1 ]
 Resource Group: rg_export
 res_fs (ocf::heartbeat:Filesystem):Started nfsha2
 res_exportfs_export1(ocf::heartbeat:exportfs):FAILED nfsha2 
(unmanaged)
 res_ip (ocf::heartbeat:IPaddr2):Stopped
 Clone Set: cl_nfsserver [res_nfsserver]
 Started: [ nfsha1 ]
 Clone Set: cl_exportfs_root [res_exportfs_root]
 res_exportfs_root  (ocf::heartbeat:exportfs):FAILED nfsha2
 Started: [ nfsha1 ]

Migration Summary:
* Node 2:
   res_exportfs_export1: migration-threshold=100 fail-count=100
last-failure='Tue Aug 30 16:55:50 2016'
   res_exportfs_root: migration-threshold=100 fail-count=1 
last-failure='Tue Aug 30 16:55:48 2016'
* Node 1:

Failed Actions:
* res_exportfs_export1_stop_0 on nfsha2 'unknown error' (1): call=134, 
status=Timed Out, exitreason='non
e',
last-rc-change='Tue Aug 30 16:55:30 2016', queued=0ms, exec=20001ms
* res_exportfs_root_monitor_3 on nfsha2 'not running' (7): call=126, 
status=complete, exitreason='no
ne',
last-rc-change='Tue Aug 30 16:55:48 2016', queued=0ms, exec=0ms

This of course blocks it, because the IP and the NFS exports are down. It 
doesn't even recognize that the other node is down. I am then forced to do 
"crm_resource -P" to get it back to a working state.

Even when unplugging the master, and booting it up again, trying to get it back 
in the cluster executing "service pacemaker start" on the node that was 
unplugged will sometimes just cause the exportfs_root resource on the slave to 
fail (but the service is still up):

 Master/Slave Set: ms_drbd_export [res_drbd_export]
 Masters: [ nfsha1 ]
 Slaves: [ nfsha2 ]
 Resource Group: rg_export
 res_fs (ocf::heartbeat:Filesystem):Started nfsha1
 res_exportfs_export1(ocf::heartbeat:exportfs):Started nfsha1
 res_ip (ocf::heartbeat:IPaddr2):Started nfsha1
 Clone Set: cl_nfsserver [res_nfsserver]
 Started: [ nfsha1 nfsha2 ]
 Clone Set: cl_exportfs_root [res_exportfs_root]
 Started: [ nfsha1 nfsha2 ]

Migration Summary:
* Node nfsha2:
   res_exportfs_root: migration-threshold=100 fail-count=1 
last-failure='Tue Aug 30 17:18:17 2016'
* Node nfsha1:

Failed Actions:
* res_exportfs_root_monitor_3 on nfsha2 'not running' (7): call=34, 
status=complete, exitreason='non
e',
last-rc-change='Tue Aug 30 17:18:17 2016', queued=0ms, exec=33ms

BTW I notice that the node attributes are changed:

Node Attributes:
* Node nfsha1:
+ master-res_drbd_export: 1
* Node nfsha2:
+ master-res_drbd_export: 1000

Usually both would have the same weight (1), so running "crm_resource -P" 
restores that.

Some other times it will instead cause a service disruption:

Online: [ nfsha1 nfsha2 ]

 Master/Slave Set: ms_drbd_export [res_drbd_export]
 Masters: [ nfsha2 ]
 Slaves: [ nfsha1 ]
 Resource Group: rg_export
 res_fs (ocf::heartbeat:Filesystem):Started nfsha2
 res_exportfs_export1(ocf::heartbeat:exportfs):FAILED (unmanaged)[ 
nfsha2 nfsha1 ]
 res_ip (ocf::heartbeat:IPaddr2):Stopped
 Clone Set: cl_nfsserver [res_nfsserver]
 Started: [ nfsha1 nfsha2 ]
 Clone Set: cl_exportfs_root [res_exportfs_root]
 Started: [ nfsha1 nfsha2]

Migration Summary:
* Node nfsha2:
   res_exportfs_export1: migration-threshold=100 fail-count=100
last-failure='Tue Aug 30 17:31:01 2016'
* Node nfsha1:
   res_exportfs_export1: migration-threshold=100 fail-count=100
last-failure='Tue Aug 30 17:31:01 2016'
   res_exportfs_root: migration-threshold=100 fail-count=1 
last-failure='Tue Aug 30 17:31:11 2016'

Failed Actions:
* res_exportfs_export1_stop_0 on nfsha2 'unknown error' (1): call=86, 
status=Timed Out, exitreason='none
',
last-rc-change='Tue Aug 30 17:30:41 2016', queued=0ms, exec=20002ms
* res_exportfs_export1_stop_0 on nfsha1 'unknown error' (1): call=32, 
status=Timed Out, exitreason='none
',
last-rc-change='Tue Aug 30 17:30:41 2016', queued=0ms, exec=20002ms
* res_exportfs_root_monitor_3 on nfsha1 'not running' (7): call=29, 

Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Dmitri Maziuk

On 2016-08-30 03:44, Dejan Muhamedagic wrote:


The kernel reads the shebang line and it is what defines the
interpreter which is to be invoked to run the script.


Yes, and does the kernel read when the script is source'd or executed 
via any of the mechanisms that have the executable specified in the 
call, explicitly or implicitly?



None of /bin/sh RA requires bash.


Yeah, only "local".

Dima


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Gabriele Bulfon
illumos (and Solaris 11) delivers ksh93, that is fully Bourn compatible, but 
not with the bash extension of "local" variables, that is not Bourn shell. It 
is supported in ksh93 with the "typedef" operator, instead of "local".
This is used inside the "ocf-*" scripts.
Gabriele

Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Dejan Muhamedagic
A: gbul...@sonicle.com Cluster Labs - All topics related to open-source 
clustering welcomed
Data: 30 agosto 2016 12.20.19 CEST
Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
Hi,
On Mon, Aug 29, 2016 at 05:08:35PM +0200, Gabriele Bulfon wrote:
Sure, infact I can change all shebang to point to /bin/bash and it's ok.
The question is about current shebang /bin/sh which may go into trouble (as if 
one would point to a generic python but uses many specific features of a 
version of python).
Also, the question is about bash being a good option for RAs, being much more 
heavy.
I'd really suggest installing a smaller shell such as /bin/dash
and using that as /bin/sh. Isn't there a Bourne shell in Solaris?
If you modify the RAs it could be trouble on subsequent updates.
Thanks,
Dejan
Gabriele

Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Dejan Muhamedagic
A: kgail...@redhat.com Cluster Labs - All topics related to open-source 
clustering welcomed
Data: 29 agosto 2016 16.43.52 CEST
Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
Hi,
On Mon, Aug 29, 2016 at 08:47:43AM -0500, Ken Gaillot wrote:
On 08/29/2016 04:17 AM, Gabriele Bulfon wrote:
Hi Ken,
I have been talking with the illumos guys about the shell problem.
They all agreed that ksh (and specially the ksh93 used in illumos) is
absolutely Bourne-compatible, and that the "local" variables used in the
ocf shells is not a Bourne syntax, but probably a bash specific.
This means that pointing the scripts to "#!/bin/sh" is portable as long
as the scripts are really Bourne-shell only syntax, as any Unix variant
may link whatever Bourne-shell they like.
In this case, it should point to "#!/bin/bash" or whatever shell the
script was written for.
Also, in this case, the starting point is not the ocf-* script, but the
original RA (IPaddr, but almost all of them).
What about making the code base of RA and ocf-* portable?
It may be just by changing them to point to bash, or with some kind of
configure modifier to be able to specify the shell to use.
Meanwhile, changing the scripts by hands into #!/bin/bash worked like a
charm, and I will start patching.
Gabriele
Interesting, I thought local was posix, but it's not. It seems everyone
but solaris implemented it:
http://stackoverflow.com/questions/18597697/posix-compliant-way-to-scope-variables-to-a-function-in-a-shell-script
Please open an issue at:
https://github.com/ClusterLabs/resource-agents/issues
The simplest solution would be to require #!/bin/bash for all RAs that
use local,
This issue was raised many times, but note that /bin/bash is a
shell not famous for being lean: it's great for interactive use,
but not so great if you need to run a number of scripts. The
complexity in bash, which is superfluous for our use case,
doesn't go well with the basic principles of HA clusters.
but I'm not sure that's fair to the distros that support
local in a non-bash default shell. Another possibility would be to
modify all RAs to avoid local entirely, by using unique variable
prefixes per function.
I doubt that we could do a moderately complex shell scripts
without capability of limiting the variables' scope and retaining
sanity at the same time.
Or, it may be possible to guard every instance of
local with a check for ksh, which would use typeset instead. Raising the
issue will allow some discussion of the possibilities.
Just to mention that this is the first time someone reported
running a shell which doesn't support local. Perhaps there's an
option that they install a shell which does.
Thanks,
Dejan

*Sonicle S.r.l. *: http://www.sonicle.com
*Music: *http://www.gabrielebulfon.com
*Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Ken Gaillot
A: gbul...@sonicle.com Cluster Labs - All topics related to open-source
clustering welcomed
Data: 26 agosto 2016 15.56.02 CEST
Oggetto: Re: ocf scripts shell and local variables
On 08/26/2016 08:11 AM, Gabriele Bulfon 

Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Gabriele Bulfon
Not RA, but ocf-* do, because of the "local" operator usage.

Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Dejan Muhamedagic
A: Cluster Labs - All topics related to open-source clustering welcomed
Data: 30 agosto 2016 10.44.49 CEST
Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
Hi,
On Mon, Aug 29, 2016 at 10:13:18AM -0500, Dmitri Maziuk wrote:
On 2016-08-29 04:06, Gabriele Bulfon wrote:
Thanks, though this does not work :)
Uhm... right. Too many languages, sorry: perl's system() will call the login
shell, system system() uses /bin/sh, and exec()s will run whatever the
programmer tells them to. The point is none of them cares what shell's in
shebang line AFAIK.
The kernel reads the shebang line and it is what defines the
interpreter which is to be invoked to run the script.
But anyway, you're correct; a lot of linux "shell" scripts are bash-only and
pacemaker RAs are no exception.
None of /bin/sh RA requires bash.
Thanks,
Dejan
Dima
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Dejan Muhamedagic
Hi,

On Mon, Aug 29, 2016 at 05:08:35PM +0200, Gabriele Bulfon wrote:
> Sure, infact I can change all shebang to point to /bin/bash and it's ok.
> The question is about current shebang /bin/sh which may go into trouble (as 
> if one would point to a generic python but uses many specific features of a 
> version of python).
> Also, the question is about bash being a good option for RAs, being much more 
> heavy.

I'd really suggest installing a smaller shell such as /bin/dash
and using that as /bin/sh. Isn't there a Bourne shell in Solaris?
If you modify the RAs it could be trouble on subsequent updates.

Thanks,

Dejan

> Gabriele
> 
> Sonicle S.r.l.
> :
> http://www.sonicle.com
> Music:
> http://www.gabrielebulfon.com
> Quantum Mechanics :
> http://www.cdbaby.com/cd/gabrielebulfon
> --
> Da: Dejan Muhamedagic
> A: kgail...@redhat.com Cluster Labs - All topics related to open-source 
> clustering welcomed
> Data: 29 agosto 2016 16.43.52 CEST
> Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
> Hi,
> On Mon, Aug 29, 2016 at 08:47:43AM -0500, Ken Gaillot wrote:
> On 08/29/2016 04:17 AM, Gabriele Bulfon wrote:
> Hi Ken,
> I have been talking with the illumos guys about the shell problem.
> They all agreed that ksh (and specially the ksh93 used in illumos) is
> absolutely Bourne-compatible, and that the "local" variables used in the
> ocf shells is not a Bourne syntax, but probably a bash specific.
> This means that pointing the scripts to "#!/bin/sh" is portable as long
> as the scripts are really Bourne-shell only syntax, as any Unix variant
> may link whatever Bourne-shell they like.
> In this case, it should point to "#!/bin/bash" or whatever shell the
> script was written for.
> Also, in this case, the starting point is not the ocf-* script, but the
> original RA (IPaddr, but almost all of them).
> What about making the code base of RA and ocf-* portable?
> It may be just by changing them to point to bash, or with some kind of
> configure modifier to be able to specify the shell to use.
> Meanwhile, changing the scripts by hands into #!/bin/bash worked like a
> charm, and I will start patching.
> Gabriele
> Interesting, I thought local was posix, but it's not. It seems everyone
> but solaris implemented it:
> http://stackoverflow.com/questions/18597697/posix-compliant-way-to-scope-variables-to-a-function-in-a-shell-script
> Please open an issue at:
> https://github.com/ClusterLabs/resource-agents/issues
> The simplest solution would be to require #!/bin/bash for all RAs that
> use local,
> This issue was raised many times, but note that /bin/bash is a
> shell not famous for being lean: it's great for interactive use,
> but not so great if you need to run a number of scripts. The
> complexity in bash, which is superfluous for our use case,
> doesn't go well with the basic principles of HA clusters.
> but I'm not sure that's fair to the distros that support
> local in a non-bash default shell. Another possibility would be to
> modify all RAs to avoid local entirely, by using unique variable
> prefixes per function.
> I doubt that we could do a moderately complex shell scripts
> without capability of limiting the variables' scope and retaining
> sanity at the same time.
> Or, it may be possible to guard every instance of
> local with a check for ksh, which would use typeset instead. Raising the
> issue will allow some discussion of the possibilities.
> Just to mention that this is the first time someone reported
> running a shell which doesn't support local. Perhaps there's an
> option that they install a shell which does.
> Thanks,
> Dejan
> 
> *Sonicle S.r.l. *: http://www.sonicle.com
> *Music: *http://www.gabrielebulfon.com
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
> --
> Da: Ken Gaillot
> A: gbul...@sonicle.com Cluster Labs - All topics related to open-source
> clustering welcomed
> Data: 26 agosto 2016 15.56.02 CEST
> Oggetto: Re: ocf scripts shell and local variables
> On 08/26/2016 08:11 AM, Gabriele Bulfon wrote:
> I tried adding some debug in ocf-shellfuncs, showing env and ps
> -ef into
> the corosync.log
> I suspect it's always using ksh, because in the env output I
> produced I
> find this: KSH_VERSION=.sh.version
> This is normally not present in the environment, unless ksh is running
> the shell.
> The RAs typically start with #!/bin/sh, so whatever that points to on
> your system is what will be used.
> I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the
> beginning, no way, same output.
> You'd have to change the RA that includes them.
> Any idea how can I change the used shell to support "local" variables?

Re: [ClusterLabs] systemd RA start/stop delays

2016-08-30 Thread Dejan Muhamedagic
Hi,

On Thu, Aug 18, 2016 at 09:00:24AM -0500, Ken Gaillot wrote:
> On 08/17/2016 08:17 PM, TEG AMJG wrote:
> > Hi
> > 
> > I am having a problem with a simple Active/Passive cluster which
> > consists in the next configuration
> > 
> > Cluster Name: kamcluster
> > Corosync Nodes:
> >  kam1vs3 kam2vs3
> > Pacemaker Nodes:
> >  kam1vs3 kam2vs3
> > 
> > Resources:
> >  Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
> >   Attributes: ip=10.0.1.206 cidr_netmask=32
> >   Operations: start interval=0s timeout=20s (ClusterIP-start-interval-0s)
> >   stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
> >   monitor interval=10s (ClusterIP-monitor-interval-10s)
> >  Resource: ClusterIP2 (class=ocf provider=heartbeat type=IPaddr2)
> >   Attributes: ip=10.0.1.207 cidr_netmask=32
> >   Operations: start interval=0s timeout=20s (ClusterIP2-start-interval-0s)
> >   stop interval=0s timeout=20s (ClusterIP2-stop-interval-0s)
> >   monitor interval=10s (ClusterIP2-monitor-interval-10s)
> >  Resource: rtpproxycluster (class=systemd type=rtpproxy)
> >   Operations: monitor interval=10s (rtpproxycluster-monitor-interval-10s)
> >   stop interval=0s on-fail=block
> > (rtpproxycluster-stop-interval-0s)
> >  Resource: kamailioetcfs (class=ocf provider=heartbeat type=Filesystem)
> >   Attributes: device=/dev/drbd1 directory=/etc/kamailio fstype=ext4
> >   Operations: start interval=0s timeout=60 (kamailioetcfs-start-interval-0s)
> >   monitor interval=10s on-fail=fence
> > (kamailioetcfs-monitor-interval-1  
> >0s)
> >   stop interval=0s on-fail=fence
> > (kamailioetcfs-stop-interval-0s)
> >  Clone: fence_kam2_xvm-clone
> >   Meta Attrs: interleave=true clone-max=2 clone-node-max=1
> >   Resource: fence_kam2_xvm (class=stonith type=fence_xvm)
> >Attributes: port=tegamjg_kam2 pcmk_host_list=kam2vs3
> >Operations: monitor interval=60s (fence_kam2_xvm-monitor-interval-60s)
> >  Master: kamailioetcclone
> >   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
> > clone-node-max=1 notify=t  
> >rue on-fail=fence
> >   Resource: kamailioetc (class=ocf provider=linbit type=drbd)
> >Attributes: drbd_resource=kamailioetc
> >Operations: start interval=0s timeout=240 (kamailioetc-start-interval-0s)
> >promote interval=0s on-fail=fence
> > (kamailioetc-promote-interval-0s)
> >demote interval=0s on-fail=fence
> > (kamailioetc-demote-interval-0s)
> >stop interval=0s on-fail=fence (kamailioetc-stop-interval-0s)
> >monitor interval=10s (kamailioetc-monitor-interval-10s)
> >  Clone: fence_kam1_xvm-clone
> >   Meta Attrs: interleave=true clone-max=2 clone-node-max=1
> >   Resource: fence_kam1_xvm (class=stonith type=fence_xvm)
> >Attributes: port=tegamjg_kam1 pcmk_host_list=kam1vs3
> >Operations: monitor interval=60s (fence_kam1_xvm-monitor-interval-60s)
> >  Resource: kamailiocluster (class=ocf provider=heartbeat type=kamailio)
> >   Attributes: listen_address=10.0.1.206
> > conffile=/etc/kamailio/kamailio.cfg pidfil  
> >  
> >  e=/var/run/kamailio.pid monitoring_ip=10.0.1.206
> > monitoring_ip2=10.0.1.207 port=50  
> >60 proto=udp
> > kamctlrc=/etc/kamailio/kamctlrc shmem=128 pkg=8
> >   Meta Attrs: target-role=Stopped
> >   Operations: start interval=0s timeout=60
> > (kamailiocluster-start-interval-0s)
> >   stop interval=0s timeout=30 (kamailiocluster-stop-interval-0s)
> >   monitor interval=5s (kamailiocluster-monitor-interval-5s)
> > 
> > Stonith Devices:
> > Fencing Levels:
> > 
> > Location Constraints:
> > Ordering Constraints:
> >   start fence_kam1_xvm-clone then start fence_kam2_xvm-clone
> > (kind:Mandatory) (id:  
> >  
> >  order-fence_kam1_xvm-clone-fence_kam2_xvm-clone-mandatory)
> >   start fence_kam2_xvm-clone then promote kamailioetcclone
> > (kind:Mandatory) (id:or
> >
> >  der-fence_kam2_xvm-clone-kamailioetcclone-mandatory)
> >   promote kamailioetcclone then start kamailioetcfs (kind:Optional)
> > (id:order-kama  
> >ilioetcclone-kamailioetcfs-Optional)
> >   Resource Sets:
> > set kamailioetcfs sequential=true (id:pcs_rsc_set_kamailioetcfs) set
> > ClusterIP  
> > ClusterIP2 sequential=false
> > (id:pcs_rsc_set_ClusterIP_ClusterIP2) set 

Re: [ClusterLabs] ip clustering strange behaviour

2016-08-30 Thread Klaus Wenninger
Then it is probably the default for no-quorum-policy (=stop)

On 08/30/2016 08:52 AM, Gabriele Bulfon wrote:
> Sorry for reiterating, but my main question was:
>
> why does node 1 removes its own IP if I shut down node 2 abruptly?
> I understand that it does not take the node 2 IP (because the
> ssh-fencing has no clue about what happened on the 2nd node), but I
> wouldn't expect it to shut down its own IP...this would kill any
> service on both nodes...what am I wrong?
>
> 
> *Sonicle S.r.l. *: http://www.sonicle.com 
> *Music: *http://www.gabrielebulfon.com 
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>
> 
>
>
> *Da:* Gabriele Bulfon 
> *A:* kwenn...@redhat.com Cluster Labs - All topics related to
> open-source clustering welcomed 
> *Data:* 29 agosto 2016 17.37.36 CEST
> *Oggetto:* Re: [ClusterLabs] ip clustering strange behaviour
>
>
> Ok, got it, I hadn't gracefully shut pacemaker on node2.
> Now I restarted, everything was up, stopped pacemaker service on
> host2 and I got host1 with both IPs configured. ;)
>
> But, though I understand that if I halt host2 with no grace shut
> of pacemaker, it will not move the IP2 to Host1, I don't expect
> host1 to loose its own IP! Why?
>
> Gabriele
>
> 
> 
> *Sonicle S.r.l. *: http://www.sonicle.com 
> *Music: *http://www.gabrielebulfon.com
> 
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>
>
>
> 
> --
>
> Da: Klaus Wenninger 
> A: users@clusterlabs.org
> Data: 29 agosto 2016 17.26.49 CEST
> Oggetto: Re: [ClusterLabs] ip clustering strange behaviour
>
> On 08/29/2016 05:18 PM, Gabriele Bulfon wrote:
> > Hi,
> >
> > now that I have IPaddr work, I have a strange behaviour on
> my test
> > setup of 2 nodes, here is my configuration:
> >
> > ===STONITH/FENCING===
> >
> > primitive xstorage1-stonith stonith:external/ssh-sonicle op
> monitor
> > interval="25" timeout="25" start-delay="25" params
> hostlist="xstorage1"
> >
> > primitive xstorage2-stonith stonith:external/ssh-sonicle op
> monitor
> > interval="25" timeout="25" start-delay="25" params
> hostlist="xstorage2"
> >
> > location xstorage1-stonith-pref xstorage1-stonith -inf:
> xstorage1
> > location xstorage2-stonith-pref xstorage2-stonith -inf:
> xstorage2
> >
> > property stonith-action=poweroff
> >
> >
> >
> > ===IP RESOURCES===
> >
> >
> > primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params
> ip="1.2.3.4"
> > cidr_netmask="255.255.255.0" nic="e1000g1"
> > primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params
> ip="1.2.3.5"
> > cidr_netmask="255.255.255.0" nic="e1000g1"
> >
> > location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1
> > location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2
> >
> > ===
> >
> > So I plumbed e1000g1 with unconfigured IP on both machines
> and started
> > corosync/pacemaker, and after some time I got all nodes
> online and
> > started, with IP configured as virtual interfaces (e1000g1:1 and
> > e1000g1:2) one in host1 and one in host2.
> >
> > Then I halted host2, and I expected to have host1 started
> with both
> > IPs configured on host1.
> > Instead, I got host1 started with the IP stopped and removed
> (only
> > e1000g1 unconfigured), host2 stopped saying IP started (!?).
> > Not exactly what I expected...
> > What's wrong?
>
> How did you stop host2? Graceful shutdown of pacemaker? If not ...
> Anyway ssh-fencing is just working if the machine is still
> running ...
> So it will stay unclean and thus pacemaker is thinking that
> the IP might still be running on it. So this is actually the
> expected
> behavior.
> You might add a watchdog via sbd if you don't have other fencing
> hardware at hand ...
> >
> > Here is the crm status after I stopped host 2:
> >
> > 2 nodes and 4 resources configured
> >
> > Node xstorage2: UNCLEAN (offline)
> > Online: 

Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Dejan Muhamedagic
Hi,

On Mon, Aug 29, 2016 at 10:13:18AM -0500, Dmitri Maziuk wrote:
> On 2016-08-29 04:06, Gabriele Bulfon wrote:
> >Thanks, though this does not work :)
> 
> Uhm... right. Too many languages, sorry: perl's system() will call the login
> shell, system system() uses /bin/sh, and exec()s will run whatever the
> programmer tells them to. The point is none of them cares what shell's in
> shebang line AFAIK.

The kernel reads the shebang line and it is what defines the
interpreter which is to be invoked to run the script.

> But anyway, you're correct; a lot of linux "shell" scripts are bash-only and
> pacemaker RAs are no exception.

None of /bin/sh RA requires bash.

Thanks,

Dejan

> 
> Dima
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Dejan Muhamedagic
On Tue, Aug 30, 2016 at 08:09:34AM +0200, Ulrich Windl wrote:
> >>> Dejan Muhamedagic  schrieb am 29.08.2016 um 16:37 in
> Nachricht <20160829143700.GA1538@tuttle.homenet>:
> > Hi,
> > 
> > On Mon, Aug 29, 2016 at 02:58:11PM +0200, Gabriele Bulfon wrote:
> >> I think the main issue is the usage of the "local" operator in ocf*
> >> I'm not an expert on this operator (never used!), don't know how hard it 
> >> is 
> > to replace it with a standard version.
> > 
> > Unfortunately, there's no command defined in POSIX which serves
> > the purpose of local, i.e. setting variables' scope. "local" is,
> 
> Isn't it "typeset"?

I don't think that /bin/dash supports typeset. Anyway, supporting
typeset, which covers much more than limiting the scope, would
also invite people to use it for that other stuff. If it's there,
someone's sure to use it ;-)

Thanks,

Dejan

> 
> > however, supported in almost all shells (including most versions
> > of ksh, but apparently not the one you run) and hence we
> > tolerated that in /bin/sh resource agents.
> > 
> > Thanks,
> > 
> > Dejan
> 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Dejan Muhamedagic
Hi,

On Tue, Aug 30, 2016 at 09:32:54AM +0200, Kristoffer Grönlund wrote:
> Jehan-Guillaume de Rorthais  writes:
> 
> > On Mon, 29 Aug 2016 10:02:28 -0500
> > Ken Gaillot  wrote:
> >
> >> On 08/29/2016 09:43 AM, Dejan Muhamedagic wrote:
> > ...
> >>> I doubt that we could do a moderately complex shell scripts
> >>> without capability of limiting the variables' scope and retaining
> >>> sanity at the same time.
> >> 
> >> This prefixing approach would definitely be ugly, and it would violate
> >> best practices on shells that do support local, but it should be feasible.
> >> 
> >> I'd argue that anything moderately complex should be converted to python
> >> (maybe after we have OCF 2.0, and some good python bindings ...).
> >
> > For what it worth, I already raised this discussion some month ago as we 
> > wrote
> > some perl modules equivalent to ocf-shellfuncs, ocf-returncodes and
> > ocf-directories. See: 
> >
> >   Subject: [ClusterLabs Developers] Perl Modules for resource agents (was:
> > Resource Agent language discussion) 
> >   Date: Thu, 26 Nov 2015 01:13:36 +0100
> >
> > I don't want to start a flameware about languages here, this is not about 
> > that.
> > Maybe it would be a good time to include various libraries for different
> > languages in official source? At least for ocf-directories which is quite
> > simple, but often tied to the configure options in various distro. We had to
> > make a ugly wrapper around the ocf-directories librairie on build time to
> > produce our OCF_Directories.pm module on various distros.
> >
> 
> I don't know Perl so I can't be very helpful reviewing it, but please do
> submit your Perl OCF library to resource-agents if you have one! I don't
> think anyone is opposed to including a Perl (or Python) API on
> principle, it's just a matter of someone actually sitting down to do the
> work putting it together.

Sure, I doubt that anybody would oppose that. Looking at the
amount of the existing supporting code, it is not going to be a
small feat.

***

This is off-topic and I really do hope not to start a discussion
about it so I'll keep it short: I often heard shell programming
being bashed up to the point that it should be banned. While it
is true that there are numerous scripts executed in a way leaving
much to be desired, the opposite is certainly possible too. Shell
does fit well a certain type of task and resource agents
(essentially more robust init scripts) belong to that realm.

Thanks,

Dejan


> Cheers,
> Kristoffer
> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ocf scripts shell and local variables

2016-08-30 Thread Kristoffer Grönlund
Jehan-Guillaume de Rorthais  writes:

> On Mon, 29 Aug 2016 10:02:28 -0500
> Ken Gaillot  wrote:
>
>> On 08/29/2016 09:43 AM, Dejan Muhamedagic wrote:
> ...
>>> I doubt that we could do a moderately complex shell scripts
>>> without capability of limiting the variables' scope and retaining
>>> sanity at the same time.
>> 
>> This prefixing approach would definitely be ugly, and it would violate
>> best practices on shells that do support local, but it should be feasible.
>> 
>> I'd argue that anything moderately complex should be converted to python
>> (maybe after we have OCF 2.0, and some good python bindings ...).
>
> For what it worth, I already raised this discussion some month ago as we wrote
> some perl modules equivalent to ocf-shellfuncs, ocf-returncodes and
> ocf-directories. See: 
>
>   Subject: [ClusterLabs Developers] Perl Modules for resource agents (was:
> Resource Agent language discussion) 
>   Date: Thu, 26 Nov 2015 01:13:36 +0100
>
> I don't want to start a flameware about languages here, this is not about 
> that.
> Maybe it would be a good time to include various libraries for different
> languages in official source? At least for ocf-directories which is quite
> simple, but often tied to the configure options in various distro. We had to
> make a ugly wrapper around the ocf-directories librairie on build time to
> produce our OCF_Directories.pm module on various distros.
>

I don't know Perl so I can't be very helpful reviewing it, but please do
submit your Perl OCF library to resource-agents if you have one! I don't
think anyone is opposed to including a Perl (or Python) API on
principle, it's just a matter of someone actually sitting down to do the
work putting it together.

Cheers,
Kristoffer

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ip clustering strange behaviour

2016-08-30 Thread Gabriele Bulfon
Sorry for reiterating, but my main question was:
why does node 1 removes its own IP if I shut down node 2 abruptly?
I understand that it does not take the node 2 IP (because the ssh-fencing has 
no clue about what happened on the 2nd node), but I wouldn't expect it to shut 
down its own IP...this would kill any service on both nodes...what am I wrong?

Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
Da:
Gabriele Bulfon
A:
kwenn...@redhat.com Cluster Labs - All topics related to open-source clustering 
welcomed
Data:
29 agosto 2016 17.37.36 CEST
Oggetto:
Re: [ClusterLabs] ip clustering strange behaviour
Ok, got it, I hadn't gracefully shut pacemaker on node2.
Now I restarted, everything was up, stopped pacemaker service on host2 and I 
got host1 with both IPs configured. ;)
But, though I understand that if I halt host2 with no grace shut of pacemaker, 
it will not move the IP2 to Host1, I don't expect host1 to loose its own IP! 
Why?
Gabriele

Sonicle S.r.l.
:
http://www.sonicle.com
Music:
http://www.gabrielebulfon.com
Quantum Mechanics :
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Klaus Wenninger
A: users@clusterlabs.org
Data: 29 agosto 2016 17.26.49 CEST
Oggetto: Re: [ClusterLabs] ip clustering strange behaviour
On 08/29/2016 05:18 PM, Gabriele Bulfon wrote:
Hi,
now that I have IPaddr work, I have a strange behaviour on my test
setup of 2 nodes, here is my configuration:
===STONITH/FENCING===
primitive xstorage1-stonith stonith:external/ssh-sonicle op monitor
interval="25" timeout="25" start-delay="25" params hostlist="xstorage1"
primitive xstorage2-stonith stonith:external/ssh-sonicle op monitor
interval="25" timeout="25" start-delay="25" params hostlist="xstorage2"
location xstorage1-stonith-pref xstorage1-stonith -inf: xstorage1
location xstorage2-stonith-pref xstorage2-stonith -inf: xstorage2
property stonith-action=poweroff
===IP RESOURCES===
primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params ip="1.2.3.4"
cidr_netmask="255.255.255.0" nic="e1000g1"
primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params ip="1.2.3.5"
cidr_netmask="255.255.255.0" nic="e1000g1"
location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1
location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2
===
So I plumbed e1000g1 with unconfigured IP on both machines and started
corosync/pacemaker, and after some time I got all nodes online and
started, with IP configured as virtual interfaces (e1000g1:1 and
e1000g1:2) one in host1 and one in host2.
Then I halted host2, and I expected to have host1 started with both
IPs configured on host1.
Instead, I got host1 started with the IP stopped and removed (only
e1000g1 unconfigured), host2 stopped saying IP started (!?).
Not exactly what I expected...
What's wrong?
How did you stop host2? Graceful shutdown of pacemaker? If not ...
Anyway ssh-fencing is just working if the machine is still running ...
So it will stay unclean and thus pacemaker is thinking that
the IP might still be running on it. So this is actually the expected
behavior.
You might add a watchdog via sbd if you don't have other fencing
hardware at hand ...
Here is the crm status after I stopped host 2:
2 nodes and 4 resources configured
Node xstorage2: UNCLEAN (offline)
Online: [ xstorage1 ]
Full list of resources:
xstorage1-stonith (stonith:external/ssh-sonicle): Started xstorage2
(UNCLEAN)
xstorage2-stonith (stonith:external/ssh-sonicle): Stopped
xstorage1_wan1_IP (ocf::heartbeat:IPaddr): Stopped
xstorage2_wan2_IP (ocf::heartbeat:IPaddr): Started xstorage2 (UNCLEAN)
Gabriele

*Sonicle S.r.l. *: http://www.sonicle.com
*Music: *http://www.gabrielebulfon.com
*Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
___Users mailing list: 
Users@clusterlabs.orghttp://clusterlabs.org/mailman/listinfo/usersProject Home: 
http://www.clusterlabs.orgGetting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdfBugs: 
http://bugs.clusterlabs.org

[ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Ulrich Windl
>>> Lars Ellenberg  schrieb am 29.08.2016 um 22:07 in
Nachricht <20160829200739.GQ5268@soda.linbit>:

[...]
> And no, I strongly do not think that we should "fall back" to the
> "art" of shell syntax and idioms that was force on you by the original"
> choose-your-brand-and-year-and-version shell, just because some
> "production systems" still have /bin/sh point to whatever it was
> their oldest ancestor system shipped with in the 19sixties...

Yes, those with a PDP-11 and 128kB "core" memory using a 300 baud acoustic 
coupler for dial-in;-)

[...]

Ulrich




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Ulrich Windl
>>> Jehan-Guillaume de Rorthais  schrieb am 29.08.2016 um 
>>> 18:04 in
Nachricht <20160829180440.5b7f1a2e@firost>:
> On Mon, 29 Aug 2016 10:02:28 -0500
> Ken Gaillot  wrote:
> 
>> On 08/29/2016 09:43 AM, Dejan Muhamedagic wrote:
> ...
>>> I doubt that we could do a moderately complex shell scripts
>>> without capability of limiting the variables' scope and retaining
>>> sanity at the same time.
>> 
>> This prefixing approach would definitely be ugly, and it would violate
>> best practices on shells that do support local, but it should be feasible.
>> 
>> I'd argue that anything moderately complex should be converted to python
>> (maybe after we have OCF 2.0, and some good python bindings ...).
> 
> For what it worth, I already raised this discussion some month ago as we 
> wrote
> some perl modules equivalent to ocf-shellfuncs, ocf-returncodes and
> ocf-directories. See: 
> 
>   Subject: [ClusterLabs Developers] Perl Modules for resource agents (was:
> Resource Agent language discussion) 
>   Date: Thu, 26 Nov 2015 01:13:36 +0100
> 
> I don't want to start a flameware about languages here, this is not about 
> that.
> Maybe it would be a good time to include various libraries for different
> languages in official source? At least for ocf-directories which is quite
> simple, but often tied to the configure options in various distro. We had to
> make a ugly wrapper around the ocf-directories librairie on build time to
> produce our OCF_Directories.pm module on various distros.

I'd like to write RAs in Perl, also. It's much safer (more tests, more 
warnings, better language) and possibly efficient, too.

> 
> Regards,
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Ulrich Windl
>>> Dmitri Maziuk  schrieb am 29.08.2016 um 17:13 in
Nachricht <161a5209-166c-e837-dcb0-169b8a4b3...@gmail.com>:
> On 2016-08-29 04:06, Gabriele Bulfon wrote:
>> Thanks, though this does not work :)
> 
> Uhm... right. Too many languages, sorry: perl's system() will call the 
> login shell, system system() uses /bin/sh, and exec()s will run whatever 
> the programmer tells them to. The point is none of them cares what 
> shell's in shebang line AFAIK.

Nonsense: If you use the exec system call, it uses the shebang; if you 
explictly call the script with a shell, that shell is used. Read the docs of 
the language or library!

> 
> But anyway, you're correct; a lot of linux "shell" scripts are bash-only 
> and pacemaker RAs are no exception.
> 
> Dima
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Ulrich Windl
>>> Gabriele Bulfon  schrieb am 29.08.2016 um 17:08 in
Nachricht <14184995.610.1472483315074.JavaMail.sonicle@www>:
> Sure, infact I can change all shebang to point to /bin/bash and it's ok.
> The question is about current shebang /bin/sh which may go into trouble (as 
> if one would point to a generic python but uses many specific features of a 
> version of python).
> Also, the question is about bash being a good option for RAs, being much 
> more heavy.

A completely different idea: What about defining an OCF shell function that 
tells you the category of the shell, like "bourne|posix|bash"? So some 
pseudo-code like this could be used:
if [ `shell_class` != bash ]; then
  # report a problem
  exit ...
fi

Regards,
Ulrich

> Gabriele
> 
> Sonicle S.r.l.
> :
> http://www.sonicle.com 
> Music:
> http://www.gabrielebulfon.com 
> Quantum Mechanics :
> http://www.cdbaby.com/cd/gabrielebulfon 
> --
> Da: Dejan Muhamedagic
> A: kgail...@redhat.com Cluster Labs - All topics related to open-source 
> clustering welcomed
> Data: 29 agosto 2016 16.43.52 CEST
> Oggetto: Re: [ClusterLabs] ocf scripts shell and local variables
> Hi,
> On Mon, Aug 29, 2016 at 08:47:43AM -0500, Ken Gaillot wrote:
> On 08/29/2016 04:17 AM, Gabriele Bulfon wrote:
> Hi Ken,
> I have been talking with the illumos guys about the shell problem.
> They all agreed that ksh (and specially the ksh93 used in illumos) is
> absolutely Bourne-compatible, and that the "local" variables used in the
> ocf shells is not a Bourne syntax, but probably a bash specific.
> This means that pointing the scripts to "#!/bin/sh" is portable as long
> as the scripts are really Bourne-shell only syntax, as any Unix variant
> may link whatever Bourne-shell they like.
> In this case, it should point to "#!/bin/bash" or whatever shell the
> script was written for.
> Also, in this case, the starting point is not the ocf-* script, but the
> original RA (IPaddr, but almost all of them).
> What about making the code base of RA and ocf-* portable?
> It may be just by changing them to point to bash, or with some kind of
> configure modifier to be able to specify the shell to use.
> Meanwhile, changing the scripts by hands into #!/bin/bash worked like a
> charm, and I will start patching.
> Gabriele
> Interesting, I thought local was posix, but it's not. It seems everyone
> but solaris implemented it:
> http://stackoverflow.com/questions/18597697/posix-compliant-way-to-scope-variable
>  
> s-to-a-function-in-a-shell-script
> Please open an issue at:
> https://github.com/ClusterLabs/resource-agents/issues 
> The simplest solution would be to require #!/bin/bash for all RAs that
> use local,
> This issue was raised many times, but note that /bin/bash is a
> shell not famous for being lean: it's great for interactive use,
> but not so great if you need to run a number of scripts. The
> complexity in bash, which is superfluous for our use case,
> doesn't go well with the basic principles of HA clusters.
> but I'm not sure that's fair to the distros that support
> local in a non-bash default shell. Another possibility would be to
> modify all RAs to avoid local entirely, by using unique variable
> prefixes per function.
> I doubt that we could do a moderately complex shell scripts
> without capability of limiting the variables' scope and retaining
> sanity at the same time.
> Or, it may be possible to guard every instance of
> local with a check for ksh, which would use typeset instead. Raising the
> issue will allow some discussion of the possibilities.
> Just to mention that this is the first time someone reported
> running a shell which doesn't support local. Perhaps there's an
> option that they install a shell which does.
> Thanks,
> Dejan
> 
> *Sonicle S.r.l. *: http://www.sonicle.com 
> *Music: *http://www.gabrielebulfon.com 
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon 
> --
> Da: Ken Gaillot
> A: gbul...@sonicle.com Cluster Labs - All topics related to open-source
> clustering welcomed
> Data: 26 agosto 2016 15.56.02 CEST
> Oggetto: Re: ocf scripts shell and local variables
> On 08/26/2016 08:11 AM, Gabriele Bulfon wrote:
> I tried adding some debug in ocf-shellfuncs, showing env and ps
> -ef into
> the corosync.log
> I suspect it's always using ksh, because in the env output I
> produced I
> find this: KSH_VERSION=.sh.version
> This is normally not present in the environment, unless ksh is running
> the shell.
> The RAs typically start with #!/bin/sh, so whatever that points to on
> your system is what will be used.
> I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the
> beginning, 

Re: [ClusterLabs] Node Fencing and STONITH in Amazon Web Services

2016-08-30 Thread Enno Gröper
Hi,

Am 26.08.2016 um 22:32 schrieb Jason A Ramsey:
> No users
> would be connecting to the severed instances, but background and system
> tasks would proceed as normal, potentially writing new data to the
> databases making rejoining the nodes to the cluster a little bit tricky
> to say the least, especially if the severed side’s network comes back up
> and both systems come to realize that they’re not consistent.
I think you forgot a potential routing problem. Network is not simply
on/off.
It could be, that you have a split brain scenario, where your 2 nodes
don't see each other. But it is perfectly possible, that at the same
time both nodes are seen by / interacting with some of your users
(depending on their location/routing). So it aren't just some background
tasks potentially writing data to the databases. It may be real user data.

I'm interested in this topic as this is a general cloud problem: To my
knowledge you simply don't have any out-of-band messaging channel
between your nodes to avoid split brain or make STONITH possible.
At least in OpenStack (this is what I know), everything running on the
node (vm) needs to go through client networking, which could be
malfunctioning. Even if it is, in theory, possible to issue API calls to
shutdown the other node. These calls would still need to go through the
same messaging channel (client network).

A solution could be to use DBaaS (reliable database provided by cloud
service provider). Don't know if any csp provides database replication
across different sites.
I simply don't see a way to reliably solve this using Pacemaker (without
out-of-band messaging channel and/or reliable STONITH).
Imho for DBaaS you need to look carefully at the SLA / specs to see, if
your DBaaS really provides, what you want.

My 2 cents
Enno



smime.p7s
Description: S/MIME Cryptographic Signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Ulrich Windl
>>> Dejan Muhamedagic  schrieb am 29.08.2016 um 16:37 in
Nachricht <20160829143700.GA1538@tuttle.homenet>:
> Hi,
> 
> On Mon, Aug 29, 2016 at 02:58:11PM +0200, Gabriele Bulfon wrote:
>> I think the main issue is the usage of the "local" operator in ocf*
>> I'm not an expert on this operator (never used!), don't know how hard it is 
> to replace it with a standard version.
> 
> Unfortunately, there's no command defined in POSIX which serves
> the purpose of local, i.e. setting variables' scope. "local" is,

Isn't it "typeset"?


> however, supported in almost all shells (including most versions
> of ksh, but apparently not the one you run) and hence we
> tolerated that in /bin/sh resource agents.
> 
> Thanks,
> 
> Dejan




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: ocf scripts shell and local variables

2016-08-30 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 29.08.2016 um 15:47 in 
>>> Nachricht
<8ca22867-2103-b2da-316b-ac97234f8...@redhat.com>:
> On 08/29/2016 04:17 AM, Gabriele Bulfon wrote:

[...]
> 
> Interesting, I thought local was posix, but it's not. It seems everyone
> but solaris implemented it:

Their VI was unable to handle 8-bit characters (like 10 years ago), not to talk 
about Unicode, also, and their NTP daemon isn't telling its version. I'm not a 
Solaris user, but that's what I know from co-workers...

Ulrich




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org