Hi, On Mon, Aug 29, 2016 at 08:47:43AM -0500, Ken Gaillot wrote: > On 08/29/2016 04:17 AM, Gabriele Bulfon wrote: > > Hi Ken, > > > > I have been talking with the illumos guys about the shell problem. > > They all agreed that ksh (and specially the ksh93 used in illumos) is > > absolutely Bourne-compatible, and that the "local" variables used in the > > ocf shells is not a Bourne syntax, but probably a bash specific. > > This means that pointing the scripts to "#!/bin/sh" is portable as long > > as the scripts are really Bourne-shell only syntax, as any Unix variant > > may link whatever Bourne-shell they like. > > In this case, it should point to "#!/bin/bash" or whatever shell the > > script was written for. > > Also, in this case, the starting point is not the ocf-* script, but the > > original RA (IPaddr, but almost all of them). > > > > What about making the code base of RA and ocf-* portable? > > It may be just by changing them to point to bash, or with some kind of > > configure modifier to be able to specify the shell to use. > > > > Meanwhile, changing the scripts by hands into #!/bin/bash worked like a > > charm, and I will start patching. > > > > Gabriele > > Interesting, I thought local was posix, but it's not. It seems everyone > but solaris implemented it: > > http://stackoverflow.com/questions/18597697/posix-compliant-way-to-scope-variables-to-a-function-in-a-shell-script > > Please open an issue at: > > https://github.com/ClusterLabs/resource-agents/issues > > The simplest solution would be to require #!/bin/bash for all RAs that > use local,
This issue was raised many times, but note that /bin/bash is a shell not famous for being lean: it's great for interactive use, but not so great if you need to run a number of scripts. The complexity in bash, which is superfluous for our use case, doesn't go well with the basic principles of HA clusters. > but I'm not sure that's fair to the distros that support > local in a non-bash default shell. Another possibility would be to > modify all RAs to avoid local entirely, by using unique variable > prefixes per function. I doubt that we could do a moderately complex shell scripts without capability of limiting the variables' scope and retaining sanity at the same time. > Or, it may be possible to guard every instance of > local with a check for ksh, which would use typeset instead. Raising the > issue will allow some discussion of the possibilities. Just to mention that this is the first time someone reported running a shell which doesn't support local. Perhaps there's an option that they install a shell which does. Thanks, Dejan > > > > ---------------------------------------------------------------------------------------- > > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/> > > *Music: *http://www.gabrielebulfon.com <http://www.gabrielebulfon.com/> > > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > > > > > > > ---------------------------------------------------------------------------------- > > > > Da: Ken Gaillot <kgail...@redhat.com> > > A: gbul...@sonicle.com Cluster Labs - All topics related to open-source > > clustering welcomed <users@clusterlabs.org> > > Data: 26 agosto 2016 15.56.02 CEST > > Oggetto: Re: ocf scripts shell and local variables > > > > On 08/26/2016 08:11 AM, Gabriele Bulfon wrote: > > > I tried adding some debug in ocf-shellfuncs, showing env and ps > > -ef into > > > the corosync.log > > > I suspect it's always using ksh, because in the env output I > > produced I > > > find this: KSH_VERSION=.sh.version > > > This is normally not present in the environment, unless ksh is running > > > the shell. > > > > The RAs typically start with #!/bin/sh, so whatever that points to on > > your system is what will be used. > > > > > I also tried modifiying all ocf shells with "#!/usr/bin/bash" at the > > > beginning, no way, same output. > > > > You'd have to change the RA that includes them. > > > > > Any idea how can I change the used shell to support "local" variables? > > > > You can either edit the #!/bin/sh line at the top of each RA, or figure > > out how to point /bin/sh to a Bourne-compatible shell. ksh isn't > > Bourne-compatible, so I'd expect lots of #!/bin/sh scripts to fail with > > it as the default shell. > > > > > Gabriele > > > > > > > > > > ---------------------------------------------------------------------------------------- > > > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/> > > > *Music: *http://www.gabrielebulfon.com > > <http://www.gabrielebulfon.com/> > > > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > *Da:* Gabriele Bulfon <gbul...@sonicle.com> > > > *A:* kgail...@redhat.com Cluster Labs - All topics related to > > > open-source clustering welcomed <users@clusterlabs.org> > > > *Data:* 26 agosto 2016 10.12.13 CEST > > > *Oggetto:* Re: [ClusterLabs] ocf::heartbeat:IPaddr > > > > > > > > > I looked around what you suggested, inside ocf-binaris and > > > ocf-shellfuncs etc. > > > So I found also these logs in corosync.log : > > > > > > Aug 25 17:50:33 [2250] crmd: notice: process_lrm_event: > > > xstorage1-xstorage2_wan2_IP_start_0:22 [ > > > /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No > > > such file or > > > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[354]: local: > > > not found [No such file or > > > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[355]: local: > > > not found [No such file or > > > directory]\n/usr/lib/ocf/resource.d/heartbeat/IPaddr[356]: local: > > > not found [No such file or directory]\nocf-exit-reason:Setup > > > problem: coul > > > > > > Aug 25 17:50:33 [2246] lrmd: notice: operation_finished: > > > xstorage2_wan2_IP_start_0:3613:stderr [ > > > /usr/lib/ocf/resource.d/heartbeat/IPaddr[71]: local: not found [No > > > such file or directory] ] > > > > > > Looks like the shell is not happy with the "local" variable > > definition. > > > I tried running ocf-shellfuncs manually with sh and bash and they > > > all run without errors. > > > How can I see what shell is running these scripts? > > > > > > > > > > ---------------------------------------------------------------------------------------- > > > *Sonicle S.r.l. *: http://www.sonicle.com <http://www.sonicle.com/> > > > *Music: *http://www.gabrielebulfon.com > > <http://www.gabrielebulfon.com/> > > > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > > > > > > > > > > > > > > > ---------------------------------------------------------------------------------- > > > > > > Da: Ken Gaillot <kgail...@redhat.com> > > > A: users@clusterlabs.org > > > Data: 25 agosto 2016 18.07.42 CEST > > > Oggetto: Re: [ClusterLabs] ocf::heartbeat:IPaddr > > > > > > On 08/25/2016 10:51 AM, Gabriele Bulfon wrote: > > > > Hi, > > > > > > > > I'm advancing with this monster cluster on XStreamOS/illumos ;) > > > > > > > > In the previous older tests I used heartbeat, and I had these > > > lines to > > > > take care of the swapping public IP addresses: > > > > > > > > primitive xstorage1_wan1_IP ocf:heartbeat:IPaddr params > > > ip="1.2.3.4" > > > > cidr_netmask="255.255.255.0" nic="e1000g1" > > > > primitive xstorage2_wan2_IP ocf:heartbeat:IPaddr params > > > ip="1.2.3.5" > > > > cidr_netmask="255.255.255.0" nic="e1000g1" > > > > > > > > location xstorage1_wan1_IP_pref xstorage1_wan1_IP 100: xstorage1 > > > > location xstorage2_wan2_IP_pref xstorage2_wan2_IP 100: xstorage2 > > > > > > > > They get configured, but then I get this in crm status: > > > > > > > > xstorage1_wan1_IP (ocf::heartbeat:IPaddr): Stopped > > > > xstorage2_wan2_IP (ocf::heartbeat:IPaddr): Stopped > > > > > > > > Failed Actions: > > > > * xstorage1_wan1_IP_start_0 on xstorage1 'not installed' (5): > > > call=20, > > > > status=complete, exitreason='Setup problem: couldn't find command: > > > > /usr/bin/gawk', > > > > last-rc-change='Thu Aug 25 17:50:32 2016', queued=1ms, exec=158ms > > > > * xstorage2_wan2_IP_start_0 on xstorage1 'not installed' (5): > > > call=22, > > > > status=complete, exitreason='Setup problem: couldn't find command: > > > > /usr/bin/gawk', > > > > last-rc-change='Thu Aug 25 17:50:33 2016', queued=1ms, exec=29ms > > > > * xstorage1_wan1_IP_start_0 on xstorage2 'not installed' (5): > > > call=22, > > > > status=complete, exitreason='Setup problem: couldn't find command: > > > > /usr/bin/gawk', > > > > last-rc-change='Thu Aug 25 17:50:30 2016', queued=1ms, exec=36ms > > > > * xstorage2_wan2_IP_start_0 on xstorage2 'not installed' (5): > > > call=20, > > > > status=complete, exitreason='Setup problem: couldn't find command: > > > > /usr/bin/gawk', > > > > last-rc-change='Thu Aug 25 17:50:29 2016', queued=0ms, exec=150ms > > > > > > > > > > > > The crm configure process already checked of the presence of the > > > > required IPaddr shell, and it was ok. > > > > Now looks like it's looking for "/usr/bin/gawk", and that is > > > actually there! > > > > Is there any known incompatibility with the mixed heartbeat > > > ocf ? Should > > > > I use corosync specific ocf files or something else? > > > > > > "heartbeat" in this case is just an OCF provider name, and has > > > nothing > > > to do with the heartbeat messaging layer, other than having its > > > origin > > > in the same project. There actually has been a recent proposal > > > to rename > > > the provider to "clusterlabs" to better reflect the current reality. > > > > > > The "couldn't find command" message comes from the ocf-binaries > > > shell > > > functions. If you look at have_binary() there, it uses sed and > > > which, > > > and I'm guessing that fails on your OS somehow. You may need to > > > patch it. > > > > > > > Thanks again! > > > > > > > > Gabriele > > > > > > > > > > > > > > > ---------------------------------------------------------------------------------------- > > > > *Sonicle S.r.l. *: http://www.sonicle.com > > > <http://www.sonicle.com/> > > > > *Music: *http://www.gabrielebulfon.com > > > <http://www.gabrielebulfon.com/> > > > > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org