Re: [Linux-HA] Are the Resource Agents POSIX compliant?

2011-01-24 Thread Michele Codutti
Hi Florian, i've looked for bashism into the debian package (1.0.3-3.1)
and this is the result:
$ for i in * .*; do checkbashisms $i 21 |grep -v is already a bash
script; skipping ; done
possible bashism in AudibleAlarm line 84 (echo -e):
echo -ne \a  /dev/console 
possible bashism in CTDB line 333 (should be 'b = a'):
[ $OCF_RESKEY_ctdb_logfile == syslog ]  log_option=--syslog
possible bashism in CTDB line 359 (brace expansion, should be $(seq a
b)):
for i in {1..30}; do
possible bashism in Delay line 160 ([^] should be [!]):
echo $i |grep -v [^0-9.] |grep -q -v [.].*[.]
script IPv6addr does not appear to have a #! interpreter line;
you may get strange results
possible bashism in SAPDatabase line 540 (should be word 21):
  eval $VALUE  /dev/null
possible bashism in SAPInstance line 383 (should be word 21):
  eval $VALUE  /dev/null
possible bashism in anything line 129 (let ...):
let i++
possible bashism in eDir88 line 190 (let ...):
let CNT=$CNT+1
possible bashism in eDir88 line 322 (declare):
declare rc=$OCF_SUCCESS
possible bashism in oracle line 234 (should be 'b = a'):
if [ x == x$ORACLE_HOME ]; then
possible bashism in oracle line 238 (should be 'b = a'):
if [ x == x$ORACLE_OWNER ]; then
possible bashism in oracle line 386 (should be 'b = a'):
if [ x$dumpdest == x -o ! -d $dumpdest ]; then
possible bashism in oracle line 390 (local -opt):
local -i fcount=`ls -rt $dumpdest | wc -l`
possible bashism in oracle line 393 (local -opt):
local -i fcount2=`ls -rt $dumpdest | wc -l`
possible bashism in oralsnr line 161 (should be 'b = a'):
if [ x == x$ORACLE_HOME ]; then
possible bashism in oralsnr line 165 (should be 'b = a'):
if [ x == x$ORACLE_OWNER ]; then
script .ocf-binaries does not appear to have a #! interpreter line;
you may get strange results
script .ocf-directories does not appear to have a #! interpreter line;
you may get strange results
script .ocf-returncodes does not appear to have a #! interpreter line;
you may get strange results
script .ocf-shellfuncs does not appear to have a #! interpreter line;
you may get strange results
possible bashism in .ocf-shellfuncs line 68 ($RANDOM):
local rnd=$RANDOM

It seems that there are bunch of agents (AudibleAlarm, CTDB, Delay,
IPv6addr, SAPDatabase, SAPInstance, anything, eDir88, oracle, oralsnr)
that contain bashisms and that have the #!/bin/sh interpreter line.
but what concern me most is the .ocf-shellfuncs that is called by almost
all agents (POSIX or not) contain a well known bashism ($RANDOM).

Il giorno mar, 18/01/2011 alle 11.43 +0100, Florian Haas ha scritto:
 All the agents that declare their interpreter to be /bin/sh have had
 any
 bashisms eradicated for the Debian squeeze release. So yes these will
 work on a system where /bin/sh links to dash.
 
 Florian 
-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Are the Resource Agents POSIX compliant?

2011-01-17 Thread Michele Codutti
Hello, I'm in the process to upgrade from Debian lenny to squeeze (so
from heartbeat 2.1.3 to pacemaker 1.0.9) but from this release the
default shell (only for scripts) is changed from bash to dash.
The difference from bash to dash i that the second one is strictly POSIX
compliant and doesn't support bashisms:
https://wiki.ubuntu.com/DashAsBinSh
So my question is: resource agents (now cluster agents) are POSIX
compliant?

-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Resource colocation with a clone.

2010-03-18 Thread Michele Codutti
No the from/to attributes was correctly instantiated.
The error was the id used to constrain the resource to the instances of the 
clone.
You must constrain the id of the clone not the id of the instances of the clone.
Example:
If the definition of the clone is:
clone id=clone_service
...
  primitive id=service ... /
  ...
  /primitive
/clone

The constrain must be:
rsc_colocation id=ip_runs_with_service from=ip to=clone_service 
score=INFINITY
and not:
rsc_colocation id=ip_runs_with_service from=ip to=service 
score=INFINITY

PS: I run the heartbeat 2.1.3 from Debian Lenny.

Il giorno 17/mar/2010, alle ore 22.08, Andrew Beekhof ha scritto:

 Probably swap the values of from and to.


Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] New HOWTO about High Available Firewalls

2009-02-04 Thread Michele Codutti
Il giorno mer, 04/02/2009 alle 16.16 +0100, Michael Schwartzkopff ha
scritto:
 Am Mittwoch, 4. Februar 2009 12:27:44 schrieb Igor Neves:
  Hi,
 
  I have done some work woth conntrackd and heartbeat a couple of time ago.
 
  Attached it's one conntrackd OCF script I made but when I finish I
  realized that it was not working and would never work.
  As you say in your HOWTO, conntrackd work with 2 caches.
 
 I do start conntracd outside of heartbeat from init. So setup of sync is 
 already working before the cluster starts.
 Inside heartbeat I only dump the connection table from the cache into the 
 kernel (firewall starts) or clear the cache (firewall stops)
 
I've also a 2-node active-standby firewall setup in production.
The problem with conntrackd is that it has only one sync connection with
the other node. To solve this SPOF I wrote two RA.
- the first one starts conntrackd and checks (in the monitor action) if
the other node is alive, otherwise, restarts conntrackd with another
configuration with another communication media.
- the second simply commits the conntrack tables from the other node
when it starts.
Obviously you must co-locate the second resource to an IP resource (or
in my case another custom RA that bridges some interfaces).
The two RA are still in a work-for-me status but they proved stable for
a while. Maybe in the next days I'll post them here to gather some
comments.

 If you want to write a OCF resource for that task to be done inside heartbeat 
 you need a stateful agent. You agent below is not stateful, i.e. it does not 
 unterstand promote and demote.
 
 Re-thinking: Perhaps you also could state a conntrackd clone... 
In my implementation a clone (one for every node) of the table merging
RA is enough.

-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Meta data syntax

2009-01-15 Thread Michele Codutti
Heartbeat 2.1.3 on a debian etch taken from debian backports.

Il giorno gio, 15/01/2009 alle 10.00 +0100, Dominik Klein ha scritto:
 Which version are you using?
 
 That's a known and fixed bug from a rather old version.
 
 Unfortunately, the bugzilla is not available at the moment. But
 searching for bugs with keyword meta once it is back should get you to
 the changeset.
 
 Regards
 Dominik
 
 Michele Codutti wrote:
  Hello, I'm working on an RA and I have some problems with the meta-data
  part of my RA.
  The script is working correctly and I've tested it with ocf-tester.
  To do a field test I've setup a two node cluster and installed into it.
  The configuration is done by the graphical GUI (hbclient).
  The problem is evident with the GUI when you create a new resource, and
  select my RA, then no parameter is shown you by the GUI. If I fill
  manually the parameters fields the resource starts and work as expected
  (very well :) ).
  The error that i see on the log is:
  mgmtd: [6873]: ERROR: lrm_get_rsc_type_metadata(572): got a return code
  HA_FAIL from a reply message of rmetadata with function
  get_ret_from_msg.
  
  I've checked the XML almost 20 times but I found it syntactically
  correct. I post it here in hoping that someone give me a hint about how
  to fix this situation:
  
  meta_data() {
  cat EOF
  ?xml version=1.0?
  !DOCTYPE resource-agent SYSTEM ra-api-1.dtd
  resource-agent name=LSBwrapper version=0.1
  version1.0/version
  
  longdesc lang=en
  This is a wapper Resource Agent built around a 3.1 LSB init script.
  You can also override the defined functions with an override script.
  /longdesc
  shortdesc lang=enLSB wrapper resource agent/shortdesc
  
  parameters
  
  parameter name=InitScript unique=1
  longdesc lang=en
  Location of the init script.
  /longdesc
  shortdesc lang=enInit script location/shortdesc
  content type=string /
  /parameter
  
  parameter name=OverrideScript unique=1
  longdesc lang=en
  Location of the override script.
  /longdesc
  shortdesc lang=enOverride script location/shortdesc
  content type=string /
  /parameter
  
  /parameters
  
  actions
  action name=starttimeout=90 /
  action name=stop timeout=100 /
  action name=monitor  timeout=20 interval=10 depth=0
  start-delay=0 /
  action name=reload   timeout=90 /
  action name=meta-datatimeout=5 /
  action name=verify-all   timeout=30 /
  /actions
  
  /resource-agent
  EOF
  }
  The RA is a generic wrapper to an 3.1 LSB init script. The main function
  calls meta-data (if requested) and then exit with an $OCF_SUCCESS.
  
  Any suggestion is really appreciated.
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Meta data syntax

2009-01-14 Thread Michele Codutti
Hello, I'm working on an RA and I have some problems with the meta-data
part of my RA.
The script is working correctly and I've tested it with ocf-tester.
To do a field test I've setup a two node cluster and installed into it.
The configuration is done by the graphical GUI (hbclient).
The problem is evident with the GUI when you create a new resource, and
select my RA, then no parameter is shown you by the GUI. If I fill
manually the parameters fields the resource starts and work as expected
(very well :) ).
The error that i see on the log is:
mgmtd: [6873]: ERROR: lrm_get_rsc_type_metadata(572): got a return code
HA_FAIL from a reply message of rmetadata with function
get_ret_from_msg.

I've checked the XML almost 20 times but I found it syntactically
correct. I post it here in hoping that someone give me a hint about how
to fix this situation:

meta_data() {
cat EOF
?xml version=1.0?
!DOCTYPE resource-agent SYSTEM ra-api-1.dtd
resource-agent name=LSBwrapper version=0.1
version1.0/version

longdesc lang=en
This is a wapper Resource Agent built around a 3.1 LSB init script.
You can also override the defined functions with an override script.
/longdesc
shortdesc lang=enLSB wrapper resource agent/shortdesc

parameters

parameter name=InitScript unique=1
longdesc lang=en
Location of the init script.
/longdesc
shortdesc lang=enInit script location/shortdesc
content type=string /
/parameter

parameter name=OverrideScript unique=1
longdesc lang=en
Location of the override script.
/longdesc
shortdesc lang=enOverride script location/shortdesc
content type=string /
/parameter

/parameters

actions
action name=starttimeout=90 /
action name=stop timeout=100 /
action name=monitor  timeout=20 interval=10 depth=0
start-delay=0 /
action name=reload   timeout=90 /
action name=meta-datatimeout=5 /
action name=verify-all   timeout=30 /
/actions

/resource-agent
EOF
}
The RA is a generic wrapper to an 3.1 LSB init script. The main function
calls meta-data (if requested) and then exit with an $OCF_SUCCESS.

Any suggestion is really appreciated.

-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Two Apaches with two IP in a active active configuration

2008-12-03 Thread Michele Codutti
Hello, maybe i was not clear about why i wrote here:
i don't want to notify a bug, i only want to have some suggestions to
resolve my configuration problems.
The version of heartbeat is only a reference for the readers to know
what possible features are available.
Maybe someone use the same version of heartbeat and may be someone had
resolved the same problem. If this person is kind enough to write his
thoughts about my question i will be grateful to him.

Il giorno mar, 02/12/2008 alle 16.50 +0100, Andrew Beekhof ha scritto:
 On Tue, Dec 2, 2008 at 15:24, Michele Codutti [EMAIL PROTECTED] wrote:
  Hello, i want to setup a webserver cluster with two nodes in an
  active-active configuration. I've a DNS name for the cluster:
  www.example.com. This name is resolved by DNS with the round-robin
  technique with two IP 10.0.0.1 and 10.0.0.2. I MUST use a heartbeat
  version 2.0.7 (Debian 4.0 Etch).
 
 Then you're in the wrong place... you need a Debian support list.
 2.0.7 was released over two years ago and our desire to re-visit bugs
 we've already fixed is minimal.
 
 Its not even clear to me how, after re-finding the problem, we can
 provide you with a fix if you can't/won't upgrade.
 If you insist on using only what Debian provides, then we've no way to help 
 you.
 
 
  I want to configure HB to achieve this:
  1) On a normal situation (2 nodes running) each node must have one IP
  and one apache running.
  2) If one apache is failed on one node the IP on this node must migrate
  to the remaning node.
  3) When a node that had failures is repaired then the IP and the Apache
  must return to run on that node.
 
  My first setup was:
   * Resources
 - IP1:IPaddr2(OCF)
 - IP2:IPaddr2(OCF)
 - WebServer(clone max:2 node_max:1):apache(OCF)
   * Costraints:
 - IP1_where_WebServer
 - IP2_Where_Webserver
  Initially the resource are equally balanced on the two nodes like this:
   * node1
 - IP1
 - WebServer_istance:0
   * node2
 - IP2
 - WebServer_istance:1
  When one webserver instance fails, the IP that runs on the same node
  doesn't migrate on the other node. This is not the behavior that i want.
  So I decided to try another setup:
   * Resources
 - Group1(ordered, collocated)
  IP1:IPaddr2(OCF)
  WebServer1:apache(OCF)
 - Group2(ordered, collocated)
  IP2:IPaddr2(OCF)
  WebServer2:apache(OCF)
  Initially the resource are equally balanced on the two nodes like this:
  * node1
 - Group1
  IP1
  WebServer1
  * node2
 - Group2
  IP2
  WebServer2
  When one webserver instance fails, the IP that runs on the same node
  migrate on the other node with the apache resource. This is a good
  approximation of what I want (the illusion of two running WebServers
  isn't pretty but it works). Now, to restore the migrated IP and
  WebServer i've reset the fail-counts of every resource but they don't
  come back to their original running node. This in not what i want. Only
  If I restart the service on the node where the resource was failed then
  the entire group migrate on the original node.
  There is anyone that could suggest me a better way to obtain what i
  need?
  Thanks in advance
 
  --
  Michele Codutti
  Centro Servizi Informatici e Telematici (CSIT)
  Universita' degli Studi di Udine
  via Delle Scienze, 208 - 33100 UDINE
  tel +39 0432 558928
  fax +39 0432 558911
  e-mail: michele.codutti at uniud.it
 
  ___
  Linux-HA mailing list
  Linux-HA@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha
  See also: http://linux-ha.org/ReportingProblems
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Two Apaches with two IP in a active active configuration

2008-12-02 Thread Michele Codutti
Hello, i want to setup a webserver cluster with two nodes in an
active-active configuration. I've a DNS name for the cluster:
www.example.com. This name is resolved by DNS with the round-robin
technique with two IP 10.0.0.1 and 10.0.0.2. I MUST use a heartbeat
version 2.0.7 (Debian 4.0 Etch).
I want to configure HB to achieve this:
1) On a normal situation (2 nodes running) each node must have one IP
and one apache running.
2) If one apache is failed on one node the IP on this node must migrate
to the remaning node.
3) When a node that had failures is repaired then the IP and the Apache
must return to run on that node.

My first setup was:
 * Resources
- IP1:IPaddr2(OCF)
- IP2:IPaddr2(OCF)
- WebServer(clone max:2 node_max:1):apache(OCF)
 * Costraints:
- IP1_where_WebServer
- IP2_Where_Webserver
Initially the resource are equally balanced on the two nodes like this:
 * node1
- IP1
- WebServer_istance:0
 * node2
- IP2
- WebServer_istance:1
When one webserver instance fails, the IP that runs on the same node
doesn't migrate on the other node. This is not the behavior that i want.
So I decided to try another setup:
 * Resources
- Group1(ordered, collocated)
 IP1:IPaddr2(OCF)
 WebServer1:apache(OCF)
- Group2(ordered, collocated)
 IP2:IPaddr2(OCF)
 WebServer2:apache(OCF)
Initially the resource are equally balanced on the two nodes like this:
* node1
- Group1
 IP1
 WebServer1
* node2
- Group2
 IP2
 WebServer2
When one webserver instance fails, the IP that runs on the same node
migrate on the other node with the apache resource. This is a good
approximation of what I want (the illusion of two running WebServers
isn't pretty but it works). Now, to restore the migrated IP and
WebServer i've reset the fail-counts of every resource but they don't
come back to their original running node. This in not what i want. Only
If I restart the service on the node where the resource was failed then
the entire group migrate on the original node.
There is anyone that could suggest me a better way to obtain what i
need?
Thanks in advance

-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Why the fail count isn' increased to a value 1?

2008-11-28 Thread Michele Codutti
There is any patch to apply to 2.1.3 that fix this problem?

Il giorno gio, 27/11/2008 alle 23.00 +0100, Andrew Beekhof ha scritto:
 On Thu, Nov 27, 2008 at 17:47, Michele Codutti [EMAIL PROTECTED] wrote:
  Il giorno gio, 27/11/2008 alle 17.19 +0100, Francisco José Méndez Cirera
  ha scritto:
  Is a bug, you must install the last version.
 
  Do you mean the 2.1.4?
  Please dont'tell me so! I don't want to install software out of
  distribution!
 
 Then I suggest you contact your distribution for support.
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Why the fail count isn' increased to a value 1?

2008-11-27 Thread Michele Codutti
Hallo, I'm testing heartbeat 2.1.3 (packaged by debian) and I see a
strange behaviour of the failcount.
According to http://www.linux-ha.org/ScoreCalculation the failcount is
increased by 1 every time that my resource is failing.
In my experience with heartbeat the fail count is increased by 1 only if
the previous value was 0.
My test was conducted in this way: I've configured a resource that is an
instance of the IPaddr2, I've put the resource_stickiness=3 and the
resource_failure_stickyness=-1 RA. The first time I've started the
resource the CRM has choose a node (let's say node1) where to put the IP
configured by IPaddr2. To test the failure behaviour I've delete (by
hand) the IP from the interface configured by my resource and the
monitor operation has detected the failure and restored the resource.
I've checked that the score of IPaddr2 on the running node was 2 and the
failcount was 1.
Now, to test a second failure on the same node I deleted again the IP.
Also this time the resource was restored and I expected that the score
was 1 but the score was 2 and the value of the failcount was not
incremented (failcount=1)!
It's the normal behaviour of the failcount? There is any parameter on
the configuration file or the cib.xml that I must configure to change
this binary behaviour of the failcount?

Thanks in advance

-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Why the fail count isn' increased to a value 1?

2008-11-27 Thread Michele Codutti
Il giorno gio, 27/11/2008 alle 17.19 +0100, Francisco José Méndez Cirera
ha scritto:
 Is a bug, you must install the last version.

Do you mean the 2.1.4?
Please dont'tell me so! I don't want to install software out of
distribution!

-- 
Michele Codutti
Centro Servizi Informatici e Telematici (CSIT)
Universita' degli Studi di Udine
via Delle Scienze, 208 - 33100 UDINE
tel +39 0432 558928
fax +39 0432 558911
e-mail: michele.codutti at uniud.it

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems