Re: [Linux-HA] Updating LDAP in Heartbeat/DRDB Cluster

2011-02-28 Thread Brett Delle Grazie
Putting this back on list since it may be relevant.

On 24 February 2011 15:16, Dimitri Maziuk  wrote:
> On 2/24/2011 1:55 AM, Brett Delle Grazie wrote:
>>
>> Hi,
>
>> Timeouts (connect, bind and retry count) can also be specified in
>> ldap.conf.
>> Can you use a caching daemon (nscd or preferably nslcd) to avoid your ls
>> -l
>> problem?
>
> Yes and yes. However,
> a) you can't set timeouts below tcp timeouts (well, you can, but it doesn't
> make sense), and the latter are set system-wide. Reducing them will affect
> every tcp/ip application, which may or may not cause problems.
> b) The problem with cache is consistency. It works great when nothing ever
> changes, when e.g. someone changes their password, the caches have to be
> invalidated all over the place. Which is not how it works. You can reduce
> retention intervals, but then you're back to square 1.

I don't think you can get around this without caching of some sort -
NSCD / NSLCD both have Time To Live (TTL) values for positive and
negative results.

1) I cache only group and passwd information (i.e. user IDs, group IDs), never
shadow or hosts (see below)/
2) I purposefully disable permanent caching (in memory only, restart
clears cache).
3) TTL for positive results I set to 5 minutes, negative to 1 minute YMMV.

This solution means you're at most 5 minutes away from being able to login with
a newly created user. While at the same time reducing network traffic
and load on
LDAP directory.

For DNS, I'd consider a local caching resolver that correctly honours TTLs.

>
> Irix and solaris come with nscd enabled by default (on irix it's called
> something else). On both systems I had to manually restart nscd after a dns
> change (record updated or deleted), a password change, locking user account,
> etc., every time.

As I indicated - I disable the 'hosts' cache for this very reason. DNS
is one of those
things you don't want local caching without properly honouring TTL
values - and NSCD is
notorious for bugs / issues / problems with its hosts and shadow
caching capabilities.
Admittedly ... things should have improved with more recent versions
but I haven't
been tempted to check.

>
> We had problems ("shell freezes") with one-server openldap setup after
> initial ldap migration. 2nd server didn't help, but switching to
> active/passive "R1" cluster did. For roughly the same amount of setup as 2
> servers with syncrepl and fine-tuning tcp parameters.

Most likely just caching passwd / group with low fixed TTL values (5
min +ve, 1min -ve)
would resolve this.

Good luck.

>
> Dima
>


-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Updating LDAP in Heartbeat/DRDB Cluster

2011-02-23 Thread Brett Delle Grazie
Hi,

On 23 February 2011 22:57, Dimitri Maziuk  wrote:
> Serge Dubrouski wrote:
>
>> But you still can have just 1 IP associated with a node that has LDAP
>> up. Or you can have an IP with load balancer and health monitor. It's
>> all design issues.
>
> Yes. You can have a lot of things. however, the ldap failover + syncrepl
> README is just put "URI server-1 server-2" in /etc/ldap.conf. That is
> the setup that's not so great when things actually fail.

Timeouts (connect, bind and retry count) can also be specified in ldap.conf.
Can you use a caching daemon (nscd or preferably nslcd) to avoid your ls -l
problem?

>
> Dima
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data

2011-02-16 Thread Brett Delle Grazie
Hi,

Apologies for reposting, but I forgot to 'reply-all'. Details below.

On 14 February 2011 09:04, Andrew Beekhof  wrote:
> On Fri, Feb 4, 2011 at 11:23 AM, Brett Delle Grazie
>  wrote:
>> Hi,
>>
>> Apologies for cross-posting but I'm not sure where this problem resides.
>>
>> I'm running:
>> corosync-1.2.7-1.1.el5.x86_64
>> corosynclib-1.2.7-1.1.el5.x86_64
>> cluster-glue-1.0.6-1.6.el5.x86_64
>> cluster-glue-libs-1.0.6-1.6.el5.x86_64
>> pacemaker-1.0.10-1.4.el5.x86_64
>> pacemaker-libs-1.0.10-1.4.el5.x86_64
>> resource-agents-1.0.3-2.6.el5.x86_64
>>
>> on RHEL5.
>>
>> In one of my resource agents (tomcat) I'm directly outputting the result of:
>> $((OCF_RESKEY_CRM_meta_timeout/1000))
>> to an external file.
>> and its coming up with a value of '100'
>>
>> Whereas the resource definition in pacemaker specifies timeout of '30'
>> specifically:
>>
>> primitive tomcat_tc1 ocf:intact:tomcat \
>>        params tomcat_user="tomcat" catalina_home="/opt/tomcat6"
>> catalina_pid="/home/tomcat/tc1/temp/tomcat.pid"
>> catalina_rotate_log="NO" script_log="/home/tomcat/tc1/logs/tc1.log"
>> statusurl="http://127.0.0.1/version/"; java_home="/usr/lib/jvm/java" \
>>        op start interval="0" timeout="70" \
>>        op stop interval="0" timeout="20" \
>>        op monitor interval="60" timeout="30" start-delay="70"
>>
>> Is this a known bug?
>
> No.  Could you file a bug please?

Bug filed: http://developerbugs.linux-foundation.org/show_bug.cgi?id=2560

>
>> Does it affect all operation timeouts?
>
> Unknown
>
Okay thanks.

-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] [Linux-ha-dev] resource agents 1.0.4-rc announcement

2011-02-14 Thread Brett Delle Grazie
Hi,

On 14 February 2011 19:09, Florian Haas  wrote:
> On 02/14/2011 07:54 PM, Raoul Bhatia [IPAX] wrote:
>> hi,
>>
>> On 14.02.2011 17:56, Dejan Muhamedagic wrote:
>>> Hello,
>>>
>>> The current repository of Resource Agents has been tagged to
>>> agents-1.0.4-rc on Friday evening.
>>>
>>> Some major additions and improvements:
>>>
>>> - conntrackd, exportfs, nginx, fio: new agents
>>> - mysql: master-slave functionality and replication monitoring
>>
>> i have some serious issues with mysql master-slave and rapid fail over.
>> imho, and i might be wrong, this functionality "is not quite there yet".
>>
>> the basic problem: master fail over from node1 to node2 and back again
>> makes node2 try to parse the binlog from the very start.
>>
>> "CHANGE MASTER TO" does not honor the slave's last position for a given
>> master upon fail over and/or the binlogs on the master are never reset
>> thus leading to duplicate parsing of the very same binlog.
>>
>> i'll see to a more detailed report but am kind of swamped in work right
>> now.
>
> Okay, thanks for the feedback. We'll see what we can do about this. If
> you can spare any more time looking into this issue, it would be much
> appreciated.
>
> Cheers,
> Florian
>
>

There's also one possible issue in the tomcat stop operation due to an
apparent bug in pacemaker where the
resource agent is passed the wrong OCF_RESOURCE_CRM_meta_timeout. I'm
going to lodge a bug
report tomorrow.


-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data

2011-02-04 Thread Brett Delle Grazie
Hi,

Apologies for cross-posting but I'm not sure where this problem resides.

I'm running:
corosync-1.2.7-1.1.el5.x86_64
corosynclib-1.2.7-1.1.el5.x86_64
cluster-glue-1.0.6-1.6.el5.x86_64
cluster-glue-libs-1.0.6-1.6.el5.x86_64
pacemaker-1.0.10-1.4.el5.x86_64
pacemaker-libs-1.0.10-1.4.el5.x86_64
resource-agents-1.0.3-2.6.el5.x86_64

on RHEL5.

In one of my resource agents (tomcat) I'm directly outputting the result of:
$((OCF_RESKEY_CRM_meta_timeout/1000))
to an external file.
and its coming up with a value of '100'

Whereas the resource definition in pacemaker specifies timeout of '30'
specifically:

primitive tomcat_tc1 ocf:intact:tomcat \
params tomcat_user="tomcat" catalina_home="/opt/tomcat6"
catalina_pid="/home/tomcat/tc1/temp/tomcat.pid"
catalina_rotate_log="NO" script_log="/home/tomcat/tc1/logs/tc1.log"
statusurl="http://127.0.0.1/version/"; java_home="/usr/lib/jvm/java" \
op start interval="0" timeout="70" \
op stop interval="0" timeout="20" \
op monitor interval="60" timeout="30" start-delay="70"

Is this a known bug? Does it affect all operation timeouts?

Thanks,

-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] [PATCH]es Tomcat resource agent - multiple instances

2011-01-25 Thread Brett Delle Grazie
Hi Dejan,

>>>>
>>>> OK. Could you please produce new patch which would keep the
>>>> obsolete parameters, but mark them as such in the meta-data.
>>>> That way we can keep the existing installations happy. I guess
>>>> that you could anyway use the meta timeout parameter, it would
>>>> anyway need to be set to something higher than
>>>> tomcat_stop_timeout.
>>
>> Will do.

Attached patch uses meta-data-timeout and ignores tomcat_stop_timeout this is
necessary because tomcat_stop_timeout should always be less than meta timeout
otherwise the resource agent would never have worked properly.  Note that the
stop operation is now a blocking call, before it was a background one.

-- 
Best Regards,

Brett Delle Grazie
From e883f8443e96b82af1a54220aa25b7289d3be81f Mon Sep 17 00:00:00 2001
From: Brett Delle Grazie 
Date: Tue, 25 Jan 2011 20:38:15 +
Subject: [PATCH] Med: tomcat: Use Tomcat stop TIMEOUT -force to improve stop

The tomcat stop script can be told to forcefully terminate tomcat if it
doesn't shut down nicely within a specified period. Using this reduces
the stop case to almost a simple 'call tomcat stop script in blocking
mode'.  The timeout is set to one second shorter than the stop operation
timeout. The tomcat stop script checks for and uses the PID file.
---
 heartbeat/tomcat |   59 +++--
 1 files changed, 13 insertions(+), 46 deletions(-)

diff --git a/heartbeat/tomcat b/heartbeat/tomcat
index 1248a97..1eaac98 100755
--- a/heartbeat/tomcat
+++ b/heartbeat/tomcat
@@ -24,8 +24,8 @@
 # OCF parameters:
 #   OCF_RESKEY_tomcat_name - The name of the resource. Default is tomcat
 #   OCF_RESKEY_script_log  - A destination of the log of this script. Default /var/log/OCF_RESKEY_tomcat_name.log
-#   OCF_RESKEY_tomcat_stop_timeout  - Time-out at the time of the stop. Default is 5
-#   OCF_RESKEY_tomcat_suspend_trialcount  - The re-try number of times awaiting a stop. Default is 10
+#   OCF_RESKEY_tomcat_stop_timeout  - Time-out at the time of the stop. Default is 5. DEPRECATED
+#   OCF_RESKEY_tomcat_suspend_trialcount  - The re-try number of times awaiting a stop. Default is 10. DEPRECATED
 #   OCF_RESKEY_tomcat_user  - A user name to start a resource. Default is root
 #   OCF_RESKEY_statusurl - URL for state confirmation. Default is http://127.0.0.1:8080
 #   OCF_RESKEY_java_home - Home directory of Java. Default is none
@@ -175,21 +175,23 @@ END_TOMCAT_START
 # Stop Tomcat
 stop_tomcat()
 {
+	STOP_TIMEOUT=$((OCF_RESKEY_CRM_meta_timeout/1000-1))
+
 	cd "$CATALINA_HOME/bin"
 
 	echo "`date "+%Y/%m/%d %T"`: stop  ###" >> "$TOMCAT_CONSOLE"
 
 	if [ "$RESOURCE_TOMCAT_USER" = RUNASIS ]; then
-		"$CATALINA_HOME/bin/catalina.sh" stop \
-			>> "$TOMCAT_CONSOLE" 2>&1 &
+		"$CATALINA_HOME/bin/catalina.sh" stop $STOP_TIMEOUT -force \
+			>> "$TOMCAT_CONSOLE" 2>&1
 	else
-		cat<<-END_TOMCAT_STOP | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
+		cat<<-END_TOMCAT_STOP | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1
 			export JAVA_HOME=${JAVA_HOME}
 			export JAVA_OPTS="${JAVA_OPTS}"
 			export CATALINA_HOME=${CATALINA_HOME}
 			export CATALINA_BASE=${CATALINA_BASE}
 			export CATALINA_PID=${CATALINA_PID}
-			$CATALINA_HOME/bin/catalina.sh stop
+			$CATALINA_HOME/bin/catalina.sh stop $STOP_TIMEOUT -force
 END_TOMCAT_STOP
 	fi
 
@@ -197,40 +199,7 @@ END_TOMCAT_STOP
 	while isalive_tomcat; do
 		sleep 1
 		lapse_sec=`expr $lapse_sec + 1`
-		ocf_log debug "stop_tomcat[$TOMCAT_NAME]: stop NORM $lapse_sec/$TOMCAT_STOP_TIMEOUT"
-		if [ $lapse_sec -ge $TOMCAT_STOP_TIMEOUT ]; then
-			break
-		fi
-	done
-
-	if isalive_tomcat; then
-		lapse_sec=0
-		while true; do
-			sleep 1
-			lapse_sec=`expr $lapse_sec + 1`
-			ocf_log debug "stop_tomcat[$TOMCAT_NAME]: suspend tomcat by SIGTERM ($lapse_sec/$TOMCAT_SUSPEND_TRIALCOUNT)"
-			pkill -TERM -f "${SEARCH_STR}"
-			if isalive_tomcat; then
-ocf_log debug "stop_tomcat[$TOMCAT_NAME]: suspend tomcat by SIGQUIT ($lapse_sec/$TOMCAT_SUSPEND_TRIALCOUNT)"
-pkill -QUIT -f "${SEARCH_STR}"
-if isalive_tomcat; then
-	if [ $lapse_sec -ge $TOMCAT_SUSPEND_TRIALCOUNT ]; then
-		break
-	fi
-else
-	break
-fi
-			else
-break
-			fi
-		done
-	fi
-
-	lapse_sec=0
-	while isalive_tomcat; do
-		sleep 1
-		lapse_sec=`expr $lapse_sec + 1`
-		ocf_log debug "stop_tomcat[$TOMCAT_NAME]: suspend tomcat by SIGKILL ($lapse_sec)"
+		ocf_log debug "stop_tomcat[$TOMCAT_NAME]: stop failed, killing with SIGKILL ($lapse_sec)"
 		pkill -KILL -f "${SEARCH_STR}"
 	

Re: [Linux-HA] [PATCH]es Tomcat resource agent - multiple instances

2011-01-20 Thread Brett Delle Grazie
Hi (3rd time luck),

On 20 January 2011 09:57, Brett Delle Grazie
 wrote:
> I missed the bit down the end ;)
>
> On 20 January 2011 09:55, Brett Delle Grazie
>  wrote:
>> Hi,
>>
>> On 19 January 2011 12:20, Dejan Muhamedagic  wrote:
>>> Hi,
>>>
>>> On Tue, Jan 18, 2011 at 07:28:55PM +, Brett Delle Grazie wrote:
>>>> Hi Dejan,
>>>>
>>>> My changes were in a completely separate SVN repository with other
>>>> client work.  They were rather ad-hoc as I edited things and
>>>> then fixed them after finding problems.  After a while I completely
>>>> forgot about posting back the changes so apologies there.
>>>> I had to rebase my changes using git against the mercurial tip, I'd
>>>> have learnt mercurial but I've only just got my head around git
>>>> and didn't want to make matters worse.
>>>>
>>>> Comments below :)
>>>>
>>>> On 18 January 2011 17:32, Dejan Muhamedagic  wrote:
>>>> > Hi Brett,
>>>> >
>>>> > On Tue, Jan 18, 2011 at 04:08:05PM +, Brett Delle Grazie wrote:
>>>> >> Hi,
>>>> >>
>>>> >> Its been a while but here are the patches for using multiple instances
>>>> >> of Tomcat.
>>>> >>
>>>> >> The last one (7) you may or may not wish to use...
>>>> >>
>>>> >> I apologies for having missed this for so long.
>>>> >
>>>> > NP. I appreciate your effort. Comments below.
>>>> >
>>>> >> Enjoy!
>>>> >>
>>>> >> --
>>>> >> Best Regards,
>>>> >>
>>>> >> Brett Delle Grazie
>>>> >
>>>> >> From 1c0a2ef05bfbde930962befd99799d4f6a318231 Mon Sep 17 00:00:00 2001
>>>> >> From: Brett Delle Grazie 
>>>> >> Date: Mon, 17 Jan 2011 22:09:44 +
>>>> >> Subject: [PATCH 1/7] Low: tomcat: Use here-documents to simplify 
>>>> >> start/stop operations
>>>> >>
>>>> >> ---
>>>> >>  heartbeat/tomcat |   30 +++---
>>>> >>  1 files changed, 15 insertions(+), 15 deletions(-)
>>>> >>
>>>> >> diff --git a/heartbeat/tomcat b/heartbeat/tomcat
>>>> >> index 689edc7..671ba82 100755
>>>> >> --- a/heartbeat/tomcat
>>>> >> +++ b/heartbeat/tomcat
>>>> >> @@ -146,14 +146,14 @@ start_tomcat()
>>>> >>               "$CATALINA_HOME/bin/catalina.sh" start $TOMCAT_START_OPTS 
>>>> >> \
>>>> >>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>>>> >>       else
>>>> >> -             su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
>>>> >> -                     -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
>>>> >> -                            export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
>>>> >> -                            export 
>>>> >> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
>>>> >> -                            export 
>>>> >> CATALINA_PID=${OCF_RESKEY_catalina_pid};\
>>>> >> -                            export 
>>>> >> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
>>>> >> -                            $CATALINA_HOME/bin/catalina.sh start 
>>>> >> ${OCF_RESKEY_tomcat_start_opts}" \
>>>> >> -                     >> "$TOMCAT_CONSOLE" 2>&1 &
>>>> >> +             cat<<-END_TOMCAT_START | su - -s /bin/sh 
>>>> >> "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
>>>> >> +                     export JAVA_HOME=${OCF_RESKEY_java_home}
>>>> >> +                     export JAVA_OPTS=-Dname=${TOMCAT_NAME}
>>>> >> +                     export CATALINA_HOME=${OCF_RESKEY_catalina_home}
>>>> >> +                     export CATALINA_PID=${OCF_RESKEY_catalina_pid}
>>>> >> +                     export 
>>>> >> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
>>>> >> +                     $CATALINA_HOME/bin/catalina.sh start 
>>>> >> ${OCF_RESKEY_tomcat_start_opts}
>>>> >> +END_TOMC

Re: [Linux-HA] [PATCH]es Tomcat resource agent - multiple instances

2011-01-20 Thread Brett Delle Grazie
I missed the bit down the end ;)

On 20 January 2011 09:55, Brett Delle Grazie
 wrote:
> Hi,
>
> On 19 January 2011 12:20, Dejan Muhamedagic  wrote:
>> Hi,
>>
>> On Tue, Jan 18, 2011 at 07:28:55PM +, Brett Delle Grazie wrote:
>>> Hi Dejan,
>>>
>>> My changes were in a completely separate SVN repository with other
>>> client work.  They were rather ad-hoc as I edited things and
>>> then fixed them after finding problems.  After a while I completely
>>> forgot about posting back the changes so apologies there.
>>> I had to rebase my changes using git against the mercurial tip, I'd
>>> have learnt mercurial but I've only just got my head around git
>>> and didn't want to make matters worse.
>>>
>>> Comments below :)
>>>
>>> On 18 January 2011 17:32, Dejan Muhamedagic  wrote:
>>> > Hi Brett,
>>> >
>>> > On Tue, Jan 18, 2011 at 04:08:05PM +, Brett Delle Grazie wrote:
>>> >> Hi,
>>> >>
>>> >> Its been a while but here are the patches for using multiple instances
>>> >> of Tomcat.
>>> >>
>>> >> The last one (7) you may or may not wish to use...
>>> >>
>>> >> I apologies for having missed this for so long.
>>> >
>>> > NP. I appreciate your effort. Comments below.
>>> >
>>> >> Enjoy!
>>> >>
>>> >> --
>>> >> Best Regards,
>>> >>
>>> >> Brett Delle Grazie
>>> >
>>> >> From 1c0a2ef05bfbde930962befd99799d4f6a318231 Mon Sep 17 00:00:00 2001
>>> >> From: Brett Delle Grazie 
>>> >> Date: Mon, 17 Jan 2011 22:09:44 +
>>> >> Subject: [PATCH 1/7] Low: tomcat: Use here-documents to simplify 
>>> >> start/stop operations
>>> >>
>>> >> ---
>>> >>  heartbeat/tomcat |   30 +++---
>>> >>  1 files changed, 15 insertions(+), 15 deletions(-)
>>> >>
>>> >> diff --git a/heartbeat/tomcat b/heartbeat/tomcat
>>> >> index 689edc7..671ba82 100755
>>> >> --- a/heartbeat/tomcat
>>> >> +++ b/heartbeat/tomcat
>>> >> @@ -146,14 +146,14 @@ start_tomcat()
>>> >>               "$CATALINA_HOME/bin/catalina.sh" start $TOMCAT_START_OPTS \
>>> >>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>>> >>       else
>>> >> -             su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
>>> >> -                     -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
>>> >> -                            export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
>>> >> -                            export 
>>> >> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
>>> >> -                            export 
>>> >> CATALINA_PID=${OCF_RESKEY_catalina_pid};\
>>> >> -                            export 
>>> >> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
>>> >> -                            $CATALINA_HOME/bin/catalina.sh start 
>>> >> ${OCF_RESKEY_tomcat_start_opts}" \
>>> >> -                     >> "$TOMCAT_CONSOLE" 2>&1 &
>>> >> +             cat<<-END_TOMCAT_START | su - -s /bin/sh 
>>> >> "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
>>> >> +                     export JAVA_HOME=${OCF_RESKEY_java_home}
>>> >> +                     export JAVA_OPTS=-Dname=${TOMCAT_NAME}
>>> >> +                     export CATALINA_HOME=${OCF_RESKEY_catalina_home}
>>> >> +                     export CATALINA_PID=${OCF_RESKEY_catalina_pid}
>>> >> +                     export 
>>> >> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
>>> >> +                     $CATALINA_HOME/bin/catalina.sh start 
>>> >> ${OCF_RESKEY_tomcat_start_opts}
>>> >> +END_TOMCAT_START
>>> >>       fi
>>> >>
>>> >>       while true; do
>>> >> @@ -181,13 +181,13 @@ stop_tomcat()
>>> >>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>>> >>               eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
>>> >>       else
>>> >> -             su

Re: [Linux-HA] [PATCH]es Tomcat resource agent - multiple instances

2011-01-20 Thread Brett Delle Grazie
Hi,

On 19 January 2011 12:20, Dejan Muhamedagic  wrote:
> Hi,
>
> On Tue, Jan 18, 2011 at 07:28:55PM +0000, Brett Delle Grazie wrote:
>> Hi Dejan,
>>
>> My changes were in a completely separate SVN repository with other
>> client work.  They were rather ad-hoc as I edited things and
>> then fixed them after finding problems.  After a while I completely
>> forgot about posting back the changes so apologies there.
>> I had to rebase my changes using git against the mercurial tip, I'd
>> have learnt mercurial but I've only just got my head around git
>> and didn't want to make matters worse.
>>
>> Comments below :)
>>
>> On 18 January 2011 17:32, Dejan Muhamedagic  wrote:
>> > Hi Brett,
>> >
>> > On Tue, Jan 18, 2011 at 04:08:05PM +, Brett Delle Grazie wrote:
>> >> Hi,
>> >>
>> >> Its been a while but here are the patches for using multiple instances
>> >> of Tomcat.
>> >>
>> >> The last one (7) you may or may not wish to use...
>> >>
>> >> I apologies for having missed this for so long.
>> >
>> > NP. I appreciate your effort. Comments below.
>> >
>> >> Enjoy!
>> >>
>> >> --
>> >> Best Regards,
>> >>
>> >> Brett Delle Grazie
>> >
>> >> From 1c0a2ef05bfbde930962befd99799d4f6a318231 Mon Sep 17 00:00:00 2001
>> >> From: Brett Delle Grazie 
>> >> Date: Mon, 17 Jan 2011 22:09:44 +
>> >> Subject: [PATCH 1/7] Low: tomcat: Use here-documents to simplify 
>> >> start/stop operations
>> >>
>> >> ---
>> >>  heartbeat/tomcat |   30 +++---
>> >>  1 files changed, 15 insertions(+), 15 deletions(-)
>> >>
>> >> diff --git a/heartbeat/tomcat b/heartbeat/tomcat
>> >> index 689edc7..671ba82 100755
>> >> --- a/heartbeat/tomcat
>> >> +++ b/heartbeat/tomcat
>> >> @@ -146,14 +146,14 @@ start_tomcat()
>> >>               "$CATALINA_HOME/bin/catalina.sh" start $TOMCAT_START_OPTS \
>> >>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>> >>       else
>> >> -             su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
>> >> -                     -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
>> >> -                            export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
>> >> -                            export 
>> >> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
>> >> -                            export 
>> >> CATALINA_PID=${OCF_RESKEY_catalina_pid};\
>> >> -                            export 
>> >> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
>> >> -                            $CATALINA_HOME/bin/catalina.sh start 
>> >> ${OCF_RESKEY_tomcat_start_opts}" \
>> >> -                     >> "$TOMCAT_CONSOLE" 2>&1 &
>> >> +             cat<<-END_TOMCAT_START | su - -s /bin/sh 
>> >> "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
>> >> +                     export JAVA_HOME=${OCF_RESKEY_java_home}
>> >> +                     export JAVA_OPTS=-Dname=${TOMCAT_NAME}
>> >> +                     export CATALINA_HOME=${OCF_RESKEY_catalina_home}
>> >> +                     export CATALINA_PID=${OCF_RESKEY_catalina_pid}
>> >> +                     export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
>> >> +                     $CATALINA_HOME/bin/catalina.sh start 
>> >> ${OCF_RESKEY_tomcat_start_opts}
>> >> +END_TOMCAT_START
>> >>       fi
>> >>
>> >>       while true; do
>> >> @@ -181,13 +181,13 @@ stop_tomcat()
>> >>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>> >>               eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
>> >>       else
>> >> -             su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
>> >> -                     -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
>> >> -                            export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
>> >> -                            export 
>> >> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
>> >> -                            export 
>> >> CATALINA_PID=${OCF_RESKEY_catalina_pid};\
>&

Re: [Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes

2011-01-20 Thread Brett Delle Grazie
On 19 January 2011 12:42, Dejan Muhamedagic  wrote:
> Hi,
>
> On Tue, Jan 18, 2011 at 12:54:50PM -0500, Vadym Chepkov wrote:
>>
>> On Jan 18, 2011, at 10:28 AM, Dejan Muhamedagic wrote:
>>
>> > On Tue, Jan 18, 2011 at 07:58:28AM -0500, Vadym Chepkov wrote:
>> >>
>> >> On Jan 17, 2011, at 5:43 PM, Brett Delle Grazie wrote:
>> >>
>> >>> Hi Dejan,
>> >>>
>> >>> On 17 January 2011 14:54, Dejan Muhamedagic  wrote:
>> >>>> Hi Brett,
>> >>>>
>> >>>> Long time.
>> >>>
>> >>> Indeed it is - thank you for the reminder!
>> >>>
>> >>> This one simply uses here documents for start/stop operations.
>> >>
>> >> Using 'su -'  always makes me uncomfortable, because this
>> >> would invoke so many things intended for login sessions only,
>> >> especially on systems with /etc/profile.d/. Just a thought.
>> >
>> > Well, I'd rather live without it, but sometimes it seems
>> > necessary. I don't know if tomcat is such a beast, but it could
>> > be.
>>
>> definitely not, it was done in the past to put enormous amount of 
>> environment variables into .profile of tomcat user
>> But catalina.sh has this code for quite awhile now:
>>
>> if [ -r "$CATALINA_BASE"/bin/setenv.sh ]; then
>>   . "$CATALINA_BASE"/bin/setenv.sh
>> elif [ -r "$CATALINA_HOME"/bin/setenv.sh ]; then
>>   . "$CATALINA_HOME"/bin/setenv.sh
>> fi
>>
>> so environment can be set in a setenv.sh
>
> OK. I'd rather leave this to you tomcat experts to figure out
> which is the right way, but let's keep it as it is in order not
> to disturb the existing installations.

setenv.sh usage requires the following to be set:
(a) CATALINA_HOME
(b) CATALINA_BASE (if different from CATALINA_HOME)

Inside setenv.sh you set:
JAVA_OPTS
CATALINA_OPTS
CATALINA_PID
...

However a 'standard' setenv.sh usage will break the resource agent
- why?
Because of the way the resource agent checks to see if the process is alive.
Instead of using the PID file and associated PID test it uses a grep against the
process table for -D parameter added to CATALINA_OPTS.

Resulting in... nothing. - because CATALINA_OPTS is usually reset
inside setenv.sh
One can work around this by suggesting users set CATALINA_OPTS in setenv.sh as:
CATALINA_OPTS="${CATALINA_OPTS} new options go here"
Which is in fact what I do - and this permits me to use both the
resource agent and
manual start/stop for testing.

Completely redesigning a minimalist tomcat resource agent would
require the following variables:
CATALINA_HOME
CATALINA_BASE (defaulting to CATALINA_HOME)
status_url (optional)

and usage of setenv.sh which _must_ define (for resource agent usage):
CATALINA_PID
(anything else is optional)

In theory everything else could be inferred from either the resource agent
config (think timeouts here)
and/or importing setenv.sh (for PID file location).

>
>> Vadym
>>
>> P.S. Is it necessary for ClusterMon ?
>
> How should I know? I'm just a poor maintainer who knows close to
> nothing about single resource agents.
>
> Thanks,
>
> Dejan
>
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>

-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] [PATCH]es Tomcat resource agent - multiple instances

2011-01-18 Thread Brett Delle Grazie
Hi Dejan,

My changes were in a completely separate SVN repository with other
client work.  They were rather ad-hoc as I edited things and
then fixed them after finding problems.  After a while I completely
forgot about posting back the changes so apologies there.
I had to rebase my changes using git against the mercurial tip, I'd
have learnt mercurial but I've only just got my head around git
and didn't want to make matters worse.

Comments below :)

On 18 January 2011 17:32, Dejan Muhamedagic  wrote:
> Hi Brett,
>
> On Tue, Jan 18, 2011 at 04:08:05PM +, Brett Delle Grazie wrote:
>> Hi,
>>
>> Its been a while but here are the patches for using multiple instances
>> of Tomcat.
>>
>> The last one (7) you may or may not wish to use...
>>
>> I apologies for having missed this for so long.
>
> NP. I appreciate your effort. Comments below.
>
>> Enjoy!
>>
>> --
>> Best Regards,
>>
>> Brett Delle Grazie
>
>> From 1c0a2ef05bfbde930962befd99799d4f6a318231 Mon Sep 17 00:00:00 2001
>> From: Brett Delle Grazie 
>> Date: Mon, 17 Jan 2011 22:09:44 +
>> Subject: [PATCH 1/7] Low: tomcat: Use here-documents to simplify start/stop 
>> operations
>>
>> ---
>>  heartbeat/tomcat |   30 +++---
>>  1 files changed, 15 insertions(+), 15 deletions(-)
>>
>> diff --git a/heartbeat/tomcat b/heartbeat/tomcat
>> index 689edc7..671ba82 100755
>> --- a/heartbeat/tomcat
>> +++ b/heartbeat/tomcat
>> @@ -146,14 +146,14 @@ start_tomcat()
>>               "$CATALINA_HOME/bin/catalina.sh" start $TOMCAT_START_OPTS \
>>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>>       else
>> -             su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
>> -                     -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
>> -                            export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
>> -                            export 
>> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
>> -                            export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
>> -                            export 
>> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
>> -                            $CATALINA_HOME/bin/catalina.sh start 
>> ${OCF_RESKEY_tomcat_start_opts}" \
>> -                     >> "$TOMCAT_CONSOLE" 2>&1 &
>> +             cat<<-END_TOMCAT_START | su - -s /bin/sh 
>> "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
>> +                     export JAVA_HOME=${OCF_RESKEY_java_home}
>> +                     export JAVA_OPTS=-Dname=${TOMCAT_NAME}
>> +                     export CATALINA_HOME=${OCF_RESKEY_catalina_home}
>> +                     export CATALINA_PID=${OCF_RESKEY_catalina_pid}
>> +                     export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
>> +                     $CATALINA_HOME/bin/catalina.sh start 
>> ${OCF_RESKEY_tomcat_start_opts}
>> +END_TOMCAT_START
>>       fi
>>
>>       while true; do
>> @@ -181,13 +181,13 @@ stop_tomcat()
>>                       >> "$TOMCAT_CONSOLE" 2>&1 &
>>               eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
>>       else
>> -             su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
>> -                     -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
>> -                            export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
>> -                            export 
>> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
>> -                            export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
>> -                            $CATALINA_HOME/bin/catalina.sh stop" \
>> -                     >> "$TOMCAT_CONSOLE" 2>&1 &
>> +             cat<<-END_TOMCAT_STOP | su - -s /bin/sh 
>> "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
>> +                     export JAVA_HOME=${OCF_RESKEY_java_home}
>> +                     export JAVA_OPTS=-Dname=${TOMCAT_NAME}
>> +                     export CATALINA_HOME=${OCF_RESKEY_catalina_home}
>> +                     export CATALINA_PID=${OCF_RESKEY_catalina_pid}
>> +                     $CATALINA_HOME/bin/catalina.sh stop
>> +END_TOMCAT_STOP
>>       fi
>>
>>       lapse_sec=0
>> --
>> 1.7.1
>
> This seems to be OK.
>
>> From 8a7e6c8fd4c5f130e19eadf669550a67473f2fa5 Mon Sep 17 00:00:00 2001
>>

Re: [Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes

2011-01-18 Thread Brett Delle Grazie
On 18 January 2011 15:30, Dejan Muhamedagic  wrote:
...snip...
>> More patches are coming ... in about an hour

Patches sent on completely separate thread.
...snip...
-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] [PATCH]es Tomcat resource agent - multiple instances

2011-01-18 Thread Brett Delle Grazie
Hi,

Its been a while but here are the patches for using multiple instances
of Tomcat.

The last one (7) you may or may not wish to use...

I apologies for having missed this for so long.

Enjoy!

-- 
Best Regards,

Brett Delle Grazie
From 1c0a2ef05bfbde930962befd99799d4f6a318231 Mon Sep 17 00:00:00 2001
From: Brett Delle Grazie 
Date: Mon, 17 Jan 2011 22:09:44 +
Subject: [PATCH 1/7] Low: tomcat: Use here-documents to simplify start/stop operations

---
 heartbeat/tomcat |   30 +++---
 1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/heartbeat/tomcat b/heartbeat/tomcat
index 689edc7..671ba82 100755
--- a/heartbeat/tomcat
+++ b/heartbeat/tomcat
@@ -146,14 +146,14 @@ start_tomcat()
 		"$CATALINA_HOME/bin/catalina.sh" start $TOMCAT_START_OPTS \
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	else
-		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
-export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
-$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}" \
-			>> "$TOMCAT_CONSOLE" 2>&1 &
+		cat<<-END_TOMCAT_START | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
+			export JAVA_HOME=${OCF_RESKEY_java_home}
+			export JAVA_OPTS=-Dname=${TOMCAT_NAME}
+			export CATALINA_HOME=${OCF_RESKEY_catalina_home}
+			export CATALINA_PID=${OCF_RESKEY_catalina_pid}
+			export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
+			$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}
+END_TOMCAT_START
 	fi
 
 	while true; do
@@ -181,13 +181,13 @@ stop_tomcat()
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 		eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
 	else
-		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
-$CATALINA_HOME/bin/catalina.sh stop" \
-			>> "$TOMCAT_CONSOLE" 2>&1 &
+		cat<<-END_TOMCAT_STOP | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
+			export JAVA_HOME=${OCF_RESKEY_java_home}
+			export JAVA_OPTS=-Dname=${TOMCAT_NAME}
+			export CATALINA_HOME=${OCF_RESKEY_catalina_home}
+			export CATALINA_PID=${OCF_RESKEY_catalina_pid}
+			$CATALINA_HOME/bin/catalina.sh stop
+END_TOMCAT_STOP
 	fi
 
 	lapse_sec=0
-- 
1.7.1

From 8a7e6c8fd4c5f130e19eadf669550a67473f2fa5 Mon Sep 17 00:00:00 2001
From: Brett Delle Grazie 
Date: Tue, 18 Jan 2011 10:42:16 +
Subject: [PATCH 2/7] Low: tomcat: Fix to ensure default OCF_RESKEY_xx values are observed

Use the internal name of the OCF_RESKEY_xx variables throughout
ensuring that any defaults set at the beginning are observed.
---
 heartbeat/tomcat |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/heartbeat/tomcat b/heartbeat/tomcat
index 671ba82..9fb948a 100755
--- a/heartbeat/tomcat
+++ b/heartbeat/tomcat
@@ -147,12 +147,12 @@ start_tomcat()
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	else
 		cat<<-END_TOMCAT_START | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
-			export JAVA_HOME=${OCF_RESKEY_java_home}
+			export JAVA_HOME=${JAVA_HOME}
 			export JAVA_OPTS=-Dname=${TOMCAT_NAME}
-			export CATALINA_HOME=${OCF_RESKEY_catalina_home}
-			export CATALINA_PID=${OCF_RESKEY_catalina_pid}
-			export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
-			$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}
+			export CATALINA_HOME=${CATALINA_HOME}
+			export CATALINA_PID=${CATALINA_PID}
+			export CATALINA_OPTS="${CATALINA_OPTS}"
+			$CATALINA_HOME/bin/catalina.sh start ${TOMCAT_START_OPTS}
 END_TOMCAT_START
 	fi
 
@@ -182,10 +182,10 @@ stop_tomcat()
 		eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
 	else
 		cat<<-END_TOMCAT_STOP | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
-			export JAVA_HOME=${OCF_RESKEY_java_home}
+			export JAVA_HOME=${JAVA_HOME}
 			export JAVA_OPTS=-Dname=${TOMCAT_NAME}
-			export CATALINA_HOME=${OCF_RESKEY_catalina_home}
-			export CATALINA_PID=${OCF_RESKEY_catalina_pid}
+			export CATALINA_HOME=${CATALINA_HOME}
+			export CATALINA_P

Re: [Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes

2011-01-18 Thread Brett Delle Grazie
Hi,

On 18 January 2011 12:58, Vadym Chepkov  wrote:
>
> On Jan 17, 2011, at 5:43 PM, Brett Delle Grazie wrote:
>
>> Hi Dejan,
>>
>> On 17 January 2011 14:54, Dejan Muhamedagic  wrote:
>>> Hi Brett,
>>>
>>> Long time.
>>
>> Indeed it is - thank you for the reminder!
>>
>> This one simply uses here documents for start/stop operations.
>
> Using 'su -'  always makes me uncomfortable, because this would invoke so 
> many things intended for login sessions only,
> especially on systems with /etc/profile.d/. Just a thought.

It was there originally, so I left it there. Isn't it so that process
limits (using ulimit) for number of open files etc. can be set?
Or are they applied irrespective of whether the su is a login or not?

More patches are coming ... in about an hour

Vadym - sorry for the double post to your email.

>
> Vadym
>
>
> __
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> __
>



-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes

2011-01-17 Thread Brett Delle Grazie
Hi Dejan,

On 17 January 2011 14:54, Dejan Muhamedagic  wrote:
> Hi Brett,
>
> Long time.

Indeed it is - thank you for the reminder!

This one simply uses here documents for start/stop operations.

-- 
Best Regards,

Brett Delle Grazie
From 1c0a2ef05bfbde930962befd99799d4f6a318231 Mon Sep 17 00:00:00 2001
From: Brett Delle Grazie 
Date: Mon, 17 Jan 2011 22:09:44 +
Subject: [PATCH] Low: tomcat: Use here-documents to simplify start/stop operations

---
 heartbeat/tomcat |   30 +++---
 1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/heartbeat/tomcat b/heartbeat/tomcat
index 689edc7..671ba82 100755
--- a/heartbeat/tomcat
+++ b/heartbeat/tomcat
@@ -146,14 +146,14 @@ start_tomcat()
 		"$CATALINA_HOME/bin/catalina.sh" start $TOMCAT_START_OPTS \
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	else
-		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
-export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
-$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}" \
-			>> "$TOMCAT_CONSOLE" 2>&1 &
+		cat<<-END_TOMCAT_START | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
+			export JAVA_HOME=${OCF_RESKEY_java_home}
+			export JAVA_OPTS=-Dname=${TOMCAT_NAME}
+			export CATALINA_HOME=${OCF_RESKEY_catalina_home}
+			export CATALINA_PID=${OCF_RESKEY_catalina_pid}
+			export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\"
+			$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}
+END_TOMCAT_START
 	fi
 
 	while true; do
@@ -181,13 +181,13 @@ stop_tomcat()
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 		eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
 	else
-		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
-$CATALINA_HOME/bin/catalina.sh stop" \
-			>> "$TOMCAT_CONSOLE" 2>&1 &
+		cat<<-END_TOMCAT_STOP | su - -s /bin/sh "$RESOURCE_TOMCAT_USER" >> "$TOMCAT_CONSOLE" 2>&1 &
+			export JAVA_HOME=${OCF_RESKEY_java_home}
+			export JAVA_OPTS=-Dname=${TOMCAT_NAME}
+			export CATALINA_HOME=${OCF_RESKEY_catalina_home}
+			export CATALINA_PID=${OCF_RESKEY_catalina_pid}
+			$CATALINA_HOME/bin/catalina.sh stop
+END_TOMCAT_STOP
 	fi
 
 	lapse_sec=0
-- 
1.7.1

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Round Robin DNS OCF?

2010-11-30 Thread Brett Delle Grazie
Hi,

Some commercial load balancers achieve this by manipulating DNS servers with
very low time-to-live (TTL) values (e.g.
60 seconds).

However problems occurs when clients don't obey or observe the TTL values
specified by the DNS servers. This frequently
occurs with browsers, Java and other client-side software that caches DNS
entries well beyond the TTL time.  Certain (older)
versions of Java cached positive DNS results _indefinitely_ by default
irrespective of TTL (yes, really).

Also remember that some DNS servers used by ISPs may deliberately ignore TTL
values < 1-5 minutes
(variable, depends upon ISP).

I'm not saying it won't work... just to be aware of these issues, which have
bit me in the past.

To answer your question, I don't know of a resource agent that does this,
however it would be possible to write one -
it just depends upon your DNS server. Power-DNS can probably be made to do
this with its backend in LDAP. Don't know
how easy it would be to control BIND to do this though.

Regards,

Brett


On 30 November 2010 11:21, Michael Kromer wrote:

> Hi,
>
> wanted to know if anyone has an idea to manipulate DNS-entries in case
> of failover, by example having 3 nodes communicating with ips from
> different locations (which kills IPAddr(2) for takeover) and having them
> delivering ressources by same DNS-entry. Idea would be to remove an IP
> as soon as ressources on that node are not available anymore, and adding
> itself back as soon as everything is running.
>
> Any ideas?
>
> - mike
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
Best Regards,

Brett Delle Grazie
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] HA-Proxy with HeartBeat

2010-11-23 Thread Brett Delle Grazie
Hi Karl,

We use HAproxy and Corosync/Pacemaker at client site. Perhaps I can
help.  We terminate the SSL connection at the load balancer however
(using stunnel).

Which version of HAproxy are you using?

Stupid question but does this problem occur if you are not using HAproxy
or if one of load balancers is deliberately disabled (thus forcing all
traffic to one system)?

We had a similar issue when we started to restrict the SSL encryption
algorithms as we require PCI DSS compliance. It turned out that one of
the algorithms was necessary for IE6 to work properly (IE 6 specific
problem)

HAproxy mailing list might be helpful as well.


On Mon, 2010-11-22 at 12:31 +1100, Karl Kloppenborg wrote:
> Hi Linux-HA Users,
> 
> I have a bit of an issue and I don't know whether this is a good place to 
> start...
> 
> I have been trying to implement HaProxy alongside Heartbeat for the last 
> couple of weeks...
> 
> Combining it with Heartbeat was no issue however HAProxy's actual 
> configuration was the issue.
> 
> The load balancing setup is:
> LB Algorithm: source based
> Load balancing: 
> 1) port 443(sslHTTPs / tcp based)
> 2) port 80(HTTP / http based)
> 
> The website enters and exits https all throughout it and the load balancer is 
> using source load balancing to ensure it does not leave the app server with 
> its current site session on it.
> 
> 95% of users are working perfectly and it seems like the setup is working 
> fine, however 5% of clients seem to be returning a blank white page upon 
> trying to enter SSL...
> 
> Has anyone else seen this? I am at whits end! :)
> 
> Your thoughts please,
> 
> Karl Kloppenborg
> Head of Development
> Phone: 1300 884 839 (AU Only - Business Hours)
> Website: AU http://www.crucial.com.au| US http://www.crucialp.com
> 
> 
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Best Regards,

Brett Delle Grazie

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Serial links and corosync

2010-09-15 Thread Brett Delle Grazie
Hi,

I was just wondering whether a serial link over a null-modem at 115k
with pppd running at both ends is sufficient as a secondary redundant
link for corosync.

I know its not officially supported but I'm in a situation where its the
best option.

Will a pppd connection over 115k serial link be sufficient to run in
rrp_mode passive?

(or even 'active')?

Thanks,

-- 
Best Regards,

Brett Delle Grazie

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Starting over with Pacemaker / hoping to make somedocs

2010-08-13 Thread Brett Delle Grazie
Hi Peter,

Yes, the timeouts are generalised for very large systems under heavy load, you 
can safely shorten them as you see fit.

My MySQL configuration is:

primitive mysqld_mysql0 ocf:heartbeat:mysql \
params binary="/usr/bin/mysqld_safe" config="/mnt/data/mysql/my.cnf" 
datadir="/mnt/data/mysql" user="mysql" group="mysql" log="/var/log/mysqld.log" 
pid="/var/run/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" 
test_user="heartbeat" \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="10" timeout="30" depth="0"

this is RHEL 5.5 based for HA only - similar to yours but without the STONITH 
requirement as my system is on DRBD (also controlled by the cluster but not 
shown).

I have deliberately moved the MySQL location (to /mnt/data/mysql) and 
configuration (to /mnt/data/mysql/my.cnf) so as to be nowhere near
the usual /etc/my.cnf and /var/lib/mysql.  This prevents administrators doing 
silly things like starting mysql on the slave.

Depending upon how far you get I would end up letting the cluster control the 
mounting of the NFS directories as well. That way
an NFS failure will trigger an automatic switch to the other node - but that 
would need testing.

Best Regards,

Brett


-Original Message-
From: Peter Sylvester [mailto:peter.t.sylves...@gmail.com]
Sent: Fri 13/08/2010 16:23
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Starting over with Pacemaker / hoping to make somedocs
 
I have successfully gotten the ocf:heartbeat:mysql resource to work.  Though
please see the text below.  I am getting a couple of warnings, but I'm not
seeing any params that I can use to change these settings so that the time
outs are larger than what is advised.

Any tips?

crm(live)configure# primitive mysql ocf:heartbeat:mysql op monitor
interval=3s
WARNING: mysql: default timeout 20s for start is smaller than the advised
120
WARNING: mysql: default timeout 20s for monitor_0 is smaller than the
advised 30
WARNING: mysql: default timeout 20s for stop is smaller than the advised 120
 crm(live)configure#
 - Peter

On Fri, Aug 13, 2010 at 10:10 AM, Peter Sylvester <
peter.t.sylves...@gmail.com> wrote:

> I have tried using ocf:heartbeat:mysql and have not had much success with
> it, though I might try this again this morning since MySQL has changed since
> the last time I attempted using it.  Any tips for getting this up and
> running would be greatly appreciated.
>
> In the production env the two systems will be accessing their data
> via NFS.  There is a single NFS file server that the systems will connect to
> and locally it will appear as /var/lib/mysql.  I know that this is not the
> most preferable situation per what I have read about this, but it's not
> something I have flexibility on and it's why STONITH is so crucial in this
> env.
>
>  - Peter
>
>   On Fri, Aug 13, 2010 at 3:47 AM, Brett Delle Grazie <
> brett.dellegra...@intact-is.com> wrote:
>
>> On Thu, 2010-08-12 at 17:30 -0400, Peter Sylvester wrote:
>> > I believe I figued out what the problem was.  When I unisntalled the
>> version
>> > of mysql that I got from yum and installed the enterprise verion (one of
>> the
>> > differences being that one uses /etc/init.d/mysqld and the other uses
>> > /etc/init.d/mysql) and it suddenly works, which leads me to believe that
>> the
>> > mysqld init script is not LSB compliant where the mysql one is.
>> >
>>
>> Try using the ocf:heartbeat:mysql resource agent instead and not relying
>> upon the LSB agent. That way a small query can be executed as part of
>> your monitor operation and you no longer have to worry about LSB
>> compliance.
>>
>> Secondly are you using MySQL replication or file system level copying
>> using DRBD or SAN?
>>
>> > The next thing to figure out is stonith, but I have to wait until I have
>> the
>> > resources available to get started with that.  Thank you all for your
>> > support thus far!
>> >
>> > On Thu, Aug 12, 2010 at 4:35 PM, Peter Sylvester <
>> > peter.t.sylves...@gmail.com> wrote:
>> >
>> > > Ok, so I've got pacemaker and heartbeat going, I have mysql
>> installerd,
>> > > pacemaker is set to have a VIP and mysql.  I can fail back and forth
>> between
>> > > the two servers with no problems.  The two resources are configured to
>> work
>> > > together to make sure they don't end up on seperate boxes.  In short,
>> a lot
>> > > of improvement h

Re: [Linux-HA] Starting over with Pacemaker / hoping to make some docs

2010-08-13 Thread Brett Delle Grazie
est I've gotten in
> >> > > several
> >> > > > days.  My next task is to get them to use a VIP, and to get them to
> >> > > monitor
> >> > > > eachother and MySQL.
> >> > > >
> >> > > > I'm currently reading through the pacemaker explained docs seen
> >> here...
> >> > > >
> >> > > > *
> >> > > >
> >> > >
> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/index.html
> >> > > > *<
> >> > >
> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-node-attributes.html
> >> > > >
> >> > > >
> >> > > > But does anyone know of any other docs that explain how to get these
> >> two
> >> > > > things configured?  I would assume that this would be fairly easy,
> >> but at
> >> > > > this point I REALLY don't wanna screw up now that I'm this far.
> >> > >
> >> > > Have you seen this?
> >> > >
> >> > >
> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/pdf/Clusters_from_Scratch/Clusters_from_Scratch.pdf
> >> > >
> >> > > >  - Peter
> >> > > >
> >> > > > On Thu, Aug 12, 2010 at 1:02 PM, Dejan Muhamedagic <
> >> deja...@fastmail.fm
> >> > > >wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > On Thu, Aug 12, 2010 at 10:38:40AM -0400, Peter Sylvester wrote:
> >> > > > > > Hey there guys.  Today I'm going to start over with a couple of
> >> fresh
> >> > > VMs
> >> > > > > > and try to get pacemaker going with heartbeat.  I haven until
> >> the end
> >> > > of
> >> > > > > the
> >> > > > > > day tomorrow to get this up and running before my client says
> >> we'll
> >> > > just
> >> > > > > go
> >> > > > > > back to using heartbeat.  As a note I should state that I am a
> >> MySQL
> >> > > DBA
> >> > > > > so
> >> > > > > > I do have some familiarity with Linux but I would not consider
> >> myself
> >> > > to
> >> > > > > be
> >> > > > > > an uber linux admin.  It's my hope that, if I am able to get ths
> >> up
> >> > > and
> >> > > > > > running, I can make some documetation regarding getting this
> >> setup
> >> > > and
> >> > > > > > perhaps it can be submitted to the linux-HA wiki.  As the
> >> biggest
> >> > > > > complaint
> >> > > > > > appears to be the lack of clear documentation for
> >> implementation, so
> >> > > > > let's
> >> > > > > > try and fix that problem.
> >> > > > > >
> >> > > > > > The VMs that I'll be creating today will be Cent OS 5, they will
> >> each
> >> > > be
> >> > > > > > running MySQL, they will act as an active/passive cluster, and
> >> will
> >> > > have
> >> > > > > one
> >> > > > > > VIP that will fail over between the two servers.
> >> > > > > >
> >> > > > > > Last time I tried to get this up and running it was suggested
> >> that I
> >> > > was
> >> > > > > > using versions of pacemaker and heartbeat that were incompatible
> >> with
> >> > > one
> >> > > > > > another.  So I checked this site
> >> > > > > >
> >> > > > > > http://www.clusterlabs.org/wiki/Install
> >> > > > > >
> >> > > > > > I am going to attempt to get heartbeat 3.0.3 installed on the
> >> systems
> >> > > and
> >> > > > > > then install pacemaker.  So here's the current plan.
> >> > > > > >
> >> > > > > > 1) Install OS (during installation no additional packages will
> >> be
> >> > > > > installed)
> >> > > > > > 2) Get ssh keys setup
> >>

Re: [Linux-HA] Am I even on the right track here with Heartbeat?

2010-08-11 Thread Brett Delle Grazie
Hi Igor,

I've been following this and your previous thread a little bit and have
some suggestions.

What version(s) of the following packages are installed on your system:
Heartbeat, 
drbd, 
drbdlinks, 
cluster-glue,
cluster-agents
drbd0.7-module-source or drbd8-source

(these package names are based on Ubuntu Lucid which you indicated you
were running).

How is DRBD configured on your system? (can you post your configs
please?)

I've run heartbeat and corosync both in production using DRBD and apart
from some very occasional odd behaviour they all work perfectly.

Are you running any kind of iptables firewalls on your systems?

The first thing to establish is that DRBD is working properly and you
can manually promote / demote your resource - because if that doesn't
work nothing will.

Secondly, DRBD resource agents changed in version 8.x, the one supplied
with Heartbeat is _not_ supported according to Linbit. Instead you
should use Linbit's OCF resource agent (ocf:linbit:heartbeat)

DRBDLINKS is useful but it relies upon DRBD being started by the OS, not
by the cluster manager.  This is why many people use heartbeat/pacemaker
because drbdlinks can still be used in a controlled manner (after
permitting the cluster to start drbd first).  Start simple - start off
just getting DRBD to go primary, worry about drbdlinks later.

What about startup order on boot?
Is heartbeat started before or after DRBD?
Probably in your case (if you're using drbdlinks) Heartbeat should be
started _after_ drbd (in RHEL systems its typically the reverse).

After a reboot what does cat /proc/drbd say on each system?

That will at least confirm that DRBD is in the correct state.

Yes you are on the right track with heartbeat or corosync - but clusters
are not simple creatures and many things can cause intermittent or
downright silly problems (such as port span or auto negotiation on
switches). Don't give up.

Best Regards,

Brett

On Wed, 2010-08-11 at 16:23 -0500, Igor Chudov wrote:
> On Wed, Aug 11, 2010 at 3:24 PM, Dimitri Maziuk  wrote:
> > On Wednesday 11 August 2010 15:12, Igor Chudov wrote:
> > ...
> >> At this point, I am beginning to have my doubts about this whole
> >> heartbeat system and its ability to serve for years, in what looks to
> >> me like simple configuration.
> > ...
> >
> > Well, that's kinda why I stick to 2.1.4 (also b/c it's a stock rpm on 
> > centos)
> > and v1-style config. From back when things were simple stupid.
> 
> Simple stupid is exactly what I want.
> 
> > As I understand it, most heartbeat work since was done on v2 features: xml,
> > resource monitoring, corosync, pacemaker... which I'm either not missing 
> > (mon
> > works just fine for monitoring) or actively don't want (xml in particular).
> 
> I would not mind xml if either 1) it was documented or 2) the command
> line tool was documented beyond just mentioning every field or 3) the
> GUI was working instead of not working.
> 
> > When I need a 3-node cluster I'll think about those. Until then, 2.1.4 is 
> > not
> > perfect but it works well enough.
> 
> My heartbeat is 3.0.3.
> 
> Do you think that, say, 2.1.4 s sufficiently bug free that I could
> install it from source and just let it run forever?
> 
> I mean, I just want to get that simple two node cluster to run. I am
> not trying to back up Mars to Venus and Uranus by TCP over light rays.
> is 2.1.4 is easy and works, I will just install it. I assume that it
> can work with standard Ubuntu Lucid drbd.
> 
> 
> i
> 

-- 
Best Regards,

Brett Delle Grazie

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] IPaddr2 in ClusterIP mode fail-back causes complete loss of connectivity to IP.

2010-08-05 Thread Brett Delle Grazie
Hi Andrew,

On Wed, 2010-08-04 at 10:27 +0200, Andrew Beekhof wrote:
> On Wed, Aug 4, 2010 at 12:56 AM, Brett Delle Grazie
>  wrote:
> > Hi,
> >
> > I have two nodes (RHEL 5.5) configured in a cluster
> > (Corosync/Pacemaker).
> >
> > I have an IPaddr2 and Apache resources configured as clones on those
> > systems (configuration is shown below).
> >
> > The IPaddr2 is configured for load balancing using ClusterIP with hash
> > sourceip.
> >
> > My problem is that when failing back the ClusterIP after an initial
> > failure test the cluster no longer receives traffic even though the
> > ClusterIP appears to fail back correctly.
> >
> > Test as follows:
> >
> > 1. Test initial failure by adding location constraint to one node - this
> > moves all ClusterIP elements to the opposing node:
> > e.g. location loc_test cl_ipclust_0 -inf: node2 (will cause ClusterIP
> > to move to node 1)
> >
> > 2. Observe in web browser traffic is still being sent to Apache and
> > responses returned correctly
> >
> > 3. Now delete the test constraint and the ClusterIP will return back to
> > node 2. (confirmed with ifconfig, cat /proc/net/ipt_CLUSTER/192.168.0.10
> > on both nodes - nodes respond with correct 'id', ping on both nodes to
> > cluster IP resolves correctly)
> >
> > 4. Attempt to connect with web-browser to ClusterIP and this fails
> >
> > 5. Restart Apache clone and retry - still doesn't work
> >
> > The only way I can restore traffic is to stop the ClusterIP clone, wait
> > a few seconds (5-10) and start it again.
> >
> > Anyone have any idea why this might be?
> 
> An ARP issue perhaps?

And indeed that's exactly what it was.
It turned out that the switches and the firewall needed static entries
in the ARP tables for the cluster IP -> multicast mac address.

Once they were put in place, clustering works perfectly.

Thanks!

> 
> >
> > Any advice appreciated.
> >
> > Thanks,
> >
> > Configuration as follows:
> >
> > node node1
> > node node2
> > primitive apache_0 ocf:heartbeat:apache \
> >params configfile="/etc/httpd/conf/httpd.conf"
> > httpd="/usr/sbin/httpd.worker"
> > statusurl="http://localhost/server-status";
> > envfiles="/etc/sysconfig/httpd" \
> >op start interval="0" timeout="40" \
> >op stop interval="0" timeout="60" \
> >op monitor interval="10" timeout="40"
> > primitive ipclust_0 ocf:intact:IPaddr2 \
> >params ip="192.168.0.10" nic="eth0" iflabel="1"
> > clusterip_hash="sourceip" \
> >    op monitor interval="10" timeout="5"
> > clone cl_apache0 apache_0 \
> >meta globally_unique="false" interleave="true"
> > clone cl_ipclust_0 ipclust_0 \
> >meta globally-unique="true" interleave="true" clone-node-max="2"
> > clone-max="2" notify="true" \
> >params resource-stickiness="0"
> >
> > RHEL 5.5, kernel 2.6.18-194.8.1.el5 x86_64
> > Corosync 1.2.7-1.1.el5
> > Pacemaker 1.0.9.1-1.15.el5
> > resource-agents 1.0.3-2.6.el5
> >
> >
> > --
> > Best Regards,
> >
> > Brett Delle Grazie
> >

-- 
Best Regards,

Brett Delle Grazie

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] IPaddr2 in ClusterIP mode fail-back causes complete loss of connectivity to IP.

2010-08-03 Thread Brett Delle Grazie
Hi,

I have two nodes (RHEL 5.5) configured in a cluster
(Corosync/Pacemaker).

I have an IPaddr2 and Apache resources configured as clones on those
systems (configuration is shown below).

The IPaddr2 is configured for load balancing using ClusterIP with hash
sourceip.

My problem is that when failing back the ClusterIP after an initial
failure test the cluster no longer receives traffic even though the
ClusterIP appears to fail back correctly.

Test as follows:

1. Test initial failure by adding location constraint to one node - this
moves all ClusterIP elements to the opposing node:
e.g. location loc_test cl_ipclust_0 -inf: node2  (will cause ClusterIP
to move to node 1)

2. Observe in web browser traffic is still being sent to Apache and
responses returned correctly

3. Now delete the test constraint and the ClusterIP will return back to
node 2. (confirmed with ifconfig, cat /proc/net/ipt_CLUSTER/192.168.0.10
on both nodes - nodes respond with correct 'id', ping on both nodes to
cluster IP resolves correctly)

4. Attempt to connect with web-browser to ClusterIP and this fails

5. Restart Apache clone and retry - still doesn't work

The only way I can restore traffic is to stop the ClusterIP clone, wait
a few seconds (5-10) and start it again.

Anyone have any idea why this might be?

Any advice appreciated.

Thanks,

Configuration as follows:

node node1
node node2
primitive apache_0 ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf"
httpd="/usr/sbin/httpd.worker"
statusurl="http://localhost/server-status";
envfiles="/etc/sysconfig/httpd" \
op start interval="0" timeout="40" \
op stop interval="0" timeout="60" \
op monitor interval="10" timeout="40"
primitive ipclust_0 ocf:intact:IPaddr2 \
params ip="192.168.0.10" nic="eth0" iflabel="1"
clusterip_hash="sourceip" \
op monitor interval="10" timeout="5"
clone cl_apache0 apache_0 \
meta globally_unique="false" interleave="true"
clone cl_ipclust_0 ipclust_0 \
meta globally-unique="true" interleave="true" clone-node-max="2"
clone-max="2" notify="true" \
params resource-stickiness="0"

RHEL 5.5, kernel 2.6.18-194.8.1.el5 x86_64
Corosync 1.2.7-1.1.el5
Pacemaker 1.0.9.1-1.15.el5
resource-agents 1.0.3-2.6.el5


-- 
Best Regards,

Brett Delle Grazie

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes

2010-07-15 Thread Brett Delle Grazie

Hi,

-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Thu 15/07/2010 15:47
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes
 
Hi,

On Mon, Jul 12, 2010 at 01:03:05PM +0100, Brett Delle Grazie wrote:
> Hi,
> 
> Another patch for the Tomcat resource agent.
> 
> This patch simply:
> 
> 1. Removes the 'n' character added after the '\' on the export
> commands - otherwise this causes "'n' not found" messages to
> occur in the resource agent log during start and stop
> operations.

It'd be cleaner to feed everything on the stdin to the su command:

cat<> "$TOMCAT_CONSOLE" 2>&1 &
export JAVA_HOME=${OCF_RESKEY_java_home}
...
$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}
EOF

If you feel like testing this too ...

BDG: What a good suggestion. Will test and resubmit.

> 2. Adds a missing background operator (&) to the stop
> operation. Otherwise the stop operation cannot be monitored by
> the resource agent

This is a different issue. I'll split it off.

BDG: Fine, no problem - its a trivial fix.

Thanks,

Dejan

> This patch can be applied independently of the documentation
> patch supplied previously.
> 
> I hope this helps
>

Thanks,
 
Best Regards,
 
Brett

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__<>___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Tomcat resource agent - PATCH3 - supports multiple tomcat instance configurations

2010-07-15 Thread Brett Delle Grazie
Hi,

I should document these things better...

Additional clarification interleaved in your email.

The environment variables:
CATALINA_HOME
CATALINA_BASE
JAVA_OPTS
CATALINA_OPTS
CATALINA_PID
are all documented fully in RUNNING.TXT included with Tomcat.
Normally only CATALINA_HOME, CATALINA_BASE and CATALINA_PID are set prior to 
calling
the start/stop scripts. The others are typically set in the 'setenv.sh' file 
which is
usually in the Tomcat instance's bin directory.

In case you're wondering I'm editing this in Outlook Web Access, so the result
might not be pretty (apologies).  Let me know if you require any further 
clarification.

Thanks,

Best Regards,

Brett

-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Thu 15/07/2010 17:00
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat resource agent - PATCH3 - supports multiple 
tomcat instance configurations
 
Hi,

On Wed, Jul 14, 2010 at 05:54:28PM +0100, Brett Delle Grazie wrote:
> Hi,
> 
> This patch provides support for multiple instance tomcat
> configurations (i.e. where CATALINA_BASE is configured) as
> per Tomcat documentation.

There are changes in this patch which seem to be unrelated. I'm
not a tomcat expert, so I'm not sure.

- there's a fix for CATALINA_PID (the old version wouldn't use
  the default, though that default has never been advertised in
  the meta-data)
That's a direct side-effect of me changing the start/stop operations
to using the locally copied version of the parameters rather than their 
original OCF_RESKEY_ versions. Otherwise defaults are not observed.
In particular catalina_base must default to catalina_home if not set.

- -Dname=... got moved to CATALINA_OPTS
Yes, in Tomcat start/stop script catalina.sh, CATALINA_OPTS is only used 
during start operation whereas JAVA_OPTS is used during both operations. 
It belongs in CATALINA_OPTS.  Originally it was being set in the start 
operation so this didn't matter but I was trying to keep things consistent
so I set it in the default list and the only place it can safely go is
CATALINA_OPTS.  I really don't like this solution for a process check, 
it has the feel of a hack.. we have the PID, why not use that?

- there's a new parameter catalina_base: does this one enable
  multiple instances?
Only if its different from catalina_home. To use this, one should read the
RUNNING.TXT that comes with Tomcat. Its default should always be catalina_home 
to
preserve existing behaviour

- there's a new parameter java_opts: and this one too?
Used during both start and stop operations. Used to pass parameters to the JVM
running the start/stop operation - typically things like java.awt.headless=true.
It was being exported previously but there was no way to set it.

That seems like four changes to me. I can't apply the patch as
it is without further clarification. 

And many thanks for sharing the improvements.

Cheers,

Dejan

> Tested with current Tomcat (6.0.28).
> 
> I hope this helps.
> 
> Best Regards,
> 
> Brett
> 
> __
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> __

> --- tomcat2010-07-14 11:22:57.0 +0100
> +++ tomcat.intact 2010-07-14 12:44:46.0 +0100
> @@ -29,10 +29,12 @@
>  #   OCF_RESKEY_tomcat_user  - A user name to start a resource. Default is 
> root
>  #   OCF_RESKEY_statusurl - URL for state confirmation. Default is 
> http://127.0.0.1:8080
>  #   OCF_RESKEY_java_home - Home directory of Java. Default is none
> +#   OCF_RESKEY_java_opts - Options to pass to Java JVM for start and stop. 
> Default is none
>  #   OCF_RESKEY_catalina_home - Home directory of Tomcat. Default is none
> +#   OCF_RESKEY_catalina_base - Base directory of Tomcat. Default is none
>  #   OCF_RESKEY_catalina_pid  - A PID file name of Tomcat. Default is 
> OCF_RESKEY_catalina_home/logs/catalina.pid
>  #   OCF_RESKEY_tomcat_start_opts - Start options of Tomcat. Default is none.
> -#   OCF_RESKEY_catalina_opts - CATALINA_OPTS environment variable. Default 
> is none.
> +#   OCF_RESKEY_catalina_opts - Options to pass to Java JVM for start 
> operation, always adds -Dname=${OCF_RESKEY_tomcat_name}. Default is none.
>  #   OCF_RESKEY_catalina_rotate_log - Control catalina.out logrotation flag. 
> Default is NO.
>  #   OCF_RESKEY_catalina_rotatetime - catalina.out logrotation time 
> span(seconds). Default is 86400.
>  
> ###
> @@ -147,12 +149,13 @@
>   >> "$TOMCAT_CONSOLE&

[Linux-HA] Tomcat resource agent - PATCH3 - supports multiple tomcat instance configurations

2010-07-14 Thread Brett Delle Grazie
Hi,

This patch provides support for multiple instance tomcat configurations (i.e. 
where CATALINA_BASE is configured) as
per Tomcat documentation.

Tested with current Tomcat (6.0.28).

I hope this helps.

Best Regards,

Brett

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__--- tomcat	2010-07-14 11:22:57.0 +0100
+++ tomcat.intact	2010-07-14 12:44:46.0 +0100
@@ -29,10 +29,12 @@
 #   OCF_RESKEY_tomcat_user  - A user name to start a resource. Default is root
 #   OCF_RESKEY_statusurl - URL for state confirmation. Default is http://127.0.0.1:8080
 #   OCF_RESKEY_java_home - Home directory of Java. Default is none
+#   OCF_RESKEY_java_opts - Options to pass to Java JVM for start and stop. Default is none
 #   OCF_RESKEY_catalina_home - Home directory of Tomcat. Default is none
+#   OCF_RESKEY_catalina_base - Base directory of Tomcat. Default is none
 #   OCF_RESKEY_catalina_pid  - A PID file name of Tomcat. Default is OCF_RESKEY_catalina_home/logs/catalina.pid
 #   OCF_RESKEY_tomcat_start_opts - Start options of Tomcat. Default is none.
-#   OCF_RESKEY_catalina_opts - CATALINA_OPTS environment variable. Default is none.
+#   OCF_RESKEY_catalina_opts - Options to pass to Java JVM for start operation, always adds -Dname=${OCF_RESKEY_tomcat_name}. Default is none.
 #   OCF_RESKEY_catalina_rotate_log - Control catalina.out logrotation flag. Default is NO.
 #   OCF_RESKEY_catalina_rotatetime - catalina.out logrotation time span(seconds). Default is 86400.
 ###
@@ -147,12 +149,13 @@
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	else
 		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
-export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
-$CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}" \
+			-c "export JAVA_HOME=${JAVA_HOME};\
+export JAVA_OPTS=\"${JAVA_OPTS}\";\
+export CATALINA_HOME=${CATALINA_HOME};\
+export CATALINA_BASE=${CATALINA_BASE};\
+export CATALINA_PID=${CATALINA_PID};\
+export CATALINA_OPTS=\"${CATALINA_OPTS}\";\
+$CATALINA_HOME/bin/catalina.sh start ${TOMCAT_START_OPTS}" \
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	fi
 
@@ -182,10 +185,11 @@
 		eval $tomcat_stop_cmd >> "$TOMCAT_CONSOLE" 2>&1
 	else
 		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
+			-c "export JAVA_HOME=${JAVA_HOME};\
+export JAVA_OPTS=\"${JAVA_OPTS}\";\
+export CATALINA_HOME=${CATALINA_HOME};\
+export CATALINA_BASE=${CATALINA_BASE};\
+export CATALINA_PID=${CATALINA_PID};\
 $CATALINA_HOME/bin/catalina.sh stop" \
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	fi
@@ -262,7 +266,7 @@
 
 
 
-The name of the resource, added as a Java parameter in JAVA_OPTS: -Dname= to Tomcat 
+The name of the resource, added as a Java parameter in CATALINA_OPTS: -Dname= to Tomcat 
 process on start.  Used to ensure process is still running and must be unique amongst all Tomcat 
 instances in this cluster.
 
@@ -318,6 +322,14 @@
 
 
 
+
+
+Java JVM options used on start and stop
+
+Java options parsed to JVM, used on start and stop
+
+
+
 
 
 Home directory of Tomcat
@@ -326,6 +338,14 @@
 
 
 
+
+
+Instance directory of Tomcat
+
+Instance directory of Tomcat, defaults to catalina_home
+
+
+
 
 
 A PID file name for Tomcat
@@ -344,9 +364,9 @@
 
 
 
-Catalina options, applied on start operation only
+Java JVM options used on start only
 
-Catalina options
+Java JVM options used on start
 
 
 
@@ -399,17 +419,18 @@
 RESOURCE_STATUSURL="${OCF_RESKEY_statusurl-http://127.0.0.1:8080}";
 
 JAVA_HOME="${OCF_RESKEY_java_home}"
-JAVA_OPTS="-Dname=$TOMCAT_NAME"
+JAVA_OPTS="${OCF_RESKEY_java_opts}"
 SEARCH_STR="\\""${JAVA_OPTS}"
 CATALINA_HOME="${OCF_RESKEY_catalina_home}"
+CATALINA_BASE="${OCF_RESKEY_catalina_base-${OCF_RESKEY_catalina_home}}"
 CATALINA_PID="${OCF_RESKEY_catalina_pid-$CATALINA_HOME/logs/catalina.pid}"
 
 TOMCAT_START_OPTS="${OCF_RESKEY_tomcat_start_op

Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart - [SOLVED]

2010-07-12 Thread Brett Delle Grazie
Hi,

This turned out to be due to enabling log rotation in the agent (parameter: 
catalina_rotate_log: 'YES').
We switched to using Log4J logging in Tomcat rather than JULI as it suits our 
environment better. We were
able to use the DailyRollingFileAppender - so the manual rotation of 
catalina.out by the agent was no longer 
necessary. After turning this off the problem went away.

I hope this helps someone in the future.

Thanks for all the help.

Best Regards,

Brett


-Original Message-----
From: Brett Delle Grazie
Sent: Fri 09/07/2010 16:01
To: General Linux-HA mailing list
Subject: RE: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
stop or restart
 
Hi,

Yes I meant parent of the process is init.

Timeout for start operation is: 120 seconds - process is still around (so is 
tomcat) after this.

heartbeat-libs - taken from linbit repo packages where we have a support 
contract as we use DRBD in other stuff - 
I didn't realise they were renamed to cluster-libs on clusterlabs.org.
Hmm. I can update but can the packages on clusterlabs still use heartbeat or 
will I need to switch to corosync?

The log from pacemaker / heartbeat is:
/var/log/ha-debug
Jul 09 15:39:47 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:184: start
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:47 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:48 fmp-dun-tapp1 attrd: [4266]: info: attrd_ha_callback: flush 
message from fmp-dun-tapp2
Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM 
operation tomcat_tc1:0_start_0 (call=184, rc=0, cib-update=208, confirmed=true) 
ok
Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: Performing 
key=24:225:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 op=tomcat_tc1:0_monitor_1 
)
Jul 09 15:39:51 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:185: monitor
Jul 09 15:39:51 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM 
operation tomcat_tc1:0_monitor_1 (call=185, rc=0, cib-update=209, 
confirmed=false) ok
Jul 09 15:40:16 fmp-dun-tapp1 lrmd: [4264]: info: rsc:cmon_html0:0:174: monitor
Jul 09 15:42:12 fmp-dun-tapp1 cib: [4263]: info: cib_stats: Processed 198 
operations (909.00us average, 0% utilization) in the last 10min
Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: cancel_op: operation 
monitor[185] on ocf::tomcat::tomcat_tc1:0 for client 4267, its parameters: 
CRM_meta_interval=[1] catalina_home=[/opt/tomcat] 
catalina_base=[/home/tomcat/tc-1] tomcat_user=[tomcat] 
catalina_pid=[/home/tomcat/tc-1/temp/tomcat.pid] catalina_rotate_log=[YES] 
CRM_meta_timeout=[3] CRM_meta_clone_max=[2] crm_feature_set=[3.0.1] 
java_home=[/usr/lib/jvm/java] CRM_meta_globally_unique=[false] 
CRM_meta_name=[monitor] script_log=[/home/tomcat/tc-1/logs/tc-1.log] 
statusurl=[http://127.0.0.1:10305/exam cancelled
Jul 09 15:47:43 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: Performing 
key=24:226:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 op=tomcat_tc1:0_stop_0 )
Jul 09 15:47:43 fmp-dun-tapp1 lrmd: [4264]: info: rsc:tomcat_tc1:0:186: stop
Jul 09 15:47:43 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM 
operation tomcat_tc1:0_monitor_1 (call=185, status=1, cib-update=0, 
confirmed=true) Cancelled
Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: write_cib_contents: Archived 
previous version as /var/lib/heartbeat/crm/cib-11.raw
Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: write_cib_contents: Wrote 
version 0.259.0 of the CIB to disk (digest: e028d9e440a93208328ecb4eada8fdf6)
Jul 09 15:47:43 fmp-dun-tapp1 cib: [31183]: info: retrieveCib: Reading cluster 
configuration from: /var/lib/heartbeat/crm/cib.8JcqTH (digest: 
/var/lib/heartbeat/crm/cib.V73jBK)
Jul 09 15:47:49 fmp-dun-tapp1 crmd: [4267]: info: process_lrm_event: LRM 
operation tomcat_tc1:0_stop_0 (call=186, rc=0, cib-update=210, confirmed=true) 
ok
Jul 09 15:47:50 fmp-dun-tapp1 crmd: [4267]: info: do_lrm_rsc_op: Performing 
key=23:227:0:2c2c0209-48a9-40d0-b4f9-53ea1adcd584 op=tomcat_tc1:0_start_0 )
Jul 09 15:47:5

[Linux-HA] Tomcat resource agent - PATCH2 - minor script fixes

2010-07-12 Thread Brett Delle Grazie
Hi,

Another patch for the Tomcat resource agent.

This patch simply:

1. Removes the 'n' character added after the '\' on the export commands - 
otherwise this causes "'n' not found" messages to occur
in the resource agent log during start and stop operations.

2. Adds a missing background operator (&) to the stop operation. Otherwise the 
stop operation cannot be monitored by the resource agent

This patch can be applied independently of the documentation patch supplied 
previously.

I hope this helps

Best Regards,

Brett

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

tomcat-minor-script-fixes.patch.2
Description: tomcat-minor-script-fixes.patch.2
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] tomcat resource agent - PATCH1 - minor doc improvements

2010-07-12 Thread Brett Delle Grazie

Hi,

As promised, please find attached a patch to improve the meta-data 
documentation of the Tomcat resource agent.
I hope this is useful.

Best Regards,

Brett


__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

tomcat-doc-fixes.patch.1
Description: tomcat-doc-fixes.patch.1
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart

2010-07-09 Thread Brett Delle Grazie
in/sh 
/usr/lib/ocf/resource.d//intact/tomcat start
root 31372 1  0 15:47 ?00:00:00   /bin/sh 
/usr/lib/ocf/resource.d//intact/tomcat start
tomcat   31408 1  0 15:47 ?00:00:03   /usr/lib/jvm/java/bin/java 
-Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties
... snip ..
org.apache.catalina.startup.Bootstrap start

There was a 'crm resource restart cl_tomcat_tc1' issued at 15:47.

In the above ha-debug log you can clearly see the lrmd starting tomcat at both 
points in time (15:39 and 15:47) and receiving a successful start ok response.



-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Fri 09/07/2010 14:46
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
stop or restart
 
Hi,

On Fri, Jul 09, 2010 at 01:04:09PM +0100, Brett Delle Grazie wrote:
> Hi,
> 
> Yes checked both logs:
> 
> Catalina.out specifies normal (successful) Tomcat startup.
> 
> tc-1.log (log from backgrounded start/stop operations):
> 
> Doesn't give anything unusual:
> 2010/07/09 09:42:13: start ===
> 2010/07/09 10:20:46: stop  ###
> 2010/07/09 10:27:35: start ===
> 2010/07/09 12:50:20: stop  ###
> 2010/07/09 12:50:26: start ===
> 
> Yes, I realise these are from later runs but the same thing is still 
> occurring.
> 
> Is it possible that the start operation doesn't close of one of
> the file descriptors and is left 'hanging' - even though
> it exits (at least from the perspective of pacemaker)?
> 
> Would this explain the ownership of 'init' by the 'tomcat
> start' process instead of by pacemaker?

No. lrmd kills the process if it doesn't exit within the timeout.
By "ownership" I guess you mean the parent process. The RA
process (/usr/lib/ocf/.../tomcat start) is a child of the lrmd.
init can become its parent only if lrmd exits.

What is the timeout for that start operation set to? Does the
process remain even after that timeout? What happens to lrmd?

> > > > heartbeat-libs-3.0.3-1

Where does that come from? Normally, you should have
cluster-libs. Perhaps you need to update.

Thanks,

Dejan

> Thanks,
> 
> Best Regards,
> 
> Brett
> 
> 
> -Original Message-
> From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
> Sent: Fri 09/07/2010 12:54
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
> stop or restart
>  
> Hi,
> 
> On Fri, Jul 09, 2010 at 12:29:40PM +0100, Brett Delle Grazie wrote:
> > 
> > Hi,
> > 
> > Now we come to the fun part...
> > 
> > When I first started looking at this I thought the monitor code in the 
> > agent was wrong:
> > 
> > 
> > # Check tomcat process and service availability
> > monitor_tomcat()
> > {
> > isalive_tomcat ||
> > return $OCF_NOT_RUNNING
> > isrunning_tomcat ||
> > return $OCF_NOT_RUNNING
> > return $OCF_SUCCESS
> > }
> > 
> > Both pgrep and wget return 0 if successful, thus so do isalive_tomcat and 
> > isrunning_tomcat.
> > However this appears correct.
> > 
> > So I'm _really_ confused about why this is not exiting.
> > 
> > Any ideas?
> 
> The logs should say. Did you check the tomcat logs too?
> 
> Thanks,
> 
> Dejan
> 
> > Thanks,
> > 
> > Regards,
> > 
> > Brett
> > 
> > 
> > -Original Message-
> > From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
> > Sent: Fri 09/07/2010 11:53
> > To: General Linux-HA mailing list
> > Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
> > stop or restart
> >  
> > Hi,
> > 
> > On Fri, Jul 09, 2010 at 11:41:37AM +0100, Brett Delle Grazie wrote:
> > > Hi Dejan,
> > > 
> > > Thanks for your response.
> > > 
> > > You are correct the backgrounded process used to start tomcat by the 
> > > resource agent isn't exiting the way it should - the question is why?
> > > 
> > > Ignore the incorrect date on the example - I killed the wrong leftover 
> > > process before setting up the example.
> > > 
> > > restarting tomcat is performed by:
> > > 
> > > crm resource restart cl_tomcat_tc1
> > > 
> > > To the best of my knowledge this performs a 'stop&#

Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart

2010-07-09 Thread Brett Delle Grazie
Hi,

Yes checked both logs:

Catalina.out specifies normal (successful) Tomcat startup.

tc-1.log (log from backgrounded start/stop operations):

Doesn't give anything unusual:
2010/07/09 09:42:13: start ===
2010/07/09 10:20:46: stop  ###
2010/07/09 10:27:35: start ===
2010/07/09 12:50:20: stop  ###
2010/07/09 12:50:26: start ===

Yes, I realise these are from later runs but the same thing is still occurring.

Is it possible that the start operation doesn't close of one of the file 
descriptors and is left 'hanging' - even though
it exits (at least from the perspective of pacemaker)?

Would this explain the ownership of 'init' by the 'tomcat start' process 
instead of by pacemaker?

Thanks,

Best Regards,

Brett


-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Fri 09/07/2010 12:54
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
stop or restart
 
Hi,

On Fri, Jul 09, 2010 at 12:29:40PM +0100, Brett Delle Grazie wrote:
> 
> Hi,
> 
> Now we come to the fun part...
> 
> When I first started looking at this I thought the monitor code in the agent 
> was wrong:
> 
> 
> # Check tomcat process and service availability
> monitor_tomcat()
> {
> isalive_tomcat ||
> return $OCF_NOT_RUNNING
> isrunning_tomcat ||
> return $OCF_NOT_RUNNING
> return $OCF_SUCCESS
> }
> 
> Both pgrep and wget return 0 if successful, thus so do isalive_tomcat and 
> isrunning_tomcat.
> However this appears correct.
> 
> So I'm _really_ confused about why this is not exiting.
> 
> Any ideas?

The logs should say. Did you check the tomcat logs too?

Thanks,

Dejan

> Thanks,
> 
> Regards,
> 
> Brett
> 
> 
> -Original Message-
> From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
> Sent: Fri 09/07/2010 11:53
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
> stop or restart
>  
> Hi,
> 
> On Fri, Jul 09, 2010 at 11:41:37AM +0100, Brett Delle Grazie wrote:
> > Hi Dejan,
> > 
> > Thanks for your response.
> > 
> > You are correct the backgrounded process used to start tomcat by the 
> > resource agent isn't exiting the way it should - the question is why?
> > 
> > Ignore the incorrect date on the example - I killed the wrong leftover 
> > process before setting up the example.
> > 
> > restarting tomcat is performed by:
> > 
> > crm resource restart cl_tomcat_tc1
> > 
> > To the best of my knowledge this performs a 'stop' and then a 'start'.
> 
> Right. Note that "stop" won't run before the current action on
> the resource is done.
> 
> > Where cl_tomcat_tc1 is a clone tomcat resource.
> > 
> > Any ideas why the backgrounded process doesn't exit?
> 
> The start action is like this:
> 
>java ... start &
>while not monitor:
>   sleep
> 
> If the monitor never succeeds, then lrmd will kill the process
> once the timeout for the start operation expires. At any rate,
> lrmd always makes sure that there's only one operation on the
> resource at the time.
> 
> Thanks,
> 
> Dejan
> 
> > Thanks,
> > 
> > Best Regards,
> > 
> > Brett
> > 
> > 
> > 
> > -Original Message-
> > From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
> > Sent: Fri 09/07/2010 11:32
> > To: General Linux-HA mailing list
> > Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
> > stop or restart
> >  
> > Hi,
> > 
> > On Thu, Jul 08, 2010 at 10:35:57AM +0100, Brett Delle Grazie wrote:
> > > 
> > > Hi,
> > > 
> > > I'm using RHEL5.5 in a Heartbeat/Pacemaker cluster managing Tomcat and 
> > > Apache HTTPD on two nodes using the ocf:heartbeat:tomcat resource agent 
> > > for Tomcat.
> > > 
> > > Specific versions:
> > > resource-agents 1.0.3-1
> > > heartbeat-libs-3.0.3-1
> > > heartbeat-3.0.3-1
> > > pacemaker-1.0.8-1.0hg20100317.8debc1902e13
> > > Tomcat 6.0.26 (downloaded from source).
> > > 
> > > I have modified the Tomcat resource agent to be capable of
> > > controlling multiple Tomcat instances by exporting
> > > CATALINA_BASE as well as CATALINA_HOME - these 

Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart

2010-07-09 Thread Brett Delle Grazie

Hi,

Now we come to the fun part...

When I first started looking at this I thought the monitor code in the agent 
was wrong:


# Check tomcat process and service availability
monitor_tomcat()
{
isalive_tomcat ||
return $OCF_NOT_RUNNING
isrunning_tomcat ||
return $OCF_NOT_RUNNING
return $OCF_SUCCESS
}

Both pgrep and wget return 0 if successful, thus so do isalive_tomcat and 
isrunning_tomcat.
However this appears correct.

So I'm _really_ confused about why this is not exiting.

Any ideas?

Thanks,

Regards,

Brett


-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Fri 09/07/2010 11:53
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
stop or restart
 
Hi,

On Fri, Jul 09, 2010 at 11:41:37AM +0100, Brett Delle Grazie wrote:
> Hi Dejan,
> 
> Thanks for your response.
> 
> You are correct the backgrounded process used to start tomcat by the resource 
> agent isn't exiting the way it should - the question is why?
> 
> Ignore the incorrect date on the example - I killed the wrong leftover 
> process before setting up the example.
> 
> restarting tomcat is performed by:
> 
> crm resource restart cl_tomcat_tc1
> 
> To the best of my knowledge this performs a 'stop' and then a 'start'.

Right. Note that "stop" won't run before the current action on
the resource is done.

> Where cl_tomcat_tc1 is a clone tomcat resource.
> 
> Any ideas why the backgrounded process doesn't exit?

The start action is like this:

   java ... start &
   while not monitor:
  sleep

If the monitor never succeeds, then lrmd will kill the process
once the timeout for the start operation expires. At any rate,
lrmd always makes sure that there's only one operation on the
resource at the time.

Thanks,

Dejan

> Thanks,
> 
> Best Regards,
> 
> Brett
> 
> 
> 
> -Original Message-
> From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
> Sent: Fri 09/07/2010 11:32
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
> stop or restart
>  
> Hi,
> 
> On Thu, Jul 08, 2010 at 10:35:57AM +0100, Brett Delle Grazie wrote:
> > 
> > Hi,
> > 
> > I'm using RHEL5.5 in a Heartbeat/Pacemaker cluster managing Tomcat and 
> > Apache HTTPD on two nodes using the ocf:heartbeat:tomcat resource agent for 
> > Tomcat.
> > 
> > Specific versions:
> > resource-agents 1.0.3-1
> > heartbeat-libs-3.0.3-1
> > heartbeat-3.0.3-1
> > pacemaker-1.0.8-1.0hg20100317.8debc1902e13
> > Tomcat 6.0.26 (downloaded from source).
> > 
> > I have modified the Tomcat resource agent to be capable of
> > controlling multiple Tomcat instances by exporting
> > CATALINA_BASE as well as CATALINA_HOME - these are the only
> 
> 
> > changes I've made to the resource agent (this is why the agent
> > path is 'intact' instead of 'heartbeat' in the process list
> > below) - diff attached.
> > 
> > When manually issuing a restart of the clone resource on tomcat
> > I'm left with a dead 'start' process:
> 
> Doesn't look dead to me, just that it didn't exit.
> 
> > (before restart):
> > [r...@fmp-dun-tapp1 ~]# ps -efH | grep [t]omcat
> > root 22754 21037  0 10:09 pts/000:00:00 grep tomcat
> > root  5058 1  0 Jul07 ?00:00:00   /bin/sh 
> > /usr/lib/ocf/resource.d//intact/tomcat start
> > tomcat5101 1  0 Jul07 ?00:00:19   
> > /usr/lib/jvm/java/bin/java 
> > -Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties 
> > -Dname=tomcat -Djava.awt.headless=true -Djava.library.path=/usr/lib64 
> > -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xmx1024M 
> > -Djava.endorsed.dirs=/opt/tomcat/endorsed -classpath 
> > /opt/tomcat/bin/bootstrap.jar -Dcatalina.base=/home/tomcat/tc-1 
> > -Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/home/tomcat/tc-1/temp 
> > org.apache.catalina.startup.Bootstrap start
> > 
> > (after restart):
> > [r...@fmp-dun-tapp1 ~]# ps -efH | grep [t]omcat
> > root  5058 1  0 Jul07 ?00:00:00   /bin/sh 
> > /usr/lib/ocf/resource.d//intact/tomcat start
> 
> This looks like an old process, judging by the date. Perhaps you
> killed (using -9) some processes so this one remained hanging?
> Otherwise, this is not possible, i.e. only one operation on a
> resource is run.
> 
> > root

Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart

2010-07-09 Thread Brett Delle Grazie

Hi Dejan,

Thanks for the patch comments - I am completely uncertain as to what format the 
patches need to be in so feel free to let me know (I've used diff -uNr).

1. If empty, CATALINA_BASE needs to be set to CATALINA_HOME (this is what 
catalina.sh does and what is documented by Apache in RUNNING.txt) or not 
exported.
Its easier to simply set it identical to CATALINA_HOME and export them both 
(I'll correct this).

2. I'll document the meta-data properly - this will require improving the 
comments on the existing meta-data but 
I didn't want the patch to be too confusing initially. The patch was only to 
demonstrate I hadn't modified the 
original resource agent to any large degree.

Those two items aside.

There are other environment variables used by Tomcat, they are normally 
specified by a 'setenv.sh' file 
located in CATALINA_BASE/bin/

In order to make it simpler for a person running Tomcat manually (e.g. during 
testing) and a resource agent running 
tomcat to get exactly the same results, my personal preference would be to 
specify a file that contains all the 
environment variables that need to be used (e.g. the setenv.sh file).  Only the 
CATALINA_HOME, CATALINA_BASE and 
CATALINA_PID variables would then need to be exported in any subsequent calls 
(the catalina.sh script would read 
setenv.sh and include all other variables accordingly).

Is this type of mechanism (using an external file for parameters) logically 
correct and permissible 
by the OCF standard?

If so, I'll correct accordingly and repost.

Thanks for your help / guidance,

Best Regards,

Brett


-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Fri 09/07/2010 11:47
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
stop or restart
 
Hi again,

Forgot to comment on your patch.

On Thu, Jul 08, 2010 at 10:35:57AM +0100, Brett Delle Grazie wrote:
> 
> Hi,
> 
> [...]
> 
> I have modified the Tomcat resource agent to be capable of
> controlling multiple Tomcat instances by exporting
> CATALINA_BASE as well as CATALINA_HOME - these are the only
> changes I've made to the resource agent (this is why the agent
> path is 'intact' instead of 'heartbeat' in the process list
> below) - diff attached.
> 
> [...]
> 
> --- tomcat2010-07-08 09:34:13.0 +0100
> +++ tomcat.intact 2010-07-08 10:24:51.0 +0100
> @@ -29,7 +29,9 @@
>  #   OCF_RESKEY_tomcat_user  - A user name to start a resource. Default is 
> root
>  #   OCF_RESKEY_statusurl - URL for state confirmation. Default is 
> http://127.0.0.1:8080
>  #   OCF_RESKEY_java_home - Home directory of the Java. Default is None
> +#   OCF_RESKEY_java_opts - Options to parse to Java. Always adds 
> -Dname=OCF_RESKEY_tomcat_name
>  #   OCF_RESKEY_catalina_home - Home directory of Tomcat. Default is None
> +#   OCF_RESKEY_catalina_base - Base directory of Tomcat. Default is None
>  #   OCF_RESKEY_catalina_pid  - A PID file name of Tomcat. Default is 
> OCF_RESKEY_catalina_home/logs/catalina.pid
>  #   OCF_RESKEY_tomcat_start_opts - Start options of the tomcat. Default is 
> None.
>  #   OCF_RESKEY_catalina_opts - CATALINA_OPTS environment variable. Default 
> is None.
> @@ -147,11 +149,12 @@
>   >> "$TOMCAT_CONSOLE" 2>&1 &
>   else
>   su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
> - -c "export JAVA_HOME=${OCF_RESKEY_java_home};\n
> -export JAVA_OPTS=-Dname=${TOMCAT_NAME};\n
> -export 
> CATALINA_HOME=${OCF_RESKEY_catalina_home};\n
> -export CATALINA_PID=${OCF_RESKEY_catalina_pid};\n
> -export 
> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\n
> + -c "export JAVA_HOME=${OCF_RESKEY_java_home};\
> +export JAVA_OPTS=\"-Dname=${TOMCAT_NAME} 
> ${OCF_RESKEY_java_opts}\";\
> +export 
> CATALINA_HOME=${OCF_RESKEY_catalina_home};\
> +export 
> CATALINA_BASE=${OCF_RESKEY_catalina_base};\

If the new parameter is not set, how would this export affect the
resource? Isn't it that there is otherwise some default set?
Also, do you have any idea why it wasn't used in the resource
agents before?

> +export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
> +export 
> CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
>  $CATALINA_HOME/bin/catalina.sh start 
> ${OCF_RESKEY_tomcat_start_opts}" \
>   >> "$T

Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart

2010-07-09 Thread Brett Delle Grazie
Hi Dejan,

Thanks for your response.

You are correct the backgrounded process used to start tomcat by the resource 
agent isn't exiting the way it should - the question is why?

Ignore the incorrect date on the example - I killed the wrong leftover process 
before setting up the example.

restarting tomcat is performed by:

crm resource restart cl_tomcat_tc1

To the best of my knowledge this performs a 'stop' and then a 'start'.

Where cl_tomcat_tc1 is a clone tomcat resource.

Any ideas why the backgrounded process doesn't exit?

Thanks,

Best Regards,

Brett



-Original Message-
From: Dejan Muhamedagic [mailto:deja...@fastmail.fm]
Sent: Fri 09/07/2010 11:32
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Tomcat Resource Agent always leaves dead process on 
stop or restart
 
Hi,

On Thu, Jul 08, 2010 at 10:35:57AM +0100, Brett Delle Grazie wrote:
> 
> Hi,
> 
> I'm using RHEL5.5 in a Heartbeat/Pacemaker cluster managing Tomcat and Apache 
> HTTPD on two nodes using the ocf:heartbeat:tomcat resource agent for Tomcat.
> 
> Specific versions:
> resource-agents 1.0.3-1
> heartbeat-libs-3.0.3-1
> heartbeat-3.0.3-1
> pacemaker-1.0.8-1.0hg20100317.8debc1902e13
> Tomcat 6.0.26 (downloaded from source).
> 
> I have modified the Tomcat resource agent to be capable of
> controlling multiple Tomcat instances by exporting
> CATALINA_BASE as well as CATALINA_HOME - these are the only


> changes I've made to the resource agent (this is why the agent
> path is 'intact' instead of 'heartbeat' in the process list
> below) - diff attached.
> 
> When manually issuing a restart of the clone resource on tomcat
> I'm left with a dead 'start' process:

Doesn't look dead to me, just that it didn't exit.

> (before restart):
> [r...@fmp-dun-tapp1 ~]# ps -efH | grep [t]omcat
> root 22754 21037  0 10:09 pts/000:00:00 grep tomcat
> root  5058 1  0 Jul07 ?00:00:00   /bin/sh 
> /usr/lib/ocf/resource.d//intact/tomcat start
> tomcat5101 1  0 Jul07 ?00:00:19   /usr/lib/jvm/java/bin/java 
> -Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties 
> -Dname=tomcat -Djava.awt.headless=true -Djava.library.path=/usr/lib64 
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xmx1024M 
> -Djava.endorsed.dirs=/opt/tomcat/endorsed -classpath 
> /opt/tomcat/bin/bootstrap.jar -Dcatalina.base=/home/tomcat/tc-1 
> -Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/home/tomcat/tc-1/temp 
> org.apache.catalina.startup.Bootstrap start
> 
> (after restart):
> [r...@fmp-dun-tapp1 ~]# ps -efH | grep [t]omcat
> root  5058 1  0 Jul07 ?00:00:00   /bin/sh 
> /usr/lib/ocf/resource.d//intact/tomcat start

This looks like an old process, judging by the date. Perhaps you
killed (using -9) some processes so this one remained hanging?
Otherwise, this is not possible, i.e. only one operation on a
resource is run.

> root  2271 1  0 10:26 ?00:00:00   /bin/sh 
> /usr/lib/ocf/resource.d//intact/tomcat start
> tomcat2307 1 21 10:26 ?00:00:02   /usr/lib/jvm/java/bin/java 
> -Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties 
> -Dname=tomcat -Djava.awt.headless=true -Djava.library.path=/usr/lib64 
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xmx1024M 
> -Djava.endorsed.dirs=/opt/tomcat/endorsed -classpath 
> /opt/tomcat/bin/bootstrap.jar -Dcatalina.base=/home/tomcat/tc-1 
> -Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/home/tomcat/tc-1/temp 
> org.apache.catalina.startup.Bootstrap start
> 
> Note the two 'tomcat start' processes above.
> Each restart produces successively more copies of the 'tocmat start' process.

What is a "restart"? How does it happen?

> Does anyone know why this would occur? I thought the call to
> 'catalina.sh start' which is backgrounded in tomcat_start
> function in resource should exit after starting Tomcat - but
> apparently it doesn't.

That's strange. The '&' at the end of the line certainly makes to
run in background. Otherwise, the start action goes into infinite
loop waiting for the monitor of the resource to succeed. If it
never does, then lrmd will timeout and kill the process.

Thanks,

Dejan


> Any help / trouble-shooting tips appreciated.
> 
> Thanks,
> 
> Best Regards,
> 
> Brett 
> 
> 
> 
> 
> __
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> __

> --- to

[Linux-HA] Tomcat Resource Agent always leaves dead process on stop or restart

2010-07-08 Thread Brett Delle Grazie

Hi,

I'm using RHEL5.5 in a Heartbeat/Pacemaker cluster managing Tomcat and Apache 
HTTPD on two nodes using the ocf:heartbeat:tomcat resource agent for Tomcat.

Specific versions:
resource-agents 1.0.3-1
heartbeat-libs-3.0.3-1
heartbeat-3.0.3-1
pacemaker-1.0.8-1.0hg20100317.8debc1902e13
Tomcat 6.0.26 (downloaded from source).

I have modified the Tomcat resource agent to be capable of controlling multiple 
Tomcat instances by exporting CATALINA_BASE as well as CATALINA_HOME - these 
are the only changes I've made to the resource agent (this is why the agent 
path is 'intact' instead of 'heartbeat' in the process list below) - diff 
attached.

When manually issuing a restart of the clone resource on tomcat I'm left with a 
dead 'start' process:

(before restart):
[r...@fmp-dun-tapp1 ~]# ps -efH | grep [t]omcat
root 22754 21037  0 10:09 pts/000:00:00 grep tomcat
root  5058 1  0 Jul07 ?00:00:00   /bin/sh 
/usr/lib/ocf/resource.d//intact/tomcat start
tomcat5101 1  0 Jul07 ?00:00:19   /usr/lib/jvm/java/bin/java 
-Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties 
-Dname=tomcat -Djava.awt.headless=true -Djava.library.path=/usr/lib64 
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xmx1024M 
-Djava.endorsed.dirs=/opt/tomcat/endorsed -classpath 
/opt/tomcat/bin/bootstrap.jar -Dcatalina.base=/home/tomcat/tc-1 
-Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/home/tomcat/tc-1/temp 
org.apache.catalina.startup.Bootstrap start

(after restart):
[r...@fmp-dun-tapp1 ~]# ps -efH | grep [t]omcat
root  5058 1  0 Jul07 ?00:00:00   /bin/sh 
/usr/lib/ocf/resource.d//intact/tomcat start
root  2271 1  0 10:26 ?00:00:00   /bin/sh 
/usr/lib/ocf/resource.d//intact/tomcat start
tomcat2307 1 21 10:26 ?00:00:02   /usr/lib/jvm/java/bin/java 
-Djava.util.logging.config.file=/home/tomcat/tc-1/conf/logging.properties 
-Dname=tomcat -Djava.awt.headless=true -Djava.library.path=/usr/lib64 
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xmx1024M 
-Djava.endorsed.dirs=/opt/tomcat/endorsed -classpath 
/opt/tomcat/bin/bootstrap.jar -Dcatalina.base=/home/tomcat/tc-1 
-Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/home/tomcat/tc-1/temp 
org.apache.catalina.startup.Bootstrap start

Note the two 'tomcat start' processes above.
Each restart produces successively more copies of the 'tocmat start' process.

Does anyone know why this would occur? I thought the call to 'catalina.sh 
start' which is backgrounded in 
tomcat_start function in resource should exit after starting Tomcat - but 
apparently it doesn't.

Any help / trouble-shooting tips appreciated.

Thanks,

Best Regards,

Brett 




__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__--- tomcat	2010-07-08 09:34:13.0 +0100
+++ tomcat.intact	2010-07-08 10:24:51.0 +0100
@@ -29,7 +29,9 @@
 #   OCF_RESKEY_tomcat_user  - A user name to start a resource. Default is root
 #   OCF_RESKEY_statusurl - URL for state confirmation. Default is http://127.0.0.1:8080
 #   OCF_RESKEY_java_home - Home directory of the Java. Default is None
+#   OCF_RESKEY_java_opts - Options to parse to Java. Always adds -Dname=OCF_RESKEY_tomcat_name
 #   OCF_RESKEY_catalina_home - Home directory of Tomcat. Default is None
+#   OCF_RESKEY_catalina_base - Base directory of Tomcat. Default is None
 #   OCF_RESKEY_catalina_pid  - A PID file name of Tomcat. Default is OCF_RESKEY_catalina_home/logs/catalina.pid
 #   OCF_RESKEY_tomcat_start_opts - Start options of the tomcat. Default is None.
 #   OCF_RESKEY_catalina_opts - CATALINA_OPTS environment variable. Default is None.
@@ -147,11 +149,12 @@
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	else
 		su - -s /bin/sh "$RESOURCE_TOMCAT_USER" \
-			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\n
-export JAVA_OPTS=-Dname=${TOMCAT_NAME};\n
-export CATALINA_HOME=${OCF_RESKEY_catalina_home};\n
-export CATALINA_PID=${OCF_RESKEY_catalina_pid};\n
-export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\n
+			-c "export JAVA_HOME=${OCF_RESKEY_java_home};\
+export JAVA_OPTS=\"-Dname=${TOMCAT_NAME} ${OCF_RESKEY_java_opts}\";\
+export CATALINA_HOME=${OCF_RESKEY_catalina_home};\
+export CATALINA_BASE=${OCF_RESKEY_catalina_base};\
+export CATALINA_PID=${OCF_RESKEY_catalina_pid};\
+export CATALINA_OPTS=\"${OCF_RESKEY_catalina_opts}\";\
 $CATALINA_HOME/bin/catalina.sh start ${OCF_RESKEY_tomcat_start_opts}" \
 			>> "$TOMCAT_CONSOLE" 2>&1 &
 	fi
@@ -