Re: [Linux-HA] Linux heartbeat resources question

Darren.Mansell Wed, 06 Apr 2011 08:12:35 -0700

Hi all,

 

i have a scenario where i have 2 front ends that receive traffic (http

server) and  should run some scripts in crontab, some of the scripts
should just being running by 1 server at a time (active one) and others
should run on both.

 

Regarding the http like is load-sharing i think i cant use heartbeat,
right?

heartbeat is just for active-stanby or can we use to a active-active as
watchdog? i have a cisco css to load sharing the http, and i can make a
watchdog script to the apache. 

 

Regarding the cron crontrol i was thinking to make a script that
replaces the crontab file to whatever is the correct one. When the
heartbeat start what parameter is sent to the script that are resources?
a start if active node and nothing if is the standby?allways start?how
should i config the haresources to do it? what is the best way?

 

i have other situation that is making a nfs server in solaris 10, i have
2 servers with shared disks ( sun array), can i use heartbeat to this
too? it is possible to make it in such way that if i had i failover in
nfs server the clients doesn't need to reconnect?

 

Its a long post...sorry!

thanks!!

--

 

 

I'll have a go at this one. J

 

I've got some clusters implementing both of the features you're
requesting there. In fact, I've got quite a few heavy, mission-critical
clusters with web services doing load-balancing using Pacemaker and LVS.

 

To load-balance using Pacemaker, although I believe there are other
ways, I've always used a combination of cloned resources on the cluster,
ldirectord, and a virtual IP address. The virtual IP and ldirectord are
standard primitive resources grouped together so they get run together
on one node like so:

 

Resource Group: Load-Balancing

    VIP       (ocf::heartbeat:IPaddr2):  Started NODE-01

    ldirectord       (ocf::heartbeat:ldirectord):      Started NODE-01

 

A configuration for this would be something like:

 

primitive VIP ocf:heartbeat:IPaddr2 \

       params lvs_support="true" ip="192.168.1.100" cidr_netmask="24"
broadcast="192.168.1.255" \

       op monitor interval="1m" timeout="10s" \

       meta migration-threshold="10"

primitive ldirectord ocf:heartbeat:ldirectord \

       params configfile="/etc/ha.d/ldirectord.cf" \

       op monitor interval="2m" timeout="20s" \

       meta migration-threshold="10" target-role="Started"

group Load-Balancing VIP ldirectord

location Prefer-Node1 ldirectord \

       rule $id="prefer-node1-rule" 100: #uname eq NODE-01

 

And then just put your load-balancing rules in /etc/ha.d/ldirectord.cf:

 

checktimeout=5

checkinterval=7

autoreload=yes

logfile="/var/log/ldirectord.log"

quiescent=no

emailalert=yourem...@you.com

virtual=192.168.1.100:80

        fallback=192.168.1.250:80

        real=192.168.1.10:80 gate 100

        real=192.168.1.20:80 gate 100

        service=http

        scheduler=wlc

        protocol=tcp

        checktype=negotiate

        request="/"

        receive="OK"

 

Pacemaker with Ldirectord/LVS can make a fantastic load-balancer with HA
built using only 2 nodes. I'm surprised more people don't use it in this
way as while it makes the config slightly more complicated, you can use
your passive node to run the cloned resource and maximise your
performance.

 

Note that to do this you'll need to put the VIP as an extra IP on your
loopback interface (I've still got to file a bug about this), and set
ARP parameters in sysctl. (look on the pacemaker and LVS wiki). You can
also configure LVS to sync the connection table so on failover you won't
lose any connections.

 

On to your second point. I've found that just writing a loop inside of a
shell script with the appropriate controls and adding it as an LSB style
resource to the cluster works fine. You can then add the resource to the
load-balancing group so it will only run on one node. It's a little bit
like rewriting cron, but cron isn't cluster aware ;) My script is a bit
like:

 

#!/bin/sh                                                

# description: Start or stop the task script

#                                                        

### BEGIN INIT INFO                                      

# Provides: task                                         

# Required-Start: $network $syslog                       

# Required-Stop: $network                                

# Default-Start: 3                                       

# Default-Stop: 0                                        

# Description: Start or stop the task script           

### END INIT INFO                                        

 

# Static variables here.

DIR=/opt/task/

BOTHER="y...@yourmail.com"

RUNFILE="/var/run/task"

RUNTIME="003000"

 

MAINLOOP() {

LOG="/var/log/task.log"

 

while true

do

       TODAY=`date +%d-%m-%Y`

       EXTRACTFILE="${DIR}OP/$TODAY.zip"

       echo `date` >> $LOG

 

       # Check for permission to run.

       if [ ! -f "$RUNFILE" ]

       then

              echo "No $RUNFILE found" >> $LOG

              exit 0

       fi

 

       # Check if we've already run today

       if [ ! -f "$EXTRACTFILE" ]

       then

              echo "No $EXTRACTFILE found, continuing" >> $LOG

 

              # Or if we're still running

              NUMPROCS=`pgrep -f "Dname=task " | wc -l`

                                if [ $NUMPROCS -lt 1 ]

                                then

                     THETIME=`date +%H%M%S` 

                     if [ $THETIME -gt $RUNTIME ]

                     then

                           echo -e "\nApparently $THETIME is greater
than $RUNTIME so it's time to do our thang" >> $LOG

                           echo -e "------------------------" >> $LOG

                           echo -e "\n*** Starting extract process
***\nThe time : $THETIME" >> $LOG

                           echo -e "\nNumber of existing task processes
: $NUMPROCS" >> $LOG

                           RUN_TASK

                     fi

              fi

       fi

       sleep 10m

done

 

}

 

RUN_TASK() {

 

Do_stuff to create EXTRACTFILE...

 

CHECKRET

 

# Sync everything to the other node so a failover won't make any
difference

rsync -avz $DIR NODE-02:$DIR 

 

# Lets clean up

kill $$

}

 

CHECKRET() {

RET=$?

if [ $RET -ne 0 ]

then

    tail -n 200 $LOG | mail -s "Extract Failed" $BOTHER

else

    tail -n 200 $LOG | mail -s "Extract Succeeded" $BOTHER

fi

}

 

CHECKSTATUS () {

if [ -f "$RUNFILE" ]

then

       if [ `pgrep -f "task start" | wc -l` -gt 0 ]

       then

              RUNNING="yes"

       else

              unset $RUNNING

       fi

fi

}

 

case "$1" in

'start')

        CHECKSTATUS

        [ "$RUNNING" ] && echo "$0 is already running" && exit 0

        echo $"Starting $0"

       touch $RUNFILE

       MAINLOOP &

        ;;

'stop')

       [ -f "$RUNFILE" ] && rm $RUNFILE 

       pkill -f "Dname=task "

       pkill -f "task start"

        echo "Stopping task"

        ;;

'restart')

        $0 stop

       sleep 5

        $0 start

        ;;

'status')

       CHECKSTATUS

        [ "$RUNNING" ] && echo " running" && exit 0 || echo " stopped"
&& exit 3;;

*)

        echo

        echo $"Usage: $0 {start|stop}"

        echo

        exit 1;;

 

esac

 

If you edit anything, just run /etc/init.d/task restart.

 

The fantastic thing about modern Linux-HA with Pacemaker is that you can
do pretty much anything. 

 

Hope this helps.

Darren

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Linux heartbeat resources question

Reply via email to