Re: [Linux-ha-dev] New stateful RA: conntrackd

2010-10-27 Thread Dominik Klein
Hi everybody

So I updated my RA according to Florian's comments on Jonathan
Petersson's conntrackd RA. I also contacted him in order to merge our
RAs, no reply there yet. Once we talked, you will get an update by one
of us.

Regards
Dominik
#!/bin/bash
#
#
#   An OCF RA for conntrackd
#   http://conntrack-tools.netfilter.org/
#
# Copyright (c) 2010 Dominik Klein
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#
###
# Initialization:

. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
export LANG=C LANGUAGE=C LC_ALL=C

meta_data() {
cat END
?xml version=1.0?
!DOCTYPE resource-agent SYSTEM ra-api-1.dtd
resource-agent name=conntrackd
version1.1/version

longdesc lang=en
Master/Slave OCF Resource Agent for conntrackd
/longdesc

shortdesc lang=enThis resource agent manages conntrackd/shortdesc

parameters
parameter name=conntrackd
longdesc lang=enFull path to conntrackd executable/longdesc
shortdesc lang=enFull path to conntrackd executable/shortdesc
content type=string default=/usr/sbin/conntrackd/
/parameter

parameter name=conntrackdconf
longdesc lang=enFull path to the conntrackd.conf file./longdesc
shortdesc lang=enPath to conntrackd.conf/shortdesc
content type=string default=/etc/conntrackd/conntrackd.conf/
/parameter

parameter name=statefile
longdesc lang=enFull path to the state file you wish to use./longdesc
shortdesc lang=enFull path to the state file you wish to use./shortdesc
content type=string default=/var/run/conntrackd.master/
/parameter
/parameters

actions
action name=start   timeout=240 /
action name=promote   timeout=90 /
action name=demote   timeout=90 /
action name=notify   timeout=90 /
action name=stoptimeout=100 /
action name=monitor depth=0  timeout=20 interval=20 role=Slave /
action name=monitor depth=0  timeout=20 interval=10 role=Master /
action name=meta-data  timeout=5 /
action name=validate-all  timeout=30 /
/actions
/resource-agent
END
}

meta_expect()
{
local what=$1 whatvar=OCF_RESKEY_CRM_meta_${1//-/_} op=$2 expect=$3
local val=${!whatvar}
if [[ -n $val ]]; then
# [, not [[, or it won't work ;)
[ $val $op $expect ]  return
fi
ocf_log err meta parameter misconfigured, expected $what $op $expect, 
but found ${val:-unset}.
exit $OCF_ERR_CONFIGURED
}

conntrackd_is_master() {
# You can't query conntrackd whether it is master or slave. It can be 
both at the same time. 
# This RA creates a statefile during promote and enforces master-max=1 
and clone-node-max=1
if [ -e $STATEFILE ]; then
return $OCF_SUCCESS
else
return $OCF_ERR_GENERIC
fi
}

conntrackd_set_master_score() {
${HA_SBIN_DIR}/crm_master -Q -l reboot -v $1
}

conntrackd_monitor() {
rc=$OCF_NOT_RUNNING
# It does not write a PID file, so check with pgrep
pgrep -f $CONNTRACKD  rc=$OCF_SUCCESS
if [ $rc = $OCF_SUCCESS ]; then
# conntrackd is running 
# now see if it acceppts queries
if ! ($CONNTRACKD -C $CONNTRACKD_CONF -s  /dev/null 21); then
rc=$OCF_ERR_GENERIC
ocf_log err conntrackd is running but not responding 
to queries
fi
if conntrackd_is_master; then
rc=$OCF_RUNNING_MASTER
# Restore master setting on probes
if [ $OCF_RESKEY_CRM_meta_interval -eq 0 ]; then
conntrackd_set_master_score $master_score
fi
else
# Restore master setting on probes
if [ $OCF_RESKEY_CRM_meta_interval -eq 0 ]; then
conntrackd_set_master_score $slave_score
fi
fi
fi
return $rc
}

conntrackd_start() {
rc=$OCF_ERR_GENERIC

# Keep 

[Linux-ha-dev] New stateful RA: conntrackd

2010-10-15 Thread Dominik Klein
Hi everybody,

I wrote a master/slave RA to manage conntrackd, the connection tracking
daemon from the netfilter project. Conntrackd is used to replicate
connection state between highly available stateful firewalls.

Conntrackd replicates data using multicast. Basically it sends state
information about connections written to its kernels connection tracking
table. Replication slaves write these updates to an external cache.

When a firewall is to take over the master role, it commits the external
cache to the kernel and so knows the connections that were previously
running through the old master system and clients can continue working
without having to open a new connection.

While there has been an RA for conntrackd (at least I found something
that looked like it in a pastebin using google), this one was not able
to deal with failback, which is a thing I needed, and was not yet
included in the repository. I hope this one will be included.

The main challenge in this RA was the failback part. Say one system goes
down completely. Then it loses the kernel connection tracking table and
the external cache. Once it comes back, it will receive updates for new
connections that are initiated through the master, but it will neither
be sent the complete tracking table of the current master, nor can it
request this (that's how I understand and tested conntrackd works,
please correct me if I'm wrong :)).

This may be acceptable for short-lived connections and configurations
where there is no preferred master system, but it does become a problem
if you have either of those.

So my approach is to send a so called bulk update in two situations:

a) in the notify pre promote call, if the local machine is not the
machine to be promoted
This part is responsible for sending the update to a preferred master
that had previously failed (failback).
b) in the notify post start call, if the local machine is the master
This part is responsible for sending the update to a previously failed
machine that re-joins the cluster but is not to be promoted right away.

For now I limited the RA to deal with only 2 clones and 1 master since
this is the only testbed I have and I am not 100% sure what happens to
the new master in situation a) if there are multiple slaves.

Configuration could look like this, notify=true is important:

primitive conntrackd ocf:intelegence:conntrackd \
op monitor interval=10 timeout=10 \
op monitor interval=11 role=Master timeout=10
primitive ip-extern ocf:heartbeat:IPaddr2 \
params ip=10.2.50.237 cidr_netmask=24 \
op monitor interval=10 timeout=10
primitive ip-intern ocf:heartbeat:IPaddr2 \
params ip=10.2.52.3 cidr_netmask=24 \
op monitor interval=10 timeout=10
ms ms-conntrackd conntrackd \
meta target-role=Started globally-unique=false notify=true
colocation ip-intern-extern inf: ip-extern ip-intern
colocation ips-on-conntrackd-master inf: ip-intern ms-conntrackd:Master
order ips-after-conntrackd inf: ms-conntrackd:promote ip-intern:start

Please review and test the RA, post comments and questions. Maybe it can
be included in the repository.

Regards
Dominik

ps. yes, some parts are from linbit's drbd RA and some parts may also be
from Andrew's Stateful RA. Hope that's okay.

-- 
IN-telegence GmbH
Oskar-Jäger-Str. 125
50825 Köln

Registergericht AG Köln - HRB 34038
USt-ID DE210882245
Geschäftsführende Gesellschafter: Christian Plätke und Holger Jansen
#!/bin/bash
#
#
#   An OCF RA for conntrackd
#   http://conntrack-tools.netfilter.org/
#
# Copyright (c) 2010 Dominik Klein
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#
###
# Initialization:

. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
export LANG=C LANGUAGE=C LC_ALL=C

meta_data() {
cat END
?xml version=1.0?
!DOCTYPE resource-agent SYSTEM ra-api-1.dtd
resource-agent name=conntrackd
version1.1/version

longdesc lang=en
Master/Slave OCF Resource Agent for conntrackd
/longdesc

shortdesc