Re: Odd CARP issue with 4.6

2009-11-30 Thread Michiel van Baak
On 17:17, Mon 30 Nov 09, Otto Moerbeek wrote:
> On Thu, Nov 26, 2009 at 03:56:37PM +0100, Henning Brauer wrote:
> 
> > * Derek Buttineau  [2009-11-26 15:07]:
> > > On 2009-11-25, at 6:23 PM, Henning Brauer wrote:
> > > 
> > > > check ifconfig -g carp on both
> > > 
> > > 
> > > Right now both are at:
> > > 
> > > carp: carp demote count 0
> > > 
> > > However, I did check that before I rebooted the backup unit and the 
> > > master was
> > > set to
> > > 
> > > carp: carp demote count 1
> > > 
> > > At first I thought that maybe pfsync was keeping the master from reverting
> > > while it synced state, but even after 24 hours the master hadn't taken 
> > > back
> > > over from the slave.
> > 
> > the one with the higher demote count always loses, regardless of
> > advskew. now finding out which subsytem set the demote count might be
> > nintrivial. pfsync is in the game, so is rc, and, depending on
> > configuration, various daemons like bgpd and ospfd.
> 
> What I have observed on a 4.6 firewall pair:
> 
> Thge demote count stays on 1 for a while because the first bulk state
> update request times out. Only the subsequent one succeeds. The timeout
> is 20s by default, but grows if you have a larger max state number. 
> 
> The analysis is that the pfsync code triggers a bulk request on
> the BSIOCSETPFSYNC ioctl, but at that moment the interface is not yet
> up, the SIOCSIFFLAGS is done after that.
> 
> This happens if you have a line in hostname.pfsync0 like:
> 
>   up syncif itf0
> 
> This gets rewritten by /etc/netstart, moving the "up" to the end.
> 
> A workaround (until dlg@ or somebody else finds a real fix) is to have
> a newline after "up", so that two ifconfig commands are issued by
> netstart, one to up the interface, and next to set the syncif:
> 
>   up
>   syncif itf0

Thanks!
This is exactly what happens on our setup, and your workaround is
working great.

Cheers
-- 

Michiel van Baak
mich...@vanbaak.eu
http://michiel.vanbaak.eu
GnuPG key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x71C946BD

"Why is it drug addicts and computer aficionados are both called users?"



Re: Odd CARP issue with 4.6

2009-11-30 Thread Otto Moerbeek
On Thu, Nov 26, 2009 at 03:56:37PM +0100, Henning Brauer wrote:

> * Derek Buttineau  [2009-11-26 15:07]:
> > On 2009-11-25, at 6:23 PM, Henning Brauer wrote:
> > 
> > > check ifconfig -g carp on both
> > 
> > 
> > Right now both are at:
> > 
> > carp: carp demote count 0
> > 
> > However, I did check that before I rebooted the backup unit and the master 
> > was
> > set to
> > 
> > carp: carp demote count 1
> > 
> > At first I thought that maybe pfsync was keeping the master from reverting
> > while it synced state, but even after 24 hours the master hadn't taken back
> > over from the slave.
> 
> the one with the higher demote count always loses, regardless of
> advskew. now finding out which subsytem set the demote count might be
> nintrivial. pfsync is in the game, so is rc, and, depending on
> configuration, various daemons like bgpd and ospfd.

What I have observed on a 4.6 firewall pair:

Thge demote count stays on 1 for a while because the first bulk state
update request times out. Only the subsequent one succeeds. The timeout
is 20s by default, but grows if you have a larger max state number. 

The analysis is that the pfsync code triggers a bulk request on
the BSIOCSETPFSYNC ioctl, but at that moment the interface is not yet
up, the SIOCSIFFLAGS is done after that.

This happens if you have a line in hostname.pfsync0 like:

up syncif itf0

This gets rewritten by /etc/netstart, moving the "up" to the end.

A workaround (until dlg@ or somebody else finds a real fix) is to have
a newline after "up", so that two ifconfig commands are issued by
netstart, one to up the interface, and next to set the syncif:

up
syncif itf0


-Otto



Re: Odd CARP issue with 4.6

2009-11-26 Thread Derek Buttineau
On 2009-11-26, at 10:40 AM, Marco Pfatschbacher wrote:

> It might help to set
> sysctl net.inet.carp.log=6
>
> carp does logging about who demoted it:
>
>CARP_LOG(LOG_INFO, nil, ("%s demoted group %s to %d",
ifp->if_xname,
>ifgl->ifgl_group->ifg_group, *dm));


Thanks, have set that.  Will check next time it happens and see if I can tell
what's demoting it.

--
Regards,

Derek Buttineau
Internet Systems Developer
Compu-SOLVE Internet Services
Compu-SOLVE Technologies, Inc

Phone:  705-725-1212 x255
E-Mail:  de...@csolve.net



Re: Odd CARP issue with 4.6

2009-11-26 Thread Marco Pfatschbacher
On Thu, Nov 26, 2009 at 03:56:37PM +0100, Henning Brauer wrote:
> * Derek Buttineau  [2009-11-26 15:07]:
> > On 2009-11-25, at 6:23 PM, Henning Brauer wrote:
> > 
> > > check ifconfig -g carp on both
> > 
> > 
> > Right now both are at:
> > 
> > carp: carp demote count 0
> > 
> > However, I did check that before I rebooted the backup unit and the master 
> > was
> > set to
> > 
> > carp: carp demote count 1
> > 
> > At first I thought that maybe pfsync was keeping the master from reverting
> > while it synced state, but even after 24 hours the master hadn't taken back
> > over from the slave.
> 
> the one with the higher demote count always loses, regardless of
> advskew. now finding out which subsytem set the demote count might be
> nintrivial. pfsync is in the game, so is rc, and, depending on
> configuration, various daemons like bgpd and ospfd.

It might help to set
sysctl net.inet.carp.log=6

carp does logging about who demoted it:

CARP_LOG(LOG_INFO, nil, ("%s demoted group %s to %d", 
ifp->if_xname,
ifgl->ifgl_group->ifg_group, *dm));



Re: Odd CARP issue with 4.6

2009-11-26 Thread Henning Brauer
* Derek Buttineau  [2009-11-26 15:07]:
> On 2009-11-25, at 6:23 PM, Henning Brauer wrote:
> 
> > check ifconfig -g carp on both
> 
> 
> Right now both are at:
> 
> carp: carp demote count 0
> 
> However, I did check that before I rebooted the backup unit and the master was
> set to
> 
> carp: carp demote count 1
> 
> At first I thought that maybe pfsync was keeping the master from reverting
> while it synced state, but even after 24 hours the master hadn't taken back
> over from the slave.

the one with the higher demote count always loses, regardless of
advskew. now finding out which subsytem set the demote count might be
nintrivial. pfsync is in the game, so is rc, and, depending on
configuration, various daemons like bgpd and ospfd.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: Odd CARP issue with 4.6

2009-11-26 Thread Derek Buttineau
On 2009-11-25, at 6:23 PM, Henning Brauer wrote:

> check ifconfig -g carp on both


Right now both are at:

carp: carp demote count 0

However, I did check that before I rebooted the backup unit and the master was
set to

carp: carp demote count 1

At first I thought that maybe pfsync was keeping the master from reverting
while it synced state, but even after 24 hours the master hadn't taken back
over from the slave.

--
Regards,

Derek Buttineau
Internet Systems Developer
Compu-SOLVE Internet Services
Compu-SOLVE Technologies, Inc

Phone:  705-725-1212 x255
E-Mail:  de...@csolve.net



Re: Odd CARP issue with 4.6

2009-11-26 Thread Derek Buttineau
On 2009-11-25, at 6:08 PM, Bryan Irvine wrote:

> did you by chance upgrade your sysctl.conf?  Make sure preempt is
> still turned on.
> 
> -B


I did upgrade sysctl.conf, but preempt is still turned on.  Odd.

--
Regards,

Derek Buttineau
Internet Systems Developer
Compu-SOLVE Internet Services
Compu-SOLVE Technologies, Inc

Phone:  705-725-1212 x255
E-Mail:  de...@csolve.net



Re: Odd CARP issue with 4.6

2009-11-26 Thread Daniele Pilenga
On Thu, Nov 26, 2009 at 7:52 AM, Michiel van Baak 
wrote:
> On 17:21, Wed 25 Nov 09, Derek Buttineau wrote:
>> I'm having a really odd issue, and not sure quite how best to explain it.
>>
>> As far as I know my setup was working fine with 4.5, and the failover
itself
>> still works without a hitch, it just  doesn't seem to want to fail back
>> anymore.

May that have anything to do with this?

   # ifconfig -g carp
   carp: carp demote count 0

Ciao,
D.



Re: Odd CARP issue with 4.6

2009-11-25 Thread Michiel van Baak
On 17:21, Wed 25 Nov 09, Derek Buttineau wrote:
> I'm having a really odd issue, and not sure quite how best to explain it.
> 
> As far as I know my setup was working fine with 4.5, and the failover itself
> still works without a hitch, it just  doesn't seem to want to fail back
> anymore.
> 
> If the master goes down (say for a reboot), CARP fails over to the secondary
> machine as normal, but when the master is back it doesn't fail back to it.
> 
> If I force the carp interface down on the backup machine, it fails back over,
> but then as soon as I bring those interfaces back up, the BACKUP becomes
> master again.  I find this strange since the BACKUP still has a much higher
> advskew.
> 
> I end up rebooting the backup, which seems to put everything in its place.
> Very odd issue.  Has anyone else encountered this?

I'm also see the same thing.
Since the failback does happen when the new master
dies/disconnects/whatever I did not really look into it more.

I dont really care what box is MASTER, they are both the same and as
long as failover from one to the other works everytime I take a box down
I'm fine.

But it is odd to see the box with the lowest advskew does not become
master anymore.
-- 

Michiel van Baak
mich...@vanbaak.eu
http://michiel.vanbaak.eu
GnuPG key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x71C946BD

"Why is it drug addicts and computer aficionados are both called users?"



Re: Odd CARP issue with 4.6

2009-11-25 Thread Brent Jones
On Wed, Nov 25, 2009 at 2:21 PM, Derek Buttineau  wrote:
> I'm having a really odd issue, and not sure quite how best to explain it.
>
> As far as I know my setup was working fine with 4.5, and the failover
itself
> still works without a hitch, it just  doesn't seem to want to fail back
> anymore.
>
> If the master goes down (say for a reboot), CARP fails over to the
secondary
> machine as normal, but when the master is back it doesn't fail back to it.
>
> If I force the carp interface down on the backup machine, it fails back
over,
> but then as soon as I bring those interfaces back up, the BACKUP becomes
> master again.  I find this strange since the BACKUP still has a much higher
> advskew.
>
> I end up rebooting the backup, which seems to put everything in its place.
> Very odd issue.  Has anyone else encountered this?
>
> Master interface ifconfig:
>
> carp1: flags=8843 mtu 1500
>lladdr 00:00:5e:00:01:02
>priority: 0
>carp: MASTER carpdev bnx0 vhid 2 advbase 1 advskew 0
>groups: carp
>
>
> Backup interface ifconfig:
>
> carp1: flags=8843 mtu 1500
>lladdr 00:00:5e:00:01:02
>priority: 0
>carp: BACKUP carpdev em0 vhid 2 advbase 1 advskew 100
>groups: carp
>
> --
> Regards,
>
> Derek Buttineau
> Internet Systems Developer
> Compu-SOLVE Internet Services
> Compu-SOLVE Technologies, Inc
>
> Phone:  705-725-1212 x255
> E-Mail:  de...@csolve.net
>
>

I see the same thing.
To get around it, I down the carp1 interface on the backup to make the
master active again, then just bring up the backup again.

Wasn't a big deal to me, but good to know I'm not the only one

--
Brent Jones
br...@servuhome.net



Re: Odd CARP issue with 4.6

2009-11-25 Thread Bryan Irvine
did you by chance upgrade your sysctl.conf?  Make sure preempt is
still turned on.

-B

On Wed, Nov 25, 2009 at 2:21 PM, Derek Buttineau  wrote:
> I'm having a really odd issue, and not sure quite how best to explain it.
>
> As far as I know my setup was working fine with 4.5, and the failover
itself
> still works without a hitch, it just  doesn't seem to want to fail back
> anymore.
>
> If the master goes down (say for a reboot), CARP fails over to the
secondary
> machine as normal, but when the master is back it doesn't fail back to it.
>
> If I force the carp interface down on the backup machine, it fails back
over,
> but then as soon as I bring those interfaces back up, the BACKUP becomes
> master again.  I find this strange since the BACKUP still has a much higher
> advskew.
>
> I end up rebooting the backup, which seems to put everything in its place.
> Very odd issue.  Has anyone else encountered this?
>
> Master interface ifconfig:
>
> carp1: flags=8843 mtu 1500
>lladdr 00:00:5e:00:01:02
>priority: 0
>carp: MASTER carpdev bnx0 vhid 2 advbase 1 advskew 0
>groups: carp
>
>
> Backup interface ifconfig:
>
> carp1: flags=8843 mtu 1500
>lladdr 00:00:5e:00:01:02
>priority: 0
>carp: BACKUP carpdev em0 vhid 2 advbase 1 advskew 100
>groups: carp
>
> --
> Regards,
>
> Derek Buttineau
> Internet Systems Developer
> Compu-SOLVE Internet Services
> Compu-SOLVE Technologies, Inc
>
> Phone:  705-725-1212 x255
> E-Mail:  de...@csolve.net



Odd CARP issue with 4.6

2009-11-25 Thread Derek Buttineau
I'm having a really odd issue, and not sure quite how best to explain it.

As far as I know my setup was working fine with 4.5, and the failover itself
still works without a hitch, it just  doesn't seem to want to fail back
anymore.

If the master goes down (say for a reboot), CARP fails over to the secondary
machine as normal, but when the master is back it doesn't fail back to it.

If I force the carp interface down on the backup machine, it fails back over,
but then as soon as I bring those interfaces back up, the BACKUP becomes
master again.  I find this strange since the BACKUP still has a much higher
advskew.

I end up rebooting the backup, which seems to put everything in its place.
Very odd issue.  Has anyone else encountered this?

Master interface ifconfig:

carp1: flags=8843 mtu 1500
lladdr 00:00:5e:00:01:02
priority: 0
carp: MASTER carpdev bnx0 vhid 2 advbase 1 advskew 0
groups: carp


Backup interface ifconfig:

carp1: flags=8843 mtu 1500
lladdr 00:00:5e:00:01:02
priority: 0
carp: BACKUP carpdev em0 vhid 2 advbase 1 advskew 100
groups: carp

--
Regards,

Derek Buttineau
Internet Systems Developer
Compu-SOLVE Internet Services
Compu-SOLVE Technologies, Inc

Phone:  705-725-1212 x255
E-Mail:  de...@csolve.net