Re: lagg(4) and failover

2008-12-09 Thread Tom Samplonius

- Brian A. Seklecki [EMAIL PROTECTED] wrote:
 On Sun, 2008-12-07 at 08:03 +1100, Peter Jeremy wrote:
  On 2008-Dec-05 07:34:21 -0500, Brian A. Seklecki
  [EMAIL PROTECTED] wrote:
  Well ... name a price for the development; HA L1/L2 is a feature
 the
  community would gladly sponsor the development of.
  
  net/ifstated covers at least some of this.
 
 I was thinking something like a heartbeat protocol could work well
 w/o
 LACP hacking on mid-range switches.
 
 I think that's how Dell does it with the RHEL crap for Broadcom.
 
 Send multicast packets on both an active/standby link.  If either
 node
 discontinues to see the others packets, some admin-configurable logic
 promotes (Metric/Bias/Weighting).

  I don't know if that is such a great idea, as that would only test the switch 
that you are connected to.

  The Linux bonding driver supports probing the default gateway.  Now, it uses 
ARP for this (probably because the ARP who-has code is also in the kernel and 
easily accessible), which also not so great, as a ARP who-has is a broadcast.  
So if you have lots of servers on the LAN using the bonding driver, you get a 
lot of broadcast traffic.  ICMP echo-request would be a better approach, but my 
take on this, is that the echo-request/reply handling code would have to be 
written, so this hasn't been done yet.  But ultimately, gateway probing is the 
best, as not only does it verify the directly connected switch, but also that 
you can get from that switch to the outside world.

  lagg is ultimately a problem as a high-availability solution since most 
switches do not support multi-switch 802.3ad yet, and most probably never well. 
 So you are limited to a single switch.  So 802.3ad is good only for 
aggregation, and not for high availability.

  So an active-standby system with probing is the way to go for 
high-availability.  It seems that FreeBSD has most of the components of this 
already.  ng_one2many was a possible base for this.  

 ~BAS
 
 -- 
 Brian A. Seklecki [EMAIL PROTECTED]
 Collaborative Fusion, Inc.

Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-12-09 Thread Peter Jeremy
Please wrap your mail before 80 columns.
On 2008-Dec-08 23:58:00 -0800, Tom Samplonius [EMAIL PROTECTED] wrote:
  The Linux bonding driver supports probing the default gateway.

This is the same brokenness as Solaris IPMP.  I agree that probing
an external IP address (probably, but not necessarily a gateway) is
the way to go but you need to be able to configure this.  Otherwise
you need to jump through hoops where the interfaces you are protecting
is not the default route (or there are multiple independent groups
of interfaces being protected).

  Now, it uses ARP for this (probably because the ARP who-has code is
also in the kernel and easily accessible), which also not so great,

I don't see that it's necessary to have the interface failover code
in the kernel.  The kernel needs hooks to allow a daemon to bind to
the physical interfaces and control which one is active, but the
actual code that decides how to determine which interface is active
should be in userland.  (Note that routing works this way).

switches do not support multi-switch 802.3ad yet, and most probably
never well.  So you are limited to a single switch.  So 802.3ad is
good only for aggregation, and not for high availability.

Keep in mind that higher-end switches as well as stacked lower-end
switches have a reasonable amount of internal redundancy so 802.3ad
within one distinct components of one physical switch may be adequate
for many purposes.  Keep in mind that you'll still need multiple
FreeBSD boxes to prevent them being a single point of failure.

-- 
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.


pgpRg83tFxR9k.pgp
Description: PGP signature


Re: lagg(4) and failover

2008-12-09 Thread Andrew Snow

  lagg is ultimately a problem as a high-availability solution since most 
switches do not support multi-switch 802.3ad yet, and most probably never well. 
 So you are limited to a single switch.  So 802.3ad is good only for 
aggregation, and not for high availability.


What about using STP or RSTP instead of lagg? Which L2 managed switches 
like 3com and HP support.  Then you could connect each of two NICs to a 
different switch, as well as connect the switches to each other.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-12-09 Thread Brian A. Seklecki
On Mon, 2008-12-08 at 23:58 -0800, Tom Samplonius wrote:
   I don't know if that is such a great idea, as that would only test
 the switch that you are connected to.

Heatbeats (STP,CARP) and active sanity checking (Nagios, ifwatchd(8))
are the two main options.

Results may vary in every system/network device permutation.

At least we're talking about it -- even if just for the sake of the
archives -- that wasn't happening before.


-- 
Brian A. Seklecki [EMAIL PROTECTED]
Collaborative Fusion, Inc.


signature.asc
Description: This is a digitally signed message part


Re: lagg(4) and failover

2008-12-08 Thread Brian A. Seklecki
On Sun, 2008-12-07 at 08:03 +1100, Peter Jeremy wrote:
 On 2008-Dec-05 07:34:21 -0500, Brian A. Seklecki
 [EMAIL PROTECTED] wrote:
 Well ... name a price for the development; HA L1/L2 is a feature the
 community would gladly sponsor the development of.
 
 net/ifstated covers at least some of this.

I was thinking something like a heartbeat protocol could work well w/o
LACP hacking on mid-range switches.

I think that's how Dell does it with the RHEL crap for Broadcom.

Send multicast packets on both an active/standby link.  If either node
discontinues to see the others packets, some admin-configurable logic
promotes (Metric/Bias/Weighting).

~BAS

-- 
Brian A. Seklecki [EMAIL PROTECTED]
Collaborative Fusion, Inc.


signature.asc
Description: This is a digitally signed message part


Re: lagg(4) and failover

2008-12-06 Thread Peter Jeremy
On 2008-Dec-05 07:34:21 -0500, Brian A. Seklecki [EMAIL PROTECTED] wrote:
Well ... name a price for the development; HA L1/L2 is a feature the
community would gladly sponsor the development of.

net/ifstated covers at least some of this.

Also, Peter, you should put a page up on the FreeBSD wiki with some of
those multi-catalyst LACP IOS config examples.  

This appears to be aimed at Pete French - I'm using stacked Alcatel-Lucent
OS6850's which appear as single switches to LACP.

P.S., in my experience, system level redundancy/HA with a load balancer
is almost always less expensive then excessive component-level
redundancy/ha (RAID Disk, RAID RAM, Dual Power Supplies, Dual
Backplanes...)

That's a different topic, but yes, you should evaluate your
requirements at a system level, rather than just making every
component HA.

-- 
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.


pgps6KI1leVrl.pgp
Description: PGP signature


Re: lagg(4) and failover

2008-12-05 Thread Brian A. Seklecki
On Tue, 2008-08-12 at 22:03 +1000, Peter Jeremy wrote:
 Thats unfortunate...
 
 I tend to agree.
 
 bonding in Linux is capable of doing this and solaris too.
 

Well ... name a price for the development; HA L1/L2 is a feature the
community would gladly sponsor the development of.

Also, Peter, you should put a page up on the FreeBSD wiki with some of
those multi-catalyst LACP IOS config examples.  

Maybe write an article for BSDMag.

I always just counted that idea out (LACP against two switches) since
LACP doesn't have any inter-component transport protocol a la pfsync(4).

But if the backplanes of Cat 37xx`s can be merged at a lower level, then
then yea, fuck.  Lets have it.

~BAS 

P.S., in my experience, system level redundancy/HA with a load balancer
is almost always less expensive then excessive component-level
redundancy/ha (RAID Disk, RAID RAM, Dual Power Supplies, Dual
Backplanes...)

 It shouldn't be too difficult to create something that behaves
 functionally similarly to Slowaris ipmpd (and with marginally more
 effort, you could create something that could be configured to behave
 sensibly).
-- 
Brian A. Seklecki [EMAIL PROTECTED]
Collaborative Fusion, Inc.


signature.asc
Description: This is a digitally signed message part


lagg(4) and failover

2008-08-12 Thread Marian Hettwer
Hi Folks,

I'm using lagg(4) on some of our servers and I'm just wondering how the
failover is implemented.
The manpage isn't quite clear:

 failover Sends and receives traffic only through the master port. 
If
  the master port becomes unavailable, the next active port
is
  used.  The first interface added is the master port; any
  interfaces added after that are used as failover devices.

What is meant by becomes unavailable? Is it just the physical link which
needs to become unavailable to trigger a failover?

I do wonder, because there might be other faults where the link is still
active, but the port is unusable. Think of a wrong vlan on the switch
itself.

When using bonding under Linux (yeah, I know, the configuration sucks ;) ),
I can configure the device to check for arp respones of it's default
gateway. If arp to the default gw becomes unavailable, bonding fails over
to the next interface and tries it luck over there.
With that kind of configuration, I could cover a misconfigured switch port
and still have failover.

Long Story short: How is failover in lagg(4) implemented?

Thanks for any hints :)

Or should I ask the OpenBSD boys, since lagg(4) seems to be a port of
trunk(4)?? :)

best regards,
Marian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Eugene Grosbein
On Tue, Aug 12, 2008 at 12:37:15PM +0200, Marian Hettwer wrote:

 I'm using lagg(4) on some of our servers and I'm just wondering how the
 failover is implemented.
 The manpage isn't quite clear:
 
  failover Sends and receives traffic only through the master port. 
 If
   the master port becomes unavailable, the next active port
 is
   used.  The first interface added is the master port; any
   interfaces added after that are used as failover devices.
 
 What is meant by becomes unavailable? Is it just the physical link which
 needs to become unavailable to trigger a failover?

Yes. It seems you need lacp protocol described later in the manual.

Eugene Grosbein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Marian Hettwer


On Tue, 12 Aug 2008 18:55:52 +0800, Eugene Grosbein [EMAIL PROTECTED]
wrote:
 On Tue, Aug 12, 2008 at 12:37:15PM +0200, Marian Hettwer wrote:
 
 I'm using lagg(4) on some of our servers and I'm just wondering how the
 failover is implemented.
 The manpage isn't quite clear:

  failover Sends and receives traffic only through the master
 port.
 If
   the master port becomes unavailable, the next active
 port
 is
   used.  The first interface added is the master port;
 any
   interfaces added after that are used as failover
 devices.

 What is meant by becomes unavailable? Is it just the physical link
 which
 needs to become unavailable to trigger a failover?
 
 Yes. It seems you need lacp protocol described later in the manual.
 
Thanks for your answer.
However, IMO lacp doesn't solve that problem. lacp is used for link
aggregation, not failover.
If I'm wrong over there, I should have a read about lacp... should do that
anyway, I guess.

The manpage states In the event of changes in physical connectivity
Again, does that mean, the link needs to be physically unavailable? If so,
it'll be the same behaviour as in failover mode and doesn't solve my
problem of a misconfigured switch...

Cheers,
Marian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Peter Jeremy
On 2008-Aug-12 18:55:52 +0800, Eugene Grosbein [EMAIL PROTECTED] wrote:
On Tue, Aug 12, 2008 at 12:37:15PM +0200, Marian Hettwer wrote:

 I'm using lagg(4) on some of our servers and I'm just wondering how the
 failover is implemented.

As far as I can tell, not especially well :-(.  It doesn't seem to detect
much short of layer 1 failure.  In particular, shutting down the switch
port will not trigger a failover.

 The manpage isn't quite clear:
 
  failover Sends and receives traffic only through the master port. 
 If
   the master port becomes unavailable, the next active port
 is
   used.  The first interface added is the master port; any
   interfaces added after that are used as failover devices.
 
 What is meant by becomes unavailable? Is it just the physical link which
 needs to become unavailable to trigger a failover?

It seems to be,

Yes. It seems you need lacp protocol described later in the manual.

Actually, lacp and failover are used differently: lacp is primarily
used to increase the bandwidth between the host and the switch whilst
failover is used for redundancy.

With lacp, all the physical interfaces must be connected to a single
switch.  With failover, the physical interfaces will normally be
connected to different switches (so a failure in one switch will not
cause the loss of all connectivity.

-- 
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.


pgpGk05yQX4JR.pgp
Description: PGP signature


Re: lagg(4) and failover

2008-08-12 Thread Pete French
 However, IMO lacp doesn't solve that problem. lacp is used for link
 aggregation, not failover.

It does both - if one of the links becomes unavailable then it will
stop using it. We use this for failover and it works fine, the only
caveat being that your LACP device at the far end needs to look like
a single phsyical device (the nicer Cisco switches do this quite happily)

 The manpage states In the event of changes in physical connectivity
 Again, does that mean, the link needs to be physically unavailable? If so,
 it'll be the same behaviour as in failover mode and doesn't solve my
 problem of a misconfigured switch...

lagg is to handle failover at the physical layer for when one of your
ether ports fails, or someone unplugs a cable. If I understand you
correctly you are looking for something at the next layer up, to handle
a problem where the ports work fine, but are not going to their expected
destinations. lagg won't do this.

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Pete French
 As far as I can tell, not especially well :-(.  It doesn't seem to detect
 much short of layer 1 failure.  In particular, shutting down the switch
 port will not trigger a failover.

Are you using bce devices as your phsyical interfaces ? Take a look at
the thread from last week about ifconfig - with the patch posted a port
shutdown now *does* trigger a failover quite happily. If you are using
e devices then I suggest you try it.

 With lacp, all the physical interfaces must be connected to a single
 switch.  With failover, the physical interfaces will normally be
 connected to different switches (so a failure in one switch will not
 cause the loss of all connectivity.

This is true - with the caveat that certain pairs of switches can be made
to appear as a single phsyical device for the purposes of LACP, in which
case it works fine for failover.

We have two farms here - an old one using a pair of Cisco 3560s and
a new one using a pair of 3750-Es. The 3750s will act as a single
device and we use LACP on the machines connected to those, but the 3560s
appear as a pair of devices, so for those we use failover mode. LACP
failover always worked fine, and with the bce patch from last week
the normal failover now also works.

Nore that you can enable LACP on the 3560,s and it does appear to negotiate
and work, but the switches keep changing their idea of which port to use every
few seconds. So the connection works, but with high rates of packet loss
as a few go missing every time the switch pair flip-flops.

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Marian Hettwer
Hi Pete,

On Tue, 12 Aug 2008 12:30:12 +0100, Pete French
[EMAIL PROTECTED] wrote:
 However, IMO lacp doesn't solve that problem. lacp is used for link
 aggregation, not failover.
 
 It does both - if one of the links becomes unavailable then it will
 stop using it. We use this for failover and it works fine, the only
 caveat being that your LACP device at the far end needs to look like
 a single phsyical device (the nicer Cisco switches do this quite happily)
 
thanks for that info.

 The manpage states In the event of changes in physical
 connectivity
 Again, does that mean, the link needs to be physically unavailable? If
 so,
 it'll be the same behaviour as in failover mode and doesn't solve my
 problem of a misconfigured switch...
 
 lagg is to handle failover at the physical layer for when one of your
 ether ports fails, or someone unplugs a cable. If I understand you
 correctly you are looking for something at the next layer up, to handle
 a problem where the ports work fine, but are not going to their expected
 destinations. lagg won't do this.

Thats unfortunate...
bonding in Linux is capable of doing this and solaris too.
Well then. At least everythings clear now. And in the end, clarifing things
was the reason for that mail thread :)

Cheers,
Marian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Peter Jeremy
On 2008-Aug-12 13:43:29 +0200, Marian Hettwer [EMAIL PROTECTED] wrote:
 lagg is to handle failover at the physical layer for when one of your
 ether ports fails, or someone unplugs a cable. If I understand you

Thats unfortunate...

I tend to agree.

bonding in Linux is capable of doing this and solaris too.

It shouldn't be too difficult to create something that behaves
functionally similarly to Slowaris ipmpd (and with marginally more
effort, you could create something that could be configured to behave
sensibly).

-- 
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.


pgph5IlfzxY1r.pgp
Description: PGP signature


Re: lagg(4) and failover

2008-08-12 Thread Marian Hettwer
Hi Max,

On Tue, 12 Aug 2008 14:00:18 +0200, Max Laier [EMAIL PROTECTED] wrote:
 Thats unfortunate...
 bonding in Linux is capable of doing this and solaris too.
 Well then. At least everythings clear now. And in the end, clarifing
 things
 was the reason for that mail thread :)
 
 You are looking for net/ifstated

at a first glance into pkg-descr. Yeah, seems like I'm looking for
ifstated.
Thanks for the heads up :-)

Cheers,
Marian


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Marian Hettwer
Hi Peter,

On Tue, 12 Aug 2008 22:03:07 +1000, Peter Jeremy
[EMAIL PROTECTED] wrote:
 On 2008-Aug-12 13:43:29 +0200, Marian Hettwer [EMAIL PROTECTED] wrote:
 lagg is to handle failover at the physical layer for when one of your
 ether ports fails, or someone unplugs a cable. If I understand you

Thats unfortunate...
 
 I tend to agree.
 
bonding in Linux is capable of doing this and solaris too.
 
 It shouldn't be too difficult to create something that behaves
 functionally similarly to Slowaris ipmpd (and with marginally more
 effort, you could create something that could be configured to behave
 sensibly).
 
har har. Yeah, you're right.
But as Max pointed out, theres net/ifstated. I'll have a look into that
tiny daemon :)

Cheers,
Marian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Andrew Thompson
On Tue, Aug 12, 2008 at 12:37:15PM +0200, Marian Hettwer wrote:
 Hi Folks,
 
 I'm using lagg(4) on some of our servers and I'm just wondering how the
 failover is implemented.
 The manpage isn't quite clear:
 
  failover Sends and receives traffic only through the master port. 
 If
   the master port becomes unavailable, the next active port
 is
   used.  The first interface added is the master port; any
   interfaces added after that are used as failover devices.
 
 What is meant by becomes unavailable? Is it just the physical link which
 needs to become unavailable to trigger a failover?
 
 I do wonder, because there might be other faults where the link is still
 active, but the port is unusable. Think of a wrong vlan on the switch
 itself.
 
 When using bonding under Linux (yeah, I know, the configuration sucks ;) ),
 I can configure the device to check for arp respones of it's default
 gateway. If arp to the default gw becomes unavailable, bonding fails over
 to the next interface and tries it luck over there.
 With that kind of configuration, I could cover a misconfigured switch port
 and still have failover.
 
 Long Story short: How is failover in lagg(4) implemented?

It is simply performed on the physical link state, nothing more.

Adding smarter methods of detecting the link such as what Linux does are
very welcome. You may want to also look at LACP mode where heatbeat
frames are exchanged with the peer.


Andrew
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lagg(4) and failover

2008-08-12 Thread Andrew Thompson
On Tue, Aug 12, 2008 at 09:24:30PM +1000, Peter Jeremy wrote:
 On 2008-Aug-12 18:55:52 +0800, Eugene Grosbein [EMAIL PROTECTED] wrote:
 On Tue, Aug 12, 2008 at 12:37:15PM +0200, Marian Hettwer wrote:
 
  I'm using lagg(4) on some of our servers and I'm just wondering how the
  failover is implemented.
 
 As far as I can tell, not especially well :-(.  It doesn't seem to detect
 much short of layer 1 failure.  In particular, shutting down the switch
 port will not trigger a failover.
 
  The manpage isn't quite clear:
  
   failover Sends and receives traffic only through the master port. 
  If
the master port becomes unavailable, the next active port
  is
used.  The first interface added is the master port; any
interfaces added after that are used as failover devices.
  
  What is meant by becomes unavailable? Is it just the physical link which
  needs to become unavailable to trigger a failover?
 
 It seems to be,
 
 Yes. It seems you need lacp protocol described later in the manual.
 
 Actually, lacp and failover are used differently: lacp is primarily
 used to increase the bandwidth between the host and the switch whilst
 failover is used for redundancy.
 
 With lacp, all the physical interfaces must be connected to a single
 switch.  With failover, the physical interfaces will normally be
 connected to different switches (so a failure in one switch will not
 cause the loss of all connectivity.

Actually you can use lacp in failover mode by connecting interfaces to
different switches. It will only bundle an aggregation to one switch at
a time but if that becomes unavailable then it will automatically choose
the next switch.


Andrew
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]