Re: LAGG bug or misconfiguration???

2012-03-23 Thread Snoop
Hi guys ... just for the record.

I've fixed the issue simply moving the cable of the backup interface to
another switch as suggested by the network guys of the DC. Which is even
preferable under the network redundancy perspective.
Now works perfectly and the failover NIC0-NIC1 and (NIC1-NIC0) is
immediate.

Many thanks for your time.
Cheers.

On Fri, 2012-03-16 at 17:49 +0100, Damien Fleuriot wrote:
> I confirm you should see fast transition for your VLANs to forwarding state.
> 
> 
> Are your ports in access or trunk mode ?
> 
> If they're trunked, portfast alone won't do it, you need "spanning-tree
> portfast trunk".
> 
> Additionally, are you using link aggregation on the cisco swi ?
> (channel-group)
> 
> 
> On 3/16/12 5:31 PM, Snoop wrote:
> > That's the STP configuration on my two switch ports:
> > 
> >  spanning-tree portfast
> >  spanning-tree bpduguard enable
> > 
> > 
> > 
> > On Fri, 2012-03-16 at 12:10 +0100, Damien Fleuriot wrote:
> >> You're not looking for FEC or ethechannel or 802.3ad at all.
> >>
> >> What you're looking for, in the case of a *failover* configuration, is a
> >> "spanning-tree portfast" feature so that your port doesn't transition
> >> through the different spantree states before forwarding traffic.
> >>
> >> Kindly obtain the configuration from whoever has it and let us know.
> >>
> >>
> >> On 3/16/12 11:18 AM, Snoop wrote:
> >>> Hi Dweimer and Damien,
> >>> thanks for replying.
> >>>
> >>> The server is connected to a switch of the datacentre. The configuration
> >>> of this switch is unknown to me and I obviously have no access to it but
> >>> I truly believe that such an enterprise environment has management
> >>> capabilities.
> >>> Anyway, in which way the configuration would affect the lagg
> >>> functionality? Might this issue be related to what stated in the FreeBSD
> >>> LAGG pages in the handbook?
> >>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html
> >>>
> >>> "Cisco® Fast EtherChannel®
> >>>
> >>> Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
> >>> aggregation with the peer or exchange frames to monitor the link. If the
> >>> switch supports LACP then that should be used instead."
> >>>
> >>>
> >>>
> >>> On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
>  Sorry top posting from phone.
> 
> 
>  Show your switch's port configurations.
> 
>  We're using VLAN tagging over lagg failover interfaces at work and I 
>  have already tried the tests you described, to much better results.
> 
>  We're also running 8.2 so the only thing that seems to differ between us 
>  is the switch config, likely.
> 
> 
> 
>  On 15 Mar 2012, at 20:06, Snoop  wrote:
> 
> > Hi there,
> > a while after setting up my new server (with 8 jails in it) I've decided
> > (after postponing several times) to properly check the functionality of
> > the lagg and the result was very disappointing.
> >
> > The test I've done is very simple.
> > I've started copying a file from one site to another of my VPN network
> > (from the server I've been testing the net to another node somewhere
> > else) and in the meantime I've been physically disconnecting the main
> > network cable to check the responsiveness of the lagg configuration.
> > Then I've plugged the cable back to check if the traffic would switch
> > back to the main NIC as it should.
> >
> > The result was basically this (lagg0 members: bge0 primary, bge1
> > secondary)
> >
> > - when bge0 unplugged the traffic switched almost instantaneously to
> > bge1
> > - when bge0 plugged back in, the network stopped working completely with
> > the two NICs polling synchronously until I manually unplug bge1. Then
> > within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> > little more than a minute maximum to avoid all the active connections on
> > the server to timeout).
> >
> > Now, I've repeated the same test about 10-15 times randomly waiting for
> > different times between the unplug-replug procedure. The result was
> > always the same.
> >
> > So, below are the ipconfig outputs
> > - before to start the test
> > - when bge0 gets unplugged
> > - when bge0 gets plugged back in
> >
> > I couldn't see anything odd.
> > ___
> > lagg0: flags=8843 metric 0 mtu
> > 1500
> >
> > options=8009b
> >ether 00:14:ee:00:8a:c0
> >inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >inet 172.16.3.3 netmask 0x broadc

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Snoop
I actually don't know Damien. I'll have to have a chat with the network
guy in the DC as I'm not managing the switch neither I have access to
it, plus I'm not really a Cisco guy so I'll forward those questions to
him.

Moreover I'm getting a bit lost with this.
If the ports are in trunk mode would this affect the FreeBSD lagg
functionality? If yes how?
Do I need "spanning-tree portfast trunk" to make it work properly?

I really appreciate your useful inputs Damien.



On Fri, 2012-03-16 at 17:49 +0100, Damien Fleuriot wrote:
> I confirm you should see fast transition for your VLANs to forwarding state.
> 
> 
> Are your ports in access or trunk mode ?
> 
> If they're trunked, portfast alone won't do it, you need "spanning-tree
> portfast trunk".
> 
> Additionally, are you using link aggregation on the cisco swi ?
> (channel-group)
> 
> 
> On 3/16/12 5:31 PM, Snoop wrote:
> > That's the STP configuration on my two switch ports:
> > 
> >  spanning-tree portfast
> >  spanning-tree bpduguard enable
> > 
> > 
> > 
> > On Fri, 2012-03-16 at 12:10 +0100, Damien Fleuriot wrote:
> >> You're not looking for FEC or ethechannel or 802.3ad at all.
> >>
> >> What you're looking for, in the case of a *failover* configuration, is a
> >> "spanning-tree portfast" feature so that your port doesn't transition
> >> through the different spantree states before forwarding traffic.
> >>
> >> Kindly obtain the configuration from whoever has it and let us know.
> >>
> >>
> >> On 3/16/12 11:18 AM, Snoop wrote:
> >>> Hi Dweimer and Damien,
> >>> thanks for replying.
> >>>
> >>> The server is connected to a switch of the datacentre. The configuration
> >>> of this switch is unknown to me and I obviously have no access to it but
> >>> I truly believe that such an enterprise environment has management
> >>> capabilities.
> >>> Anyway, in which way the configuration would affect the lagg
> >>> functionality? Might this issue be related to what stated in the FreeBSD
> >>> LAGG pages in the handbook?
> >>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html
> >>>
> >>> "Cisco® Fast EtherChannel®
> >>>
> >>> Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
> >>> aggregation with the peer or exchange frames to monitor the link. If the
> >>> switch supports LACP then that should be used instead."
> >>>
> >>>
> >>>
> >>> On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
>  Sorry top posting from phone.
> 
> 
>  Show your switch's port configurations.
> 
>  We're using VLAN tagging over lagg failover interfaces at work and I 
>  have already tried the tests you described, to much better results.
> 
>  We're also running 8.2 so the only thing that seems to differ between us 
>  is the switch config, likely.
> 
> 
> 
>  On 15 Mar 2012, at 20:06, Snoop  wrote:
> 
> > Hi there,
> > a while after setting up my new server (with 8 jails in it) I've decided
> > (after postponing several times) to properly check the functionality of
> > the lagg and the result was very disappointing.
> >
> > The test I've done is very simple.
> > I've started copying a file from one site to another of my VPN network
> > (from the server I've been testing the net to another node somewhere
> > else) and in the meantime I've been physically disconnecting the main
> > network cable to check the responsiveness of the lagg configuration.
> > Then I've plugged the cable back to check if the traffic would switch
> > back to the main NIC as it should.
> >
> > The result was basically this (lagg0 members: bge0 primary, bge1
> > secondary)
> >
> > - when bge0 unplugged the traffic switched almost instantaneously to
> > bge1
> > - when bge0 plugged back in, the network stopped working completely with
> > the two NICs polling synchronously until I manually unplug bge1. Then
> > within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> > little more than a minute maximum to avoid all the active connections on
> > the server to timeout).
> >
> > Now, I've repeated the same test about 10-15 times randomly waiting for
> > different times between the unplug-replug procedure. The result was
> > always the same.
> >
> > So, below are the ipconfig outputs
> > - before to start the test
> > - when bge0 gets unplugged
> > - when bge0 gets plugged back in
> >
> > I couldn't see anything odd.
> > ___
> > lagg0: flags=8843 metric 0 mtu
> > 1500
> >
> > options=8009b
> >ether 00:14:ee:00:8a:c0
> >inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Damien Fleuriot
I confirm you should see fast transition for your VLANs to forwarding state.


Are your ports in access or trunk mode ?

If they're trunked, portfast alone won't do it, you need "spanning-tree
portfast trunk".

Additionally, are you using link aggregation on the cisco swi ?
(channel-group)


On 3/16/12 5:31 PM, Snoop wrote:
> That's the STP configuration on my two switch ports:
> 
>  spanning-tree portfast
>  spanning-tree bpduguard enable
> 
> 
> 
> On Fri, 2012-03-16 at 12:10 +0100, Damien Fleuriot wrote:
>> You're not looking for FEC or ethechannel or 802.3ad at all.
>>
>> What you're looking for, in the case of a *failover* configuration, is a
>> "spanning-tree portfast" feature so that your port doesn't transition
>> through the different spantree states before forwarding traffic.
>>
>> Kindly obtain the configuration from whoever has it and let us know.
>>
>>
>> On 3/16/12 11:18 AM, Snoop wrote:
>>> Hi Dweimer and Damien,
>>> thanks for replying.
>>>
>>> The server is connected to a switch of the datacentre. The configuration
>>> of this switch is unknown to me and I obviously have no access to it but
>>> I truly believe that such an enterprise environment has management
>>> capabilities.
>>> Anyway, in which way the configuration would affect the lagg
>>> functionality? Might this issue be related to what stated in the FreeBSD
>>> LAGG pages in the handbook?
>>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html
>>>
>>> "Cisco® Fast EtherChannel®
>>>
>>> Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
>>> aggregation with the peer or exchange frames to monitor the link. If the
>>> switch supports LACP then that should be used instead."
>>>
>>>
>>>
>>> On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
 Sorry top posting from phone.


 Show your switch's port configurations.

 We're using VLAN tagging over lagg failover interfaces at work and I have 
 already tried the tests you described, to much better results.

 We're also running 8.2 so the only thing that seems to differ between us 
 is the switch config, likely.



 On 15 Mar 2012, at 20:06, Snoop  wrote:

> Hi there,
> a while after setting up my new server (with 8 jails in it) I've decided
> (after postponing several times) to properly check the functionality of
> the lagg and the result was very disappointing.
>
> The test I've done is very simple.
> I've started copying a file from one site to another of my VPN network
> (from the server I've been testing the net to another node somewhere
> else) and in the meantime I've been physically disconnecting the main
> network cable to check the responsiveness of the lagg configuration.
> Then I've plugged the cable back to check if the traffic would switch
> back to the main NIC as it should.
>
> The result was basically this (lagg0 members: bge0 primary, bge1
> secondary)
>
> - when bge0 unplugged the traffic switched almost instantaneously to
> bge1
> - when bge0 plugged back in, the network stopped working completely with
> the two NICs polling synchronously until I manually unplug bge1. Then
> within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> little more than a minute maximum to avoid all the active connections on
> the server to timeout).
>
> Now, I've repeated the same test about 10-15 times randomly waiting for
> different times between the unplug-replug procedure. The result was
> always the same.
>
> So, below are the ipconfig outputs
> - before to start the test
> - when bge0 gets unplugged
> - when bge0 gets plugged back in
>
> I couldn't see anything odd.
> ___
> lagg0: flags=8843 metric 0 mtu
> 1500
>
> options=8009b
>ether 00:14:ee:00:8a:c0
>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
>media: Ethernet autoselect
>status: active
>laggproto failover
>laggport: bge1 flags=0<>
>laggport: bge0 flags=5
> ___
> lagg0: flags=8843 metric 0 mtu
> 1500
>
> options

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Snoop
That's the STP configuration on my two switch ports:

 spanning-tree portfast
 spanning-tree bpduguard enable



On Fri, 2012-03-16 at 12:10 +0100, Damien Fleuriot wrote:
> You're not looking for FEC or ethechannel or 802.3ad at all.
> 
> What you're looking for, in the case of a *failover* configuration, is a
> "spanning-tree portfast" feature so that your port doesn't transition
> through the different spantree states before forwarding traffic.
> 
> Kindly obtain the configuration from whoever has it and let us know.
> 
> 
> On 3/16/12 11:18 AM, Snoop wrote:
> > Hi Dweimer and Damien,
> > thanks for replying.
> > 
> > The server is connected to a switch of the datacentre. The configuration
> > of this switch is unknown to me and I obviously have no access to it but
> > I truly believe that such an enterprise environment has management
> > capabilities.
> > Anyway, in which way the configuration would affect the lagg
> > functionality? Might this issue be related to what stated in the FreeBSD
> > LAGG pages in the handbook?
> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html
> > 
> > "Cisco® Fast EtherChannel®
> > 
> > Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
> > aggregation with the peer or exchange frames to monitor the link. If the
> > switch supports LACP then that should be used instead."
> > 
> > 
> > 
> > On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
> >> Sorry top posting from phone.
> >>
> >>
> >> Show your switch's port configurations.
> >>
> >> We're using VLAN tagging over lagg failover interfaces at work and I have 
> >> already tried the tests you described, to much better results.
> >>
> >> We're also running 8.2 so the only thing that seems to differ between us 
> >> is the switch config, likely.
> >>
> >>
> >>
> >> On 15 Mar 2012, at 20:06, Snoop  wrote:
> >>
> >>> Hi there,
> >>> a while after setting up my new server (with 8 jails in it) I've decided
> >>> (after postponing several times) to properly check the functionality of
> >>> the lagg and the result was very disappointing.
> >>>
> >>> The test I've done is very simple.
> >>> I've started copying a file from one site to another of my VPN network
> >>> (from the server I've been testing the net to another node somewhere
> >>> else) and in the meantime I've been physically disconnecting the main
> >>> network cable to check the responsiveness of the lagg configuration.
> >>> Then I've plugged the cable back to check if the traffic would switch
> >>> back to the main NIC as it should.
> >>>
> >>> The result was basically this (lagg0 members: bge0 primary, bge1
> >>> secondary)
> >>>
> >>> - when bge0 unplugged the traffic switched almost instantaneously to
> >>> bge1
> >>> - when bge0 plugged back in, the network stopped working completely with
> >>> the two NICs polling synchronously until I manually unplug bge1. Then
> >>> within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> >>> little more than a minute maximum to avoid all the active connections on
> >>> the server to timeout).
> >>>
> >>> Now, I've repeated the same test about 10-15 times randomly waiting for
> >>> different times between the unplug-replug procedure. The result was
> >>> always the same.
> >>>
> >>> So, below are the ipconfig outputs
> >>> - before to start the test
> >>> - when bge0 gets unplugged
> >>> - when bge0 gets plugged back in
> >>>
> >>> I couldn't see anything odd.
> >>> ___
> >>> lagg0: flags=8843 metric 0 mtu
> >>> 1500
> >>>
> >>> options=8009b
> >>>ether 00:14:ee:00:8a:c0
> >>>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >>>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >>>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >>>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >>>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
> >>>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
> >>>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
> >>>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
> >>>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
> >>>media: Ethernet autoselect
> >>>status: active
> >>>laggproto failover
> >>>laggport: bge1 flags=0<>
> >>>laggport: bge0 flags=5
> >>> ___
> >>> lagg0: flags=8843 metric 0 mtu
> >>> 1500
> >>>
> >>> options=8009b
> >>>ether 00:14:ee:00:8a:c0
> >>>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >>>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >>>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >>>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >>

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Snoop
I've requested the configuration. I'll post that as soon as I have it.
Thank you very much for your time.

On Fri, 2012-03-16 at 12:10 +0100, Damien Fleuriot wrote:
> You're not looking for FEC or ethechannel or 802.3ad at all.
> 
> What you're looking for, in the case of a *failover* configuration, is a
> "spanning-tree portfast" feature so that your port doesn't transition
> through the different spantree states before forwarding traffic.
> 
> Kindly obtain the configuration from whoever has it and let us know.
> 
> 
> On 3/16/12 11:18 AM, Snoop wrote:
> > Hi Dweimer and Damien,
> > thanks for replying.
> > 
> > The server is connected to a switch of the datacentre. The configuration
> > of this switch is unknown to me and I obviously have no access to it but
> > I truly believe that such an enterprise environment has management
> > capabilities.
> > Anyway, in which way the configuration would affect the lagg
> > functionality? Might this issue be related to what stated in the FreeBSD
> > LAGG pages in the handbook?
> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html
> > 
> > "Cisco® Fast EtherChannel®
> > 
> > Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
> > aggregation with the peer or exchange frames to monitor the link. If the
> > switch supports LACP then that should be used instead."
> > 
> > 
> > 
> > On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
> >> Sorry top posting from phone.
> >>
> >>
> >> Show your switch's port configurations.
> >>
> >> We're using VLAN tagging over lagg failover interfaces at work and I have 
> >> already tried the tests you described, to much better results.
> >>
> >> We're also running 8.2 so the only thing that seems to differ between us 
> >> is the switch config, likely.
> >>
> >>
> >>
> >> On 15 Mar 2012, at 20:06, Snoop  wrote:
> >>
> >>> Hi there,
> >>> a while after setting up my new server (with 8 jails in it) I've decided
> >>> (after postponing several times) to properly check the functionality of
> >>> the lagg and the result was very disappointing.
> >>>
> >>> The test I've done is very simple.
> >>> I've started copying a file from one site to another of my VPN network
> >>> (from the server I've been testing the net to another node somewhere
> >>> else) and in the meantime I've been physically disconnecting the main
> >>> network cable to check the responsiveness of the lagg configuration.
> >>> Then I've plugged the cable back to check if the traffic would switch
> >>> back to the main NIC as it should.
> >>>
> >>> The result was basically this (lagg0 members: bge0 primary, bge1
> >>> secondary)
> >>>
> >>> - when bge0 unplugged the traffic switched almost instantaneously to
> >>> bge1
> >>> - when bge0 plugged back in, the network stopped working completely with
> >>> the two NICs polling synchronously until I manually unplug bge1. Then
> >>> within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> >>> little more than a minute maximum to avoid all the active connections on
> >>> the server to timeout).
> >>>
> >>> Now, I've repeated the same test about 10-15 times randomly waiting for
> >>> different times between the unplug-replug procedure. The result was
> >>> always the same.
> >>>
> >>> So, below are the ipconfig outputs
> >>> - before to start the test
> >>> - when bge0 gets unplugged
> >>> - when bge0 gets plugged back in
> >>>
> >>> I couldn't see anything odd.
> >>> ___
> >>> lagg0: flags=8843 metric 0 mtu
> >>> 1500
> >>>
> >>> options=8009b
> >>>ether 00:14:ee:00:8a:c0
> >>>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >>>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >>>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >>>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >>>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
> >>>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
> >>>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
> >>>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
> >>>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
> >>>media: Ethernet autoselect
> >>>status: active
> >>>laggproto failover
> >>>laggport: bge1 flags=0<>
> >>>laggport: bge0 flags=5
> >>> ___
> >>> lagg0: flags=8843 metric 0 mtu
> >>> 1500
> >>>
> >>> options=8009b
> >>>ether 00:14:ee:00:8a:c0
> >>>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >>>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >>>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >>>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >>> 

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Damien Fleuriot
You're not looking for FEC or ethechannel or 802.3ad at all.

What you're looking for, in the case of a *failover* configuration, is a
"spanning-tree portfast" feature so that your port doesn't transition
through the different spantree states before forwarding traffic.

Kindly obtain the configuration from whoever has it and let us know.


On 3/16/12 11:18 AM, Snoop wrote:
> Hi Dweimer and Damien,
> thanks for replying.
> 
> The server is connected to a switch of the datacentre. The configuration
> of this switch is unknown to me and I obviously have no access to it but
> I truly believe that such an enterprise environment has management
> capabilities.
> Anyway, in which way the configuration would affect the lagg
> functionality? Might this issue be related to what stated in the FreeBSD
> LAGG pages in the handbook?
> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html
> 
> "Cisco® Fast EtherChannel®
> 
> Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
> aggregation with the peer or exchange frames to monitor the link. If the
> switch supports LACP then that should be used instead."
> 
> 
> 
> On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
>> Sorry top posting from phone.
>>
>>
>> Show your switch's port configurations.
>>
>> We're using VLAN tagging over lagg failover interfaces at work and I have 
>> already tried the tests you described, to much better results.
>>
>> We're also running 8.2 so the only thing that seems to differ between us is 
>> the switch config, likely.
>>
>>
>>
>> On 15 Mar 2012, at 20:06, Snoop  wrote:
>>
>>> Hi there,
>>> a while after setting up my new server (with 8 jails in it) I've decided
>>> (after postponing several times) to properly check the functionality of
>>> the lagg and the result was very disappointing.
>>>
>>> The test I've done is very simple.
>>> I've started copying a file from one site to another of my VPN network
>>> (from the server I've been testing the net to another node somewhere
>>> else) and in the meantime I've been physically disconnecting the main
>>> network cable to check the responsiveness of the lagg configuration.
>>> Then I've plugged the cable back to check if the traffic would switch
>>> back to the main NIC as it should.
>>>
>>> The result was basically this (lagg0 members: bge0 primary, bge1
>>> secondary)
>>>
>>> - when bge0 unplugged the traffic switched almost instantaneously to
>>> bge1
>>> - when bge0 plugged back in, the network stopped working completely with
>>> the two NICs polling synchronously until I manually unplug bge1. Then
>>> within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
>>> little more than a minute maximum to avoid all the active connections on
>>> the server to timeout).
>>>
>>> Now, I've repeated the same test about 10-15 times randomly waiting for
>>> different times between the unplug-replug procedure. The result was
>>> always the same.
>>>
>>> So, below are the ipconfig outputs
>>> - before to start the test
>>> - when bge0 gets unplugged
>>> - when bge0 gets plugged back in
>>>
>>> I couldn't see anything odd.
>>> ___
>>> lagg0: flags=8843 metric 0 mtu
>>> 1500
>>>
>>> options=8009b
>>>ether 00:14:ee:00:8a:c0
>>>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
>>>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
>>>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
>>>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
>>>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
>>>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
>>>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
>>>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
>>>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
>>>media: Ethernet autoselect
>>>status: active
>>>laggproto failover
>>>laggport: bge1 flags=0<>
>>>laggport: bge0 flags=5
>>> ___
>>> lagg0: flags=8843 metric 0 mtu
>>> 1500
>>>
>>> options=8009b
>>>ether 00:14:ee:00:8a:c0
>>>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
>>>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
>>>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
>>>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
>>>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
>>>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
>>>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
>>>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
>>>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
>>>media: Ethernet autoselect
>>>statu

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Snoop
Hi Dweimer and Damien,
thanks for replying.

The server is connected to a switch of the datacentre. The configuration
of this switch is unknown to me and I obviously have no access to it but
I truly believe that such an enterprise environment has management
capabilities.
Anyway, in which way the configuration would affect the lagg
functionality? Might this issue be related to what stated in the FreeBSD
LAGG pages in the handbook?
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html

"Cisco® Fast EtherChannel®

Cisco Fast EtherChannel (FEC), is a static setup and does not negotiate
aggregation with the peer or exchange frames to monitor the link. If the
switch supports LACP then that should be used instead."



On Fri, 2012-03-16 at 10:45 +0100, Damien Fleuriot wrote:
> Sorry top posting from phone.
> 
> 
> Show your switch's port configurations.
> 
> We're using VLAN tagging over lagg failover interfaces at work and I have 
> already tried the tests you described, to much better results.
> 
> We're also running 8.2 so the only thing that seems to differ between us is 
> the switch config, likely.
> 
> 
> 
> On 15 Mar 2012, at 20:06, Snoop  wrote:
> 
> > Hi there,
> > a while after setting up my new server (with 8 jails in it) I've decided
> > (after postponing several times) to properly check the functionality of
> > the lagg and the result was very disappointing.
> > 
> > The test I've done is very simple.
> > I've started copying a file from one site to another of my VPN network
> > (from the server I've been testing the net to another node somewhere
> > else) and in the meantime I've been physically disconnecting the main
> > network cable to check the responsiveness of the lagg configuration.
> > Then I've plugged the cable back to check if the traffic would switch
> > back to the main NIC as it should.
> > 
> > The result was basically this (lagg0 members: bge0 primary, bge1
> > secondary)
> > 
> > - when bge0 unplugged the traffic switched almost instantaneously to
> > bge1
> > - when bge0 plugged back in, the network stopped working completely with
> > the two NICs polling synchronously until I manually unplug bge1. Then
> > within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> > little more than a minute maximum to avoid all the active connections on
> > the server to timeout).
> > 
> > Now, I've repeated the same test about 10-15 times randomly waiting for
> > different times between the unplug-replug procedure. The result was
> > always the same.
> > 
> > So, below are the ipconfig outputs
> > - before to start the test
> > - when bge0 gets unplugged
> > - when bge0 gets plugged back in
> > 
> > I couldn't see anything odd.
> > ___
> > lagg0: flags=8843 metric 0 mtu
> > 1500
> > 
> > options=8009b
> >ether 00:14:ee:00:8a:c0
> >inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
> >inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
> >inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
> >inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
> >inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
> >media: Ethernet autoselect
> >status: active
> >laggproto failover
> >laggport: bge1 flags=0<>
> >laggport: bge0 flags=5
> > ___
> > lagg0: flags=8843 metric 0 mtu
> > 1500
> > 
> > options=8009b
> >ether 00:14:ee:00:8a:c0
> >inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
> >inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
> >inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
> >inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
> >inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
> >inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
> >inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
> >inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
> >media: Ethernet autoselect
> >status: active
> >laggproto failover
> >laggport: bge1 flags=4
> >laggport: bge0 flags=1
> > ___
> > 
> > lagg0: flags=8843 metric 0 mtu
> > 1500
> > 
> > options=8009b
> >ether 00:14:ee:00:8a:c0
> >inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
> >inet xxx.xx.xx.227 netmask 0xf

Re: LAGG bug or misconfiguration???

2012-03-16 Thread Damien Fleuriot
Sorry top posting from phone.


Show your switch's port configurations.

We're using VLAN tagging over lagg failover interfaces at work and I have 
already tried the tests you described, to much better results.

We're also running 8.2 so the only thing that seems to differ between us is the 
switch config, likely.



On 15 Mar 2012, at 20:06, Snoop  wrote:

> Hi there,
> a while after setting up my new server (with 8 jails in it) I've decided
> (after postponing several times) to properly check the functionality of
> the lagg and the result was very disappointing.
> 
> The test I've done is very simple.
> I've started copying a file from one site to another of my VPN network
> (from the server I've been testing the net to another node somewhere
> else) and in the meantime I've been physically disconnecting the main
> network cable to check the responsiveness of the lagg configuration.
> Then I've plugged the cable back to check if the traffic would switch
> back to the main NIC as it should.
> 
> The result was basically this (lagg0 members: bge0 primary, bge1
> secondary)
> 
> - when bge0 unplugged the traffic switched almost instantaneously to
> bge1
> - when bge0 plugged back in, the network stopped working completely with
> the two NICs polling synchronously until I manually unplug bge1. Then
> within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
> little more than a minute maximum to avoid all the active connections on
> the server to timeout).
> 
> Now, I've repeated the same test about 10-15 times randomly waiting for
> different times between the unplug-replug procedure. The result was
> always the same.
> 
> So, below are the ipconfig outputs
> - before to start the test
> - when bge0 gets unplugged
> - when bge0 gets plugged back in
> 
> I couldn't see anything odd.
> ___
> lagg0: flags=8843 metric 0 mtu
> 1500
> 
> options=8009b
>ether 00:14:ee:00:8a:c0
>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
>media: Ethernet autoselect
>status: active
>laggproto failover
>laggport: bge1 flags=0<>
>laggport: bge0 flags=5
> ___
> lagg0: flags=8843 metric 0 mtu
> 1500
> 
> options=8009b
>ether 00:14:ee:00:8a:c0
>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
>media: Ethernet autoselect
>status: active
>laggproto failover
>laggport: bge1 flags=4
>laggport: bge0 flags=1
> ___
> 
> lagg0: flags=8843 metric 0 mtu
> 1500
> 
> options=8009b
>ether 00:14:ee:00:8a:c0
>inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
>inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
>inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
>inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
>inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
>inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
>inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
>inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
>inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
>media: Ethernet autoselect
>status: active
>laggproto failover
>laggport: bge1 flags=0<>
>laggport: bge0 flags=5
> __
> Also nothing unusual on dmesg:
> 
> ...
> bge0: link state changed to DOWN
> bge0: link state changed to UP
> bge1: link state changed to DOWN
> bge1: link state changed to UP
> bge0: link state changed to DOWN
> bge0: link state changed to UP
> bge1: link sta

Re: LAGG bug or misconfiguration???

2012-03-15 Thread Dean E. Weimer

On 15.03.2012 14:06, Snoop wrote:

Hi there,
a while after setting up my new server (with 8 jails in it) I've 
decided
(after postponing several times) to properly check the functionality 
of

the lagg and the result was very disappointing.

The test I've done is very simple.
I've started copying a file from one site to another of my VPN 
network

(from the server I've been testing the net to another node somewhere
else) and in the meantime I've been physically disconnecting the main
network cable to check the responsiveness of the lagg configuration.
Then I've plugged the cable back to check if the traffic would switch
back to the main NIC as it should.

The result was basically this (lagg0 members: bge0 primary, bge1
secondary)

- when bge0 unplugged the traffic switched almost instantaneously to
bge1
- when bge0 plugged back in, the network stopped working completely 
with

the two NICs polling synchronously until I manually unplug bge1. Then
within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
little more than a minute maximum to avoid all the active connections 
on

the server to timeout).

Now, I've repeated the same test about 10-15 times randomly waiting 
for

different times between the unplug-replug procedure. The result was
always the same.

So, below are the ipconfig outputs
- before to start the test
- when bge0 gets unplugged
- when bge0 gets plugged back in

I couldn't see anything odd.

___
lagg0: flags=8843 metric 0 
mtu

1500


options=8009b
ether 00:14:ee:00:8a:c0
inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
media: Ethernet autoselect
status: active
laggproto failover
laggport: bge1 flags=0<>
laggport: bge0 flags=5

___
lagg0: flags=8843 metric 0 
mtu

1500


options=8009b
ether 00:14:ee:00:8a:c0
inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
media: Ethernet autoselect
status: active
laggproto failover
laggport: bge1 flags=4
laggport: bge0 flags=1

___

lagg0: flags=8843 metric 0 
mtu

1500


options=8009b
ether 00:14:ee:00:8a:c0
inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
media: Ethernet autoselect
status: active
laggproto failover
laggport: bge1 flags=0<>
laggport: bge0 flags=5

__
Also nothing unusual on dmesg:

...
bge0: link state changed to DOWN
bge0: link state changed to UP
bge1: link state changed to DOWN
bge1: link state changed to UP
bge0: link state changed to DOWN
bge0: link state changed to UP
bge1: link state changed to DOWN
bge1: link state changed to UP
bge0: link state changed to DOWN
bge0: link state changed to UP
bge1: link state changed to DOWN
bge1: link state changed to UP
...

The following is the related configuration in rc.conf:

...
ifconfig_bge0="up"
ifconfig_bge1="up"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto failover laggport bge0 laggport bge1
xxx.xx.xx.224/24"
ifconfig_lagg0_alias_0="inet xxx.xx.xx.225/32"
ifconfig_lagg0_alias_1="

LAGG bug or misconfiguration???

2012-03-15 Thread Snoop
Hi there,
a while after setting up my new server (with 8 jails in it) I've decided
(after postponing several times) to properly check the functionality of
the lagg and the result was very disappointing.

The test I've done is very simple.
I've started copying a file from one site to another of my VPN network
(from the server I've been testing the net to another node somewhere
else) and in the meantime I've been physically disconnecting the main
network cable to check the responsiveness of the lagg configuration.
Then I've plugged the cable back to check if the traffic would switch
back to the main NIC as it should.

The result was basically this (lagg0 members: bge0 primary, bge1
secondary)

- when bge0 unplugged the traffic switched almost instantaneously to
bge1
- when bge0 plugged back in, the network stopped working completely with
the two NICs polling synchronously until I manually unplug bge1. Then
within 2-4 seconds traffic goes back on bge0 (I've been waiting for a
little more than a minute maximum to avoid all the active connections on
the server to timeout).

Now, I've repeated the same test about 10-15 times randomly waiting for
different times between the unplug-replug procedure. The result was
always the same.

So, below are the ipconfig outputs
- before to start the test
- when bge0 gets unplugged
- when bge0 gets plugged back in

I couldn't see anything odd.
___
lagg0: flags=8843 metric 0 mtu
1500

options=8009b
ether 00:14:ee:00:8a:c0
inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
media: Ethernet autoselect
status: active
laggproto failover
laggport: bge1 flags=0<>
laggport: bge0 flags=5
___
lagg0: flags=8843 metric 0 mtu
1500

options=8009b
ether 00:14:ee:00:8a:c0
inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
media: Ethernet autoselect
status: active
laggproto failover
laggport: bge1 flags=4
laggport: bge0 flags=1
___

lagg0: flags=8843 metric 0 mtu
1500

options=8009b
ether 00:14:ee:00:8a:c0
inet xxx.xx.xx.224 netmask 0xff00 broadcast xxx.xx.xx.255
inet xxx.xx.xx.227 netmask 0x broadcast xxx.xx.xx.227
inet xxx.xx.xx.225 netmask 0x broadcast xxx.xx.xx.225
inet 172.16.3.2 netmask 0x broadcast 172.16.3.2
inet 172.16.3.3 netmask 0x broadcast 172.16.3.3
inet 172.16.3.4 netmask 0x broadcast 172.16.3.4
inet 172.16.3.5 netmask 0x broadcast 172.16.3.5
inet 172.16.3.6 netmask 0x broadcast 172.16.3.6
inet xxx.xx.xx.226 netmask 0x broadcast xxx.xx.xx.226
media: Ethernet autoselect
status: active
laggproto failover
laggport: bge1 flags=0<>
laggport: bge0 flags=5
__
Also nothing unusual on dmesg:

...
bge0: link state changed to DOWN
bge0: link state changed to UP
bge1: link state changed to DOWN
bge1: link state changed to UP
bge0: link state changed to DOWN
bge0: link state changed to UP
bge1: link state changed to DOWN
bge1: link state changed to UP
bge0: link state changed to DOWN
bge0: link state changed to UP
bge1: link state changed to DOWN
bge1: link state changed to UP
...

The following is the related configuration in rc.conf:

...
ifconfig_bge0="up"
ifconfig_bge1="up"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto failover laggport bge0 laggport bge1
xxx.xx.xx.224/24"
ifconfig_lagg0_alias_0="inet xxx.xx.xx.225/32"
ifconfig_lagg0_alias_1="inet xxx.xx.xx.226/32"
ifconfig_lagg0_alias_2="inet xxx.xx.x