Re: LACP inquiry

2019-06-18 Thread Peter J. Philipp
On Tue, Jun 18, 2019 at 12:31:30PM -0700, Lyndon Nerenberg wrote:
> > The panic indicated that there was no memory left and
> > was in UFS region.  Since this is the only change I did in the last few 
> > month
> > s
> > I'm guessing there is a memory leak in the LACP routines, somewhere.
> 
> Seems unlikely.  We run LACP trunks on all our firewalls and nginx
> load balancers.  Each of those machines pushes a steady 150 Mb/s of
> traffic through the trunk interfaces, 24 hours a day.
> 
> Are you doing any NFS mounts?  I've seen panicks in the past due
> to stuck NFS servers causing clients to run out of mbufs.  But that
> was a long time ago, so it's just a hint based on the panic being
> near the filesystem code ...  Seeing the actual panic traceback
> would help.
> 
> --lyndon

Hmmm, you are probably right, here.  I'll look at sendbuggin' the
panic string and backtrace later today.

To answer your question, no NFS mounts.

Regards,
-peter



Re: LACP inquiry

2019-06-18 Thread Lyndon Nerenberg
> The panic indicated that there was no memory left and
> was in UFS region.  Since this is the only change I did in the last few month
> s
> I'm guessing there is a memory leak in the LACP routines, somewhere.

Seems unlikely.  We run LACP trunks on all our firewalls and nginx
load balancers.  Each of those machines pushes a steady 150 Mb/s of
traffic through the trunk interfaces, 24 hours a day.

Are you doing any NFS mounts?  I've seen panicks in the past due
to stuck NFS servers causing clients to run out of mbufs.  But that
was a long time ago, so it's just a hint based on the panic being
near the filesystem code ...  Seeing the actual panic traceback
would help.

--lyndon



LACP inquiry

2019-06-18 Thread Peter J. Philipp
Hi,

I had for the longest time a trunk0 on my router with failover mode.  I redid 
the config on last friday to have trunk LACP on the Netgear switch instead.

Here is my config:

{internet}---[octeon router]---[netgear switch]===[Lanner 6 port firewall]

I have drawn the === in there to indicate that there is 2 cat5e cables going
to the Lanner, let's call it uranus for its hostname.

Today I returned to my apartment after being gone from it for 3 days to find
uranus had panic'ed.  The panic indicated that there was no memory left and
was in UFS region.  Since this is the only change I did in the last few months
I'm guessing there is a memory leak in the LACP routines, somewhere.  Or I have
misconfigured something.  Here is an ifconfig output of trunk0 on uranus:

>
trunk0: flags=8947 mtu 
1500
lladdr 00:90:0b:19:56:04
index 10 priority 0 llprio 3
trunk: trunkproto lacp
trunk id: [(8000,00:90:0b:19:56:04,4054,,),
 (0080,00:00:00:00:00:00,,,)]
trunkport em5 lacp_state actor 
activity,aggregation,sync,collecting,distributing,defaulted
trunkport em5 lacp_state partner 
aggregation,sync,collecting,distributing
trunkport em5 active,collecting,distributing
trunkport em0 lacp_state actor 
activity,aggregation,sync,collecting,distributing,defaulted
trunkport em0 lacp_state partner 
aggregation,sync,collecting,distributing
trunkport em0 active,collecting,distributing
groups: trunk egress
media: Ethernet autoselect
status: active
<-

My config for trunk0 looks like this:

uranus$ more /etc/hostname.trunk0
trunkport em0
trunkport em5
trunkproto lacp
inet 192.168.177.40 255.255.255.0 192.168.177.255
inet6 2001:db8:0:30::142 64
up

So my question for the short term is, is there anything I'm missing in this
config?  Do i need to set any priorities or anything?

Because it worked right away, I get a good ping from another host on the 
netgear switch:

beta$ ping uranus
PING uranus.internal.centroid.eu (192.168.177.40): 56 data bytes
64 bytes from 192.168.177.40: icmp_seq=0 ttl=255 time=0.490 ms
64 bytes from 192.168.177.40: icmp_seq=1 ttl=255 time=0.415 ms
64 bytes from 192.168.177.40: icmp_seq=2 ttl=255 time=0.526 ms
64 bytes from 192.168.177.40: icmp_seq=3 ttl=255 time=0.424 ms
^C
--- uranus.internal.centroid.eu ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.415/0.464/0.526/0.046 ms

I'm looking at Rich Seiferts Switch book section 9.5.8 (LACP), but it'll take
me a bit to make sense of it all, meanwhile I'M recompiling with LACP_DEBUG in
hopes I see something in dmesg that indicates that there is functions exiting
without perhaps free'ing some memory?

If someone has a successful lacp setup and don't have my problems can you let
me know what you're doing different?

Best Regards,
-peter