Re: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread Nikos Vassiliadis

On 2/18/2011 7:49 PM, kevin wrote:

My current testing has shown little promise -- both firewalls will go up,
traffic will only go to the first firewall. If I reboot that first firewall,
no traffic will flow to the second bridging firewall. Note that all IPs on
my network (inside and out) are public IPs, there are no private ips on my
network.


Could you send your ifconfig bridge output from both firewalls?
If STP is turned off on the four switch ports that the firewalls are
patched, one of the two firewalls must be root of the spanning tree.

Be sure that STP is *really* turned off on the switch, use tcpdump on the
physical ports for this.

Be sure that the FreeBSD's BPDUs are forwarded by the switch, so the one
bridging firewall can exchange BPDUs with the other.

HTH, Nikos
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


bge0 watchdog timeout -- resetting on 8.2-PREREL never recovers

2011-02-19 Thread Steven Hartland

Just updated a box to the 8.2-PREREL as of friday and now when we do any
serious amounts of network traffice we see:-
bge0: watchdog timeout -- resetting
bge0: link state changed to DOWN
bge0: link state changed to UP

The interface never recovers, we have to use remote console to down, wait
30 seconds then up the interface to restore network access.

This is the details from dmesg:-
bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x002100 mem 
0xfc9f-0xfc9f irq 26 at device 5.0 on pci3
bge0: CHIP ID 0x2100; ASIC REV 0x02; CHIP REV 0x21; PCI-X
miibus0: MII bus on bge0
brgphy0: BCM5704 10/100/1000baseTX PHY PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
auto-flow

bge0: Ethernet address: 00:30:48:75:32:42
bge0: [ITHREAD]


pciconf -lbvc:-
bge0@pci0:3:5:0:class=0x02 card=0x164815d9 chip=0x164814e4 rev=0x10 
hdr=0x00
   vendor = 'Broadcom Corporation'
   device = 'NetXtreme Dual Gigabit Adapter (BCM5704)'
   class  = network
   subclass   = ethernet
   bar   [10] = type Memory, range 64, base 0xfc9f, size 65536, enabled
   cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split 
transaction
   cap 01[48] = powerspec 2  supports D0 D3  current D0
   cap 03[50] = VPD
   cap 05[58] = MSI supports 8 messages, 64 bit

Searching around I thought that r216970 might fix this but it looks
like this is already present in another (if_bge.c,v 1.226.2.49)

   Regards
   Steve 




This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread Nikos Vassiliadis

On 2/19/2011 4:13 PM, kevin wrote:



Could you send your ifconfig bridge output from both firewalls?
If STP is turned off on the four switch ports that the firewalls are
patched, one of the two firewalls must be root of the spanning tree.


I believe if you don't specify 'stp' in the rc.conf ifconfig statement,
freebsd by default sets the bridge as 'rstp' :


Yes, that's correct.



sdh-fw# ifconfig
bridge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST  metric 0 mtu
1500
 ether 06:c7:a9:50:41:17
 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
 maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
 member: bge1 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP
 ifmaxaddr 0 port 3 priority 128 path cost 55
 member: bge0 flags=143LEARNING,DISCOVER,AUTOEDGE,AUTOPTP
 ifmaxaddr 0 port 2 priority 128 path cost 55



There is no active STP there. The port should look like this:
LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

You should also see the bridge's ID and not 00:00:00:00:00:00:

 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15


You should also see the root bridge's ID of the STP domain:

 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0


A bridge will look like this:
bridge2: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 
1500

ether a2:ae:00:08:a7:ab
inet 10.16.0.2 netmask 0xff00 broadcast 10.255.255.255
id 00:17:d6:a9:31:e7 priority 16384 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
root id 00:12:cf:69:e9:ea priority 16384 ifcost 14183 port 4
member: epair14b 
flags=1c7LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

ifmaxaddr 0 port 9 priority 128 path cost 14183 proto rstp
role designated state forwarding
member: epair13b 
flags=1c7LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

ifmaxaddr 0 port 8 priority 128 path cost 14183 proto rstp
role designated state forwarding
member: epair10b 
flags=1c7LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

ifmaxaddr 0 port 7 priority 128 path cost 14183 proto rstp
role alternate state discarding
...


And the root bridge will look like this:
bridge4: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 
1500

ether ae:6e:5a:9d:9b:5c
inet 10.16.0.4 netmask 0xff00 broadcast 10.255.255.255
id 00:12:cf:69:e9:ea priority 16384 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
root id 00:12:cf:69:e9:ea priority 16384 ifcost 0 port 0
member: epair18b 
flags=1c7LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

ifmaxaddr 0 port 9 priority 128 path cost 14183 proto rstp
role designated state forwarding
member: epair17b 
flags=1c7LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

ifmaxaddr 0 port 8 priority 128 path cost 14183 proto rstp
role designated state forwarding
member: epair11a 
flags=1c7LEARNING,DISCOVER,STP,AUTOEDGE,PTP,AUTOPTP

ifmaxaddr 0 port 7 priority 128 path cost 14183 proto rstp
role designated state forwarding
...









Be sure that STP is *really* turned off on the switch, use tcpdump on the
physical ports for this.


Should I just turn off STP for every port on the switch or just the ports
connected to the bridge?


Just the ports connected to the bridging firewalls. Your topology looks 
like this,

correct?

http://img811.imageshack.us/i/bridgingfw.png/

The switch must act as a plain ethernet switch, no stp, no BPDU 
filtering, no nothing.

The STP on the firewalls will handle the loop in the topology.

Be *sure* that STP is active on the firewalls and the two firewall are 
in a single
STP domain(can talk STP to each other), otherwise a L2 loop will do a 
DoS on your

firewalls...

HTH, Nikos
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread Nikos Vassiliadis

On 2/19/2011 4:52 PM, Nikos Vassiliadis wrote:

I believe if you don't specify 'stp' in the rc.conf ifconfig statement,
freebsd by default sets the bridge as 'rstp' :


Yes, that's correct.


note to self
It helps sometimes when you read the actual message before trying to 
answer:)

/note to self

No, you have to specify stp there. The default STP mode is RSTP.
If you don't specify stp, you'll get a dumb ethernet bridge.

Nikos

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: bge0 watchdog timeout -- resetting on 8.2-PREREL never recovers

2011-02-19 Thread Steven Hartland

This may be totally unrelated to bge, investigating a potential failing stick
of ram in the machine in question so until we've ruled this out as the cause
don't want to waste anyone's time.

I did however notice the logic between the two fixes for DMA on 5704's on PCIX
in svn differ so wondering which ones correct:-
http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=216085r2=216970
http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=217225r2=217226

r216970 results in:
1, 0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
where as r217226 results in:
1, BGE_DMA_BNDRY, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


RE: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread kevin
No, you have to specify stp there. The default STP mode is RSTP.
If you don't specify stp, you'll get a dumb ethernet bridge.

Thanks very much for clarification. This helps me immensely. My room for
testing is limited so this will help me take the right steps necessary.

One quick last question : would you recommend pfsync in this scenario,
between bridges? I've been hearing a lot of issues with pfsync but I'm not
sure what behavior to expect in a bridging scenario such as this one.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


RE: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread kevin
One other thing :

 id 00:17:d6:a9:31:e7 priority 16384 hellotime 2 fwddelay 15

And :

 root id 00:12:cf:69:e9:ea priority 16384 ifcost 0 port 0


I was under the impression the priority for the root bridge should be a
lower number ? Would you be able to post your rc.conf bridge entries for
each bridge, perhaps?

Thanks!


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread Nikos Vassiliadis

On 2/19/2011 6:11 PM, kevin wrote:

One other thing :


id 00:17:d6:a9:31:e7 priority 16384 hellotime 2 fwddelay 15


And :


root id 00:12:cf:69:e9:ea priority 16384 ifcost 0 port 0



I was under the impression the priority for the root bridge should be a
lower number ?


The priority is checked first when two bridges exchange BPDUs. The
bridge with the lower number wins. If the priority is the same then
the bridge's ID is used for the elections.

It would be best to manually set the priority in order to know who
is administratively the active firewall.


Would you be able to post your rc.conf bridge entries for
each bridge, perhaps?


Sorry, this is a lab environment, I am not using rc
facilities. It is just a shell script...

Nikos
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread Nikos Vassiliadis

On 2/19/2011 6:07 PM, kevin wrote:

One quick last question : would you recommend pfsync in this scenario,
between bridges? I've been hearing a lot of issues with pfsync but I'm not
sure what behavior to expect in a bridging scenario such as this one.


Can't really comment about pfsync as i have no experience of my own...

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread Tom Judge
On 19/02/2011 11:07, kevin wrote:
 No, you have to specify stp there. The default STP mode is RSTP.
 If you don't specify stp, you'll get a dumb ethernet bridge.
 Thanks very much for clarification. This helps me immensely. My room for
 testing is limited so this will help me take the right steps necessary.

 One quick last question : would you recommend pfsync in this scenario,
 between bridges? I've been hearing a lot of issues with pfsync but I'm not
 sure what behavior to expect in a bridging scenario such as this one.


This setup with pfsync will work ok as long as you have the STP setup
correctly.

As to the STP.

I can see an issue with this setup if you are using a single switch and
2 firewalls.

You will have the following links:

switch - port 1 - firewall 1 - port 1
switch - port 2 - firewall 1 - port 2
switch - port 3 - firewall 2 - port 1
switch - port 4 - firewall 2 - port 2

In this setup it does not matter where the root bridge is, each of the
firewalls will always have on port in disguarding state as both ports
lead back to the same peer bridge. With states such as:

fw 1 - 1: forwarding
fw 2 - 1: forwarding
fw 1 - 2: disguarding - backup
fw 2 - 2: disguarding - backup


If you disable STP on the ports for the firewalls you will have virtual
links:

firewall 1 - port 1 - firewall 2 - port 1
firewall 1 - port 2 - firewall 2 - port 2

This will create the following states (the same as above):

fw 1 - 1: forwarding
fw 2 - 1: forwarding
fw 1 - 2: disguarding - backup
fw 2 - 2: disguarding - backup

There is a also the caveat:  The switch will probably _not_ forward the
STP BPDU's from one port to another. This is because if the switch is a
properly compliant bridge it will not forwards the frames as they are
marked as link local ethernet multicast frame which is not allowed to
forwarded by a bridge per the ethernet spec.  If this is indeed the case
you will make an instant forwarding loop in your network when you try to
make it work.

You will need to introducing a 4th STP speaking device to the
configuration with a topology such as this:


switch 1 
||   |
|fw1-fw 2
||   |
switch 2 

Where the link between switch 1 and 2 is a trunk with both the vlans on
it.  This way you can set the root bridge to firewall 1 and firewall 2
as the second highest priority and the switches equal 3rd priorities.  I
would also recommend that FW 1 and 2 have opposite vlan assignments on
each switch, this way you can add a 3rd port to each firewall and link
them together, and you will be able to survive a switch failure as well.



___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org




signature.asc
Description: OpenPGP digital signature


Vote in favor of keeping ATM (was: NATM still scheduled for removal - please follow up to keep it in-tree)

2011-02-19 Thread Martin Birgmeier

Hi,

I would like to vote in favor of keeping NATM in the kernel. The reason 
is that I am currently working on writing a device driver for the 
SpeedTouch USB modem, to replace ports/net/pppoa which is not supported 
on FreeBSD8+ any more. The SpeedTouch USB in fact terminates as an ATM 
connection, on top of which PPPoA needs to be layered.


From my experiments with compiling a kernel with device atm and 
options NATM, I know that currently only the former works, this being 
due to unmaintained and broken code dealing with routing entries.


I am currently not much of a kernel code expert, but have already 
managed to write enough of the USB side of the device driver to load the 
modem's firmware. The next step would be to connect it to the ATM stack, 
using this route:


1. Terminate as ATM interface (ATM cells arriving);
2. The ATM stack implements AAL5 (I hope);
3. Capture the interface via ng_atm (which, as far as I understand, 
would more aptly be named ng_natm);
4. Extend the functionality of ng_atmllc (which basically does a 
small subset of RFC2684) to also do LLC/ISO (cf. RFC2364) (then better 
named ng_llc);

5. Couple the resultant PPP stream to ng_ppp;
6. Use something to configure the VPI/VCI (what?);
7. Run ports/net/mpd5 on that netgraph node.

5. and 7. could be replaced by ng_tty and ppp(8), but that would be the 
poorer choice as all traffic would have to go through userland again as 
it is doing with ports/net/pppoa.


For this I'd need a) a working ATM stack and b) the help of some kind 
souls in hooking everything up. Hans-Petter Selasky has already been 
very helpful with the USB part in private mail, and I actually wanted to 
solicit more help on the networking side of things privately in order 
not to trumpet out something which I'll probably finish only after 
considerable time, but reading the removal message I felt that I needed 
to make my needs public.


Regards,

Martin

p.s. A few :-) of the questions I have are

- why the original (as I understand HARP) ATM stack was removed (in the 
CVS logs the reason cited is the usual giant lock issue of that time),


- what the differences between the atm and natm stacks are (as I 
understand the latter only supports a subset of the functionality of the 
former - only AAL5?),


- why AF_NATM is different from AF_ATM (hinting that NATM is not a 
replacement of ATM),


- whether and how it is even possible to inject raw ATM cells,

- whether I even need options NATM (currently I can happily 
instantiate a (of course non-functional) ATM interface using just 
device atm),


- what do I need to do on the USB side to start receive and transmit 
machines (do I need to start separate kernel threads or just issue two 
usbd_transfer_setup() calls as for loading the firmware),


- etc. etc.

I do of course read the source, but with the scarce documentation 
available that's a steep learning curve.


p.p.s. Message re-sent from freebsd-atm because up till now I was not 
subscribed to freebsd-net.


--
Martin Birgmeier
Vienna
Austria
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Locking in ng_tty.c

2011-02-19 Thread Martin Birgmeier

In ng_tty.c, function ngt_newhook(), there is the following code:

if (sc-hook)
return (EISCONN);

NGTLOCK(sc);
sc-hook = hook;
NGTUNLOCK(sc);

I do not think this is proper - should not the test be within the lock?

Regards,

Martin

--
Martin Birgmeier
Vienna
Austria
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


RE: Bridging + VLANS + RSTP / MSTP

2011-02-19 Thread kevin
There is a also the caveat:  The switch will probably _not_ forward the STP
BPDU's from one port to another. This is because if the switch is a properly
compliant bridge it will not forwards the frames as they are marked as link
local ethernet multicast frame which is not allowed to forwarded by a bridge
per the ethernet spec.  If this is indeed the case you will make an instant
forwarding loop in your network when you try to make it work.

From the user manual of my switch, I have the following options to set for
BPDU handling :

BPDU Handling - Determines how BPDU packets are managed when STP is disabled
on the port
device. BPDUs are used to transmit spanning tree information. The possible
field values are:\

- Filtering - Filters BPDU packets when spanning tree is disabled on an
interface.

- Flooding - Floods BPDU packets when spanning tree is disabled on an
interface. This is the
default value.

I believe the 'flooding' option will blood BPDU packets to all ports on the
switch device. Is that something that would forward the STP BPDU's from the
disabled ports you think?

Implementing another switch isn't really an option right now so if I cannot
get this to work with my existing equiptment I  will have to redesign the
network without bridging , unfortunately (pf + carp + pfsense + multiple
gateways).


Thanks,

Kevin


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: bwi vs. bwn

2011-02-19 Thread Paul B. Mahol
On Fri, Feb 18, 2011 at 11:26 PM, grarpamp grarp...@gmail.com wrote:
 Doesn't FreeBSD have some sort of ndiswrapper function for this?
 http://www.broadcom.com/support/802.11/linux_sta.php
 NDISulator, ndis(4).

 Hmm, maybe that only applies to the Windows driver bundles as
 distributed by the vendors (Dell, HP, Lenovo, etc). Or from Microsoft
 itself as part of the OS. And not to this Linux thing.

You need inf and sys file(s), Windows drivers.
Everything is explained in documentation.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: bge0 watchdog timeout -- resetting on 8.2-PREREL never recovers

2011-02-19 Thread Pyun YongHyeon
On Sat, Feb 19, 2011 at 03:59:57PM -, Steven Hartland wrote:
 This may be totally unrelated to bge, investigating a potential failing 
 stick
 of ram in the machine in question so until we've ruled this out as the cause
 don't want to waste anyone's time.
 
 I did however notice the logic between the two fixes for DMA on 5704's on 
 PCIX
 in svn differ so wondering which ones correct:-
 http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=216085r2=216970
 http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=217225r2=217226
 
 r216970 results in:
 1, 0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
 where as r217226 results in:
 1, BGE_DMA_BNDRY, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
 

I think it would be same for your case(BCM5704 PCI-X). However
r217226 would be better one to address the issue. Actually I didn't
like the workaround but there was no much time left to fix it for
upcoming 8.2/7.4.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [Panic] Dummynet/IPFW related recurring crash.

2011-02-19 Thread Pawel Tyll
Hi guys, lists,

It's me, the bi-weekly panic guy. Guess what, it crashed today. As an
act of desperation I disabled the pipe dumping script after previous
crash, which today turned out to be merely a coincidence and didn't
prevent panics. (I thought it to be a longshot anyway, but it was the
only change I could associate with beginning of this). The only fix I
could come up with, that's very wrong on so many levels, is...

30 1 * * 7 /sbin/shutdown -r +210m Scheduled weekly reboot.

:(

...but it solves it all: panics, inability to dump (which leads to
freeze and requires manual intervention to bring system back up). Oh
well.

Since nobody came up with any interest in having this properly
investigated, then I suppose I'm the only one that uses dummynet for
some larger-scale traffic shaping - maybe that's my mistake? What
others are using? Other tools, Linux, proprietary traffic shapers? I
really have trouble writing another sentence in a way that won't make
me look like an arrogant schmuck that feels entitled to free support,
so I'll stop here. Any more ideas/hints what to do next/pointers how
to debug this properly are of course still welcome.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [Panic] Dummynet/IPFW related recurring crash.

2011-02-19 Thread Brandon Gooch
2011/2/19 Pawel Tyll pt...@nitronet.pl:
 Hi guys, lists,

 It's me, the bi-weekly panic guy. Guess what, it crashed today. As an
 act of desperation I disabled the pipe dumping script after previous
 crash, which today turned out to be merely a coincidence and didn't
 prevent panics. (I thought it to be a longshot anyway, but it was the
 only change I could associate with beginning of this). The only fix I
 could come up with, that's very wrong on so many levels, is...

 30 1 * * 7 /sbin/shutdown -r +210m Scheduled weekly reboot.

 :(

 ...but it solves it all: panics, inability to dump (which leads to
 freeze and requires manual intervention to bring system back up). Oh
 well.

 Since nobody came up with any interest in having this properly
 investigated, then I suppose I'm the only one that uses dummynet for
 some larger-scale traffic shaping - maybe that's my mistake? What
 others are using? Other tools, Linux, proprietary traffic shapers? I
 really have trouble writing another sentence in a way that won't make
 me look like an arrogant schmuck that feels entitled to free support,
 so I'll stop here. Any more ideas/hints what to do next/pointers how
 to debug this properly are of course still welcome.

Same backtrace as reported here?

http://www.freebsd.org/cgi/query-pr.cgi?pr=152360

What revision of the em(4) driver code are you using? I know Jack has
been working out certain quirky issues lately, and there have been
updates to the code since last you posted an update to your problem.

STABLE:
http://svn.freebsd.org/viewvc/base?view=revisionrevision=217711

RELENG:
http://svn.freebsd.org/viewvc/base?view=revisionrevision=217865

Maybe Jack could shed some light on the whether the updates could
somehow work out to be a fix for your problem -- or whether or not
your issue is even related to the em(4) driver.

-Brandon
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [Panic] Dummynet/IPFW related recurring crash.

2011-02-19 Thread Jack Vogel
I've never seen a trace like this, and no absolutely nothing about dummynet,
sorry.
If it is in some way em's fault, then making sure you have the latest code
would be
a good idea. I have a test driver that is under selective test, it does
effect the code
path that you seem to be in, so it might be worth a try. If you want to try
it early
just pipe up and I'll send it.

Jack


On Sat, Feb 19, 2011 at 6:16 PM, Brandon Gooch
jamesbrandongo...@gmail.comwrote:

 2011/2/19 Pawel Tyll pt...@nitronet.pl:
  Hi guys, lists,
 
  It's me, the bi-weekly panic guy. Guess what, it crashed today. As an
  act of desperation I disabled the pipe dumping script after previous
  crash, which today turned out to be merely a coincidence and didn't
  prevent panics. (I thought it to be a longshot anyway, but it was the
  only change I could associate with beginning of this). The only fix I
  could come up with, that's very wrong on so many levels, is...
 
  30 1 * * 7 /sbin/shutdown -r +210m Scheduled weekly reboot.
 
  :(
 
  ...but it solves it all: panics, inability to dump (which leads to
  freeze and requires manual intervention to bring system back up). Oh
  well.
 
  Since nobody came up with any interest in having this properly
  investigated, then I suppose I'm the only one that uses dummynet for
  some larger-scale traffic shaping - maybe that's my mistake? What
  others are using? Other tools, Linux, proprietary traffic shapers? I
  really have trouble writing another sentence in a way that won't make
  me look like an arrogant schmuck that feels entitled to free support,
  so I'll stop here. Any more ideas/hints what to do next/pointers how
  to debug this properly are of course still welcome.

 Same backtrace as reported here?

 http://www.freebsd.org/cgi/query-pr.cgi?pr=152360

 What revision of the em(4) driver code are you using? I know Jack has
 been working out certain quirky issues lately, and there have been
 updates to the code since last you posted an update to your problem.

 STABLE:
 http://svn.freebsd.org/viewvc/base?view=revisionrevision=217711

 RELENG:
 http://svn.freebsd.org/viewvc/base?view=revisionrevision=217865

 Maybe Jack could shed some light on the whether the updates could
 somehow work out to be a fix for your problem -- or whether or not
 your issue is even related to the em(4) driver.

 -Brandon

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


ARP issue post DDoS

2011-02-19 Thread Mike M
Hi,

After receiving a DDoS recently (likely SYN related on ports with
legitimate services), I was unable to contact my primary interface
gateway (immediate switch it's connected to).

When I looked at the ARP table I saw an 'incomplete' entry for this
gateway.  I deleted it manually then watched the ARP traffic on the
interface and saw the who-has requests, but saw no replies.

NOC suggested that something looked messed up in the TCP/IP stack of the
OS and suggested I reboot the machine.

When I rebooted, everything came right again.

Any ideas what caused this, or moreso how to prevent it from happening
in the future?  I'm concerned it will happen again and obviously don't
want to have to keep rebooting the machine.

The box is running FreeBSD 8.1-RELEASE-p2
Intel Xeon 2.4GHz w/4GB RAM

2 x NetXtreme Gigabit Ethernet PCI Express (BCM5721)

No idea if the below helps or not.  Note the netstat statistics were not
captured at the time this happened, I just grabbed them now.

# pfctl -s memory
stateshard limit 1000
src-nodes hard limit1
frags hard limit 5000
tableshard limit 1000
table-entries hard limit   10

#  netstat -m
1027/11393/12420 mbufs in use (current/cache/total)
1025/4215/5240/65000 mbuf clusters in use (current/cache/total/max)
1024/3456 mbuf+clusters out of packet secondary zone in use (current/cache)
0/199/199/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
2306K/12074K/14381K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Any help would be much appreciated.

Regards,

- Mike


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [Panic] Dummynet/IPFW related recurring crash.

2011-02-19 Thread Pawel Tyll
 Same backtrace as reported here?
I'm unable to get the new backtrace, but judging from what I can see
on the console pre-reboot, it's exactly the same deal since
8.0-RELEASE - panic with dummynet as main star.

 What revision of the em(4) driver code are you using? I know Jack has
 been working out certain quirky issues lately, and there have been
 updates to the code since last you posted an update to your problem.
em0: Intel(R) PRO/1000 Network Connection 7.1.8 port 0x2040-0x205f mem 
0xb1a0-0xb1a1,0xb1a25000-0xb1a25fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:15:17:...
em1: Intel(R) PRO/1000 Network Connection 7.1.8 port 0x1000-0x101f mem 
0xb190-0xb191,0xb192-0xb1923fff irq 16 at device 0.0 on pci2
em1: Using MSIX interrupts with 3 vectors
em1: [ITHREAD]
em1: [ITHREAD]
em1: [ITHREAD]
em1: Ethernet address: 00:15:17:..

 Maybe Jack could shed some light on the whether the updates could
 somehow work out to be a fix for your problem -- or whether or not
 your issue is even related to the em(4) driver.
I'm not sure anymore that em is at fault here. Anyway, currently I'm
running FreeBSD 8.2-PRERELEASE #1: Fri Jan  7 17:19:28. If something
happened since then that could fix this, I'll gladly update.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [Panic] Dummynet/IPFW related recurring crash.

2011-02-19 Thread Pawel Tyll
 I was actually going to suggest Pawl try a different network device if
 possible. I'm using dummynet on a network gateway equipped with
 on-board bge(4). I haven't had any crashes, but then again, I'm not
 seeing that many packets either.
It seems to be closely related to amount of processed packets. We've
connected some new customers recently, which obviously translated to
more traffic and more pipes:

 #   Uptime | System Boot up
+---
 115 days, 05:42:23 | FreeBSD 8.2-PRERELEASEWed Dec 22 16:19:55 2010
 214 days, 22:31:42 | FreeBSD 8.2-PRERELEASESun Jan  9 03:07:59 2011
 314 days, 05:25:46 | FreeBSD 8.2-PRERELEASEMon Jan 24 01:55:34 2011
 412 days, 15:42:45 | FreeBSD 8.2-PRERELEASEMon Feb  7 07:30:07 2011

In general, our traffic increases, which apparently translates to
crashing sooner. When all this started I remember seeing crashes
closer to three weeks, but now it surprised me by crashing two days
early.

As to trying different device, these are on-board interfaces on Intel
mainboard. I don't have any experience with external pci express
adapters other than Intel's, could you recommend something stable?


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [Panic] Dummynet/IPFW related recurring crash.

2011-02-19 Thread Pawel Tyll
 I've never seen a trace like this, and no absolutely nothing about dummynet, 
 sorry.
 If it is in some way em's fault, then making sure you have the latest code 
 would be
 a good idea. I have a test driver that is under selective test, it does 
 effect the code
 path that you seem to be in, so it might be worth a try. If you want to try 
 it early
 just pipe up and I'll send it.
I'm less and less sure that it has anything to do with em. I'd like to
hear Luigi's take on all this. That being said, I'll gladly try the
new driver -- if I'm right, I'll drop under 7 day reboot threshold
later into the year anyway, so I really need a permanent solution of
some kind. Apparently next crash always comes sooner that previous
one, which coincides with growing traffic.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: igb driver RX (was TX) hangs when out of mbuf clusters

2011-02-19 Thread Arnaud Lacombe
Hi Jack,

It would seem I've just been encountering this issue on an `em'
interface as well (chip ID 0x10d38086). The system has been up for a
bit more than a day. netstat(1) list about 2500 clusters allocation
denial. The mentioned interface was unable to receive traffic,
however, it continued to transmit ARP request. Comparing the output of
sysctl's statistics showed an increase of missed packets:

Over a 10s time frame:
-dev.em.5.mac_stats.missed_packets: 288412
+dev.em.5.mac_stats.missed_packets: 288423

TX accounting and INTR count got up as I'd expect. Doing an `ifconfig
down  ifconfig up' restored the connectivity.

 - Arnaud

On Fri, Feb 11, 2011 at 2:53 PM, Karim Fodil-Lemelin
fodillemlinka...@gmail.com wrote:
 Hi,

 I see a commit was made in current (r218530 | jfv | 2011-02-10 20:00:26
 -0500 (Thu, 10 Feb 2011)). Is that commit done to address this issue?

 And if so Is there any MFC planned for 7.4 for this?

 Thanks,

 Karim.

 2011/2/9 Michael Tuexen tue...@freebsd.org

 On Feb 9, 2011, at 6:35 PM, Jack Vogel wrote:

  OK, but the question is why does the ring get totally consumed this way,
 the
  ring has 1024 descriptors, it seems unintuitive that that whole quantity
 can be
  used without some being recharged. Do you see the system mbuf pool being
  depleted at the same time?
 That was the test case I created: I set up a server accepting connections
 but not reading anything. So the driver passes the mbufs to the transport
 stack and they are not consumed. Then the problem occurs. Then I kill the
 server. Now there are mbufs available again, but the driver doesn't know.

 I had the impression that these were the circumstances in which the problem
 showed up (mbuf allocations failing).
 
  Since you can reproduce it, do me a favor, in rxeof,  change the
 processed
  value from 8 to 4 and then 1, effectively call refresh every descriptor,
 see if
  that eliminates the issue.
 I will do. Need to see if I can do it remotely, since I'm not in my lab
 right now. Can do it tomorrow for sure.

 But I do not think that this solves the problem, since I did the things
 very slowly and you call it at least when you are leaving rxeof.

 Best regards
 Michael
 
  Thanks for your help,
 
  Jack
 
 
  On Wed, Feb 9, 2011 at 2:36 AM, Michael Tuexen tue...@freebsd.org
 wrote:
  Hi Jack,
 
  I could recreate the problem. When the problem occurs, we see
 
  rx_nxt_check = n
  rx_nxt_refresh = n + 1
 
  (This was also reported in a mail from Karim)
 
  This means that the *whole* receive ring has no buffers anymore. This can
  occur if, for some amount of time, no clusters are available.
 
  Now outside of the driver, at some point of time, clusters are freed.
  I don't think that igb_refresh_mbufs() gets called, since it only gets
  called from igb_rxeof(), which gets called when a packet has been
 received,
  which can not happen since the receive ring is empty. So how can the
 driver
  know? I have no idea. Maybe we can periodically check for such an event
  and call igb_refresh_mbufs().
 
  Does this make sense to you?
 
  Best regards
  Michael
 
 
  On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote:
 
   Hmmm, well so much for that theory :)
  
   Jack
  
  
   On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin 
 fodillemlinka...@gmail.com wrote:
  
  
   2011/2/8 Jack Vogel jfvo...@gmail.com
  
  
   I have been following this, and thinking about it. I still am working
 from a theoretical
   standpoint, but based on a patch I got quite a long time back and never
 quite groked,
   I believe now that I might have a solution.
  
   The original PR and patch was kern/150516 from Beezar Liu,  I was never
 quite comfortable
   with the code changes, nor convinced that it was a real issue and not a
 misunderstanding.
   However I think now that this very report might be behind what we are
 seeing today. I have
   a slightly different approach to solving it, of course it remains to be
 seen if it handles it
   properly.
  
   Please try the patch I've attached, I'm open to further correction or
 polishing of the
   changes. And thanks to Beezar for his original report and changes, this
 is not for em,
   but if this eliminates the problem its clearly needed in all drivers.
  
   Jack
  
  
   Hi Jack,
  
   Thanks for your help. I tried your patch and it didn't work so I added
 a couple of printf to see if the added code was getting hit:
  
   --- a/freebsd/sys/dev/e1000/if_igb.c
   --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c
   @@ -612,7 +612,7 @@ igb_attach(device_t dev)
               device_get_nameunit(dev));
  
           INIT_DEBUGOUT(igb_attach: end);
   -
   +       printf(this driver has a patch from Jack Vogel\n);
           return (0);
  
    err_late:
   @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, int
 *done)
                   struct mbuf             *sendmp, *mh, *mp;
                   struct igb_rx_buf       *rxbuf;
                   u16                     hlen,