Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Hooman Fazaeli

On 10/31/2011 7:33 AM, Jason Wolfe wrote:



Thanks for looking into this.  I'd be happy to test any patch thrown my way, but keep in mind my issue is only tickled when MSI-X is enabled.  My interfaces aren't bouncing, though it might be 
possible some unique path in the MSI-X code is causing a throughput hang akin to connectivity loss?


Jack is the delta your speaking to the 7.2.4 code?  I did manage to get the code from Intel compiled with a couple minutes of work, but haven't loaded it up yet as I didn't see anything that caught 
my untrained eye in the diffs.  I'll wait until its ported over and would be happy to test if needed.


Conveniently enough I just received another report from my test boxes with a pretty stock loader.conf.  I had forgotten to remove the advanced options from the interfaces after I cycled them to pick 
up the fc_setting=0.  Fixed that up just meow.


hw.em.fc_setting=0
cc_cubic_load=YES

I bounced em0 because dropped packets incremented 368756 to 369124 and the 
interface is not incrementing packets out.

5:35PM up 2 days, 17:45, 0 users, load averages: 0.34, 0.45, 0.48

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet6 X%em0 prefixlen 64 scopeid 0x1
nd6 options=1PERFORMNUD
media: Ethernet autoselect (1000baseT full-duplex)
status: active

em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet6 X%em1 prefixlen 64 scopeid 0x2
inet6 X prefixlen 64 autoconf
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (1000baseT full-duplex)
status: active

ipfw0: flags=8801UP,SIMPLEX,MULTICAST metric 0 mtu 65536

lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=3RXCSUM,TXCSUM
inet 127.0.0.1 netmask 0xff00
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
inet X.X.X.X netmask 0x
nd6 options=3PERFORMNUD,ACCEPT_RTADV

lagg0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC
ether X
inet X.X.X.X netmask 0xff00 broadcast X.X.X.X
inet6 X%lagg0 prefixlen 64 scopeid 0x5
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect
status: active
laggproto loadbalance
laggport: em0 flags=4ACTIVE
laggport: em1 flags=4ACTIVE

interrupt total rate
irq3: uart1 3456 0
cpu0: timer 473404250 2000
irq256: em0:rx 0 24614350 103
irq257: em0:tx 0 1220810972 5157
irq258: em0:link 1 0
irq259: em1:rx 0 1533295149 6477
irq260: em1:tx 0 1194032538 5044
irq261: em1:link 3272 0
irq262: mps0 189602667 801
cpu3: timer 473396089 2000
cpu1: timer 473396089 2000
cpu2: timer 473396081 2000
Total 6055954914 25585

32999/8476/41475 mbufs in use (current/cache/total)
4064/3398/7462/5872038 mbuf clusters in use (current/cache/total/max)
4064/800 mbuf+clusters out of packet secondary zone in use (current/cache)
24900/669/25569/2936019 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
115977K/11591K/127568K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
61 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500 Link#1 00:25:90:2a:a2:d7 24946787 0 0 5734180355 0 0 369844
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 2 - - -
em1 1500 Link#2 00:25:90:2a:a2:d7 5220869518 15996 0 5429971995 0 0 37009
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
em1 1500 2607:f4e8:310 2607:f4e8:310:12: 0 - - 0 - - -
lagg0 1500 Link#5 00:25:90:2a:a2:d7 5245767782 0 0 11162877037 406853 0 0
lagg0 1500 69.164.38.0/2 http://69.164.38.0/2 69.164.38.69 4776881809 - - 
11164303625 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 0 - - 3 - - -

kern.msgbuf:

Oct 30 17:08:38 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:12:10 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:20:20 cds1019 last message repeated 3 times
Oct 30 17:32:13 cds1019 last message repeated 4 times
Oct 30 17:34:27 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:35:03 cds1019 kernel: Interface is RUNNING and INACTIVE
Oct 30 17:35:03 cds1019 kernel: em0: hw tdh = 818, hw tdt = 818
Oct 30 17:35:03 cds1019 kernel: em0: hw rdh = 99, hw rdt = 98
Oct 30 17:35:03 cds1019 kernel: em0: Tx 

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Emil Muratov




You may try these settings and see if they help:

- hw.em.fc_setting=0 (in /boot/loader.conf)
- hw.em.rxd=4096 (in /boot/loader.conf)
- hw.em.txd=4096 (in /boot/loader.conf)
- Fix speed and duplex at both link sides. After doing that, confirm 
on the freebsd
  box (with ifconfig) and the other device (with whatever command it 
provides) that

  the same speed and duplex is used by both devices.

you also have  high values for dev.em.0.rx/tx_[abs]_int_delay. If you
have set them manually, remove them or replace them with these in 
loader.conf:


hw.em.rx_int_delay=0
hw.em.tx_int_delay=66
hw.em.tx_abs_int_delay=66
hw.em.rx_abs_int_delay=66

these may be set via corresponding sysctls too.



Still no luck with the above settings, I've got another lockups a couple 
of times. Here is the recent details



=
11.10.30-23:43:06 ... interface em0 is down...
we have Ierrs and no ingoing packets for 5 secs, interface em0 must be 
toggled


11:43PM  up 1 day,  3:01, 2 users, load averages: 0.76, 0.64, 0.70

 == vmstat -i ==
interrupt  total   rate
irq18: ehci0 1145540 11
irq22: nfe0473895599   4872
cpu0: timer195004026   2005
irq256: ahci0   12832958131
irq257: em0:rx 095571051982
irq258: em0:tx 088777545912
irq259: em0:link 946  0
cpu3: timer195003397   2005
cpu1: timer195003398   2005
cpu2: timer195003399   2005
Total 1452237859  14932

 == netstat -m ==
5424/1701/7125 mbufs in use (current/cache/total)
719/1185/1904/51200 mbuf clusters in use (current/cache/total/max)
719/582 mbuf+clusters out of packet secondary zone in use (current/cache)
329/583/912/12800 4k (page size) jumbo clusters in use 
(current/cache/total/max)

4095/342/4437/12800 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
40978K/8205K/49183K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/6663503/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

 == netstat -ind ==
NameMtu Network   Address  Ipkts Ierrs Idrop
Opkts Oerrs  Coll Drop
usbus 0 Link#1   0 0 0
0 0 00
usbus 0 Link#2   0 0 0
0 0 00
nfe0   1500 Link#3  00:25:22:21:86:89 196018201 0 0 
350650768 0 0  664
nfe0   1500 fe80::225:22f fe80::225:22ff:fe0 - -
0 - --
nfe0   1500 10.16.128.0/1 10.16.189.71 6 - - 
29787707 - --
em09000 Link#4  00:1b:21:ab:bf:4a 175676617   949 0 
101627139 0 00
em09000 192.168.168.0 192.168.168.1  7628423 - - 
13654747 - --
em09000 fe80::21b:21f fe80::21b:21ff:fe   45 - - 
5747 - --
em09000 2002:d5xx:xxx 2002:d5xx::x:  153 - -  
159 - --


Oct 30 23:43:06 ion kernel: Interface is RUNNING and INACTIVE
Oct 30 23:43:07 ion kernel: em0: hw tdh = 2656, hw tdt = 3271
Oct 30 23:43:07 ion kernel: em0: hw rdh = 2112, hw rdt = 2111
Oct 30 23:43:07 ion kernel: em0: Tx Queue Status = 1
Oct 30 23:43:07 ion kernel: em0: TX descriptors avail = 3481
Oct 30 23:43:07 ion kernel: em0: Tx Descriptors avail failure = 0
Oct 30 23:43:07 ion kernel: em0: RX discarded packets = 0
Oct 30 23:43:07 ion kernel: em0: RX Next to Check = 2112
Oct 30 23:43:07 ion kernel: em0: RX Next to Refresh = 2111
net.inet.ip.intr_queue_maxlen: 4096
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 
subdevice=0xa01f class=0x02

dev.em.0.%parent: pci2
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 0
dev.em.0.eee_control: 0
dev.em.0.link_irq: 956
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 1
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 100827170
dev.em.0.fc_high_water: 11264
dev.em.0.fc_low_water: 9764
dev.em.0.queue0.txd_head: 2656
dev.em.0.queue0.txd_tail: 3274
dev.em.0.queue0.tx_irq: 88769608
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 2112
dev.em.0.queue0.rxd_tail: 2111

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Hooman Fazaeli

On 10/31/2011 11:43 AM, Emil Muratov wrote:




You may try these settings and see if they help:

- hw.em.fc_setting=0 (in /boot/loader.conf)
- hw.em.rxd=4096 (in /boot/loader.conf)
- hw.em.txd=4096 (in /boot/loader.conf)
- Fix speed and duplex at both link sides. After doing that, confirm on the 
freebsd
  box (with ifconfig) and the other device (with whatever command it provides) 
that
  the same speed and duplex is used by both devices.

you also have  high values for dev.em.0.rx/tx_[abs]_int_delay. If you
have set them manually, remove them or replace them with these in loader.conf:

hw.em.rx_int_delay=0
hw.em.tx_int_delay=66
hw.em.tx_abs_int_delay=66
hw.em.rx_abs_int_delay=66

these may be set via corresponding sysctls too.



Still no luck with the above settings, I've got another lockups a couple of 
times. Here is the recent details


=
11.10.30-23:43:06 ... interface em0 is down...
we have Ierrs and no ingoing packets for 5 secs, interface em0 must be toggled

11:43PM  up 1 day,  3:01, 2 users, load averages: 0.76, 0.64, 0.70

 == vmstat -i ==
interrupt  total   rate
irq18: ehci0 1145540 11
irq22: nfe0473895599   4872
cpu0: timer195004026   2005
irq256: ahci0   12832958131
irq257: em0:rx 095571051982
irq258: em0:tx 088777545912
irq259: em0:link 946  0
cpu3: timer195003397   2005
cpu1: timer195003398   2005
cpu2: timer195003399   2005
Total 1452237859  14932

 == netstat -m ==
5424/1701/7125 mbufs in use (current/cache/total)
719/1185/1904/51200 mbuf clusters in use (current/cache/total/max)
719/582 mbuf+clusters out of packet secondary zone in use (current/cache)
329/583/912/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
4095/342/4437/12800 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
40978K/8205K/49183K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/6663503/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

 == netstat -ind ==
NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
usbus 0 Link#1   0 0 00 0 
00
usbus 0 Link#2   0 0 00 0 
00
nfe0   1500 Link#3  00:25:22:21:86:89 196018201 0 0 350650768 
0 0  664
nfe0   1500 fe80::225:22f fe80::225:22ff:fe0 - -0 - 
--
nfe0   1500 10.16.128.0/1 10.16.189.71 6 - - 29787707 - 
--
em09000 Link#4  00:1b:21:ab:bf:4a 175676617   949 0 101627139 
0 00
em09000 192.168.168.0 192.168.168.1  7628423 - - 13654747 - 
--
em09000 fe80::21b:21f fe80::21b:21ff:fe   45 - - 5747 - 
--
em09000 2002:d5xx:xxx 2002:d5xx::x:  153 - -  159 - 
--

Oct 30 23:43:06 ion kernel: Interface is RUNNING and INACTIVE
Oct 30 23:43:07 ion kernel: em0: hw tdh = 2656, hw tdt = 3271
Oct 30 23:43:07 ion kernel: em0: hw rdh = 2112, hw rdt = 2111
Oct 30 23:43:07 ion kernel: em0: Tx Queue Status = 1
Oct 30 23:43:07 ion kernel: em0: TX descriptors avail = 3481
Oct 30 23:43:07 ion kernel: em0: Tx Descriptors avail failure = 0
Oct 30 23:43:07 ion kernel: em0: RX discarded packets = 0
Oct 30 23:43:07 ion kernel: em0: RX Next to Check = 2112
Oct 30 23:43:07 ion kernel: em0: RX Next to Refresh = 2111
net.inet.ip.intr_queue_maxlen: 4096
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 
subdevice=0xa01f class=0x02
dev.em.0.%parent: pci2
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 0
dev.em.0.eee_control: 0
dev.em.0.link_irq: 956
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 1
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 100827170
dev.em.0.fc_high_water: 11264
dev.em.0.fc_low_water: 9764
dev.em.0.queue0.txd_head: 2656
dev.em.0.queue0.txd_tail: 3274
dev.em.0.queue0.tx_irq: 88769608
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 2112

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Emil Muratov

On 31.10.2011 12:13, Hooman Fazaeli wrote:





Thanks for looking into this.  I'd be happy to test any patch thrown 
my way, but keep in mind my issue is only tickled when MSI-X is 
enabled.  My interfaces aren't bouncing, though it might be possible 
some unique path in the MSI-X code is causing a throughput hang akin 
to connectivity loss?


Jack is the delta your speaking to the 7.2.4 code?  I did manage to 
get the code from Intel compiled with a couple minutes of work, but 
haven't loaded it up yet as I didn't see anything that caught my 
untrained eye in the diffs.  I'll wait until its ported over and 
would be happy to test if needed.


Conveniently enough I just received another report from my test boxes 
with a pretty stock loader.conf.  I had forgotten to remove the 
advanced options from the interfaces after I cycled them to pick up 
the fc_setting=0.  Fixed that up just meow.


hw.em.fc_setting=0
cc_cubic_load=YES




Jason

Attached is a patch for if_em.c. It flushes interface queue when it is 
full
and link is not active. Please note that when this happens, drops are 
increasing
on interface and this will trigger your scripts as before. You need to 
change

a little the scripts as follows:

  check interface TX status
  if (interface TX seems hung) {
sleep 5
check interface TX status
if (interface TX seems hung) {
 reset the interface.
}
}

For MULTIQUEUE, it just disables the check for link status (which is 
not good).

so pls. test in non-MULTIQUEUE mode.

The patch also contains some minor fixups to compile on 7 plus
a fix from r1.69 which addressed RX hang problem (the fix was
later removed in r1.70). I included it for Emil to give it a
try.

Pls. let me know if you have any problems with patch.





Hi! Thanks for the update. But I can't make it, there is an error in 
build process. Can you kindly take a look at it?



-emil@ion-/usr/src/sys/dev/e1000
--(0) sudo patch  /home/emil/patches/if_em/if_em.c.patch
Password:
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|--- if_em.c.orig   2011-10-31 11:43:35.0 +0330
|+++ if_em.c2011-10-31 11:43:35.0 +0330
--
Patching file if_em.c using Plan A...
Hunk #1 succeeded at 85.
Hunk #2 succeeded at 101.
Hunk #3 succeeded at 382 (offset -29 lines).
Hunk #4 succeeded at 400 (offset -29 lines).
Hunk #5 succeeded at 857 (offset -29 lines).
Hunk #6 succeeded at 960 (offset -29 lines).
Hunk #7 succeeded at 1420 (offset -29 lines).
Hunk #8 succeeded at 1436 (offset -29 lines).
Hunk #9 succeeded at 1466 (offset -29 lines).
Hunk #10 succeeded at 2230 (offset -29 lines).
Hunk #11 succeeded at 2338 (offset -29 lines).
Hunk #12 succeeded at 2350 (offset -29 lines).
Hunk #13 succeeded at 3799 (offset -29 lines).
Hunk #14 succeeded at 5164 with fuzz 2 (offset -29 lines).
Hunk #15 succeeded at 5616 (offset -4 lines).
done

-emil@ion-/usr/src/sys/dev/e1000
--(0) sudo patch  /home/emil/patches/if_em/if_em.h.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|--- if_em.h.orig   2011-10-31 11:43:34.0 +0330
|+++ if_em.h2011-10-31 11:43:35.0 +0330
--
Patching file if_em.h using Plan A...
Hunk #1 succeeded at 438.
done


#root@ion-/usr/src/sys/modules/em
#-(0) make
Warning: Object directory not changed from original /usr/src/sys/modules/em
awk -f @/tools/makeobjops.awk @/kern/device_if.m -h
awk -f @/tools/makeobjops.awk @/kern/bus_if.m -h
awk -f @/tools/makeobjops.awk @/dev/pci/pci_if.m -h
: opt_inet.h
cc -O2 -pipe -march=nocona -fno-strict-aliasing -Werror -D_KERNEL 
-DKLD_MODULE -nostdinc  -I/usr/src/sys/modules/em/../../dev/e1000 -I. 
-I@-I@/contrib/altq -finline-limit=8000 
--param inline-unit-growth=100 --param large-function-growth=1000 
-fno-common  -fno-omit-frame-pointer
-mcmodel=kernel -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-sse3 
-mno-mmx -mno-3dnow  -msoft-float -fno-asynchronous-unwind-tables 
-   ffreestanding -fstack-protector 
-std=iso9899:1999 -fstack-protector -Wall -Wredundant-decls 
-Wnested-externs -Wstrict-prototypes  
-Wmissing-   prototypes -Wpointer-arith -Winline 
-Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -c 
/usr/src/sys/modules/em/../../dev/e1000/if   _em.c
/usr/src/sys/modules/em/../../dev/e1000/if_em.c:387: error: 
'sysctl__hw_em_children' undeclared here (not in a function)

*** Error code 1

Stop in /usr/src/sys/modules/em.




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to 

Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

2011-10-31 Thread Hooman Fazaeli

On 10/31/2011 12:51 PM, Emil Muratov wrote:

On 31.10.2011 12:13, Hooman Fazaeli wrote:





Thanks for looking into this.  I'd be happy to test any patch thrown my way, but keep in mind my issue is only tickled when MSI-X is enabled.  My interfaces aren't bouncing, though it might be 
possible some unique path in the MSI-X code is causing a throughput hang akin to connectivity loss?


Jack is the delta your speaking to the 7.2.4 code?  I did manage to get the code from Intel compiled with a couple minutes of work, but haven't loaded it up yet as I didn't see anything that 
caught my untrained eye in the diffs.  I'll wait until its ported over and would be happy to test if needed.


Conveniently enough I just received another report from my test boxes with a pretty stock loader.conf.  I had forgotten to remove the advanced options from the interfaces after I cycled them to 
pick up the fc_setting=0.  Fixed that up just meow.


hw.em.fc_setting=0
cc_cubic_load=YES




Jason

Attached is a patch for if_em.c. It flushes interface queue when it is full
and link is not active. Please note that when this happens, drops are increasing
on interface and this will trigger your scripts as before. You need to change
a little the scripts as follows:

  check interface TX status
  if (interface TX seems hung) {
sleep 5
check interface TX status
if (interface TX seems hung) {
 reset the interface.
}
}

For MULTIQUEUE, it just disables the check for link status (which is not good).
so pls. test in non-MULTIQUEUE mode.

The patch also contains some minor fixups to compile on 7 plus
a fix from r1.69 which addressed RX hang problem (the fix was
later removed in r1.70). I included it for Emil to give it a
try.

Pls. let me know if you have any problems with patch.





Hi! Thanks for the update. But I can't make it, there is an error in build 
process. Can you kindly take a look at it?


-emil@ion-/usr/src/sys/dev/e1000
--(0) sudo patch  /home/emil/patches/if_em/if_em.c.patch
Password:
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|--- if_em.c.orig   2011-10-31 11:43:35.0 +0330
|+++ if_em.c2011-10-31 11:43:35.0 +0330
--
Patching file if_em.c using Plan A...
Hunk #1 succeeded at 85.
Hunk #2 succeeded at 101.
Hunk #3 succeeded at 382 (offset -29 lines).
Hunk #4 succeeded at 400 (offset -29 lines).
Hunk #5 succeeded at 857 (offset -29 lines).
Hunk #6 succeeded at 960 (offset -29 lines).
Hunk #7 succeeded at 1420 (offset -29 lines).
Hunk #8 succeeded at 1436 (offset -29 lines).
Hunk #9 succeeded at 1466 (offset -29 lines).
Hunk #10 succeeded at 2230 (offset -29 lines).
Hunk #11 succeeded at 2338 (offset -29 lines).
Hunk #12 succeeded at 2350 (offset -29 lines).
Hunk #13 succeeded at 3799 (offset -29 lines).
Hunk #14 succeeded at 5164 with fuzz 2 (offset -29 lines).
Hunk #15 succeeded at 5616 (offset -4 lines).
done

-emil@ion-/usr/src/sys/dev/e1000
--(0) sudo patch  /home/emil/patches/if_em/if_em.h.patch
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|--- if_em.h.orig   2011-10-31 11:43:34.0 +0330
|+++ if_em.h2011-10-31 11:43:35.0 +0330
--
Patching file if_em.h using Plan A...
Hunk #1 succeeded at 438.
done


#root@ion-/usr/src/sys/modules/em
#-(0) make
Warning: Object directory not changed from original /usr/src/sys/modules/em
awk -f @/tools/makeobjops.awk @/kern/device_if.m -h
awk -f @/tools/makeobjops.awk @/kern/bus_if.m -h
awk -f @/tools/makeobjops.awk @/dev/pci/pci_if.m -h
: opt_inet.h
cc -O2 -pipe -march=nocona -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc  -I/usr/src/sys/modules/em/../../dev/e1000 -I. -I@-I@/contrib/altq 
-finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common  -fno-omit-frame-pointer-mcmodel=kernel -mno-red-zone  -mfpmath=387 
-mno-sse -mno-sse2 -mno-sse3 -mno-mmx -mno-3dnow  -msoft-float -fno-asynchronous-unwind-tables -   ffreestanding -fstack-protector -std=iso9899:1999 -fstack-protector -Wall 
-Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-   prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -c 
/usr/src/sys/modules/em/../../dev/e1000/if   _em.c

/usr/src/sys/modules/em/../../dev/e1000/if_em.c:387: error: 
'sysctl__hw_em_children' undeclared here (not in a function)
*** Error code 1

Stop in /usr/src/sys/modules/em.





Please sync your sys/dev/e1000 with HEAD and try again:

setenv CVSROOT :pserver:anon...@anoncvs.freebsd.org:/home/ncvs
cvs login
password: enter anonymous
cd /usr/src

Current problem reports assigned to freebsd-net@FreeBSD.org

2011-10-31 Thread FreeBSD bugmaster
Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker  Resp.  Description

o kern/162153  net[em] intel em driver 7.2.4 don't compile
o kern/162110  net[igb] [panic] RELENG_9 panics on boot in IGB driver - 
o kern/162028  net[ixgbe] [patch] misplaced #endif in ixgbe.c
o kern/161908  net[netgraph] [patch] ng_vlan update for QinQ support
o kern/161899  net[route] ntpd(8): Repeating RTM_MISS packets causing hi
o kern/161381  net[re] RTL8169SC - re0: PHY write failed
o kern/161277  net[em] [patch] BMC cannot receive IPMI traffic after loa
o kern/160873  net[igb] igb(4) from HEAD fails to build on 7-STABLE
o kern/160750  netIntel PRO/1000 connection breaks under load until rebo
o kern/160693  net[gif] [em] Multicast packet are not passed from GIF0 t
o kern/160420  net[msk] phy write timeout on HP 5310m
o kern/160293  net[ieee80211] ppanic] kernel panic during network setup 
o kern/160206  net[gif] gifX stops working after a while (IPv6 tunnel)
o kern/159817  net[udp] write UDPv4: No buffer space available (code=55)
o kern/159795  net[tcp] excessive duplicate ACKs and TCP session freezes
o kern/159629  net[ipsec] [panic] kernel panic with IPsec in transport m
o kern/159621  net[tcp] [panic] panic: soabort: so_count
o kern/159603  net[netinet] [patch] in_ifscrubprefix() - network route c
o kern/159601  net[netinet] [patch] in_scrubprefix() - loopback route re
o kern/159294  net[em] em watchdog timeouts
o kern/159203  net[wpi] Intel 3945ABG Wireless LAN not support IBSS
o kern/158930  net[bpf] BPF element leak in ifp-bpf_if-bif_dlist
o kern/158726  net[ip6] [patch] ICMPv6 Router Announcement flooding limi
o kern/158694  net[ix] [lagg] ix0 is not working within lagg(4)
o kern/158665  net[ip6] [panic] kernel pagefault in in6_setscope()
o kern/158635  net[em] TSO breaks BPF packet captures with em driver
f kern/157802  net[dummynet] [panic] kernel panic in dummynet
o kern/157785  netamd64 + jail + ipfw + natd = very slow outbound traffi
o kern/157429  net[re] Realtek RTL8169 doesn't work with re(4)
o kern/157418  net[em] em driver lockup during boot on Supermicro X9SCM-
o kern/157410  net[ip6] IPv6 Router Advertisements Cause Excessive CPU U
o kern/157287  net[re] [panic] INVARIANTS panic (Memory modified after f
o kern/157209  net[ip6] [patch] locking error in rip6_input() (sys/netin
o kern/157200  net[network.subr] [patch] stf(4) can not communicate betw
o kern/157182  net[lagg] lagg interface not working together with epair 
o kern/156877  net[dummynet] [panic] dummynet move_pkt() null ptr derefe
o kern/156667  net[em] em0 fails to init on CURRENT after March 17
o kern/156408  net[vlan] Routing failure when using VLANs vs. Physical e
o kern/156328  net[icmp]: host can ping other subnet but no have IP from
o kern/156317  net[ip6] Wrong order of IPv6 NS DAD/MLD Report
o kern/156283  net[ip6] [patch] nd6_ns_input - rtalloc_mpath does not re
o kern/156279  net[if_bridge][divert][ipfw] unable to correctly re-injec
o kern/156226  net[lagg]: failover does not announce the failover to swi
o kern/156030  net[ip6] [panic] Crash in nd6_dad_start() due to null ptr
o kern/155772  netifconfig(8): ioctl (SIOCAIFADDR): File exists on direc
o kern/155680  net[multicast] problems with multicast
s kern/155642  net[request] Add driver for Realtek RTL8191SE/RTL8192SE W
o kern/155604  net[flowtable] Flowtable excessively caches dest MAC addr
o kern/155597  net[panic] Kernel panics with sbdrop message
o kern/155585  net[tcp] [panic] tcp_output tcp_mtudisc loop until kernel
o kern/155420  net[vlan] adding vlan break existent vlan
o bin/155365   net[patch] routed(8): if.c in routed fails to compile if 
o kern/155177  net[route] [panic] Panic when inject routes in kernel
o kern/155030  net[igb] igb(4) DEVICE_POLLING does not work with carp(4)
o kern/155010  net[msk] ntfs-3g via iscsi using msk driver cause kernel 
o kern/155004  net[bce] [panic] kernel panic in bce0 driver
o kern/154943  net[gif] ifconfig gifX create on existing gifX clears IP
s kern/154851  net[request]: Port brcm80211 driver from Linux to FreeBSD
o kern/154850  net[netgraph] [patch] ng_ether fails to name nodes when t
o kern/154679  net[em] Fatal trap 12: em1 taskq only at 

Re[2]: PCI-E VT6130 NIC (if_vge) hang system with gigabit link

2011-10-31 Thread Andrey Smagin



31 октября 2011, 04:56 от YongHyeon PYUN pyu...@gmail.com:
 On Sat, Oct 29, 2011 at 09:57:30AM +0400, Andrey Smagin wrote:
 
  Ok. With autonegotiation ifconfig show speed 100MBit.
 
 And vge(4) work without problems with the resolved speed/duplex
 of auto-negotiation?
 
  with manual 1000baseT full-duplex settings in dmesg:
  vge0: failed to start MII autopoll
  vge0: MII read timed out
  vge0: failed to start MII autopoll
  vge0: link state changed to UP
  vge0: MII read timed out
 
 [...]
 
 Did vge(4) ever work with 1000baseT on your box?
Hm... I never seen 1000baseT with vge on this box because 
it have only FreeBSD.
 And why you have to manually configure 1000baseT link?
With autonegatiation link stand on 100Mbit full-duplex
 Does link partner(switch) also use auto-negotiation?
In box also present Intel if_em gigabit  and if_nfe gigabit card wich work with 
same cable and partner(switch) at gigabit speed. I only switch connector to 
if_vge socket.
 ___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re[2]: PCI-E VT6130 NIC (if_vge) hang system with gigabit link

2011-10-31 Thread Andrey Smagin



31 октября 2011, 04:57 от YongHyeon PYUN pyu...@gmail.com:
 On Sat, Oct 29, 2011 at 09:57:30AM +0400, Andrey Smagin wrote:
 
  Ok. With autonegotiation ifconfig show speed 100MBit.
 
 And vge(4) work without problems with the resolved speed/duplex
 of auto-negotiation?

With auto-negotiation vge connect at 100baseTX and work good.

 
  with manual 1000baseT full-duplex settings in dmesg:
  vge0: failed to start MII autopoll
  vge0: MII read timed out
  vge0: failed to start MII autopoll
  vge0: link state changed to UP
  vge0: MII read timed out
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org

Re: kern/162201: [ip] [patch] multicast forwarding cache hash always allocated with size 0, resulting in buffer overrun

2011-10-31 Thread linimon
Old Synopsis: [patch] multicast forwarding cache hash always allocated with 
size 0, resulting in buffer overrun
New Synopsis: [ip] [patch] multicast forwarding cache hash always allocated 
with size 0, resulting in buffer overrun

Responsible-Changed-From-To: freebsd-bugs-freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Mon Oct 31 16:46:16 UTC 2011
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=162201
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: misc/162201: [patch] multicast forwarding cache hash always allocated with size 0, resulting in buffer overrun

2011-10-31 Thread Gleb Smirnoff
The following reply was made to PR kern/162201; it has been noted by GNATS.

From: Gleb Smirnoff gleb...@freebsd.org
To: Stevan Markovic stevan_marko...@mcafee.com
Cc: freebsd-gnats-sub...@freebsd.org, z...@freebsd.org, b...@freebsd.org,
b...@freebsd.org
Subject: Re: misc/162201: [patch] multicast forwarding cache hash always
 allocated with size 0, resulting in buffer overrun
Date: Mon, 31 Oct 2011 21:10:22 +0300

   Hi,
 
 On Mon, Oct 31, 2011 at 04:22:00PM +, Stevan Markovic wrote:
 S Description:
 S Bug is observed as kernel panic shortly after stopping XORP (www.xorp.org) 
configured for PIM/SM routing. Debugging discovered that at the time of MALLOC 
for V_nexpire in ip_mroute.c::vnet_mroute_init() size variable mfchashsize 
always has value 0. 
 S 
 S Why: variable mfchashsize is initialized in module event handler which is 
executed with SI_ORDER_ANY  ordering tag which happens _after_ variable usage 
in MALLOC in VNET_SYSINIT with SI_ORDER_MIDDLE.
 S 
 S Fix simply moves variable initialization before its usage in 
vnet_mroute_init.
 S 
 S This bug is discovered and fixed in McAfee Inc development.
 S How-To-Repeat:
 S Hard to reproduce since system behavior after memory overwrite is 
unpredictable.  Multicast forwarding cashe hash overrun always happens after:
 S a) configuring xorp to use PIM/SM
 S b) starting xorp_rtrmgr
 S c) stopping xorp_rtrmgr.
 S 
 S Fix:
 S Fix simply moves mfchashsize variable initialization before its usage in 
vnet_mroute_init.
 S 
 S Patch attached with submission follows:
 S 
 S Index: ip_mroute.c
 S ===
 S RCS file: /projects/freebsd/src_cvsup/src/sys/netinet/ip_mroute.c,v
 S retrieving revision 1.161
 S diff -u -r1.161 ip_mroute.c
 S --- ip_mroute.c 22 Nov 2010 19:32:54 -  1.161
 S +++ ip_mroute.c 31 Oct 2011 15:54:53 -
 S @@ -2814,7 +2814,13 @@
 S  static void
 S  vnet_mroute_init(const void *unused __unused)
 S  {
 S -
 S +   mfchashsize = MFCHASHSIZE;
 S +   if (TUNABLE_ULONG_FETCH(net.inet.ip.mfchashsize, mfchashsize) 
 S +   !powerof2(mfchashsize)) {
 S +   printf(WARNING: %s not a power of 2; using default\n,
 S +   net.inet.ip.mfchashsize);
 S +   mfchashsize = MFCHASHSIZE;
 S +   }
 S MALLOC(V_nexpire, u_char *, mfchashsize, M_MRTABLE, M_WAITOK|M_ZERO);
 S bzero(V_bw_meter_timers, sizeof(V_bw_meter_timers));
 S callout_init(V_expire_upcalls_ch, CALLOUT_MPSAFE);
 S @@ -2855,13 +2861,6 @@
 S MFC_LOCK_INIT();
 S VIF_LOCK_INIT();
 S  
 S -   mfchashsize = MFCHASHSIZE;
 S -   if (TUNABLE_ULONG_FETCH(net.inet.ip.mfchashsize, mfchashsize) 
 S -   !powerof2(mfchashsize)) {
 S -   printf(WARNING: %s not a power of 2; using default\n,
 S -   net.inet.ip.mfchashsize);
 S -   mfchashsize = MFCHASHSIZE;
 S -   }
 S  
 S pim_squelch_wholepkt = 0;
 S TUNABLE_ULONG_FETCH(net.inet.pim.squelch_wholepkt,
 
 Have you tried to remove these VNET_SYSINITs at all and do all the
 initialization in the ip_mroute_modevent() itself? From first glance
 I see no reason for separate malloc() + callout_init()s.
 
 I am putting guys, who made and reviewed the commit, into Cc.
 
 -- 
 Totus tuus, Glebius.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


RE: misc/162201: [patch] multicast forwarding cache hash always allocated with size 0, resulting in buffer overrun

2011-10-31 Thread Stevan_Markovic
The following reply was made to PR kern/162201; it has been noted by GNATS.

From: stevan_marko...@mcafee.com
To: gleb...@freebsd.org
Cc: freebsd-gnats-sub...@freebsd.org, z...@freebsd.org, b...@freebsd.org,
b...@freebsd.org
Subject: RE: misc/162201: [patch] multicast forwarding cache hash always
 allocated with size 0, resulting in buffer overrun
Date: Mon, 31 Oct 2011 13:26:43 -0500

 Hi,
 
 Gleb, no, I have not tried to eliminate VNET_SYSINITS and I do not think it=
  can be done. My understanding is that VNET_SYSINIT initializes virtual net=
 work stack instance specific data. Eliminating it would prevent using multi=
 cast in multiple virtual network stacks.=20
 
 Stevan=20
 
 -Original Message-
 From: Gleb Smirnoff [mailto:gleb...@freebsd.org]=20
 Sent: Monday, October 31, 2011 2:10 PM
 To: Markovic, Stevan
 Cc: freebsd-gnats-sub...@freebsd.org; z...@freebsd.org; b...@freebsd.org; bz@=
 FreeBSD.org
 Subject: Re: misc/162201: [patch] multicast forwarding cache hash always al=
 located with size 0, resulting in buffer overrun
 
   Hi,
 
 On Mon, Oct 31, 2011 at 04:22:00PM +, Stevan Markovic wrote:
 S Description:
 S Bug is observed as kernel panic shortly after stopping XORP (www.xorp.or=
 g) configured for PIM/SM routing. Debugging discovered that at the time of =
 MALLOC for V_nexpire in ip_mroute.c::vnet_mroute_init() size variable mfcha=
 shsize always has value 0.=20
 S=20
 S Why: variable mfchashsize is initialized in module event handler which i=
 s executed with SI_ORDER_ANY  ordering tag which happens _after_ variable u=
 sage in MALLOC in VNET_SYSINIT with SI_ORDER_MIDDLE.
 S=20
 S Fix simply moves variable initialization before its usage in vnet_mroute=
 _init.
 S=20
 S This bug is discovered and fixed in McAfee Inc development.
 S How-To-Repeat:
 S Hard to reproduce since system behavior after memory overwrite is unpred=
 ictable.  Multicast forwarding cashe hash overrun always happens after:
 S a) configuring xorp to use PIM/SM
 S b) starting xorp_rtrmgr
 S c) stopping xorp_rtrmgr.
 S=20
 S Fix:
 S Fix simply moves mfchashsize variable initialization before its usage in=
  vnet_mroute_init.
 S=20
 S Patch attached with submission follows:
 S=20
 S Index: ip_mroute.c
 S =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 S RCS file: /projects/freebsd/src_cvsup/src/sys/netinet/ip_mroute.c,v
 S retrieving revision 1.161
 S diff -u -r1.161 ip_mroute.c
 S --- ip_mroute.c 22 Nov 2010 19:32:54 -  1.161
 S +++ ip_mroute.c 31 Oct 2011 15:54:53 -
 S @@ -2814,7 +2814,13 @@
 S  static void
 S  vnet_mroute_init(const void *unused __unused)
 S  {
 S -
 S +   mfchashsize =3D MFCHASHSIZE;
 S +   if (TUNABLE_ULONG_FETCH(net.inet.ip.mfchashsize, mfchashsize) 
 S +   !powerof2(mfchashsize)) {
 S +   printf(WARNING: %s not a power of 2; using default\n,
 S +   net.inet.ip.mfchashsize);
 S +   mfchashsize =3D MFCHASHSIZE;
 S +   }
 S MALLOC(V_nexpire, u_char *, mfchashsize, M_MRTABLE, M_WAITOK|M_ZERO);
 S bzero(V_bw_meter_timers, sizeof(V_bw_meter_timers));
 S callout_init(V_expire_upcalls_ch, CALLOUT_MPSAFE);
 S @@ -2855,13 +2861,6 @@
 S MFC_LOCK_INIT();
 S VIF_LOCK_INIT();
 S =20
 S -   mfchashsize =3D MFCHASHSIZE;
 S -   if (TUNABLE_ULONG_FETCH(net.inet.ip.mfchashsize, mfchashsize) 
 S -   !powerof2(mfchashsize)) {
 S -   printf(WARNING: %s not a power of 2; using default\n,
 S -   net.inet.ip.mfchashsize);
 S -   mfchashsize =3D MFCHASHSIZE;
 S -   }
 S =20
 S pim_squelch_wholepkt =3D 0;
 S TUNABLE_ULONG_FETCH(net.inet.pim.squelch_wholepkt,
 
 Have you tried to remove these VNET_SYSINITs at all and do all the
 initialization in the ip_mroute_modevent() itself? From first glance
 I see no reason for separate malloc() + callout_init()s.
 
 I am putting guys, who made and reviewed the commit, into Cc.
 
 --=20
 Totus tuus, Glebius.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: kern/162110: Releng_9 panics on boot in IGB driver - regression from 8.2

2011-10-31 Thread Gleb Smirnoff
The following reply was made to PR kern/162110; it has been noted by GNATS.

From: Gleb Smirnoff gleb...@freebsd.org
To: Frank Terhaar-Yonkers f...@cisco.com
Cc: freebsd-gnats-sub...@freebsd.org, j...@freebsd.org
Subject: Re: kern/162110: Releng_9 panics on boot in IGB driver - regression
 from 8.2
Date: Mon, 31 Oct 2011 22:37:28 +0300

 --LTeJQqWS0MN7I/qa
 Content-Type: text/plain; charset=koi8-r
 Content-Disposition: inline
 
 On Fri, Oct 28, 2011 at 07:43:28PM +, Frank Terhaar-Yonkers wrote:
 F 
 F Number: 162110
 F Category:   kern
 F Synopsis:   Releng_9 panics on boot in IGB driver - regression from 8.2
 F Confidential:   no
 F Severity:   critical
 F Priority:   high
 F Responsible:freebsd-bugs
 F State:  open
 F Quarter:
 F Keywords:   
 F Date-Required:
 F Class:  sw-bug
 F Submitter-Id:   current-users
 F Arrival-Date:   Fri Oct 28 19:50:08 UTC 2011
 F Closed-Date:
 F Last-Modified:
 F Originator: Frank Terhaar-Yonkers
 F Release:Releng_9 CVSUP 2011-October-28
 F Organization:
 F Cisco
 F Environment:
 F FreeBSD fty-zfs-01 9.0-RC1 FreeBSD 9.0-RC1 #1: Fri Oct 28 06:50:23 EDT 2011 
toot@fty-zfs-01:/usr/obj/usr/src/sys/GENERIC  amd64
 F Description:
 F if_igb driver panics during bootup.
 F 
 F The IGB driver probes the device at line 591 of if_igb.c and punts:
 F if (e1000_validate_nvm_checksum(adapter-hw)  0) {
 F device_printf(dev,
 F The EEPROM Checksum Is Not Valid\n);
 F error = EIO;
 F goto err_late;
 F }
 F 
 F The kernel immediately panics with a page fault.  The trace-back show it's 
in the if_igb driver as the console messages suggest.
 F 
 F Releng_8 did not panic, so this is a regression.  The IGB NIC most likely 
has some sort of problem which is properly diagnosed.
 F 
 F Email me if you want the screen shot of the panic, or have a fix to try out.
 
 To reproduce your problem, I've put '|| 1)' conditional into code quoted
 above. It appeared that calling igb_detach() in case of igb_attach() failure
 is full of landmines. Attached patch fixes lot of them, and at least kernel
 doesn't panic in case of e1000_validate_nvm_checksum() failure, not sure
 about other cases.
 
 Unfortunately patch will not fix your NIC, it only cures panic.
 
 I've put into Cc Jack Vogel, who is maintainer of the Intel NIC drivers
 in FreeBSD. May be he can help you.
 
 Jack, please consider including my patch into next version of driver.
 The issues fixed:
 
 - igb_detach() may be called with not initialized ifp
 - igb_stop() may be called with not initialized ifp
 - igb_detach() already does free transmit/receive structures
 - igb_detach() already does free adapter-mta
 - igb_detach() already does destroy core lock
 
 There are probably other edge cases, when kernel panics due to some failure
 in igb_attach(), not all possible error exits were tested.
 
 -- 
 Totus tuus, Glebius.
 
 --LTeJQqWS0MN7I/qa
 Content-Type: text/x-diff; charset=koi8-r
 Content-Disposition: attachment; filename=if_igb.c.diff
 
 Index: if_igb.c
 ===
 --- if_igb.c   (revision 226966)
 +++ if_igb.c   (working copy)
 @@ -670,11 +670,12 @@
  
  err_late:
igb_detach(dev);
 -  igb_free_transmit_structures(adapter);
 -  igb_free_receive_structures(adapter);
igb_release_hw_control(adapter);
if (adapter-ifp != NULL)
if_free(adapter-ifp);
 +  igb_free_pci_resources(adapter);
 +  return (error);
 +
  err_pci:
igb_free_pci_resources(adapter);
free(adapter-mta, M_DEVBUF);
 @@ -701,26 +702,37 @@
  
INIT_DEBUGOUT(igb_detach: begin);
  
 -  /* Make sure VLANS are not using driver */
 -  if (adapter-ifp-if_vlantrunk != NULL) {
 -  device_printf(dev,Vlan in use, detach first\n);
 -  return (EBUSY);
 -  }
 +  IGB_CORE_LOCK(adapter);
 +  adapter-in_detach = 1;
 +  igb_stop(adapter);
 +  IGB_CORE_UNLOCK(adapter);
  
 -  ether_ifdetach(adapter-ifp);
 +  /* Unregister VLAN events */
 +  if (adapter-vlan_attach != NULL)
 +  EVENTHANDLER_DEREGISTER(vlan_config, adapter-vlan_attach);
 +  if (adapter-vlan_detach != NULL)
 +  EVENTHANDLER_DEREGISTER(vlan_unconfig, adapter-vlan_detach);
  
 -  if (adapter-led_dev != NULL)
 -  led_destroy(adapter-led_dev);
 +  callout_drain(adapter-timer);
  
 +  if (ifp != NULL) {
 +  /* Make sure VLANS are not using driver */
 +  if (ifp-if_vlantrunk != NULL) {
 +  device_printf(dev,Vlan in use, detach first\n);
 +  return (EBUSY);
 +  }
 +
 +  ether_ifdetach(ifp);
 +
  #ifdef DEVICE_POLLING
 -  if (ifp-if_capenable  IFCAP_POLLING)
 -  ether_poll_deregister(ifp);
 +  

Re: misc/162201: [patch] multicast forwarding cache hash always allocated with size 0, resulting in buffer overrun

2011-10-31 Thread Marko Zec
The following reply was made to PR kern/162201; it has been noted by GNATS.

From: Marko Zec z...@freebsd.org
To: stevan_marko...@mcafee.com
Cc: gleb...@freebsd.org,
 freebsd-gnats-sub...@freebsd.org,
 b...@freebsd.org,
 b...@freebsd.org
Subject: Re: misc/162201: [patch] multicast forwarding cache hash always 
allocated with size 0, resulting in buffer overrun
Date: Mon, 31 Oct 2011 21:13:27 +0100

 On Monday 31 October 2011 19:26:43 stevan_marko...@mcafee.com wrote:
  Hi,
 
  Gleb, no, I have not tried to eliminate VNET_SYSINITS and I do not think it
  can be done. My understanding is that VNET_SYSINIT initializes virtual
  network stack instance specific data. Eliminating it would prevent using
  multicast in multiple virtual network stacks.
 
 vnet_mroute_init() should be triggered after ip_mroute_modevent() is done, not 
 before, I think that's the whole wisdom here.  I'll throw a look at this...
 
 Marko
 
 
 
  Stevan
 
  -Original Message-
  From: Gleb Smirnoff [mailto:gleb...@freebsd.org]
  Sent: Monday, October 31, 2011 2:10 PM
  To: Markovic, Stevan
  Cc: freebsd-gnats-sub...@freebsd.org; z...@freebsd.org; b...@freebsd.org;
  b...@freebsd.org Subject: Re: misc/162201: [patch] multicast forwarding cache
  hash always allocated with size 0, resulting in buffer overrun
 
Hi,
 
  On Mon, Oct 31, 2011 at 04:22:00PM +, Stevan Markovic wrote:
  S Description:
  S Bug is observed as kernel panic shortly after stopping XORP
  (www.xorp.org) configured for PIM/SM routing. Debugging discovered that at
  the time of MALLOC for V_nexpire in ip_mroute.c::vnet_mroute_init() size
  variable mfchashsize always has value 0. S
  S Why: variable mfchashsize is initialized in module event handler which
  is executed with SI_ORDER_ANY  ordering tag which happens _after_ variable
  usage in MALLOC in VNET_SYSINIT with SI_ORDER_MIDDLE. S
  S Fix simply moves variable initialization before its usage in
  vnet_mroute_init. S
  S This bug is discovered and fixed in McAfee Inc development.
  S How-To-Repeat:
  S Hard to reproduce since system behavior after memory overwrite is
  unpredictable.  Multicast forwarding cashe hash overrun always happens
  after: S a) configuring xorp to use PIM/SM
  S b) starting xorp_rtrmgr
  S c) stopping xorp_rtrmgr.
  S
  S Fix:
  S Fix simply moves mfchashsize variable initialization before its usage in
  vnet_mroute_init. S
  S Patch attached with submission follows:
  S
  S Index: ip_mroute.c
  S ===
  S RCS file: /projects/freebsd/src_cvsup/src/sys/netinet/ip_mroute.c,v
  S retrieving revision 1.161
  S diff -u -r1.161 ip_mroute.c
  S --- ip_mroute.c   22 Nov 2010 19:32:54 -  1.161
  S +++ ip_mroute.c   31 Oct 2011 15:54:53 -
  S @@ -2814,7 +2814,13 @@
  S  static void
  S  vnet_mroute_init(const void *unused __unused)
  S  {
  S -
  S + mfchashsize = MFCHASHSIZE;
  S + if (TUNABLE_ULONG_FETCH(net.inet.ip.mfchashsize, mfchashsize) 
  S + !powerof2(mfchashsize)) {
  S + printf(WARNING: %s not a power of 2; using default\n,
  S + net.inet.ip.mfchashsize);
  S + mfchashsize = MFCHASHSIZE;
  S + }
  S   MALLOC(V_nexpire, u_char *, mfchashsize, M_MRTABLE, M_WAITOK|M_ZERO);
  S   bzero(V_bw_meter_timers, sizeof(V_bw_meter_timers));
  S   callout_init(V_expire_upcalls_ch, CALLOUT_MPSAFE);
  S @@ -2855,13 +2861,6 @@
  S   MFC_LOCK_INIT();
  S   VIF_LOCK_INIT();
  S
  S - mfchashsize = MFCHASHSIZE;
  S - if (TUNABLE_ULONG_FETCH(net.inet.ip.mfchashsize, mfchashsize) 
  S - !powerof2(mfchashsize)) {
  S - printf(WARNING: %s not a power of 2; using default\n,
  S - net.inet.ip.mfchashsize);
  S - mfchashsize = MFCHASHSIZE;
  S - }
  S
  S   pim_squelch_wholepkt = 0;
  S   TUNABLE_ULONG_FETCH(net.inet.pim.squelch_wholepkt,
 
  Have you tried to remove these VNET_SYSINITs at all and do all the
  initialization in the ip_mroute_modevent() itself? From first glance
  I see no reason for separate malloc() + callout_init()s.
 
  I am putting guys, who made and reviewed the commit, into Cc.
 
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Undocumented netgraph `cmd' flags ?

2011-10-31 Thread Arnaud Lacombe
Hi Julian,

For my information, is it documented anywhere that bit 28[0] and 29[1]
of Netgraph message's `cmd' shall not be used (the bits, not the
macros) ?

Thanks,
 - Arnaud

[0]: NGM_READONLY
[1]: NGM_HASREPLY
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[High Interrupt Count] Networking Difficulties

2011-10-31 Thread Paul A. Procacci
Gents,

I'm having quite an aweful problem that I need a bit of help with.

I have an HPDL360 G3 ( 
http://h18000.www1.hp.com/products/quickspecs/11504_na/11504_na.HTML ) which 
acts as a NAT (via PF) for several (600+) class C's amongst 24+ machines 
sitting behind it.
It's running FPSense (FreeBSD 8.1-RELEASE-p4).

The important guts are:

2 x 2.8 GHz Cpus
2 BGE interfaces on a PCI-X bus.

During peak times this machine is only able to handle between 500Mbps - 600Mbps 
before running out of cpu capacity.  (300Mbps(ish) on the LAN, 300Mbps(ish) on 
the WAN) It's due to the high number of interrupts.
I was speaking with a networking engineer here and he mentioned that I should 
look at Interrupt Coalescing to increase throughput.
The only information I found online regarding this was a post from 2 years ago 
here: http://lists.freebsd.org/pipermail/freebsd-net/2009-June/07.html

The tunables mentioned in the above post aren't present in my system, so I 
imagine this never made it into the bge driver.  Assuming this to be the case, 
I started looking at DEVICE_POLLING as a solution.
I did try implementing device polling, but the results were worse than I 
expected.  netisr was using 100% of a single cpu while the other cpu remained 
mostly idle.
Not knowing exactly what netisr is, I reverted the changes.

This leads me to this list.  Given the scenario above, I'm nearly certain I 
need to use device polling instead of the standard interrupt driven setup.
The two sysctl's that I've come across thus far that I think are what I need 
are:

net.isr.maxthreads
hern.hz

I would assume setting net.isr.maxthreads to 2 given my dual core machine is 
advisable, but I'm not 100% sure.
What are the caveats in setting this higher?  Given the output of `sysctl -d 
net.isr.maxthreads` I would expect anything higher than the number of cores to 
be detrimental.  Is this correct?

kern.hz I'm more unsure of.  I understand what the sysctl is, but I'm not sure 
how to come up with a reasonable number.
Generally speaking, and in your experience, would a setting of 2000 achive 
close to the theoritical meximum of the cards?  Is there an upper limit that I 
would be worried about?

Random Question:
- is device polling really the answer?  I am missing something in the bge 
driver that I've overlooked?
- what tunables directly effect processing high volumes of packets.

Network Interfaces:
##
bge0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 mtu 
1500

options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE
ether 00:0b:cd:ca:1d:1a
inet 209.18.70.211 netmask 0xff00 broadcast 209.18.70.255
inet6 fe80::20b:cdff:feca:1d1a%bge0 prefixlen 64 scopeid 0x1
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (1000baseT full-duplex)
status: active
bge1: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 mtu 
1500

options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE
ether 00:0b:cd:ca:1a:74
inet 172.16.0.3 netmask 0xfffc broadcast 172.19.255.255
inet6 fe80::20b:cdff:feca:1a74%bge1 prefixlen 64 scopeid 0x2
nd6 options=3PERFORMNUD,ACCEPT_RTADV
media: Ethernet autoselect (1000baseT full-duplex)
status: active
##

I appreciate the help in advance.

Thanks,
Paul



This message may contain confidential or privileged information. If you are not 
the intended recipient, please advise us immediately and delete this message. 
See http://www.datapipe.com/about-us-legal-email-disclaimer.htm for further 
information on confidentiality and the risks of non-secure electronic 
communication. If you cannot access these links, please notify us by reply 
message and we will send the contents to you.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: bin/155365: [patch] routed(8): if.c in routed fails to compile if time_t and long are different sizes

2011-10-31 Thread kevlo
Synopsis: [patch] routed(8): if.c in routed fails to compile if time_t and long 
are different sizes

State-Changed-From-To: open-closed
State-Changed-By: kevlo
State-Changed-When: Tue Nov 1 03:25:20 UTC 2011
State-Changed-Why: 
Fixed in r204405

http://www.freebsd.org/cgi/query-pr.cgi?pr=155365
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [High Interrupt Count] Networking Difficulties

2011-10-31 Thread Paul A. Procacci
On Mon, Oct 31, 2011 at 08:57:46PM -0500, Paul A. Procacci wrote:
 Gents,

 I'm having quite an aweful problem that I need a bit of help with.

 I have an HPDL360 G3 ( 
 http://h18000.www1.hp.com/products/quickspecs/11504_na/11504_na.HTML ) which 
 acts as a NAT (via PF) for several (600+) class C's amongst 24+ machines 
 sitting behind it.
 It's running FPSense (FreeBSD 8.1-RELEASE-p4).

 The important guts are:

 2 x 2.8 GHz Cpus
 2 BGE interfaces on a PCI-X bus.

 During peak times this machine is only able to handle between 500Mbps - 
 600Mbps before running out of cpu capacity.  (300Mbps(ish) on the LAN, 
 300Mbps(ish) on the WAN) It's due to the high number of interrupts.
 I was speaking with a networking engineer here and he mentioned that I should 
 look at Interrupt Coalescing to increase throughput.
 The only information I found online regarding this was a post from 2 years 
 ago here: http://lists.freebsd.org/pipermail/freebsd-net/2009-June/07.html

 The tunables mentioned in the above post aren't present in my system, so I 
 imagine this never made it into the bge driver.  Assuming this to be the 
 case, I started looking at DEVICE_POLLING as a solution.
 I did try implementing device polling, but the results were worse than I 
 expected.  netisr was using 100% of a single cpu while the other cpu remained 
 mostly idle.
 Not knowing exactly what netisr is, I reverted the changes.

 This leads me to this list.  Given the scenario above, I'm nearly certain I 
 need to use device polling instead of the standard interrupt driven setup.
 The two sysctl's that I've come across thus far that I think are what I need 
 are:

 net.isr.maxthreads
 hern.hz

 I would assume setting net.isr.maxthreads to 2 given my dual core machine is 
 advisable, but I'm not 100% sure.
 What are the caveats in setting this higher?  Given the output of `sysctl -d 
 net.isr.maxthreads` I would expect anything higher than the number of cores 
 to be detrimental.  Is this correct?

 kern.hz I'm more unsure of.  I understand what the sysctl is, but I'm not 
 sure how to come up with a reasonable number.
 Generally speaking, and in your experience, would a setting of 2000 achive 
 close to the theoritical meximum of the cards?  Is there an upper limit that 
 I would be worried about?

 Random Question:
 - is device polling really the answer?  I am missing something in the bge 
 driver that I've overlooked?
 - what tunables directly effect processing high volumes of packets.


snip

After some more coffee, and source code reading, I've now learned that having 
device polling enabled forces netisr to limit the number of threads it creates 
to 1.
This kinda defeats the purpose of enabling device polling. This makes me 
believe that device polling isn't going to be a great solution afterall.

A snippet from dmesg:
snip
bge0: Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002 mem 
0xf7ef-0xf7ef irq 30 at device 2.0 on pci1
brgphy0: BCM5703 10/100/1000baseTX PHY PHY 1 on miibus0
bge1: Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002 mem 
0xf7ff-0xf7ff irq 29 at device 2.0 on pci4
brgphy1: BCM5703 10/100/1000baseTX PHY PHY 1 on miibus1
snip

Any help/advice is appreciated, and sorry for following up to myself with this 
information.

~Paul



This message may contain confidential or privileged information. If you are not 
the intended recipient, please advise us immediately and delete this message. 
See http://www.datapipe.com/about-us-legal-email-disclaimer.htm for further 
information on confidentiality and the risks of non-secure electronic 
communication. If you cannot access these links, please notify us by reply 
message and we will send the contents to you.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Undocumented netgraph `cmd' flags ?

2011-10-31 Thread Julian Elischer

On 10/31/11 4:16 PM, Arnaud Lacombe wrote:

Hi Julian,

For my information, is it documented anywhere that bit 28[0] and 29[1]
of Netgraph message's `cmd' shall not be used (the bits, not the
macros) ?

Thanks,
  - Arnaud

[0]: NGM_READONLY
[1]: NGM_HASREPLY


Not really sure what you are asking.

NGM_READONLY allows the base system to use a reader lock and not block 
other

traffic while the message is bring processed.
NGM_HASREPLY is not used that I can see in the kernel. It may be a 
historical artifact or

maybe only used in the library as a hint.

It has been so long since I was involved with netgraph (over a decade) 
that I really don't remember.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org