from:"Pyun YongHyeon"

Re: bge0 watchdog timeout -- resetting on 8.2-PREREL never recovers

2011-02-19 Thread Pyun YongHyeon

On Sat, Feb 19, 2011 at 03:59:57PM -, Steven Hartland wrote:
> This may be totally unrelated to bge, investigating a potential failing 
> stick
> of ram in the machine in question so until we've ruled this out as the cause
> don't want to waste anyone's time.
> 
> I did however notice the logic between the two fixes for DMA on 5704's on 
> PCIX
> in svn differ so wondering which ones correct:-
> http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=216085&r2=216970
> http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=217225&r2=217226
> 
> r216970 results in:
> 1, 0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
> where as r217226 results in:
> 1, BGE_DMA_BNDRY, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL,
> 

I think it would be same for your case(BCM5704 PCI-X). However
r217226 would be better one to address the issue. Actually I didn't
like the workaround but there was no much time left to fix it for
upcoming 8.2/7.4.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge wedging 8.2-RC1

2011-02-09 Thread Pyun YongHyeon

On Wed, Feb 09, 2011 at 06:28:31PM -0600, Peter Lai wrote:
> >
> > Let me know attached patch makes any difference on your box.
> > The patch contains some other changes but that wouldn't affect your
> > BCM5761 controller. If you see "CLKREQ enabled" message after
> > applying the patch also let me know that too.
> >
> 
> Can I apply this to 8.2-RC1 or should I update it to -RC3?

I guess you can apply it to 8.2-RC1 without a problem.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge wedging 8.2-RC1

2011-02-09 Thread Pyun YongHyeon

On Mon, Feb 07, 2011 at 08:27:43PM -0600, Peter Lai wrote:
> On Feb 7, 2011 7:38 PM, "Pyun YongHyeon"  wrote:
> >
> > On Mon, Feb 07, 2011 at 06:09:16PM -0600, Peter Lai wrote:
> > > Hello
> > >
> > > I've got a new Dell Precision workstation here with a BCM5761 on intel
> > > mobo for westmere xeons that is wedging with interrupt storm and will
> > > lockup the system randomly. I have turned HTT and auto powermanagement
> > > off in bios (system cannot sleep), lowest cpu acpi state is C1.
> > >
> > > Here is dmesg:
> > > bge0:  > > 0x5761100> mem 0xf3be-0xf3be,0xf3bf-0xf3bf irq 17 at
> > > device 0.0 on pci6
> > > bge0: CHIP ID 0x05761100; ASIC REV 0x5761; CHIP REV 0x57611; PCI-E
> > > miibus0:  on bge0
> > > brgphy0:  PHY 1 on miibus0
> > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
> > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > >
> > > Here is pciconf -lv:
> > > bge0@pci0:6:0:0:  class=0x02 card=0x026d1028 chip=0x168114e4
> > > rev=0x10 hdr=0x00
> > > vendor = 'Broadcom Corporation'
> > > device = 'Broadcom 57XX Gigabit Integrated Controller
>  (BCM5761)'
> > > class  = network
> > > subclass   = ethernet
> > >
> > > here is the setup in rc.conf:
> > >
> > > ifconfig_bge0="polling -tso -vlanhwtso -vlanhwtag -vlanmtu inet
> > > 192.168.123.124 netmask 255.255.255.0"
> > >
> > > I have the card plugged into a dlink DSS8 100mbps switch with one
> > > other 100mbps device on it (rich man's crossover cable).
> > >
> > > Before turning off TSO4 and VLAN tagging (because I don't use them),
> > > the card would do several things:
> > > 1. 1 out of 3 reboots: Fail to bring interface up. ifconfig would hang
> > > and systat/vmstat showed 800+ interrupts per second on IRQ256
> >
> > This is strange. bge(4) does not use MSI if you build bge(4) with
> > DEVICE_POLLING so seeing IRQ256 interrupts looks odd to me.
> > Are you sure bge(4) is using IRQ256?
> 
> This is with GENERIC. I will rebuild with POLLING and try...
> 

Let me know attached patch makes any difference on your box.
The patch contains some other changes but that wouldn't affect your
BCM5761 controller. If you see "CLKREQ enabled" message after
applying the patch also let me know that too.

> >
> > > 2. After a few hours lock up the system, requiring hard reboot
> > >
> > > After disabling TSO4 and VLAN stuff:
> > > bge0: flags=8802 metric 0 mtu 1500
> > >   options=80083
> > >   media: Ethernet autoselect (100baseTX
> > > )
> > >
> > > Everything seemed fine for about two weeks and then suddenly started
> > > acting up again, locked up, after hard reboot, soft reboot, link will
> > > not come up and I see interrupt storm again
> > >
> >
> > If you don't use DEVICE_POLLING, rebuild bge(4) with
> > DEVICE_POLLING. For most cases, you don't need to enable polling on
> > intelligent controllers like bge(4).
> >
> > I also have BCM5761 PCIe controller which shows no such issues. I
> > know there is an edge case(send BD corruption) for BCM5761/BCM5784/
> > BCM57780 which needs to be investigated. I'm not sure you're seeing
> > that edge case though.
> >
> > > I am close to buying an intel card to replace the bcm, but then I
> > > noticed that the main intel desktop PCI-E card is 82574L-based and
> > > people are having em driver wedging on that too. So now I have broken
> > > ethernet on this box; my primary link is atheros 5212 pci card and I
> > > may be out of pci slots (or else I might try a pci intel card).
Index: sys/dev/bge/if_bgereg.h
===
--- sys/dev/bge/if_bgereg.h	(revision 218409)
+++ sys/dev/bge/if_bgereg.h	(working copy)
@@ -2004,6 +2004,11 @@
 #define	BGE_EECTL_DATAOUT		0x0010
 #define	BGE_EECTL_DATAIN		0x0020
 
+/* PCIe Link control register */
+#define	BGE_PCIE_LNKCTL			0x7D54
+#define	BGE_PCIE_LNKCTL_L1_PLL_PD_ENB	0x0008
+#define	BGE_PCIE_LNKCTL_L1_PLL_PD_DIS	0x0080
+
 /* MDI (MII/GMII) access register */
 #define	BGE_MDI_DATA			0x0001
 #define	BGE_MDI_DIR			0x0002
@@ -2769,6 +2774,7 @@
 #define	BGE_FLAG_4G_BNDRY_BUG	0x0200
 #define	BGE_FLAG_RX_ALIGNBUG	0x0400
 #define	BGE_FLAG_SHORT_DMA_BUG	0x0800
+#define	BGE_FLAG_CLKREQ_BUG	0x10

Re: Fwd: igb driver tx hangs when out of mbuf clusters

2011-02-07 Thread Pyun YongHyeon

On Mon, Feb 07, 2011 at 09:21:45PM -0500, Karim Fodil-Lemelin wrote:
> 2011/2/7 Pyun YongHyeon 
> 
> > On Mon, Feb 07, 2011 at 05:33:47PM -0500, Karim Fodil-Lemelin wrote:
> > > Subject: Re: igb driver tx hangs when out of mbuf clusters
> > >
> > > > To: Lev Serebryakov 
> > > > Cc: freebsd-net@freebsd.org
> > > >
> > > >
> > > > 2011/2/7 Lev Serebryakov 
> > > >
> > > > Hello, Karim.
> > > >> You wrote 7 февраля 2011 г., 19:58:04:
> > > >>
> > > >>
> > > >> > The issue is with the igb driver from 7.4 RC3 r218406. If the driver
> > > >> runs
> > > >> > out of mbuf clusters it simply stops receiving even after the
> > clusters
> > > >> have
> > > >> > been freed.
> > > >>   It looks like my problems with em0 (see thread "em0 hangs without
> > > >>  any messages like "Watchdog timeout", only down/up reset it.")...
> > > >>  Codebase for em and igb is somewhat common...
> > > >>
> > > >> --
> > > >> // Black Lion AKA Lev Serebryakov 
> > > >>
> > > >> I agree.
> > > >
> > > > Do you get missed packets in mac_stats (sysctl dev.em | grep missed)?
> > > >
> > > > I might not have mentioned but I can also 'fix' the problem by doing
> > > > ifconfig igb0 down/up.
> > > >
> > > > I will try using POLLING to 'automatize' the reset as you mentioned in
> > your
> > > > thread.
> > > >
> > > > Karim.
> > > >
> > > >
> > > Follow up on tests with POLLING: The problem is still occurring although
> > it
> > > takes more time ... Outputs of sysctl dev.igb0 and netstat -m will
> > follow:
> > >
> > > 9219/99426/108645 mbufs in use (current/cache/total)
> > > 9217/90783/10/10 mbuf clusters in use (current/cache/total/max)
> >
> > Do you see network processes are stuck in keglim state? If you see
> > that I think that's not trivial to solve. You wouldn't even kill
> > that process if it is under keglim state unless some more mbuf
> > clusters are freed from other places.
> >
> 
> No keglim state, here is a snapshot of top -SH while the problem is
> happening:
> 
>12 root  171 ki31 0K 8K CPU5   5  19:27 100.00% idle:
> cpu5
>10 root  171 ki31 0K 8K CPU7   7  19:26 100.00% idle:
> cpu7
>14 root  171 ki31 0K 8K CPU3   3  19:25 100.00% idle:
> cpu3
>11 root  171 ki31 0K 8K CPU6   6  19:25 100.00% idle:
> cpu6
>13 root  171 ki31 0K 8K CPU4   4  19:24 100.00% idle:
> cpu4
>15 root  171 ki31 0K 8K CPU2   2  19:22 100.00% idle:
> cpu2
>16 root  171 ki31 0K 8K CPU1   1  19:18 100.00% idle:
> cpu1
>17 root  171 ki31 0K 8K RUN0  19:12 100.00% idle:
> cpu0
>18 root  -32- 0K 8K WAIT   6   0:04  0.10% swi4:
> clock s
>20 root  -44- 0K 8K WAIT   4   0:08  0.00% swi1: net
>29 root  -68- 0K 8K -  0   0:02  0.00% igb0 que
>35 root  -68- 0K 8K -  2   0:02  0.00% em1 taskq
>28 root  -68- 0K 8K WAIT   5   0:01  0.00% irq256:
> igb0
> 
> keep in mind that num_queues has been forced to 1.
> 
> 
> >
> > I think both igb(4) and em(4) pass received frame to upper stack
> > before allocating new RX buffer. If driver fails to allocate new RX
> > buffer driver will try to refill RX buffers in next run. Under
> > extreme resource shortage case, this situation can produce no more
> > RX buffers in RX descriptor ring and this will take the box out of
> > network. Other drivers avoid that situation by allocating new RX
> > buffer before passing received frame to upper stack. If RX buffer
> > allocation fails driver will just reuse old RX buffer without
> > passing received frame to upper stack. That does not completely
> > solve the keglim issue though. I think you should have enough mbuf
> > cluters to avoid keglim.
> >
> > However the output above indicates you have enough free mbuf
> > clusters. So I guess igb(4) encountered zero available RX buffer
> > situation in past but failed to refill the RX buffer again. I guess
> > driver may be able to periodically check available RX buffers.
> > Jack may have better

Re: bge wedging 8.2-RC1

2011-02-07 Thread Pyun YongHyeon

On Mon, Feb 07, 2011 at 06:09:16PM -0600, Peter Lai wrote:
> Hello
> 
> I've got a new Dell Precision workstation here with a BCM5761 on intel
> mobo for westmere xeons that is wedging with interrupt storm and will
> lockup the system randomly. I have turned HTT and auto powermanagement
> off in bios (system cannot sleep), lowest cpu acpi state is C1.
> 
> Here is dmesg:
> bge0:  0x5761100> mem 0xf3be-0xf3be,0xf3bf-0xf3bf irq 17 at
> device 0.0 on pci6
> bge0: CHIP ID 0x05761100; ASIC REV 0x5761; CHIP REV 0x57611; PCI-E
> miibus0:  on bge0
> brgphy0:  PHY 1 on miibus0
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
> 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> 
> Here is pciconf -lv:
> bge0@pci0:6:0:0:  class=0x02 card=0x026d1028 chip=0x168114e4
> rev=0x10 hdr=0x00
> vendor = 'Broadcom Corporation'
> device = 'Broadcom 57XX Gigabit Integrated Controller  (BCM5761)'
> class  = network
> subclass   = ethernet
> 
> here is the setup in rc.conf:
> 
> ifconfig_bge0="polling -tso -vlanhwtso -vlanhwtag -vlanmtu inet
> 192.168.123.124 netmask 255.255.255.0"
> 
> I have the card plugged into a dlink DSS8 100mbps switch with one
> other 100mbps device on it (rich man's crossover cable).
> 
> Before turning off TSO4 and VLAN tagging (because I don't use them),
> the card would do several things:
> 1. 1 out of 3 reboots: Fail to bring interface up. ifconfig would hang
> and systat/vmstat showed 800+ interrupts per second on IRQ256

This is strange. bge(4) does not use MSI if you build bge(4) with
DEVICE_POLLING so seeing IRQ256 interrupts looks odd to me.
Are you sure bge(4) is using IRQ256?

> 2. After a few hours lock up the system, requiring hard reboot
> 
> After disabling TSO4 and VLAN stuff:
> bge0: flags=8802 metric 0 mtu 1500
>   options=80083
>   media: Ethernet autoselect (100baseTX
> )
> 
> Everything seemed fine for about two weeks and then suddenly started
> acting up again, locked up, after hard reboot, soft reboot, link will
> not come up and I see interrupt storm again
> 

If you don't use DEVICE_POLLING, rebuild bge(4) with
DEVICE_POLLING. For most cases, you don't need to enable polling on
intelligent controllers like bge(4).

I also have BCM5761 PCIe controller which shows no such issues. I
know there is an edge case(send BD corruption) for BCM5761/BCM5784/
BCM57780 which needs to be investigated. I'm not sure you're seeing
that edge case though.

> I am close to buying an intel card to replace the bcm, but then I
> noticed that the main intel desktop PCI-E card is 82574L-based and
> people are having em driver wedging on that too. So now I have broken
> ethernet on this box; my primary link is atheros 5212 pci card and I
> may be out of pci slots (or else I might try a pci intel card).
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Fwd: igb driver tx hangs when out of mbuf clusters

2011-02-07 Thread Pyun YongHyeon

On Mon, Feb 07, 2011 at 05:33:47PM -0500, Karim Fodil-Lemelin wrote:
> Subject: Re: igb driver tx hangs when out of mbuf clusters
> 
> > To: Lev Serebryakov 
> > Cc: freebsd-net@freebsd.org
> >
> >
> > 2011/2/7 Lev Serebryakov 
> >
> > Hello, Karim.
> >> You wrote 7 февраля 2011 г., 19:58:04:
> >>
> >>
> >> > The issue is with the igb driver from 7.4 RC3 r218406. If the driver
> >> runs
> >> > out of mbuf clusters it simply stops receiving even after the clusters
> >> have
> >> > been freed.
> >>   It looks like my problems with em0 (see thread "em0 hangs without
> >>  any messages like "Watchdog timeout", only down/up reset it.")...
> >>  Codebase for em and igb is somewhat common...
> >>
> >> --
> >> // Black Lion AKA Lev Serebryakov 
> >>
> >> I agree.
> >
> > Do you get missed packets in mac_stats (sysctl dev.em | grep missed)?
> >
> > I might not have mentioned but I can also 'fix' the problem by doing
> > ifconfig igb0 down/up.
> >
> > I will try using POLLING to 'automatize' the reset as you mentioned in your
> > thread.
> >
> > Karim.
> >
> >
> Follow up on tests with POLLING: The problem is still occurring although it
> takes more time ... Outputs of sysctl dev.igb0 and netstat -m will follow:
> 
> 9219/99426/108645 mbufs in use (current/cache/total)
> 9217/90783/10/10 mbuf clusters in use (current/cache/total/max)

Do you see network processes are stuck in keglim state? If you see
that I think that's not trivial to solve. You wouldn't even kill
that process if it is under keglim state unless some more mbuf
clusters are freed from other places.

I think both igb(4) and em(4) pass received frame to upper stack
before allocating new RX buffer. If driver fails to allocate new RX
buffer driver will try to refill RX buffers in next run. Under
extreme resource shortage case, this situation can produce no more
RX buffers in RX descriptor ring and this will take the box out of
network. Other drivers avoid that situation by allocating new RX
buffer before passing received frame to upper stack. If RX buffer
allocation fails driver will just reuse old RX buffer without
passing received frame to upper stack. That does not completely
solve the keglim issue though. I think you should have enough mbuf
cluters to avoid keglim.

However the output above indicates you have enough free mbuf
clusters. So I guess igb(4) encountered zero available RX buffer
situation in past but failed to refill the RX buffer again. I guess
driver may be able to periodically check available RX buffers.
Jack may have better idea if this was the case.(CCed)

> 0/640 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/12800/12800/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 20738K/257622K/278361K bytes allocated to network (current/cache/total)
> 0/291/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/5/6656 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
> 
> dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.0.7
> dev.igb.0.%driver: igb
> dev.igb.0.%location: slot=0 function=0
> dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10a7 subvendor=0x8086
> subdevice=0x class=0x02
> dev.igb.0.%parent: pci7
> dev.igb.0.nvm: -1
> dev.igb.0.flow_control: 3
> dev.igb.0.enable_aim: 1
> dev.igb.0.rx_processing_limit: 100
> dev.igb.0.link_irq: 4
> dev.igb.0.dropped: 0
> dev.igb.0.tx_dma_fail: 0
> dev.igb.0.rx_overruns: 464
> dev.igb.0.watchdog_timeouts: 0
> dev.igb.0.device_control: 1490027073
> dev.igb.0.rx_control: 67141658
> dev.igb.0.interrupt_mask: 0
> dev.igb.0.extended_int_mask: 0
> dev.igb.0.tx_buf_alloc: 14
> dev.igb.0.rx_buf_alloc: 34
> dev.igb.0.fc_high_water: 29488
> dev.igb.0.fc_low_water: 29480
> dev.igb.0.queue0.interrupt_rate: 11
> dev.igb.0.queue0.txd_head: 877
> dev.igb.0.queue0.txd_tail: 877
> dev.igb.0.queue0.no_desc_avail: 0
> dev.igb.0.queue0.tx_packets: 92013
> dev.igb.0.queue0.rxd_head: 570
> dev.igb.0.queue0.rxd_tail: 570
> dev.igb.0.queue0.rx_packets: 163386
> dev.igb.0.queue0.rx_bytes: 240260310
> dev.igb.0.queue0.lro_queued: 0
> dev.igb.0.queue0.lro_flushed: 0
> dev.igb.0.mac_stats.excess_coll: 0
> dev.igb.0.mac_stats.single_coll: 0
> dev.igb.0.mac_stats.multiple_coll: 0
> dev.igb.0.mac_stats.late_coll: 0
> dev.igb.0.mac_stats.collision_count: 0
> dev.igb.0.mac_stats.symbol_errors: 0
> dev.igb.0.mac_stats.sequence_errors: 0
> dev.igb.0.mac_stats.defer_count: 0
> dev.igb.0.mac_stats.missed_packets: 3104
> dev.igb.0.mac_stats.recv_no_buff: 4016
> dev.igb.0.mac_stats.recv_undersize: 0
> dev.igb.0.mac_stats.recv_fragmented: 0
> dev.igb.0.mac_stats.recv_oversize: 0
> dev.igb.0.mac_stats.recv_jabber: 0
> dev.igb.0.mac_stats.recv_errs: 0

Re: bogus 0 len IP packet, was: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.

2011-02-06 Thread Pyun YongHyeon

On Sun, Feb 06, 2011 at 11:54:49PM +0100, Ronald Klop wrote:
> On Sat, 22 Jan 2011 00:01:47 +0100, Ronald Klop  
>  wrote:
> 
> >On Tue, 18 Jan 2011 09:38:04 +0100,  wrote:
> >
>  So, does anyone have an idea why the IP length field would be set to  
> >>>0
>  for these TCP/IP packets?
> 
>  Here's some info from Ronald w.r.t. his hardware. (All I can think  
> >>>of is
>  that he could try disabling TSO, etc?)
> 
>  Thanks in advance for any help with this, rick
> 
> >>>
> >>>It seems that issue came from TSO. Driver will set ip_len and
> >>>ip_sum field to 0 before passing the TCP segment to controller.
> >>>The failed length were 4446, 5858, 3034 and 4310 and the total
> >>>number of such frames are more than 35k within 90 seconds. Since
> >>>failed length 4310 is continuously repeated I guess there is edge
> >>>case where em(4) didn't free failed TCP segment for TSO.
> >>>I remember there was commit to HEAD(r217295) which could be related
> >>>with this issue.
> >>
> >>I'm seeing the same problem with Broadcom NetXtreme (bce) cards:
> >>
> >>bce0@pci0:3:0:0:class=0x02 card=0x03421014 chip=0x164c14e4  
> >>rev=0x12 hdr=0x00
> >>vendor = 'Broadcom Corporation'
> >>device = 'Broadcom NetXtreme II Gigabit Ethernet Adapter  
> >>(BCM5708)'
> >>class  = network
> >>subclass   = ethernet
> >>
> >>This is with 8.2-PRERELEASE. Turning off TSO (ifconfig bce0 -tso)
> >>removes the problem.
> >>
> >>Steinar Haug, Nethelp consulting, sth...@nethelp.no
> >>___
> >>freebsd-net@freebsd.org mailing list
> >>http://lists.freebsd.org/mailman/listinfo/freebsd-net
> >>To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
> >I tried -tso and -txcsum in various combinations, but it didn't solve  
> >the problem. I wil look for another brand of network card to try. But  
> >this has to wait till monday when I'm at the office again.
> 
> I also used another network card (rl0) and it has the same problem with  
> NFS. I'm going to change some network cables to see if that helps. I have  
> some hints that there might be something wrong with that.
> 

Hmm, given that rl(4) also shows the issue it seems the issue could
be in TCP/IP stack, not in driver side. rl(4) is dumb device so
network stack should do segmentation and checksum computation.
I highly doubt the issue came from faulty cable since other users
also reported the same issue.
Unfortunately I have no clue yet and I was not able to reproduce it
on my box. I vaguely guess some code in kernel changed the ip_len
to 0 in the middle of transmission. Rick's captured traffic looks
normal except 0 ip_len given that controller is computing checksum
on the fly. If mbuf chain was corrupted(e.g. m_len == 0) driver
would have failed to send those frames.

> Ronald.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2011-01-31 Thread Pyun YongHyeon

On Mon, Jan 31, 2011 at 02:15:09PM +0200, Zeus V Panchenko wrote:
> Pyun YongHyeon (pyu...@gmail.com) [11.01.31 04:08] wrote:
> > > The RTL8168/8111D sample board I have does not show this kind of
> > > issue. This happens only when established link is 1000baseT, right?
> > > I slightly changed PHY's link detection code so would you try that
> > > patch at the following URL?
> > > http://people.freebsd.org/~yongari/re/rgephy.link.patch3
> > 
> > Previous one had a bug, please update one.
> > http://people.freebsd.org/~yongari/re/rgephy.link.patch4
> 
> no change :(
> interface continues to flap 
> 

Then I have no idea. Does other OS work with your hardware without
issues? As last resort, could you try vendor's FreeBSD driver? The
vendor's driver applies a bunch of magic DSP fixups which re(4)
does not have. I don't know whether it makes difference or not but
it would be worth a try. Note, vendor's driver treat your
controller as old 8139 such that it disables all offload features
and does not work on non-x86 architectures.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2011-01-30 Thread Pyun YongHyeon

On Sun, Jan 30, 2011 at 05:20:32PM -0800, Pyun YongHyeon wrote:
> On Sun, Jan 30, 2011 at 08:40:48AM +0200, Zeus V Panchenko wrote:
> > another detail for this nic
> > 
> > dmidecode
> > Base Board Information
> > Manufacturer: ASUSTeK Computer INC.
> > Product Name: AT5NM10-I
> > Version: Rev x.0x
> > Serial Number: MT7006K15200322
> > 
> > uname -a
> > FreeBSD 8.2-PRERELEASE amd64 
> > 
> > system was cvsup-ed 2011.01.20
> > 
> > if_re.c,v 1.160.2.17 2011/01/15 00:32:15 yongari
> > 
> > dmesg
> > rgephy0:  PHY 1 on miibus0
> > rgephy0:  10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 
> > 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
> > 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, 
> > auto-flow
> > re0: Ethernet address: 20:cf:30:89:5e:95
> > re0: [FILTER]
> > 
> > pciconf -lv
> > re0@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03 
> > hdr=0x00
> > vendor = 'Realtek Semiconductor'
> > device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> > class  = network
> > subclass   = ethernet
> > 
> > 
> > while connected directly NIC <-> NIC they flaps too
> > 
> > so, the issue with switch related causes can be excluded i believe
> > 
> 
> The RTL8168/8111D sample board I have does not show this kind of
> issue. This happens only when established link is 1000baseT, right?
> I slightly changed PHY's link detection code so would you try that
> patch at the following URL?
> http://people.freebsd.org/~yongari/re/rgephy.link.patch3

Previous one had a bug, please update one.
http://people.freebsd.org/~yongari/re/rgephy.link.patch4
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2011-01-30 Thread Pyun YongHyeon

On Sun, Jan 30, 2011 at 05:15:10PM -0800, Pyun YongHyeon wrote:
> On Sun, Jan 30, 2011 at 02:53:15PM +0100, Milan Obuch wrote:
> > On Sunday 30 January 2011 07:40:48 Zeus V Panchenko wrote:
> > > another detail for this nic
> > > 
> > > dmidecode
> > > Base Board Information
> > > Manufacturer: ASUSTeK Computer INC.
> > > Product Name: AT5NM10-I
> > > Version: Rev x.0x
> > > Serial Number: MT7006K15200322
> > > 
> > 
> > I did not followed this thread closely, but I checked my new board and it 
> > is 
> > the same.
> > 
> > > uname -a
> > > FreeBSD 8.2-PRERELEASE amd64
> > > 
> > > system was cvsup-ed 2011.01.20
> > > 
> > > if_re.c,v 1.160.2.17 2011/01/15 00:32:15 yongari
> > > 
> > > dmesg
> > > rgephy0:  PHY 1 on miibus0
> > > rgephy0:  10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 
> > > 100baseTX-FDX,
> > > 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX,
> > > 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto,
> > > auto-flow re0: Ethernet address: 20:cf:30:89:5e:95
> > > re0: [FILTER]
> > > 
> > > pciconf -lv
> > > re0@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03
> > > hdr=0x00 vendor = 'Realtek Semiconductor'
> > > device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> > > class  = network
> > > subclass   = ethernet
> > > 
> > 
> > All details are the same for my board (modulo serial number and MAC, of 
> > course), and in my case it works with no problem, but only in 100 Mb switch 
> > port. In 1 Gb port I have no link. I must verify my cables, port on switch 
> > itself works just fine with another 1 Gb (intel) card.
> > 
> 
> Would you try a patch at the following URL?
> http://people.freebsd.org/~yongari/re/rgephy.link.patch3
> 

Previous one had a bug, please use updated one.
http://people.freebsd.org/~yongari/re/rgephy.link.patch4

> > > 
> > > while connected directly NIC <-> NIC they flaps too
> > > 
> > 
> > I will try this against another 1 Gb card too, just to see what happens... 
> > in 
> > 100 Mb mode it works just fine, as I mentioned already - running flood ping 
> > with 1472 bytes packets for more than an hour I see only four responses 
> > missing in more than 21 millions tries...
> > 
> > Regards,
> > Milan
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2011-01-30 Thread Pyun YongHyeon

On Sun, Jan 30, 2011 at 08:40:48AM +0200, Zeus V Panchenko wrote:
> another detail for this nic
> 
> dmidecode
> Base Board Information
> Manufacturer: ASUSTeK Computer INC.
> Product Name: AT5NM10-I
> Version: Rev x.0x
> Serial Number: MT7006K15200322
> 
> uname -a
> FreeBSD 8.2-PRERELEASE amd64 
> 
> system was cvsup-ed 2011.01.20
> 
> if_re.c,v 1.160.2.17 2011/01/15 00:32:15 yongari
> 
> dmesg
> rgephy0:  PHY 1 on miibus0
> rgephy0:  10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 
> 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
> 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, 
> auto-flow
> re0: Ethernet address: 20:cf:30:89:5e:95
> re0: [FILTER]
> 
> pciconf -lv
> re0@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03 
> hdr=0x00
> vendor = 'Realtek Semiconductor'
> device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> class  = network
> subclass   = ethernet
> 
> 
> while connected directly NIC <-> NIC they flaps too
> 
> so, the issue with switch related causes can be excluded i believe
> 

The RTL8168/8111D sample board I have does not show this kind of
issue. This happens only when established link is 1000baseT, right?
I slightly changed PHY's link detection code so would you try that
patch at the following URL?
http://people.freebsd.org/~yongari/re/rgephy.link.patch3
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2011-01-30 Thread Pyun YongHyeon

On Sun, Jan 30, 2011 at 02:53:15PM +0100, Milan Obuch wrote:
> On Sunday 30 January 2011 07:40:48 Zeus V Panchenko wrote:
> > another detail for this nic
> > 
> > dmidecode
> > Base Board Information
> > Manufacturer: ASUSTeK Computer INC.
> > Product Name: AT5NM10-I
> > Version: Rev x.0x
> > Serial Number: MT7006K15200322
> > 
> 
> I did not followed this thread closely, but I checked my new board and it is 
> the same.
> 
> > uname -a
> > FreeBSD 8.2-PRERELEASE amd64
> > 
> > system was cvsup-ed 2011.01.20
> > 
> > if_re.c,v 1.160.2.17 2011/01/15 00:32:15 yongari
> > 
> > dmesg
> > rgephy0:  PHY 1 on miibus0
> > rgephy0:  10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX,
> > 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX,
> > 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto,
> > auto-flow re0: Ethernet address: 20:cf:30:89:5e:95
> > re0: [FILTER]
> > 
> > pciconf -lv
> > re0@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03
> > hdr=0x00 vendor = 'Realtek Semiconductor'
> > device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> > class  = network
> > subclass   = ethernet
> > 
> 
> All details are the same for my board (modulo serial number and MAC, of 
> course), and in my case it works with no problem, but only in 100 Mb switch 
> port. In 1 Gb port I have no link. I must verify my cables, port on switch 
> itself works just fine with another 1 Gb (intel) card.
> 

Would you try a patch at the following URL?
http://people.freebsd.org/~yongari/re/rgephy.link.patch3

> > 
> > while connected directly NIC <-> NIC they flaps too
> > 
> 
> I will try this against another 1 Gb card too, just to see what happens... in 
> 100 Mb mode it works just fine, as I mentioned already - running flood ping 
> with 1472 bytes packets for more than an hour I see only four responses 
> missing in more than 21 millions tries...
> 
> Regards,
> Milan
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: nfe support for jumbo frame

2011-01-24 Thread Pyun YongHyeon

On Mon, Jan 24, 2011 at 09:05:48AM -0800, Huang, Yusheng wrote:
> Hi all,
> 
> We have ported nfe driver to our product and when we try to set mtu to 9000 
> on nfe interface, it does not work. No jumbo frame buffer were allocated. 
> Looking at the code, we found the following:
> 
> In nfe_ioctl:
> 
>   else {
>   NFE_LOCK(sc);
>   ifp->if_mtu = ifr->ifr_mtu;
>   if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) ==> if 
> IFF_DRV_RUNNING is set, call nfe_init_locked
>   nfe_init_locked(sc);
>   NFE_UNLOCK(sc);
>   }
> However, in nfe_init_locked, it has the following test:
> 
> 
>   NFE_LOCK_ASSERT(sc);
> 
>   mii = device_get_softc(sc->nfe_miibus);
> 
>   if (ifp->if_drv_flags & IFF_DRV_RUNNING) ==> if IFF_DRV_RUNNING is set, 
> return
> 
>   return;
> 
>   nfe_stop(ifp);
> 
>   sc->nfe_framesize = ifp->if_mtu + NFE_RX_HEADERS;
> 
> So it ends up doing nothing.
> 
> Is there something we missed totally on changing the mtu?
> 

No you're right. Fixed in HEAD(r217794).
Thanks for reporting!
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bogus 0 len IP packet, was: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.

2011-01-19 Thread Pyun YongHyeon

On Tue, Jan 18, 2011 at 07:35:30PM +0100, sth...@nethelp.no wrote:
> > > I'm seeing the same problem with Broadcom NetXtreme (bce) cards:
> > > 
> > > bce0@pci0:3:0:0:class=0x02 card=0x03421014 chip=0x164c14e4 
> > > rev=0x12 hdr=0x00
> > > vendor = 'Broadcom Corporation'
> > > device = 'Broadcom NetXtreme II Gigabit Ethernet Adapter 
> > > (BCM5708)'
> > > class  = network
> > > subclass   = ethernet
> > > 
> > > This is with 8.2-PRERELEASE. Turning off TSO (ifconfig bce0 -tso)
> > > removes the problem.
> > > 
> > 
> > Is there a reliable way to trigger this on bce(4)? I don't have
> > BCM5708 but I have BCM5709 so I can verify that.
> 
> It showed up pretty much immediately when running a csup sessions
> against cvsup2.us.freebsd.org.
> 
> I have a pcap file from the session, if you're interested.
> 

I vaguely guess upper stack might pass less than MSS sized segment
to TSO capable driver with CSUM_TSO.
How about merging r212803 to stable/8?

> Steinar Haug, Nethelp consulting, sth...@nethelp.no
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bogus 0 len IP packet, was: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.

2011-01-18 Thread Pyun YongHyeon

On Tue, Jan 18, 2011 at 09:38:04AM +0100, sth...@nethelp.no wrote:
> > > So, does anyone have an idea why the IP length field would be set to 0
> > > for these TCP/IP packets?
> > > 
> > > Here's some info from Ronald w.r.t. his hardware. (All I can think of is
> > > that he could try disabling TSO, etc?)
> > > 
> > > Thanks in advance for any help with this, rick
> > > 
> > 
> > It seems that issue came from TSO. Driver will set ip_len and
> > ip_sum field to 0 before passing the TCP segment to controller.
> > The failed length were 4446, 5858, 3034 and 4310 and the total
> > number of such frames are more than 35k within 90 seconds. Since
> > failed length 4310 is continuously repeated I guess there is edge
> > case where em(4) didn't free failed TCP segment for TSO.
> > I remember there was commit to HEAD(r217295) which could be related
> > with this issue.
> 
> I'm seeing the same problem with Broadcom NetXtreme (bce) cards:
> 
> bce0@pci0:3:0:0:class=0x02 card=0x03421014 chip=0x164c14e4 
> rev=0x12 hdr=0x00
> vendor = 'Broadcom Corporation'
> device = 'Broadcom NetXtreme II Gigabit Ethernet Adapter (BCM5708)'
> class  = network
> subclass   = ethernet
> 
> This is with 8.2-PRERELEASE. Turning off TSO (ifconfig bce0 -tso)
> removes the problem.
> 

Is there a reliable way to trigger this on bce(4)? I don't have
BCM5708 but I have BCM5709 so I can verify that.

> Steinar Haug, Nethelp consulting, sth...@nethelp.no
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: sis(4) broken on 8.2 [Re: Carp seems completely broken on 8.2-RC2 and 8.2-PRERELEASE]

2011-01-18 Thread Pyun YongHyeon

On Tue, Jan 18, 2011 at 03:37:48AM +0100, Paul Schenkeveld wrote:
> Hello,
> 
> On Mon, Jan 17, 2011 at 02:26:24PM -0800, Pyun YongHyeon wrote:
> > > Since you didn't post dmesg output I'm not sure what kind of
> > > controller you have but I guess it would be NS8381[56]. I
> > > overhauled sis(4) to make it work on all architectures so one of
> > > change, probably r212119, could be cause of the issue. Due to lack
> > > of SiS controllers I didn't touch multicast handling part so some
> > > part of code still relies on old wrong behavior of driver.
> > > Would you try attached patch and let me know whether it makes any
> > > difference?
> > > 
> > 
> > Hmm, unfortunately it seems the patch above may not work since NS
> > data sheet says that filter function should be disabled before
> > touching other bits in the register.
> > Try this one instead.
> 
> As far as I can tell, both patches work for me.  Your second patch is
> on my production firewalls now so if anthing comes up over the
> coming days I'll keep you informed.
> 
> I've tested carp, both failover to backup and fallback (preemption)
> with IPv4 and with IPv6, all seems to work now.
> 

Thanks for testing. Committed to HEAD(r217548).

> Thannks again for your patches, hope you can get them into 8.2.
> 

I'm afraid it's too late. :-(

> Regards,
> 
> Paul Schenkeveld
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: capturing packet from wlan0 with netgraph?

2011-01-17 Thread Pyun YongHyeon

On Tue, Jan 18, 2011 at 09:55:01AM +0800, Adrian Chadd wrote:
> On 18 January 2011 02:03, Monthadar Al Jaberi  wrote:
> > filed a PR http://www.freebsd.org/cgi/query-pr.cgi?pr=154091
> 
> Thanks.
> 
> Network-stack and MIPS guys - what's the best way to handle this kind
> of stuff? This isn't the first time I've come across weird alignment
> stuff in the network stack that just doesn't seem to get much
> attention. Is it perhaps worth adding some debugging macros that
> account/log unaligned-ness? So people playing at home on i386/amd64
> can play along?
> 

I guess one of device driver/part of network stack is not aligning
IP header on strict-alignment architecture. But I'm pretty sure all
wired drivers always align IP header on 32bit boundary no matter
how it costs on strict alignment architectures. But it does not
align it on non-strict alignment architectures since that costs too
much without reasonable benefit.
I think you can add m_copyup() at the beginning ip_input to proceed
processing and print back traces to track down which one generated
unaligned IP header.

> 
> 
> Adrian
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: sis(4) broken on 8.2 [Re: Carp seems completely broken on 8.2-RC2 and 8.2-PRERELEASE]

2011-01-17 Thread Pyun YongHyeon

On Mon, Jan 17, 2011 at 01:29:47PM -0800, Pyun YongHyeon wrote:
> On Mon, Jan 17, 2011 at 08:56:15PM +0100, Paul Schenkeveld wrote:
> > On Sun, Jan 16, 2011 at 01:41:22PM +0100, Paul Schenkeveld wrote:
> > > Hi,
> > > 
> > > Trying to upgrade two Soekris firewalls to 8-STABLE or 8.2-PRERELEASE
> > > it appears that carp doesn't work at all.  I've set up carp like I've
> > > done on many firewall pairs before and they all work correctly.  With
> > > google, nor in the mailing lists, I could find anything about changes
> > > in the way carp get configured but if I missed something I'd be happy
> > > to hear that it's my fault.
> > > 
> > > Here's the setup:
> > > 
> > > net5501
> > >  test3
> > >   10.4.0.4/24
> > >|
> > >   -+-
> > >|   |
> > >   net4801 net4801
> > >test1   test2
> > >  sis4: 10.4.0.2/24   sis4: 10.4.0.3/24
> > >  carp4:10.4.0.1/24   carp4:10.4.0.1/24
> > >|   |   |   |   |   |   |   |
> > >|   |   |   |   |   |   |   |
> > >  sis[0-3] connected to other networks, see
> > >  explanation below.
> > > 
> > > When I ping from test3 to 10.4.0.1, I see the following traffic using
> > > tcpdump:
> > > 
> > > test3 # tcpdump -e -n -i vr3 not vrrp
> > > tcpdump: verbose output suppressed, use -v or -vv for full protocol 
> > > decode
> > > listening on vr3, link-type EN10MB (Ethernet), capture size 96 bytes
> > > 12:09:35.121831 00:00:24:c9:30:ff > ff:ff:ff:ff:ff:ff,
> > >   ethertype ARP (0x0806), length 60:
> > >   Request who-has 10.4.0.1 tell 10.4.0.4, length 46
> > > 12:09:35.122144 00:00:24:c3:49:91 > 00:00:24:c9:30:ff,
> > >   ethertype ARP (0x0806), length 60:
> > >   Reply 10.4.0.1 is-at 00:00:5e:00:01:68, length 46
> > > 12:09:35.122173 00:00:24:c9:30:ff > 00:00:5e:00:01:68,
> > >   ethertype IPv4 (0x0800), length 98:
> > >   10.4.0.4 > 10.4.0.1: ICMP echo request,
> > >   id 40482, seq 0, length 64
> > > 
> > > test1 # tcpdump -e -n -i sis4 not vrrp
> > > tcpdump: verbose output suppressed, use -v or -vv for full protocol 
> > > decode
> > > listening on sis4, link-type EN10MB (Ethernet), capture size 96 bytes
> > > 12:09:34.977570 00:00:24:c9:30:ff > ff:ff:ff:ff:ff:ff,
> > >   ethertype ARP (0x0806), length 60:
> > >   Request who-has 10.4.0.1 tell 10.4.0.4, length 46
> > > 12:09:34.977705 00:00:24:c3:49:91 > 00:00:24:c9:30:ff,
> > >   ethertype ARP (0x0806), length 42:
> > >   Reply 10.4.0.1 is-at 00:00:5e:00:01:68, length 28
> > > 
> > > test2 # dump -e -n -i sis4 not vrrp
> > > tcpdump: verbose output suppressed, use -v or -vv for full protocol 
> > > decode
> > > listening on sis4, link-type EN10MB (Ethernet), capture size 96 bytes
> > > 12:09:35.090050 00:00:24:c9:30:ff > ff:ff:ff:ff:ff:ff,
> > >   ethertype ARP (0x0806), length 60:
> > >   Request who-has 10.4.0.1 tell 10.4.0.4, length 46
> > > 
> > > There is an ARP request which is replied to by the carp master (test).
> > > the ping to the carp address does not even appear on the sis4 interface
> > > of test1.
> > > 
> > > This is the kernel config for test1 and test2:
> > > 
> > > include GENERIC
> > > device  carp
> > > makeoptions MODULES_OVERRIDE=""
> > > 
> > > The relevant rc.conf bits:
> > > 
> > > on test1
> > > hostname="test1"
> > > cloned_interfaces="carp1 carp2 carp3 carp4"
> > > ifconfig_sis0="xxx.xxx.xxx.41/26"
> > > ifconfig_sis1="10.1.0.2/24"
> > > ifconfig_sis2="10.2.0.2/24"
> > > ifconfig_sis3="10.3.0.2/24"
> > > ifconfig_sis4="10.4.0.2/24"
> > > ifconfig_carp1="10.1.0.1/24 vhid 101 pass abcd1234 advskew   0"
> > > ifconfig_carp2="10.2.0.1/24 vhid 102 pass abcd1234 advskew   0"
> > > ifconfig_carp3="10.3.0.1/24 vhid 103 pas

Re: sis(4) broken on 8.2 [Re: Carp seems completely broken on 8.2-RC2 and 8.2-PRERELEASE]

2011-01-17 Thread Pyun YongHyeon

On Mon, Jan 17, 2011 at 08:56:15PM +0100, Paul Schenkeveld wrote:
> On Sun, Jan 16, 2011 at 01:41:22PM +0100, Paul Schenkeveld wrote:
> > Hi,
> > 
> > Trying to upgrade two Soekris firewalls to 8-STABLE or 8.2-PRERELEASE
> > it appears that carp doesn't work at all.  I've set up carp like I've
> > done on many firewall pairs before and they all work correctly.  With
> > google, nor in the mailing lists, I could find anything about changes
> > in the way carp get configured but if I missed something I'd be happy
> > to hear that it's my fault.
> > 
> > Here's the setup:
> > 
> >   net5501
> >test3
> > 10.4.0.4/24
> >  |
> > -+-
> >  |   |
> >   net4801 net4801
> >test1   test2
> >  sis4: 10.4.0.2/24   sis4: 10.4.0.3/24
> >  carp4:10.4.0.1/24   carp4:10.4.0.1/24
> >|   |   |   |   |   |   |   |
> >|   |   |   |   |   |   |   |
> >  sis[0-3] connected to other networks, see
> >  explanation below.
> > 
> > When I ping from test3 to 10.4.0.1, I see the following traffic using
> > tcpdump:
> > 
> > test3 # tcpdump -e -n -i vr3 not vrrp
> > tcpdump: verbose output suppressed, use -v or -vv for full protocol 
> > decode
> > listening on vr3, link-type EN10MB (Ethernet), capture size 96 bytes
> > 12:09:35.121831 00:00:24:c9:30:ff > ff:ff:ff:ff:ff:ff,
> > ethertype ARP (0x0806), length 60:
> > Request who-has 10.4.0.1 tell 10.4.0.4, length 46
> > 12:09:35.122144 00:00:24:c3:49:91 > 00:00:24:c9:30:ff,
> > ethertype ARP (0x0806), length 60:
> > Reply 10.4.0.1 is-at 00:00:5e:00:01:68, length 46
> > 12:09:35.122173 00:00:24:c9:30:ff > 00:00:5e:00:01:68,
> > ethertype IPv4 (0x0800), length 98:
> > 10.4.0.4 > 10.4.0.1: ICMP echo request,
> > id 40482, seq 0, length 64
> > 
> > test1 # tcpdump -e -n -i sis4 not vrrp
> > tcpdump: verbose output suppressed, use -v or -vv for full protocol 
> > decode
> > listening on sis4, link-type EN10MB (Ethernet), capture size 96 bytes
> > 12:09:34.977570 00:00:24:c9:30:ff > ff:ff:ff:ff:ff:ff,
> > ethertype ARP (0x0806), length 60:
> > Request who-has 10.4.0.1 tell 10.4.0.4, length 46
> > 12:09:34.977705 00:00:24:c3:49:91 > 00:00:24:c9:30:ff,
> > ethertype ARP (0x0806), length 42:
> > Reply 10.4.0.1 is-at 00:00:5e:00:01:68, length 28
> > 
> > test2 # dump -e -n -i sis4 not vrrp
> > tcpdump: verbose output suppressed, use -v or -vv for full protocol 
> > decode
> > listening on sis4, link-type EN10MB (Ethernet), capture size 96 bytes
> > 12:09:35.090050 00:00:24:c9:30:ff > ff:ff:ff:ff:ff:ff,
> > ethertype ARP (0x0806), length 60:
> > Request who-has 10.4.0.1 tell 10.4.0.4, length 46
> > 
> > There is an ARP request which is replied to by the carp master (test).
> > the ping to the carp address does not even appear on the sis4 interface
> > of test1.
> > 
> > This is the kernel config for test1 and test2:
> > 
> > include GENERIC
> > device  carp
> > makeoptions MODULES_OVERRIDE=""
> > 
> > The relevant rc.conf bits:
> > 
> > on test1
> > hostname="test1"
> > cloned_interfaces="carp1 carp2 carp3 carp4"
> > ifconfig_sis0="xxx.xxx.xxx.41/26"
> > ifconfig_sis1="10.1.0.2/24"
> > ifconfig_sis2="10.2.0.2/24"
> > ifconfig_sis3="10.3.0.2/24"
> > ifconfig_sis4="10.4.0.2/24"
> > ifconfig_carp1="10.1.0.1/24 vhid 101 pass abcd1234 advskew   0"
> > ifconfig_carp2="10.2.0.1/24 vhid 102 pass abcd1234 advskew   0"
> > ifconfig_carp3="10.3.0.1/24 vhid 103 pass abcd1234 advskew   0"
> > ifconfig_carp4="10.4.0.1/24 vhid 104 pass abcd1234 advskew   0"
> > 
> > on test2
> > hostname="test2"
> > cloned_interfaces="carp1 carp2 carp3 carp4"
> > ifconfig_sis0="xxx.xxx.xxx.42/26"
> > ifconfig_sis1="10.1.0.3/24"
> > ifconfig_sis2="10.2.0.3/24"
> > ifconfig_sis3="10.3.0.3/24"
> > ifconfig_sis4="10.4.0.3/24"
> > ifconfig_carp1="10.1.0.1/24 vhid 101 pass abcd1234 advskew 100"
> > ifconfig_carp2="10.2.0.1/24 vhid 102 pass abcd1234 advskew 100"
> > ifconfig_carp3="10.3.0.1/24 vhid 103 pass abcd1234 advskew 100"
> > ifconfig_carp4="10.4.0.1/24 vhid 104 pass abcd1234 advskew 100"
> > 
> > In /etc/sysctl.conf:
> > net.inet.carp.preempt=1
> > 
> > Ifconfig output:
> > 
> > test1 # ifconfig sis4
> > sis4: flags=8943 metric 0 
> > mtu 1500
> > options=83808
> > ether 00:00:24:c3:49:91
> > inet 10.4.0.2 netmask 0xff00 broadcast 10.4.0.255
> > media: Ethernet autoselect (100baseTX )
> > status: active
> > test1 # ifconfig carp4
> > carp4: flags=49 metric 0 mtu 1500
> > inet 10.4.0.1 netmask 0xff00
> >

Re: bogus 0 len IP packet, was: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.

2011-01-16 Thread Pyun YongHyeon

On Sun, Jan 16, 2011 at 08:54:59AM -0500, Rick Macklem wrote:
> Ronald has reported having a problem with the FreeBSD NFS client using
> 8.2-prerelease. I've redirected it here, since it looks like there is
> a TCP/IP issue that is causing it.
> 
> > 
> > >>
> > >> These are the links to the dumps:
> > >> http://klop.ws/~ronald/nfs-problem/procstat.nolockd
> > >> http://klop.ws/~ronald/nfs-problem/ps.nolockd
> > >> http://klop.ws/~ronald/nfs-problem/linux5.nfs.nolockd.dump.gz
> > >> http://klop.ws/~ronald/nfs-problem/linux5.nfs.with_rpc_patch.dump.gz
> > >>
> 
> I looked at the last of these via wireshark and it seems that the FreeBSD
> client is sending bogus TCP/IP packets with a IP length == 0. (If you look
> at the above dump (...with_rpc_patch.dump.gz), the first one is at packet #46,
> then #3024, then persistently starting at #3234.) Basically the packet looks
> like:
>   frame len: 4446
>   MAC: dst 00:0d:56:70:b7:6c src b8:ac:6f:47:73:6e type: 08 00 (IP)
>   IP
> version: 4
> header length: 20
> differentiated services field: 0x00
> total length: 0
>   - followed by what looks like a legitimate TCP/IP packet
> 
> Here's the first bytes of the raw packet data:
> 00 0d 56 70 b7 6c b8 ac 6f 47 73 6e 08 00
> 45 00 00 00 ...
> 
> After this packet is sent to the Linux server, it replies with a TCP ack,
> which gets ACK'd from FreeBSD as well. For the persistent case, it just
> keeps doing this (bogus 0 length packet from FreeBSD -> Linux server,
> followed by the two TCP ack packets) over and over and over again, to
> the end of the dump.
> 
> So, does anyone have an idea why the IP length field would be set to 0
> for these TCP/IP packets?
> 
> Here's some info from Ronald w.r.t. his hardware. (All I can think of is
> that he could try disabling TSO, etc?)
> 
> Thanks in advance for any help with this, rick
> 

It seems that issue came from TSO. Driver will set ip_len and
ip_sum field to 0 before passing the TCP segment to controller.
The failed length were 4446, 5858, 3034 and 4310 and the total
number of such frames are more than 35k within 90 seconds. Since
failed length 4310 is continuously repeated I guess there is edge
case where em(4) didn't free failed TCP segment for TSO.
I remember there was commit to HEAD(r217295) which could be related
with this issue.

> 
> > > I just looked at the last dump and there seems to be a network
> > > issue.
> > > (It first shows up at packet #46, then again at #3025, then
> > > persistently
> > > starting at #3234.)
> > >
> > > I'd like to post on freebsd-net@ to see if anyone more conversant
> > > with
> > > TCP/IP can look, but first I'd like to get a little more info on
> > > your
> > > hardware/software config.
> > >
> > > In particular, what network hardware does the FreeBSD client use?
> > >
> > > And I assume the server is some variant of Linux?
> > >
> > > Thanks for creating the tcpdumps, rick
> > > ps: If you look on wireshark, the problem seems to start with a
> > > badly formed IP datagram that then causes acks in both
> > > directions.
> > 
> > We are getting off list now. I don't know how good that is.
> > 
> I've redirected it to freebsd-net@ in the hopes that networking folks
> can help.
> 
> > But here is some info. I also noticed the bad packets. And my computer
> > is
> > resending the same info at quite a high rate (MB/s).
> > The server is an up-to-date Linux Debian 5 with a 2.6.26-1-686-bigmem
> > kernel. Colleagues using Linux clients don't have these problems.
> > 
> > dmesg | grep em0
> > em0:  port 0xece0-0xecff
> > mem
> > 0xf7fe-0xf7ff,0xf7fd9000-0xf7fd9fff irq 21 at device 25.0 on
> > pci0
> > em0: Using an MSI interrupt
> > em0: [FILTER]
> > em0: Ethernet address: b8:ac:6f:47:73:6e
> > 
> > pciconf -lv
> > em0@pci0:0:25:0: class=0x02 card=0x02761028 chip=0x10de8086
> > rev=0x02 hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Intel Gigabit network connection (82567LM-3 )'
> > class = network
> > subclass = ethernet
> > 
> > 
> > [root@ronald ~]# ifconfig em0
> > em0: flags=8843 metric 0 mtu
> > 1500
> > options=219b
> > ether b8:ac:6f:47:73:6e
> > inet 10.1.20.49 netmask 0xff00 broadcast 10.1.20.255
> > media: Ethernet autoselect (100baseTX )
> > status: active
> > 
> > Thanks for looking into this. If you need more just let me know. I can
> > also reproduce the problem and send nfsstat or netstat output or 
> > 
> > Thanks,
> > Ronald.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [patch] re(4) problems on networks with disabled autonegotiation "solver" (WAS: Juniper e3k with ports limitied to...) -- REQUEST FOR REVIEW

2011-01-13 Thread Pyun YongHyeon

On Fri, Jan 14, 2011 at 02:24:12AM +0100, Marius Strobl wrote:
> On Thu, Jan 13, 2011 at 01:27:13PM -0800, Pyun YongHyeon wrote:
> > On Thu, Jan 13, 2011 at 06:39:25PM +0100, Marius Strobl wrote:
> > > On Wed, Jan 12, 2011 at 11:59:07PM +0100, Marius Strobl wrote:
> > > > On Wed, Jan 12, 2011 at 01:32:08PM -0800, Pyun YongHyeon wrote:
> > > > > On Wed, Jan 12, 2011 at 07:20:09PM +0300, Lev Serebryakov wrote:
> > > > > > Hello, Freebsd-net.
> > > > > > 
> > > > > >   Thanks to Pyun YongHyeon, who point me at fact, that rgephy(4) 
> > > > > > used
> > > > > > with re(4) does autonegotiation always and all other, who helps me
> > > > > > diagnose problem!
> > > > > > 
> > > > > >   I've prepared patch, which adds tunable/sysctl for rgephy(4) which
> > > > > > allows not to sue autonegotiation by this PHY (at user 
> > > > > > responsibility,
> > > > > > as here is PHYs which CAN NOT live without autonegotiation). It is 
> > > > > > OFF
> > > > > > by default, and in such case behavior of driver IS NOT CHANGED.
> > > > > > 
> > > > > >   But if it is set ON (non-zero value) before "media / mediopt"
> > > > > > changes via "ifconfig" autonegotiation IS NOT set with 10/100Mbit
> > > > > > settings.
> > > > > > 
> > > > > >   I've documented this new tunable in re(4) manpage, as here is no
> > > > > > rgephy(4) manpage.
> > > > > > 
> > > > > >   Tunable is per-device, not global one.
> > > > > > 
> > > > > >   Sysctl can be set after boot, but will affect only future ifconfig
> > > > > >  calls, it doesn't change anything in PHY settings by itself.
> > > > > > 
> > > > > >   It allows fully manual setup on non-buggy hardware, which allows 
> > > > > > to
> > > > > > use Hetzner dedicated servers with FreeBSD without additional NIC or
> > > > > > gigabit connection.
> > > > > > 
> > > > > >   I've tested this patch on FreeBSD 8-STABLE on Hetzner server and 
> > > > > > it
> > > > > > allows me to get full-duplex 100Mbit connection and I got 11MiB/s 
> > > > > > from
> > > > > > local NFS with it.
> > > > > > 
> > > > > >   Without this patch FreeBSD is unusable on Hetzner dedicated 
> > > > > > servers
> > > > > > in newer DCs (DC 13 and DC 14).
> > > > > > 
> > > > > >   Patch is attached. I think, it worths to include it to base 
> > > > > > system,
> > > > > > as it allows use FreeBSD at least on one very large and popular
> > > > > > hosting provider without additional costs :)
> > > > > > 
> > > > > 
> > > > > Thanks for your work. After reading commit log of rgephy(4) I now
> > > > > refreshed my memory. The issue came from the reverse usage case.
> > > > > Suppose link partner announces auto-negotiation but you want to use
> > > > > 100baseTX/full-duplex. As you know this results in duplex mismatch
> > > > > and sometimes it couldn't establish a link on some RealTek PHYs.
> > > > > (Now I'm not entirely sure it was caused by the specific switch or
> > > > > rgephy(4) or both)
> > > > > And frequently, link partner(switch) is out of control from your
> > > > > domain and most switches are configured to use auto-negotiation by
> > > > > default. Using auto-negotiation in manual media configuration
> > > > > seemed to address the issue at that time. 1000baseT link always
> > > > > requires auto-negotiation but too many switches were broken with
> > > > > auto-negotiation so some switches are forced to use manual media
> > > > > configuration even in 1000baseT mode. Using auto-negotiation on
> > > > > rgephy(4) will also solve that case.
> > > > > 
> > > > > So I have mixed feelings on how to handle both cases. Traditional
> > > > > way, which your patch does, used in manual configuration was to
> > > > > strictly honor specified manual media configuration even if it can
> > > > > break in so

Re: [patch] re(4) problems on networks with disabled autonegotiation "solver" (WAS: Juniper e3k with ports limitied to...) -- REQUEST FOR REVIEW

2011-01-13 Thread Pyun YongHyeon

On Thu, Jan 13, 2011 at 06:39:25PM +0100, Marius Strobl wrote:
> On Wed, Jan 12, 2011 at 11:59:07PM +0100, Marius Strobl wrote:
> > On Wed, Jan 12, 2011 at 01:32:08PM -0800, Pyun YongHyeon wrote:
> > > On Wed, Jan 12, 2011 at 07:20:09PM +0300, Lev Serebryakov wrote:
> > > > Hello, Freebsd-net.
> > > > 
> > > >   Thanks to Pyun YongHyeon, who point me at fact, that rgephy(4) used
> > > > with re(4) does autonegotiation always and all other, who helps me
> > > > diagnose problem!
> > > > 
> > > >   I've prepared patch, which adds tunable/sysctl for rgephy(4) which
> > > > allows not to sue autonegotiation by this PHY (at user responsibility,
> > > > as here is PHYs which CAN NOT live without autonegotiation). It is OFF
> > > > by default, and in such case behavior of driver IS NOT CHANGED.
> > > > 
> > > >   But if it is set ON (non-zero value) before "media / mediopt"
> > > > changes via "ifconfig" autonegotiation IS NOT set with 10/100Mbit
> > > > settings.
> > > > 
> > > >   I've documented this new tunable in re(4) manpage, as here is no
> > > > rgephy(4) manpage.
> > > > 
> > > >   Tunable is per-device, not global one.
> > > > 
> > > >   Sysctl can be set after boot, but will affect only future ifconfig
> > > >  calls, it doesn't change anything in PHY settings by itself.
> > > > 
> > > >   It allows fully manual setup on non-buggy hardware, which allows to
> > > > use Hetzner dedicated servers with FreeBSD without additional NIC or
> > > > gigabit connection.
> > > > 
> > > >   I've tested this patch on FreeBSD 8-STABLE on Hetzner server and it
> > > > allows me to get full-duplex 100Mbit connection and I got 11MiB/s from
> > > > local NFS with it.
> > > > 
> > > >   Without this patch FreeBSD is unusable on Hetzner dedicated servers
> > > > in newer DCs (DC 13 and DC 14).
> > > > 
> > > >   Patch is attached. I think, it worths to include it to base system,
> > > > as it allows use FreeBSD at least on one very large and popular
> > > > hosting provider without additional costs :)
> > > > 
> > > 
> > > Thanks for your work. After reading commit log of rgephy(4) I now
> > > refreshed my memory. The issue came from the reverse usage case.
> > > Suppose link partner announces auto-negotiation but you want to use
> > > 100baseTX/full-duplex. As you know this results in duplex mismatch
> > > and sometimes it couldn't establish a link on some RealTek PHYs.
> > > (Now I'm not entirely sure it was caused by the specific switch or
> > > rgephy(4) or both)
> > > And frequently, link partner(switch) is out of control from your
> > > domain and most switches are configured to use auto-negotiation by
> > > default. Using auto-negotiation in manual media configuration
> > > seemed to address the issue at that time. 1000baseT link always
> > > requires auto-negotiation but too many switches were broken with
> > > auto-negotiation so some switches are forced to use manual media
> > > configuration even in 1000baseT mode. Using auto-negotiation on
> > > rgephy(4) will also solve that case.
> > > 
> > > So I have mixed feelings on how to handle both cases. Traditional
> > > way, which your patch does, used in manual configuration was to
> > > strictly honor specified manual media configuration even if it can
> > > break in some edge cases. Programming PHYs with traditional way
> > > shall also trigger other problems to drivers which correctly keep
> > > track of valid link state changes. Normally speed/duplex/flow-control
> > > changes require MAC reprogramming such that monitoring PHY's state
> > > change is essential to modern ethernet controllers. Forcing manual
> > > media configuration can make PHY drivers fail to report link state
> > > changes which in turn shall make ethernet controller not to work
> > > due to speed/duplex mismatches between PHY and MAC of ethernet
> > > controller. re(4) does not require MAC reprogramming but many other
> > > drivers that use regephy(4) may not work. However regphy(4)
> > > hardware I have still seem to correctly report link state change
> > > with manual link configuration. Not sure about old controllers
> > > though.
> > > 
> > > I'm under

Re: [patch] re(4) problems on networks with disabled autonegotiation "solver" (WAS: Juniper e3k with ports limitied to...) -- REQUEST FOR REVIEW

2011-01-12 Thread Pyun YongHyeon

On Wed, Jan 12, 2011 at 12:59:58PM -0800, Artem Belevich wrote:
> 2011/1/12 Lev Serebryakov :
> > Hello, Freebsd-net.
> >
> > ?Thanks to Pyun YongHyeon, who point me at fact, that rgephy(4) used
> > with re(4) does autonegotiation always and all other, who helps me
> > diagnose problem!
> >
> > ?I've prepared patch, which adds tunable/sysctl for rgephy(4) which
> > allows not to sue autonegotiation by this PHY (at user responsibility,
> > as here is PHYs which CAN NOT live without autonegotiation). It is OFF
> > by default, and in such case behavior of driver IS NOT CHANGED.
> >
> > ?But if it is set ON (non-zero value) before "media / mediopt"
> > changes via "ifconfig" autonegotiation IS NOT set with 10/100Mbit
> > settings.
> >
> > ?I've documented this new tunable in re(4) manpage, as here is no
> > rgephy(4) manpage.
> 
> I wonder if we could make autonegotiation another media option.
> This may solve the problem at hand in a more generic way.
> 
> In case someone specifies speed/duplex settings but want
> autonegotiation on, we can advertise only that particular speed/duplex
> capability (as opposed to advertising everything we support). This
> would force remote end to either establish the link with the
> parameters we want or keep the link down which would be better than
> keeping the link up with mismatched duplex settings.
> 

Yeah, that would be good option. However, it's not trivial to
implement these things on all PHY drivers. Some PHY hardwares do
not have a capability which can tell it successfully resolved
speed/duplex with manual media configuration.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [patch] re(4) problems on networks with disabled autonegotiation "solver" (WAS: Juniper e3k with ports limitied to...) -- REQUEST FOR REVIEW

2011-01-12 Thread Pyun YongHyeon

On Wed, Jan 12, 2011 at 07:20:09PM +0300, Lev Serebryakov wrote:
> Hello, Freebsd-net.
> 
>   Thanks to Pyun YongHyeon, who point me at fact, that rgephy(4) used
> with re(4) does autonegotiation always and all other, who helps me
> diagnose problem!
> 
>   I've prepared patch, which adds tunable/sysctl for rgephy(4) which
> allows not to sue autonegotiation by this PHY (at user responsibility,
> as here is PHYs which CAN NOT live without autonegotiation). It is OFF
> by default, and in such case behavior of driver IS NOT CHANGED.
> 
>   But if it is set ON (non-zero value) before "media / mediopt"
> changes via "ifconfig" autonegotiation IS NOT set with 10/100Mbit
> settings.
> 
>   I've documented this new tunable in re(4) manpage, as here is no
> rgephy(4) manpage.
> 
>   Tunable is per-device, not global one.
> 
>   Sysctl can be set after boot, but will affect only future ifconfig
>  calls, it doesn't change anything in PHY settings by itself.
> 
>   It allows fully manual setup on non-buggy hardware, which allows to
> use Hetzner dedicated servers with FreeBSD without additional NIC or
> gigabit connection.
> 
>   I've tested this patch on FreeBSD 8-STABLE on Hetzner server and it
> allows me to get full-duplex 100Mbit connection and I got 11MiB/s from
> local NFS with it.
> 
>   Without this patch FreeBSD is unusable on Hetzner dedicated servers
> in newer DCs (DC 13 and DC 14).
> 
>   Patch is attached. I think, it worths to include it to base system,
> as it allows use FreeBSD at least on one very large and popular
> hosting provider without additional costs :)
> 

Thanks for your work. After reading commit log of rgephy(4) I now
refreshed my memory. The issue came from the reverse usage case.
Suppose link partner announces auto-negotiation but you want to use
100baseTX/full-duplex. As you know this results in duplex mismatch
and sometimes it couldn't establish a link on some RealTek PHYs.
(Now I'm not entirely sure it was caused by the specific switch or
rgephy(4) or both)
And frequently, link partner(switch) is out of control from your
domain and most switches are configured to use auto-negotiation by
default. Using auto-negotiation in manual media configuration
seemed to address the issue at that time. 1000baseT link always
requires auto-negotiation but too many switches were broken with
auto-negotiation so some switches are forced to use manual media
configuration even in 1000baseT mode. Using auto-negotiation on
rgephy(4) will also solve that case.

So I have mixed feelings on how to handle both cases. Traditional
way, which your patch does, used in manual configuration was to
strictly honor specified manual media configuration even if it can
break in some edge cases. Programming PHYs with traditional way
shall also trigger other problems to drivers which correctly keep
track of valid link state changes. Normally speed/duplex/flow-control
changes require MAC reprogramming such that monitoring PHY's state
change is essential to modern ethernet controllers. Forcing manual
media configuration can make PHY drivers fail to report link state
changes which in turn shall make ethernet controller not to work
due to speed/duplex mismatches between PHY and MAC of ethernet
controller. re(4) does not require MAC reprogramming but many other
drivers that use regephy(4) may not work. However regphy(4)
hardware I have still seem to correctly report link state change
with manual link configuration. Not sure about old controllers
though.

I'm under the impression that rgephy(4)'s behavior seem to confuse
users a lot since it unconditionally use auto-negotiation so I
think it's better not to use auto-negotiation at all during manual
media configuration and provides a way to use auto-negotiation in
manual media configuration if administrator want to do that.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Juniper e3k with ports limitied to 100Mbit and re NICs on MSI MoBo: problems with duplex negotiation (Hetzner host provider discard FreeBSD support due this bug)

2011-01-12 Thread Pyun YongHyeon

On Wed, Jan 12, 2011 at 08:56:19PM +0300, Lev Serebryakov wrote:
> Hello, Pyun.
> You wrote 12 января 2011 г., 20:32:42:
> 
> 
> >> > That had been supported for long time. Just remove full-duplex
> >> > media option in your manual configuration.
> >>   What do you mean by this? Without this media options it will be
> >>   100Mbit half-duplex when switch port want 100Mbit full-duplex. With
> > Because you have to rely on parallel detection the end result would
> > be half-duplex. So switch port could be configured to
> > 100Mbps/half-duplex and you can still use auto-negotiation. This
>What if I CAN NOT configure switch port? It is
> 100Mbps/full-duplex/no-autonegotiation and it is not under my control!
> 

Of course that kind of workaround would not be available if you
have no reachability to the switch. I thought service provider can
do that by your request.

> -- 
> // Black Lion AKA Lev Serebryakov 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Juniper e3k with ports limitied to 100Mbit and re NICs on MSI MoBo: problems with duplex negotiation (Hetzner host provider discard FreeBSD support due this bug)

2011-01-12 Thread Pyun YongHyeon

On Wed, Jan 12, 2011 at 10:36:09AM +, Bjoern A. Zeeb wrote:
> On Tue, 11 Jan 2011, Pyun YongHyeon wrote:
> 
> >On Tue, Jan 11, 2011 at 12:47:29PM +0300, Lev Serebryakov wrote:
> >>
> >> media: Ethernet 100baseTX  (100baseTX )
> >
> >I can see what's going on here. Link partner used forced media
> >configuration, probably 100baseTX/full-duplex, and re(4)'s
> >resolved link is 100baseTX/half-duplex.
> 
> I can confirm that the switch port should be (manually) set to 100/FD.
> It's documented on their support wiki (in German).
> 
> 
> >rgephy(4) currently always use auto-negotiation to work-around link
> >establishment issues reported in past. I don't know how Linux
> >managed to address link establishment issues for
> >non-autonegotiation case though. Perhaps a lot of vendor supplied
> 
> As I read your reply, there had been a time when manually setting
> 100/FD was possible but it didn't quite work?
> 

Correct. Many cases(probably old controllers) it worked but some
revision of PHY  did not like manual configuration. I don't
remember details.

> 
> >DSP fixups addressed that issue but I'm not sure.
> >For your case, the only way to address the issue at this moment is
> >to use auto-negotiation but that would establish 1000baseT link
> >which would add cost for you. Alternatively request half-duplex
> >configuration to the provider to get a agreed link duplex.
> 
> We should still try to fix it somehow.  Also it would be nice if re(4),
> or rephy(4) if we had that,  would document the issue properly in BUGS.
> 
> 
> >See
> >http://lists.freebsd.org/pipermail/freebsd-amd64/2011-January/013589.html
> >for details on parallel detection.
> 
> As someone from Hetzner has pointed out to me the original discussion
> seemd to have been here:
> 
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-November/059894.html
> 
> 
> While I can understand the problem, has anyone contacted RealTek for
> documentation to solve that matter, so that we could equally fix the
> things as other major OSes have done by now (either themselves or by
> a vendor update)?
> 

Recently I got contact point to the vendor.  The vendor is not
willing to provide data sheet but they are generous enough to
donate bunch of engineering sample boards to me. I'll ask some
specific questions to the vendor. Let's see how it goes.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Juniper e3k with ports limitied to 100Mbit and re NICs on MSI MoBo: problems with duplex negotiation (Hetzner host provider discard FreeBSD support due this bug)

2011-01-12 Thread Pyun YongHyeon

On Wed, Jan 12, 2011 at 12:03:03PM +0300, Lev Serebryakov wrote:
> Hello, Pyun.
> You wrote 12 января 2011 г., 1:45:26:
> 
> 
> > That had been supported for long time. Just remove full-duplex
> > media option in your manual configuration.
>   What do you mean by this? Without this media options it will be
>   100Mbit half-duplex when switch port want 100Mbit full-duplex. With

Because you have to rely on parallel detection the end result would
be half-duplex. So switch port could be configured to
100Mbps/half-duplex and you can still use auto-negotiation. This
way re(4) may be able to establish 100Mbps half-duplex. Full-duplex
would be better but it seems there is no way at this moment.

>   this media option result is the same because refphy(4) ignores user
>   requests, if I understand right.
> 
> -- 
> // Black Lion AKA Lev Serebryakov 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Juniper e3k with ports limitied to 100Mbit and re NICs on MSI MoBo: problems with duplex negotiation (Hetzner host provider discard FreeBSD support due this bug)

2011-01-11 Thread Pyun YongHyeon

On Wed, Jan 12, 2011 at 12:31:10AM +0300, Lev Serebryakov wrote:
> Hello, Pyun.
> You wrote 11 января 2011 г., 23:00:07:
> 
> > rgephy(4) currently always use auto-negotiation to work-around link
> > establishment issues reported in past.
>   I think, it is the root of the problem. Autonegotiation is DISABLED on
> these ports. I think, some additional mediaopt (like
> force-half-duplex) for rgephy(4) will be solution.
> 
> > For your case, the only way to address the issue at this moment is
> > to use auto-negotiation but that would establish 1000baseT link
> > which would add cost for you. Alternatively request half-duplex
> > configuration to the provider to get a agreed link duplex.
>Maybe, adding new mediaopt is not very hard? Or is it?
> 

That had been supported for long time. Just remove full-duplex
media option in your manual configuration.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Juniper e3k with ports limitied to 100Mbit and re NICs on MSI MoBo: problems with duplex negotiation (Hetzner host provider discard FreeBSD support due this bug)

2011-01-11 Thread Pyun YongHyeon

On Tue, Jan 11, 2011 at 12:47:29PM +0300, Lev Serebryakov wrote:
> Hello, Freebsd-net.
> 
>   Very large and famous (due to very attractive prices) hosting
>  provider Hetzner.de discards FreeBSD support on dedicated servers,
>  because these servers can niot negotiate 100Mbit/DUPLEX when
>  switches' ports are limited to 100Mbit (1Gbit connection costs
>  additional money) only under FreeBSD. Linux works fine.
> 
>   Switches known to be Juniper e3k series.
> 
>   MoBos of servers are different assortment of MSI MoBos with Realtek
> (re driver) network-on-board.
> 
>   Symptjms are: NIC can not negotiate/set duplex when switch port is
>  limited to 100Mbit/Duplex. Duplex can not be set even manually via
>  "ifconfig":
> 
> 
>  media: Ethernet 100baseTX  (100baseTX )
> 
>   Is it know problem? Maybe, -CURRENT driver has fix for it?
> 
>   Unfortunately, I can not provide more information, as I don't have
> server at Hetzner (I'm planning to order one, but due to these
> problems, I'm not sure now, as I need FreeBSD), and all this
> information is collected in communication with people who HAVE servers
> with FreeBSD installed.
> 
>  Again, I know, that Realtek NICs are crap, but "everybody says" that
> Linux doesn't have THIS problem with THESE boards and switches.
> 

I can see what's going on here. Link partner used forced media
configuration, probably 100baseTX/full-duplex, and re(4)'s
resolved link is 100baseTX/half-duplex.
rgephy(4) currently always use auto-negotiation to work-around link
establishment issues reported in past. I don't know how Linux
managed to address link establishment issues for
non-autonegotiation case though. Perhaps a lot of vendor supplied
DSP fixups addressed that issue but I'm not sure.
For your case, the only way to address the issue at this moment is
to use auto-negotiation but that would establish 1000baseT link
which would add cost for you. Alternatively request half-duplex
configuration to the provider to get a agreed link duplex.

See
http://lists.freebsd.org/pipermail/freebsd-amd64/2011-January/013589.html
for details on parallel detection.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: tx v2 error 0x6204 - is this a new feature?

2011-01-10 Thread Pyun YongHyeon

On Mon, Jan 10, 2011 at 11:10 PM,   wrote:
> Greetings,
>  I have been receiving these messages on a recent 8.1/AMD64 install.
> src/ports && world/kern about a week ago. Here is a block from the most
> recent output:
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
> nfe0: tx v2 error 0x6204
>
> It appears to only occur when transmitting largish amounts of data
> across an NFS mount. I'm not sure where the MIN-threshold lies. But
> appears to be >=1.5Mb.
> This fresh 8.1/AMD64 is part of a largish server farm comprised of
> 7+ - 8.0 i386 servers. This one is the only AMD64. It is also the
> only AMD64. I experience this when mounting an 8.0/i386 server from
> this 8.1/AMD64. The i386 also has mounts on this 8.1/AMD64.
> relevant info:
> ### 8.0/i386
> 8.0-STABLE FreeBSD 8.0-STABLE #0: /usr/obj/usr/src/sys/UDNS01  i386
> Tyan 2-CPU MB
> 2 NIC's: fxp0 (only one in use)
> ### 8.1/AMD64
> FreeBSD 8.1-RELEASE-p2 #0: /usr/obj/usr/src/sys/XII amd64
> MSI K9N4 Ultra
> CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ (3511.34-MHz K8-class
> CPU)
> 1 NIC nfe0
> ### common to both:
> rc.conf
> nfs_client_enable="YES"
> nfs_reserved_port_only="YES"
> nfs_server_enable="YES"
>
> NIC's on both boards are 10/100's @100mbps
>
> Can anyone provide any insight as to why I should be receiving these
> messages on a fresh 8.1/amd64 install. Is 8.1 INcompatible with
> earlier versions?
>

No, I guess you're seeing one of unresolved nfe(4) issues.
By chance, are you using forced media configuration instead of
auto-negotiation?
Posting both dmesg and "ifconfig nfe0" output would be useful.

> Thank you for all your time and consideration.
>
> --Chris
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: msk driver rxcsum issue

2010-12-31 Thread Pyun YongHyeon

On Thu, Dec 30, 2010 at 04:56:28PM -0800, Pyun YongHyeon wrote:
> On Wed, Dec 29, 2010 at 12:35:51AM -0800, Sreekanth M. wrote:
> > Hi, 
> > 
> >  
> > 
> >I am Sreekanth from Netlogic microsystems.
> > 
> >  
> > 
> > I am having an issue with msk driver.
> > 
> > It is related to rxcsum.
> > 
> > In freebsd 9, rxcsum is enabled in default for the device I am using on
> > XLS MIPS board.
> > 
> > >From hardware id it is known that chip id is: CHIP_ID_YUKON_EC.
> > 
> >  
> > 
> > Though I am able to ping to this port, ssh (TCP) is not working.
> > 
> >  
> > 
> > If I disable the rxcsum through the command 
> > 
> > "Ifconfig msk0 -rxcsum"
> > 
> > It works. i.e ssh command works fine.
> > 
> >  
> 
> Would you try attached patch and let me know how it goes on MIPS?

Thanks for testing. Fix committed to HEAD(r216860).
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: msk driver rxcsum issue

2010-12-30 Thread Pyun YongHyeon

On Wed, Dec 29, 2010 at 12:35:51AM -0800, Sreekanth M. wrote:
> Hi, 
> 
>  
> 
>I am Sreekanth from Netlogic microsystems.
> 
>  
> 
> I am having an issue with msk driver.
> 
> It is related to rxcsum.
> 
> In freebsd 9, rxcsum is enabled in default for the device I am using on
> XLS MIPS board.
> 
> >From hardware id it is known that chip id is: CHIP_ID_YUKON_EC.
> 
>  
> 
> Though I am able to ping to this port, ssh (TCP) is not working.
> 
>  
> 
> If I disable the rxcsum through the command 
> 
> "Ifconfig msk0 -rxcsum"
> 
> It works. i.e ssh command works fine.
> 
>  

Would you try attached patch and let me know how it goes on MIPS?
Index: sys/dev/msk/if_msk.c
===
--- sys/dev/msk/if_msk.c	(revision 216827)
+++ sys/dev/msk/if_msk.c	(working copy)
@@ -3070,7 +3070,7 @@
 	default:
 		return;
 	}
-	csum = ntohs(sc_if->msk_csum & 0x);
+	csum = bswap16(sc_if->msk_csum & 0x);
 	/* Checksum fixup for IP options. */
 	len = hlen - sizeof(struct ip);
 	if (len > 0) {
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: nfe_defrag() routine in nividia ethernet driver

2010-12-18 Thread Pyun YongHyeon

On Thu, Dec 16, 2010 at 07:53:16PM -0800, abcde abcde wrote:
> Hi, we ported the nvidia ethernet driver to our product.? It's been OK until 
> recently we?ran into an error condition where packets would get dropped 
> quietly. 
> The root cause resides in the nfe_encap() routine, where we call nfe_defrag() 
> to 
> try to reduce the length of the mbuf chain to 32, if it's longer than 32. In 
> the 
> event the 32 mbufs need more than 32 segments, the subsequent call to 
> bus_dmamap_load_mbuf_sg() would cause it to return an error then the packet 
> is 
> subsequently dropped. 
> 
> 
> My questions are,
> 
> 1. there appears to be a generic m_defrag() routine available, which doesn't 
> stop at 32 and is used by a couple of other drivers (Intel, Broadcom, to name 
> a 
> few). What was the need for a nvidia version of the defrag routine?
> 

As John said, m_defrag(9) is expensive operation. Since all nfe(4)
controllers supports multiple TX buffers use m_collapse(9) instead.

> 2. The NFE_MAX_SCATTER constant, which limits how many segments can be used, 
> is 
> defined to be 32, while the corresponding constants for other drivers are 100 
> or 
> 64 (again Intel or Broadcom). How was the value 32 picked? Anybody knows the 
> reasoning behind them?
> 

I think all nfe(4) controllers have no limitation on number of
segments can be used. However most ethernet controllers targeted to
non-server systems are not good at supporting multiple outstanding
DMA read operation on the PCIe bus. Even though controller supports
multiple DMA read operation it would take more time to fetch a TX
frame that is split into long list of mbuf chains than short/single
contiguous TX frame. CPU is much faster than controller DMA engine.
The magic number 32 was chosen to balance on performance and
resource usage. 32 should be large enough to support TSO to send a
full 64KB TCP segment. If controller has no TSO capability I would
have used 16.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with igb(4) updated to version 2.0.7

2010-12-04 Thread Pyun YongHyeon

On Fri, Dec 03, 2010 at 02:00:08PM -0800, Jack Vogel wrote:
> There are pros and cons either way you do things. I was talking to some of
> our
> Linux crew, they recently changed things so it would shut down the phy, but
> that
> doesn't always make everyone happy either.
> 
> Just saying that my FreeBSD drivers have not done so forever :)
> 

I also think powering down the PHY when interface is down is not
good idea in most cases but administrators is able to do that by
choosing 'none' media for drivers which use mii(4). For example,
#ifconfig foo0 media none

will power down the PHY. Of course igb(4) is not aware of mii(4) so
the command above does not work at this moment.

> Jack
> 
> 
> On Fri, Dec 3, 2010 at 11:41 AM, Eugene Grosbein  wrote:
> 
> > On 04.12.2010 01:37, Mike Tancsa wrote:
> >
> > >> Now I see, thanks.
> > >>
> > >> Is it technically possible to bring link down
> > >> for distinct port of dual-port em/igb-supported NICs using software?
> > >>
> > >> If yes, I'd like to patch my source tree.
> > >> For EtherChannel this kind of management should be possible.
> > >
> > >
> > > If your switch port's speed and duplex are manual, change the media
> > > options on the NIC to something like 10 half. The switch should see the
> > > port "down" then.
> >
> > Yes, but switches are not always at my control and one may be in auto mode.
> >
> > Eugene Grosbein
> > ___
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2010-12-02 Thread Pyun YongHyeon

On Thu, Dec 02, 2010 at 09:56:42PM +0100, Gabor Radnai wrote:
> Hi,
> 
> Could someone pls advise how to inject HEAD driver to stable release without
> full kernel rebuild (if possible)?
> 

If you have updated to stable/8, the driver code would be the same.
So need to replace driver with HEAD version.

> I tried this way but found no assurance/evidence actually kernel using the
> new driver:
> 1. download full HEAD source with help of csup
> 2. in /usr/src/sys/modules/re did make install which in turn compiled re
> driver and installed into default /boot/kernel
> 3. reboot
> 
> So far so good but still re0 driver cannot properly handle rtl8111 chip
> seeing the very same symptoms as in case of
> the driver shipped with RELEASE.
> 

If my memory is correct, your controller is somewhat old 8168 PCIe
controller. I also have the same TP-Link TG-3468 PCIe network card
which seems to be the only stand-alone PCIe 8168 controller in
market. I don't see any problems using the controller so would you
summarize your issue again?(Sorry, if you already post it)

> Thanks.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2010-11-15 Thread Pyun YongHyeon

On Sat, Nov 13, 2010 at 09:09:18AM +0200, Zeus V Panchenko wrote:
> Pyun YongHyeon (pyu...@gmail.com) [10.11.13 01:01] wrote:
> > 
> > Please be more specific for the issue. Your description is hard to
> > narrow down possible cause.
> > 
> > > i was sure it is the problem of the onboard rt nics ...
> > > 
> > 
> > pciconf output of all re(4) controllers are useless because the
> > vendor uses the same device id. Please show me the output of dmesg
> > which will contain necessary information to identify your
> > controller revision.
> > 
> 
> oh, sorry :(
> 
> here is what you say:
> 
> the integrated onboard nic:
> re0:  port 
> 0xe800-0xe8ff mem 0xfafff000-0xfaff,0xfaff8000-0xfaffbfff irq 17 at 
> device 0.0 on pci2
> re0: Using 1 MSI messages
> re0: Chip rev. 0x2800
> re0: MAC rev. 0x
> miibus0:  on re0
> rgephy0:  PHY 1 on miibus0
> rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto
> re0: Ethernet address: 48:5b:39:d2:1d:89
> re0: [FILTER]
> 
> 
> external PCI nic:
> re1:  port 0xd800-0xd8ff mem 
> 0xfbeffc00-0xfbeffcff irq 17 at device 0.0 on pci1
> re1: Chip rev. 0x1000
> re1: MAC rev. 0x
> miibus1:  on re1
> rgephy1:  PHY 1 on miibus1
> rgephy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto
> re1: Ethernet address: 00:21:91:f4:5f:e4
> re1: [FILTER]
> 
> 
> all that data i was posting here before (several months ago) and
> nothing changed since that time 
> 
> due to production state of the boxes i was forced finally to switch to
> external nics and to configure it with vlans and even to unplug the
> cable of the onboard nic
> 
> since nic with plugged cable but without assigned ip address did begin
> flap (may be it's specific of the swith it plugged in, it is TP-Link
> TL-SG5426 but no other nic behaves this way)
> 
> i have 7 boxes of this configuration and all 6 are running
> now on external nics
> 
> 
> if i can provide any debug/info/e.t.c. please let me know, i'd be
> happy it'd work at last :)
>  

Ok, please try latest re(4) in HEAD. If that does not change the
behavior, give attached patch spin and let me know whether it makes
any difference. Note, the attached may trigger watchdog timeouts
under certain conditions but if you do not remove UTP cable that
wouldn't happen. I have to verify whether it can really trigger
watchdog timeouts and it takes more time on my side.
Index: sys/dev/re/if_re.c
===
--- sys/dev/re/if_re.c	(revision 215345)
+++ sys/dev/re/if_re.c	(working copy)
@@ -2151,9 +2151,10 @@
 	RL_LOCK_ASSERT(sc);
 
 	mii = device_get_softc(sc->rl_miibus);
-	mii_tick(mii);
-	if ((sc->rl_flags & RL_FLAG_LINK) == 0)
+	if ((sc->rl_flags & RL_FLAG_LINK) == 0) {
+		mii_tick(mii);
 		re_miibus_statchg(sc->rl_dev);
+	}
 	/*
 	 * Reclaim transmitted frames here. Technically it is not
 	 * necessary to do here but it ensures periodic reclamation
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: if_msk not working after suspend/resume if only one adapter is present

2010-11-15 Thread Pyun YongHyeon

On Mon, Nov 15, 2010 at 11:11:42PM +, Bruce Cran wrote:
> I've been trying to get suspend/resume working on my Dell laptop.  I have two 
> if_msk adapters: one's the built-in 100Mb port and the other's a Sonnet Gb 
> ExpressCard NIC. I've noticed that if I boot with the Gb card installed both 
> ports work fine when the laptop has been resumed but if I don't have the Gb 
> card installed then the built-in port stops working.   Is there any way to 
> debug what might be going wrong?
> 

Could you show me dmesg output?
I've attached a blind patch which could be related with
suspend/resume. I guess powering down code was not synchronized
well for newer controllers.
Index: sys/dev/msk/if_msk.c
===
--- sys/dev/msk/if_msk.c	(revision 215345)
+++ sys/dev/msk/if_msk.c	(working copy)
@@ -2941,8 +2941,6 @@
 	CSR_WRITE_4(sc, B0_HWE_IMSK, 0);
 	CSR_READ_4(sc, B0_HWE_IMSK);
 
-	msk_phy_power(sc, MSK_PHY_POWERDOWN);
-
 	/* Put hardware reset. */
 	CSR_WRITE_2(sc, B0_CTST, CS_RST_SET);
 	sc->msk_pflags |= MSK_FLAG_SUSPEND;
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ML370 G4 with poor Network Performance and high CPU Load

2010-11-12 Thread Pyun YongHyeon

On Thu, Nov 11, 2010 at 10:53:38PM +, r...@reckschwardt.de wrote:
>  here is the pciconf for the onboard Nic
> 

You still didn't post dmesg output. Because there were a lot of
bge(4) changes since 8.1-RELEASE, I think it would be better to try
CURRENT or latest snapshot release and check whether you still see
the same issue.

> b...@pci0:7:3:0:class=0x02 card=0x00cb0e11 chip=0x16c714e4 
> rev=0x10 hdr=0x00
> vendor = 'Broadcom Corporation'
> device = 'BCM5703A3 NetXtreme Gigabit Ethernet'
> class  = network
> subclass   = ethernet
> bar   [10] = type Memory, range 64, base 0xfdef, size 65536, 
> enabled
> cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split 
> transaction
> cap 01[48] = powerspec 2  supports D0 D3  current D0
> cap 03[50] = VPD
> cap 05[58] = MSI supports 8 messages, 64 bit
> 
> regards r?
> 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2010-11-12 Thread Pyun YongHyeon

On Fri, Nov 12, 2010 at 09:07:59AM +0200, Zeus V Panchenko wrote:
> Hi,
> 
> Gabor Radnai (gabor.rad...@gmail.com) [10.11.11 23:22] wrote:
> > pciconf:
> > n...@pci0:0:20:0:class=0x068000 card=0x816a1043 chip=0x026910de rev=0xa3
> > hdr=0x00
> > vendor = 'NVIDIA Corporation'
> > device = 'MCP51 Network Bus Enumerator'
> > class  = bridge
> > r...@pci0:1:0:0:class=0x02 card=0x816810ec chip=0x816810ec rev=0x01
> > hdr=0x00
> > vendor = 'Realtek Semiconductor'
> > device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> > class  = network
> > subclass   = ethernet
> > 
> 
> i have the same problem (i was writing here before, but still no idea)
> with onboard (Realtek Gigabit Ethernet NIC(NDIS 6.0)
> (RTL8168/8111/8111c)) nic while the same driver but another vendor
> (D-Link DGE-528T Gigabit adaptor (dlg10086)) nic works fine ...
> 
> the flapping of the realtek interface is so much drastic, that i was
> forced to unplug the cable even on the ip less nic
> 

Please be more specific for the issue. Your description is hard to
narrow down possible cause.

> i was sure it is the problem of the onboard rt nics ...
> 

pciconf output of all re(4) controllers are useless because the
vendor uses the same device id. Please show me the output of dmesg
which will contain necessary information to identify your
controller revision.

> uname:
> FreeBSD 8.1-STABLE #3 amd64
> 
> pciconf -lv:
> r...@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03 
> hdr=0x00
> vendor = 'Realtek Semiconductor'
> device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> class  = network
> subclass   = ethernet
> r...@pci0:1:0:0: class=0x02 card=0x43001186 chip=0x43001186 rev=0x10 
> hdr=0x00
> vendor = 'D-Link System Inc'
> device = 'Used on DGE-528T Gigabit adaptor (dlg10086)'
> class  = network
> subclass   = ethernet
> 
> dmidecode:
> Handle 0x0002, DMI type 2, 15 bytes
> Base Board Information
> Manufacturer: ASUSTeK Computer INC.
> Product Name: AT5NM10-I
> Version: Rev x.0x
> Serial Number: MT7006K15200628
> Asset Tag: To Be Filled By O.E.M.
> Features:
> Board is a hosting board
> Board is replaceable
> Location In Chassis: To Be Filled By O.E.M.
> Chassis Handle: 0x0003
> Type: Motherboard
> Contained Object Handles: 0
> 
> Handle 0x0012, DMI type 8, 9 bytes
> Port Connector Information
> Internal Reference Designator: LAN
> Internal Connector Type: None
> External Reference Designator: LAN
> External Connector Type: RJ-45
> Port Type: Network Port
> 
> -- 
> Zeus V. Panchenko
> IT Dpt., IBS ltd  GMT+2 (EET)
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2010-11-12 Thread Pyun YongHyeon

On Fri, Nov 12, 2010 at 10:18:40PM +0100, Gabor Radnai wrote:
> Hi,
> 
> I hope you are interested in inet section it looks like this (will able to
> send the exact output only a bit later unfortunately as removed the card) :
> inet 0.0.0.0 netmask 255.255.255.255
> 

This might be caused by dhclient(i.e. dhclient(8) failed to receive
DHCP ACK).
BTW, you still didn't show me the output of ifconfig re0 after UP
the interface(i.e. ifconfig re0 up).

> Thanks,
> Gabor
> 
> On Thu, Nov 11, 2010 at 10:26 PM, Pyun YongHyeon  wrote:
> 
> > On Thu, Nov 11, 2010 at 09:56:26PM +0100, Gabor Radnai wrote:
> > > Hi,
> > >
> > > I have an Asus M2NPV-VM motherboard with integrated Nvidia MCP51 Gigabit
> > > Ethernet NIC and
> > > TP-Link TG-3468 PCIe network card which is using Realtek 8111 chip.
> > >
> > > I have problem with the re driver: the Nvidia network interface is
> > working
> > > properly but the other
> > > though it seems recognized by OS I cannot use. Sporadically it remains
> > down
> > > and if it gets up then
> > > does not get ip address via DHCP nor help if I set static ip address. Can
> > > manipulate via ifconfig but
> > > unreachable via IP.
> > >
> > > I replaced cable, interchanged cable working with Nvidia, restarted
> > > switch/router but no luck so far.
> > > Also using this nic in a Windows machine - it works. Using my Asus mob
> > with
> > > Ubuntu Live CD - card works.
> > >
> > > Can it be a driver bug or this type of chip is not supported by re
> > driver?
> > >
> >
> > Eh, you already know the answer, recognized by re(4) but does not
> > work so it's a bug of re(4). Would you show me the output of
> > ifconfig re0 after UP the interface(i.e. ifconfig re0 up).
> >
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ML370 G4 with poor Network Performance and high CPU Load

2010-11-11 Thread Pyun YongHyeon

On Thu, Nov 11, 2010 at 10:44:31PM +, r...@reckschwardt.de wrote:
>  Hello YongHyeon,
> 
> yes, booth Test-Servers are in idle State, no Disk activity and no 
> important Networktraffic.
> 
> the pciconf -lcbv for the Nics:
> 
> e...@pci0:7:1:0: class=0x02 card=0x00db0e11 chip=0x10108086 rev=0x01 
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Dual Port Gigabit Ethernet Controller (Copper) 
> (82546EB)'
> class  = network
> subclass   = ethernet
> bar   [10] = type Memory, range 64, base 0xfdfe, size 131072, 
> enabled
> bar   [18] = type Memory, range 64, base 0xfdf8, size 262144, 
> enabled
> bar   [20] = type I/O Port, range 32, base 0x6000, size 64, enabled
> cap 01[dc] = powerspec 2  supports D0 D3  current D0
> cap 07[e4] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split 
> transaction
> cap 05[f0] = MSI supports 1 message, 64 bit
> e...@pci0:7:1:1: class=0x02 card=0x00db0e11 chip=0x10108086 rev=0x01 
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Dual Port Gigabit Ethernet Controller (Copper) 
> (82546EB)'
> class  = network
> subclass   = ethernet
> bar   [10] = type Memory, range 64, base 0xfdf6, size 131072, 
> enabled
> bar   [20] = type I/O Port, range 32, base 0x6040, size 64, enabled
> cap 01[dc] = powerspec 2  supports D0 D3  current D0
> cap 07[e4] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split 
> transaction
> cap 05[f0] = MSI supports 1 message, 64 bit
> 
> if you need more Info please ask me ;-)
> 

Hmmm, I don't see any Broadcom controllers here. If you see issues
on em(4), Jack can help you. Note, 82546EB is really old controller
and I also remember the performance was not great compared to PCIe
version.

> thanks for your responce r?
> 
> >On Thu, Nov 11, 2010 at 07:35:32PM +, r...@reckschwardt.de wrote:
> >>  Hello,
> >>
> >>i am new in this Maillist and i use an ML370G4 with FreeBSD 8.1 AMD64. I
> >>try with netio and TCP. The used Nics are onboard Broadcom
> >>(PCI-X133Mhz), an Broadcom PCI-X Nic and an intel PCI-X Nic. The CPU
> >>load is around 35% and the performance like this:
> >>
> >>Packet size  1k bytes:  99303 KByte/s Tx,  44576 KByte/s Rx.
> >>Packet size  2k bytes:  72043 KByte/s Tx,  75200 KByte/s Rx.
> >>Packet size  4k bytes:  23280 KByte/s Tx,  66072 KByte/s Rx.
> >>Packet size  8k bytes:  55234 KByte/s Tx,  64470 KByte/s Rx.
> >>Packet size 16k bytes:  82485 KByte/s Tx,  74099 KByte/s Rx.
> >>Packet size 32k bytes:  93133 KByte/s Tx,  74992 KByte/s Rx.
> >>
> >And you did perform the test on idle system?(No disk activity, no
> >other network IOs etc).
> >
> >Show me the dmesg output of verbose boot and output of "pciconf
> >-lcbv".
> >
> >>I try the following tuning:
> >>
> >>kern.ipc.maxsockbuf=16777216
> >>net.inet.tcp.sendbuf_max=16777216
> >>net.inet.tcp.recvbuf_max=16777216
> >>net.inet.tcp.sendbuf_inc=16384
> >>net.inet.tcp.recvbuf_inc=524288
> >>net.inet.tcp.inflight.enable=0
> >>net.inet.tcp.hostcache.expire=1
> >>
> >>but this is not helpfull, the Load goes to 60% and the Performance is
> >>also poor. How can i prevent this Problem?
> >>
> >>thanks for response r?
> >>
> >>
> >>P.S. the same Computer with Linux runs perfect with Performance and 1-2%
> >>Load,
> >>
> >___
> >freebsd-net@freebsd.org mailing list
> >http://lists.freebsd.org/mailman/listinfo/freebsd-net
> >To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
> 
> 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ML370 G4 with poor Network Performance and high CPU Load

2010-11-11 Thread Pyun YongHyeon

On Thu, Nov 11, 2010 at 07:35:32PM +, r...@reckschwardt.de wrote:
>  Hello,
> 
> i am new in this Maillist and i use an ML370G4 with FreeBSD 8.1 AMD64. I 
> try with netio and TCP. The used Nics are onboard Broadcom 
> (PCI-X133Mhz), an Broadcom PCI-X Nic and an intel PCI-X Nic. The CPU 
> load is around 35% and the performance like this:
> 
> Packet size  1k bytes:  99303 KByte/s Tx,  44576 KByte/s Rx.
> Packet size  2k bytes:  72043 KByte/s Tx,  75200 KByte/s Rx.
> Packet size  4k bytes:  23280 KByte/s Tx,  66072 KByte/s Rx.
> Packet size  8k bytes:  55234 KByte/s Tx,  64470 KByte/s Rx.
> Packet size 16k bytes:  82485 KByte/s Tx,  74099 KByte/s Rx.
> Packet size 32k bytes:  93133 KByte/s Tx,  74992 KByte/s Rx.
> 

And you did perform the test on idle system?(No disk activity, no
other network IOs etc).

Show me the dmesg output of verbose boot and output of "pciconf
-lcbv".

> I try the following tuning:
> 
> kern.ipc.maxsockbuf=16777216
> net.inet.tcp.sendbuf_max=16777216
> net.inet.tcp.recvbuf_max=16777216
> net.inet.tcp.sendbuf_inc=16384
> net.inet.tcp.recvbuf_inc=524288
> net.inet.tcp.inflight.enable=0
> net.inet.tcp.hostcache.expire=1
> 
> but this is not helpfull, the Load goes to 60% and the Performance is 
> also poor. How can i prevent this Problem?
> 
> thanks for response r?
> 
> 
> P.S. the same Computer with Linux runs perfect with Performance and 1-2% 
> Load,
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Problem with re0

2010-11-11 Thread Pyun YongHyeon

On Thu, Nov 11, 2010 at 09:56:26PM +0100, Gabor Radnai wrote:
> Hi,
> 
> I have an Asus M2NPV-VM motherboard with integrated Nvidia MCP51 Gigabit
> Ethernet NIC and
> TP-Link TG-3468 PCIe network card which is using Realtek 8111 chip.
> 
> I have problem with the re driver: the Nvidia network interface is working
> properly but the other
> though it seems recognized by OS I cannot use. Sporadically it remains down
> and if it gets up then
> does not get ip address via DHCP nor help if I set static ip address. Can
> manipulate via ifconfig but
> unreachable via IP.
> 
> I replaced cable, interchanged cable working with Nvidia, restarted
> switch/router but no luck so far.
> Also using this nic in a Windows machine - it works. Using my Asus mob with
> Ubuntu Live CD - card works.
> 
> Can it be a driver bug or this type of chip is not supported by re driver?
> 

Eh, you already know the answer, recognized by re(4) but does not
work so it's a bug of re(4). Would you show me the output of
ifconfig re0 after UP the interface(i.e. ifconfig re0 up).
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: icmp packets on em larger than 1472 [SEC=UNCLASSIFIED]

2010-11-11 Thread Pyun YongHyeon

On Thu, Nov 11, 2010 at 08:10:57AM -0800, Kevin Oberman wrote:
> > Date: Wed, 10 Nov 2010 23:49:56 -0800 (PST)
> > From: Kirill Yelizarov 
> > 
> > 
> > 
> > --- On Thu, 11/11/10, Kevin Oberman  wrote:
> > 
> > > From: Kevin Oberman 
> > > Subject: Re: icmp packets on em larger than 1472 [SEC=UNCLASSIFIED]
> > > To: "Wilkinson, Alex" 
> > > Cc: freebsd-sta...@freebsd.org
> > > Date: Thursday, November 11, 2010, 8:26 AM
> > > > Date: Thu, 11 Nov 2010 13:01:26
> > > +0800
> > > > From: "Wilkinson, Alex" 
> > > > Sender: owner-freebsd-sta...@freebsd.org
> > > > 
> > > > 
> > > >? ???0n Wed, Nov 10, 2010 at
> > > 04:21:12AM -0800, Kirill Yelizarov wrote: 
> > > > 
> > > >? ???>All my em cards running
> > > 8.1 stable don't reply to icmp echo requests packets larger
> > > than 1472 bytes.
> > > >? ???>
> > > >? ???>On stable 7.2 the same
> > > hardware works as expected:
> > > >? ???># ping -s 1500
> > > 192.168.64.99
> > > >? ???>PING 192.168.64.99
> > > (192.168.64.99): 1500 data bytes
> > > >? ???>1508 bytes from
> > > 192.168.64.99: icmp_seq=0 ttl=63 time=1.249 ms
> > > >? ???>1508 bytes from
> > > 192.168.64.99: icmp_seq=1 ttl=63 time=1.158 ms
> > > >? ???>
> > > >? ???>Here is the dump on em
> > > interface
> > > >? ???>15:06:31.452043 IP
> > > 192.168.66.65 > *: ICMP echo request, id 28729, seq
> > > 5, length 1480
> > > >? ???>15:06:31.452047 IP
> > > 192.168.66.65 > : icmp
> > > >? ???>15:06:31.452069 IP 
> > > > 192.168.66.65: ICMP echo reply, id 28729, seq 5, length
> > > 1480
> > > >? ???>15:06:31.452071 IP ***
> > > > 192.168.66.65: icmp
> > > >? ???> 
> > > >? ???>Same ping from same source
> > > (it's a 8.1 stable with fxp interface) to em card running
> > > 8.1 stable
> > > >? ???>#pciconf -lv
> > > >?
> > > ???>e...@pci0:3:4:0:???
> > > class=0x02 card=0x10798086 chip=0x10798086 rev=0x03
> > > hdr=0x00
> > > >? ???>? ? vendor?
> > > ???= 'Intel Corporation'
> > > >? ???>? ? device?
> > > ???= 'Dual Port Gigabit Ethernet Controller
> > > (82546EB)'
> > > >? ???>? ? class?
> > > ? ? = network
> > > >? ???>? ?
> > > subclass???= ethernet
> > > >? ???>
> > > >? ???># ping -s 1472
> > > 192.168.64.200
> > > >? ???>PING 192.168.64.200
> > > (192.168.64.200): 1472 data bytes
> > > >? ???>1480 bytes from
> > > 192.168.64.200: icmp_seq=0 ttl=63 time=0.848 ms
> > > >? ???>^C
> > > >? ???>
> > > >? ???># ping -s 1473
> > > 192.168.64.200
> > > >? ???>PING 192.168.64.200
> > > (192.168.64.200): 1473 data bytes
> > > >? ???>^C
> > > >? ???>--- 192.168.64.200 ping
> > > statistics ---
> > > >? ???>4 packets transmitted, 0
> > > packets received, 100.0% packet loss
> > > > 
> > > > works fine for me:
> > > > 
> > > > FreeBSD 8.1-STABLE #0 r213395
> > > > 
> > > > e...@pci0:0:25:0:class=0x02 card=0x3035103c
> > > chip=0x10de8086 rev=0x02 hdr=0x00
> > > >? ???vendor?
> > > ???= 'Intel Corporation'
> > > >? ???device?
> > > ???= 'Intel Gigabit network connection
> > > (82567LM-3 )'
> > > >? ???class? ? ? =
> > > network
> > > >? ???subclass???=
> > > ethernet
> > > > 
> > > > #ping -s 1473 host
> > > > PING host(192.168.1.1): 1473 data bytes
> > > > 1481 bytes from 192.168.1.1: icmp_seq=0 ttl=253
> > > time=31.506 ms
> > > > 1481 bytes from 192.168.1.1: icmp_seq=1 ttl=253
> > > time=31.493 ms
> > > > 1481 bytes from 192.168.1.1: icmp_seq=2 ttl=253
> > > time=31.550 ms
> > > > ^C
> > > 
> > > The reason the '-s 1500' worked was that the packets were
> > > fragmented. If
> > > I add the '-D' option, '-s 1473' fails on v7 and v8. Are
> > > the V8 systems
> > > where you see if failing without the '-D' on the same
> > > network segment?
> > > If not, it is likely that an intervening device is refusing
> > > to fragment
> > > the packet. (Some routers deliberately don't fragment ICMP
> > > Echos Request
> > > packets.) 
> > 
> > If i set -D -s 1473 sender side refuses to ping and that is
> > correct. All mentioned above machines are behind the same router and
> > switch. Same hardware running v7 is working while v8 is not. And i
> > never saw such problems before.  Also correct me if i'm wrong but the
> > dump shows that the packet arrived. I'll try driver from head and will
> > post here results.
> 
> I did a bit more looking at this today and I see that something bogus is
> going on and it MAY be the em driver.
> 
> I tried 1473 data byte pings without the DF flag. I then captured the
> packets on both ends (where the sending system has a bge (Broadcom GE)
> and the responding end has an em (Intel) card.
> 
> What I saw was the fragmented IP packets all being received by the
> system with the em interface and an ICMP Echo Reply being sent back,
> again fragmented. I saw the reply on both ends, so both interfaces were
> able to fragment an over-sized packet, transmit the two pieces, and
> receive the two pieces. The em device could re-assemble them properly,
> but the bge device does not seem to re-assemble them correctly or else
> has a problem with ICMP packets bigger then MTU size.
> 
> When I

Re: [patch] WOL support for nfe(4)

2010-11-11 Thread Pyun YongHyeon

On Thu, Nov 11, 2010 at 08:08:25AM +0100, Yamagi Burmeister wrote:
> On Wed, 10 Nov 2010, Pyun YongHyeon wrote:
> 
> >On Tue, Nov 09, 2010 at 01:34:21PM -0800, Pyun YongHyeon wrote:
> >>On Tue, Nov 09, 2010 at 10:01:36PM +0100, Yamagi Burmeister wrote:
> >>>On Tue, 9 Nov 2010, Pyun YongHyeon wrote:
> >>>
> >>>>>No, the link stays at 1000Mbps so the driver must manually switch back
> >>>>>to 10/100Mbps.
> >>>>>
> >>>>
> >>>>Hmm, this is real problem for WOL. Establishing 1000Mbps link to
> >>>>accept WOL frames is really bad idea since it can draw more power
> >>>>than 375mA. Consuming more power than 375mA is violation of
> >>>>PCI specification and some system may completely shutdown the power
> >>>>to protect hardware against over-current damage which in turn means
> >>>>WOL wouldn't work anymore. Even if WOL work with 1000Mbps link for
> >>>>all nfe(4) controllers, it would dissipate much more power.
> >>>>
> >>>>Because nfe(4) controllers are notorious for using various PHYs,
> >>>>it's hard to write a code to reliably establish 10/100Mbps link in
> >>>>driver. In addition, nfe(4) is known to be buggy in link state
> >>>>handling such that forced media selection didn't work well. I'll
> >>>>see what could be done in this week if I find spare time.
> >>>
> >>>Hmm... Maybe just add a hint to the manpage that WOL is possible broken?
> >>
> >>I think this may not be enough. Because it can damage your hardware
> >>under certain conditions if protection circuit was not there.
> >>
> >
> >Ok, I updated patch which will change link speed to 10/100Mps when
> >shutdown/suspend is initiated.  You can get the patch at the
> >following URL. Please give it a try and let me know whether it
> >really changes link speed to 10/100Mbps. If it does not work as
> >expected, show me the dmesg output of your system.
> >
> >http://people.freebsd.org/~yongari/nfe/nfe.wol.patch2
> 
> Okay, that does the trick. At shutdown the link speed is changed to
> 10/100Mbps, at boot - either via WOL magic packet or manuell startup -
> it's changed back to 1000Mbps.
> 

Thanks, patch committed(r215132), will MFC after a week.

> Thanks again,
> Yamagi
> 
> -- 
> Homepage: www.yamagi.org
> Jabber:   yam...@yamagi.org
> GnuPG/GPG:0xEFBCCBCB
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [patch] WOL support for nfe(4)

2010-11-10 Thread Pyun YongHyeon

On Tue, Nov 09, 2010 at 01:34:21PM -0800, Pyun YongHyeon wrote:
> On Tue, Nov 09, 2010 at 10:01:36PM +0100, Yamagi Burmeister wrote:
> > On Tue, 9 Nov 2010, Pyun YongHyeon wrote:
> > 
> > >>No, the link stays at 1000Mbps so the driver must manually switch back
> > >>to 10/100Mbps.
> > >>
> > >
> > >Hmm, this is real problem for WOL. Establishing 1000Mbps link to
> > >accept WOL frames is really bad idea since it can draw more power
> > >than 375mA. Consuming more power than 375mA is violation of
> > >PCI specification and some system may completely shutdown the power
> > >to protect hardware against over-current damage which in turn means
> > >WOL wouldn't work anymore. Even if WOL work with 1000Mbps link for
> > >all nfe(4) controllers, it would dissipate much more power.
> > >
> > >Because nfe(4) controllers are notorious for using various PHYs,
> > >it's hard to write a code to reliably establish 10/100Mbps link in
> > >driver. In addition, nfe(4) is known to be buggy in link state
> > >handling such that forced media selection didn't work well. I'll
> > >see what could be done in this week if I find spare time.
> > 
> > Hmm... Maybe just add a hint to the manpage that WOL is possible broken?
> 
> I think this may not be enough. Because it can damage your hardware
> under certain conditions if protection circuit was not there.
> 

Ok, I updated patch which will change link speed to 10/100Mps when
shutdown/suspend is initiated.  You can get the patch at the
following URL. Please give it a try and let me know whether it
really changes link speed to 10/100Mbps. If it does not work as
expected, show me the dmesg output of your system.

http://people.freebsd.org/~yongari/nfe/nfe.wol.patch2

> > Nevertheless thanks for your work it's much appreciated :)
> > 
> > >>>o When you put your box into suspend mode, can you wake up your box
> > >>>with WOL magic packet?
> > >>
> > >>I'm sorry but I can't test that since none of those boxes supports
> > >>suspend:
> > >>
> > >>  % sysctl hw.acpi.suspend_state
> > >>hw.acpi.suspend_state: NONE
> > >>
> > >
> > >You can switch to suspend mode with "acpiconf -s1". If all goes
> > >well, driver would put the controller into suspend mode after
> > >reprogramming controller to accept WOL frames. After that, you can
> > >wakeup the box by sending a WOL magic packet.
> > 
> > Okay, It thought that S3 is required. Put the box into S1, waited some
> > minutes and send the magic packet. The video didn't resume but I was
> > able to login via SSH. So waking up by sending the WOL magic packet
> > works.
> > 
> 
> Thanks for testing. Probably you want to poke jkim@ to address
> video resume issue.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [patch] WOL support for nfe(4)

2010-11-09 Thread Pyun YongHyeon

On Tue, Nov 09, 2010 at 10:01:36PM +0100, Yamagi Burmeister wrote:
> On Tue, 9 Nov 2010, Pyun YongHyeon wrote:
> 
> >>No, the link stays at 1000Mbps so the driver must manually switch back
> >>to 10/100Mbps.
> >>
> >
> >Hmm, this is real problem for WOL. Establishing 1000Mbps link to
> >accept WOL frames is really bad idea since it can draw more power
> >than 375mA. Consuming more power than 375mA is violation of
> >PCI specification and some system may completely shutdown the power
> >to protect hardware against over-current damage which in turn means
> >WOL wouldn't work anymore. Even if WOL work with 1000Mbps link for
> >all nfe(4) controllers, it would dissipate much more power.
> >
> >Because nfe(4) controllers are notorious for using various PHYs,
> >it's hard to write a code to reliably establish 10/100Mbps link in
> >driver. In addition, nfe(4) is known to be buggy in link state
> >handling such that forced media selection didn't work well. I'll
> >see what could be done in this week if I find spare time.
> 
> Hmm... Maybe just add a hint to the manpage that WOL is possible broken?

I think this may not be enough. Because it can damage your hardware
under certain conditions if protection circuit was not there.

> Nevertheless thanks for your work it's much appreciated :)
> 
> >>>o When you put your box into suspend mode, can you wake up your box
> >>>with WOL magic packet?
> >>
> >>I'm sorry but I can't test that since none of those boxes supports
> >>suspend:
> >>
> >>  % sysctl hw.acpi.suspend_state
> >>hw.acpi.suspend_state: NONE
> >>
> >
> >You can switch to suspend mode with "acpiconf -s1". If all goes
> >well, driver would put the controller into suspend mode after
> >reprogramming controller to accept WOL frames. After that, you can
> >wakeup the box by sending a WOL magic packet.
> 
> Okay, It thought that S3 is required. Put the box into S1, waited some
> minutes and send the magic packet. The video didn't resume but I was
> able to login via SSH. So waking up by sending the WOL magic packet
> works.
> 

Thanks for testing. Probably you want to poke jkim@ to address
video resume issue.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [patch] WOL support for nfe(4)

2010-11-09 Thread Pyun YongHyeon

On Tue, Nov 09, 2010 at 09:49:14AM +0100, Yamagi Burmeister wrote:
> 
> Thanks for your reply.
> 
> On Mon, 8 Nov 2010, Pyun YongHyeon wrote:
> 
> >Thanks for the patch. I attached slightly modified the code to
> >better match other WOL capable drivers in tree. Because data sheet
> >is not available I blindly made a patch based on your code. I have
> >a couple of questions which I can't verify it on real hardware(I
> >have no more access to the hardware).
> >
> >o If you established a gigabit link with link partner and shutdown
> > your box, does the established link automatically change to 10 or
> > 100Mbps? You can check it on your link partner. If your link
> > partner still reports it established 1000Mbps link, we have to
> > do other necessary work in driver(i.e. manually switching to
> > 10/100Mbps).
> 
> No, the link stays at 1000Mbps so the driver must manually switch back
> to 10/100Mbps.
> 

Hmm, this is real problem for WOL. Establishing 1000Mbps link to
accept WOL frames is really bad idea since it can draw more power
than 375mA. Consuming more power than 375mA is violation of
PCI specification and some system may completely shutdown the power
to protect hardware against over-current damage which in turn means
WOL wouldn't work anymore. Even if WOL work with 1000Mbps link for
all nfe(4) controllers, it would dissipate much more power.

Because nfe(4) controllers are notorious for using various PHYs,
it's hard to write a code to reliably establish 10/100Mbps link in
driver. In addition, nfe(4) is known to be buggy in link state
handling such that forced media selection didn't work well. I'll
see what could be done in this week if I find spare time.

> >o When you put your box into suspend mode, can you wake up your box
> > with WOL magic packet?
> 
> I'm sorry but I can't test that since none of those boxes supports
> suspend:
> 
>   % sysctl hw.acpi.suspend_state
> hw.acpi.suspend_state: NONE
> 

You can switch to suspend mode with "acpiconf -s1". If all goes
well, driver would put the controller into suspend mode after
reprogramming controller to accept WOL frames. After that, you can
wakeup the box by sending a WOL magic packet.

> >o When your system boots up with/without WOL magic packet, sending
> > WOL magic packets from other hosts can hang your box?
> 
> No they don't. No matter if the box was started by sending the WOL magic
> packet or by hand it survives all WOL packets I send to it.
> 

Ok, some controllers are known to hang the box if it receive WOL
frames before initializing controller.

> >o If you disabled WOL with ifconfig before system shutdown, can you
> > still wakeup your box with WOL magic packet?
> 
> No, I can't. WOL is disabled and the box must be started manually.
> 
> >o If you reprogram your station address with ifconfig(i.e. ifconfig
> > nfe0 ether xx:xx:xx:xx:xx:xx), can you still wakeup your box with
> > WOL magic packet?
> 
> Yes, with sending the WOL magic packet to the new station adress.
> Sending it to the original adress doesn't work.
> 
> >The patch I made didn't take into account management firmware so
> >if you use the patch with IMPI, IMPI wouldn't work. But I think
> >that's not an issue since all other parts of nfe(4) also ignores
> >management firmware at this moment.
> 
> I can't test that, because none of these machines has the IPMI option
> installed. Sorry.
> 

Ok, all other features except wakeup from suspend seem to work.
Thanks for testing!
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [patch] WOL support for nfe(4)

2010-11-08 Thread Pyun YongHyeon

On Fri, Nov 05, 2010 at 11:10:37AM +0100, Yamagi Burmeister wrote:
> Hi,
> 
> some time ago we migrated a lot of boxes from Linux to FreeBSD. Those
> machines have a "NVIDIA nForce4 CK804 MCP4" network adapter, supported
> by nfe(4). Even if nfe(4) at least tries to enable the WOL capability of
> the NIC it doesn't work and nfe(4) doesn't integrate with FreeBSDs (new)
> WOL framework. Since we are in need of WOL I spend some minutes to
> implement it the correct way.
> 
> Attached are two patches:
> - if_nfe_wol_8.1.diff against FreeBSD 8.1-RELEASE-p1, this one is used
>   on our servers.
> - if_nfe_wol_current.diff against -CURRENT r214831. This one is
>   _untested_! But it should work...
> 
> In case that the patches a stripped by mailman they can be found here:
> http://deponie.yamagi.org/freebsd/nfe/
> 
> This patch works reliable on our machines and nfe(4) runs without any
> problems with it. But nevertheless my skills in writting network drivers
> are somewhat limited therefor a review by somewhat with better knowledge
> of the WOL framework and maybe nfe(4) itself is highly anticipated.
> 

Thanks for the patch. I attached slightly modified the code to
better match other WOL capable drivers in tree. Because data sheet
is not available I blindly made a patch based on your code. I have
a couple of questions which I can't verify it on real hardware(I
have no more access to the hardware).

o If you established a gigabit link with link partner and shutdown
  your box, does the established link automatically change to 10 or
  100Mbps? You can check it on your link partner. If your link
  partner still reports it established 1000Mbps link, we have to
  do other necessary work in driver(i.e. manually switching to
  10/100Mbps).
o When you put your box into suspend mode, can you wake up your box
  with WOL magic packet?
o When your system boots up with/without WOL magic packet, sending
  WOL magic packets from other hosts can hang your box?
o If you disabled WOL with ifconfig before system shutdown, can you
  still wakeup your box with WOL magic packet?
o If you reprogram your station address with ifconfig(i.e. ifconfig
  nfe0 ether xx:xx:xx:xx:xx:xx), can you still wakeup your box with
  WOL magic packet?

The patch I made didn't take into account management firmware so
if you use the patch with IMPI, IMPI wouldn't work. But I think
that's not an issue since all other parts of nfe(4) also ignores
management firmware at this moment.
Index: sys/dev/nfe/if_nfe.c
===
--- sys/dev/nfe/if_nfe.c	(revision 214989)
+++ sys/dev/nfe/if_nfe.c	(working copy)
@@ -125,6 +125,7 @@
 static void nfe_sysctl_node(struct nfe_softc *);
 static void nfe_stats_clear(struct nfe_softc *);
 static void nfe_stats_update(struct nfe_softc *);
+static void nfe_set_wol(struct nfe_softc *);
 
 #ifdef NFE_DEBUG
 static int nfedebug = 0;
@@ -586,6 +587,9 @@
 		if ((ifp->if_capabilities & IFCAP_HWCSUM) != 0)
 			ifp->if_capabilities |= IFCAP_VLAN_HWCSUM;
 	}
+
+	if (pci_find_extcap(dev, PCIY_PMG, ®) == 0)
+		ifp->if_capabilities |= IFCAP_WOL_MAGIC;
 	ifp->if_capenable = ifp->if_capabilities;
 
 	/*
@@ -752,6 +756,7 @@
 
 	NFE_LOCK(sc);
 	nfe_stop(sc->nfe_ifp);
+	nfe_set_wol(sc);
 	sc->nfe_suspended = 1;
 	NFE_UNLOCK(sc);
 
@@ -768,6 +773,7 @@
 	sc = device_get_softc(dev);
 
 	NFE_LOCK(sc);
+	nfe_power(sc);
 	ifp = sc->nfe_ifp;
 	if (ifp->if_flags & IFF_UP)
 		nfe_init_locked(sc);
@@ -1714,6 +1720,10 @@
 			}
 		}
 #endif /* DEVICE_POLLING */
+		if ((mask & IFCAP_WOL_MAGIC) != 0 &&
+		(ifp->if_capabilities & IFCAP_WOL_MAGIC) != 0)
+			ifp->if_capenable ^= IFCAP_WOL_MAGIC;
+
 		if ((sc->nfe_flags & NFE_HW_CSUM) != 0 &&
 		(mask & IFCAP_HWCSUM) != 0) {
 			ifp->if_capenable ^= IFCAP_HWCSUM;
@@ -2746,7 +2756,8 @@
 	NFE_WRITE(sc, NFE_STATUS, sc->mii_phyaddr << 24 | NFE_STATUS_MAGIC);
 
 	NFE_WRITE(sc, NFE_SETUP_R4, NFE_R4_MAGIC);
-	NFE_WRITE(sc, NFE_WOL_CTL, NFE_WOL_MAGIC);
+	/* Disable WOL. */
+	NFE_WRITE(sc, NFE_WOL_CTL, 0);
 
 	sc->rxtxctl &= ~NFE_RXTX_BIT2;
 	NFE_WRITE(sc, NFE_RXTX_CTL, sc->rxtxctl);
@@ -2917,18 +2928,8 @@
 static int
 nfe_shutdown(device_t dev)
 {
-	struct nfe_softc *sc;
-	struct ifnet *ifp;
 
-	sc = device_get_softc(dev);
-
-	NFE_LOCK(sc);
-	ifp = sc->nfe_ifp;
-	nfe_stop(ifp);
-	/* nfe_reset(sc); */
-	NFE_UNLOCK(sc);
-
-	return (0);
+	return (nfe_suspend(dev));
 }
 
 
@@ -3212,3 +3213,39 @@
 		stats->rx_broadcast += NFE_READ(sc, NFE_TX_BROADCAST);
 	}
 }
+
+static void
+nfe_set_wol(struct nfe_softc *sc)
+{
+	struct ifnet *ifp;
+	uint32_t wolctl;
+	int pmc;
+	uint16_t pmstat;
+
+	NFE_LOCK_ASSERT(sc);
+
+	if (pci_find_extcap(sc->nfe_dev, PCIY_PMG, &pmc) != 0)
+		return;
+	ifp = sc->nfe_ifp;
+	if ((ifp->if_capenable & IFCAP_WOL_MAGIC) != 0)
+		wolctl = NFE_WOL_MAGIC;
+	else
+		wolctl = 0;
+	NFE_WRITE(sc, NFE_WOL_CTL, wolctl);
+	if ((ifp->if_capenable & IFCAP_WOL_MAGIC) != 0) {
+		if ((sc->nfe_flags & NFE_PWR_MGMT) != 0)
+			NFE_WRITE(sc, NFE_PWR2_CTL,
+

Re: Strange problem with sk0

2010-11-06 Thread Pyun YongHyeon

On Fri, Oct 22, 2010 at 05:09:33PM -0400, Mikhail T. wrote:
>  Hello!
> 
> I have a rather bizarre problem with my on-board sk interface... It only 
> works, when tcpdump is running...
> 
> Seriously. It negotiates with the switch (1000baseT/full-duplex) just 
> fine, but, unless tcpdump has it open (and in "promiscuous" mode), no 
> traffic seems to go through. It would not respond to pings -- not even 
> from the switch itself, nothing.
> 
> But, as soon as I start tcpdump -- even if tcpdump never has anything to 
> output:
> 
>tcpdump -i sk0 -n src host 10.non.existent.IP
> 
> Traffic starts flowing just fine... Do I simply have flaky hardware? The 
> motherboard is old, and, for some reason, I need to "remind" sk0, what 
> its ethernet address upon reboote (it starts off with 00:00:00:00:00:00).
> 
> Any other explanations for what is happening? There are plenty of other 
> systems (computers, VoIP phone, two TVs) on this switch and all are 
> fine... I did try different ports on it -- same results. I also tried 
> forcing things down to 100/half-duplex -- no change...
> 
> Thanks! Yours,
> 

FYI: Fix committed to HEAD(r214898, r214899). Will MFC after 1
week.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Hardlock with alc0 device

2010-10-29 Thread Pyun YongHyeon

On Fri, Oct 29, 2010 at 08:44:22PM -0400, Kris Moore wrote:
> On Fri, Oct 29, 2010 at 09:55:31AM -0700, Pyun YongHyeon wrote:
> > On Fri, Oct 29, 2010 at 06:15:16AM -0400, Kris Moore wrote:
> > > 
> > > I'm running into a rather interesting problem here on HEAD with a newer 
> > > Asus
> > > EEE PC and the "alc" network driver. The device works great when a
> > > cable is plugged in, no issues at all. However, if I unplug the ethernet
> > > and reboot then I get a hard-lock when it tries to bring up the device.
> > > 
> > > I disabled ifconfig_alc0="DHCP" in rc.conf, and now the system boots
> > > normally, but just for kicks I tried running "dhclient alc0" on it
> > > manually, and sure enough it resulted in another system lockup. (No kern 
> > > dump,
> > > doesn't even get that far)
> > > 
> > > Here's some information about the system / device, let me know if there
> > > is any other data / commands I should run and send over. 
> > > 
> > > 
> > > FreeBSD mininova 9.0-CURRENT FreeBSD 9.0-CURRENT #14: Sat Oct 23 13:11:00 
> > > PDT 2010
> > > 
> > > a...@pci0:1:0:0:class=0x02 card=0x838a1043 chip=0x10621969 
> > > rev=0xc0 hdr=0x00
> > > vendor = 'Attansic (Now owned by Atheros)'
> > > device = 'Atheros AR8132 PCI-E Fast Ethernet Controller (AR8132)'
> > > class  = network
> > > subclass   = ethernet
> > > 
> > > 
> > > alc0: flags=8802 metric 0 mtu 1500
> > > 
> > > options=c3198
> > > ether 20:cf:30:1e:b2:38
> > > media: Ethernet autoselect
> > > 
> > 
> > I was not able to reproduce it with sample board so I'm not sure
> > what register access could trigger the stuck. Given that there are
> > some configuration changes in BIOS for better power saving(ASPM) it
> > could be related with accessing ALC_PM_CFG register.
> > I also remember some user reported controller couldn't establish
> > link when system booted without UTP cable plugged in. Not sure this
> > is also the same issue as sample board does not show the issue.
> > 
> > Anyway, would you try attached patch?
> 
> > Index: sys/dev/alc/if_alc.c
> > ===
> > --- sys/dev/alc/if_alc.c(revision 214514)
> > +++ sys/dev/alc/if_alc.c(working copy)
> > @@ -331,8 +331,8 @@
> > reg = CSR_READ_4(sc, ALC_MAC_CFG);
> > reg |= MAC_CFG_TX_ENB | MAC_CFG_RX_ENB;
> > CSR_WRITE_4(sc, ALC_MAC_CFG, reg);
> > +   alc_aspm(sc, IFM_SUBTYPE(mii->mii_media_active));
> > }
> > -   alc_aspm(sc, IFM_SUBTYPE(mii->mii_media_active));
> >  }
> >  
> >  static void
> 
> Well, so far with the attached patch, after a couple reboots, and trying to 
> use dhclient, it's still working fine.  Also after the initial dhclient times 
> out I get a link status of "status: no carrier" now, which didn't show up
> before, so thats good too.
> 

Thanks a lot for testing!

> Thanks for the quick fix, will you put this into HEAD soon?
> 

Patch committed(r214542).
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: How to obtain place of low perfomance?

2010-10-29 Thread Pyun YongHyeon

On Fri, Oct 29, 2010 at 10:20:10AM +0300, ?? ?? wrote:
> Hi, Freebsd-net.
> 
> serv1# ifocnfig nfe0
> nfe0: flags=8943 metric 0 mtu 
> 1500
> options=10b
> ether 00:13:d4:ce:82:16
> inet 10.11.8.17 netmask 0xfc00 broadcast 10.11.11.255
> inet 10.11.8.15 netmask 0xfc00 broadcast 10.11.11.255
> media: Ethernet autoselect (1000baseTX )
> status: active
> serv1# ifconfig igb0
> igb0: flags=8843 metric 0 mtu 1500
> options=19b
> ether 00:1b:21:45:da:b8
> media: Ethernet autoselect (1000baseTX )
> status: active
> serv1# ifconfig vlan7
> vlan7: flags=8843 metric 0 mtu 1500
> options=3
> ether 00:1b:21:45:da:b8
> inet 10.11.15.15 netmask 0xff00 broadcast 10.11.15.255
> inet 10.11.7.1 netmask 0xff00 broadcast 10.11.7.255
> media: Ethernet autoselect (1000baseTX )
> status: active
> vlan: 7 parent interface: igb0
> 
> doing bw test with iperf it show low performance on nfe0.
> 
> # iperf -c 10.11.8.17
> 
> Client connecting to 10.11.8.17, TCP port 5001
> TCP window size: 32.5 KByte (default)
> 
> [  3] local 10.11.8.16 port 63911 connected with 10.11.8.17 port 5001
> [ ID] Interval   Transfer Bandwidth
> [  3]  0.0-10.5 sec124 MBytes  98.8 Mbits/sec
> # iperf -c 10.11.7.1
> 
> Client connecting to 10.11.7.1, TCP port 5001
> TCP window size: 32.5 KByte (default)
> 
> [  3] local 10.11.7.2 port 61422 connected with 10.11.7.1 port 5001
> [ ID] Interval   Transfer Bandwidth
> [  3]  0.0-10.3 sec800 MBytes653 Mbits/sec
> 
> despite on it is integrated I expect about 300-400Mbit throughput
> does nfe0 really so poor NIC?

nfe(4) controllers would not be one of best controllers targeted
for server environments but generally it's not poor for desktop
users. I mean you should be able to saturate link when you use bulk
TCP/UDP transfers.
Last time I tried iperf it was not reliable. Did you disable
threading of iperf? Also note, both sender/receiver of iperf should
be built with same configuration option.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Polling slows down bandwidth

2010-10-29 Thread Pyun YongHyeon

On Thu, Oct 28, 2010 at 11:21:11PM +0300, ?? ?? wrote:
> Hello,
> w/0 polling:
> 
> 
> serv1# ifconfig nfe0
> nfe0: flags=8943 metric 0 mtu 
> 1500
> options=10b
> ether 00:13:d4:ce:82:16
> inet 10.11.8.17 netmask 0xfc00 broadcast 10.11.11.255
> inet 10.11.15.15 netmask 0xff00 broadcast 10.11.15.255
> inet 10.11.8.15 netmask 0xfc00 broadcast 10.11.11.255
> media: Ethernet autoselect (1000baseTX )
> status: active
> 
> serv2# ifconfig re0
> re0: flags=8843 metric 0 mtu 1500
> 
> options=389b
> ether 00:1c:c0:c8:5a:4e
> inet 192.168.255.254 netmask 0x broadcast 192.168.255.254
> media: Ethernet autoselect (1000baseTX )
> status: active
> 
> 
> serv1# systat -v
> 2 usersLoad  0.38  0.49  0.38  Oct 28 23:10
> 
> Mem:KBREALVIRTUAL   VN PAGER   SWAP PAGER
> Tot   Share  TotShareFree   in   out in   out
> Act  324452   20196   49040025112  938576  count
> All  346596   21060 1074286k28104  pages
> Proc:Interrupts
>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltcow   29953 total
>   1  47   71k   18 5022  27k 20291zfodatkbd0 1
>   ozfod 7 ata0 
> irq14
> 62.4%Sys  18.7%Intr  0.4%User  0.0%Nice 18.4%Idle%ozfod 27944 nfe0 
> irq23
> |||||||||||   daefr  2001 cpu0: 
> time
> ===++ prcfr   igb0 256
>  1 dtbuf6 totfr 1 igb0 257
> Namei Name-cache   Dir-cache10 desvn  react   igb0 258
>Callshits   %hits   % 84106 numvn  pdwak
>5   5 100 24824 frevn  pdpgs
>   intrn
> Disks   ad0373808 wire
> KB/t  16.00292088 act
> tps   7417568 inact
> MB/s   0.10   200 cache
> %busy 0938376 free
> 
> iperf result:
> [ ID] Interval   Transfer Bandwidth
> [  4]  0.0-10.3 sec450 MBytes368 Mbits/sec
> 
> 
> after enable POLLING:
> serv1# ifconfig nfe0 polling
> 2 usersLoad  0.32  0.39  0.35  Oct 28 23:13
> 
> Mem:KBREALVIRTUAL   VN PAGER   SWAP PAGER
> Tot   Share  TotShareFree   in   out in   out
> Act  324464   20196   49065625112  938428  count
> All  346624   21060 1074286k28104  pages
> Proc:Interrupts
>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltcow2006 total
>   1  47   28k   19 26916 20362zfodatkbd0 1
>   ozfod 4 ata0 
> irq14
> 24.7%Sys  18.6%Intr  0.7%User  0.0%Nice 55.9%Idle%ozfod   nfe0 
> irq23
> |||||||||||   daefr  2001 cpu0: 
> time
> ++prcfr   igb0 256
> 10 dtbuf2 totfr 1 igb0 257
> Namei Name-cache   Dir-cache10 desvn  react   igb0 258
>Callshits   %hits   % 84106 numvn  pdwak
>   20  20 100 24824 frevn  pdpgs
>   intrn
> Disks   ad0373944 wire
> KB/t  16.00292104 act
> tps   4417564 inact
> MB/s   0.07   200 cache
> %busy 0938228 free
> 
> I get bad results (((
> [ ID] Interval   Transfer Bandwidth
> [  4]  0.0-10.3 sec180 MBytes147 Mbits/sec
> 

nfe(4) controllers are one of rare gigabit controllers that lacks
efficient interrupt moderation mechanism. So it's normal to see
high number of interrupts under load.
hz controls how frequently checks controller's RX/TX queue so you
may have to increase hz to get reasonable performance under high
network load with polling(4). One of important performance factor
for NIC is how many frames should be processed for given network
load pattern. If you have to process more frames you have to
increase hz in polling(4).
I don't know what nfe(4) controller you have but it seems it's
somewhat low-end model because MSI/MSIX is not used at all.

Re: Hardlock with alc0 device

2010-10-29 Thread Pyun YongHyeon

On Fri, Oct 29, 2010 at 06:15:16AM -0400, Kris Moore wrote:
> 
> I'm running into a rather interesting problem here on HEAD with a newer Asus
> EEE PC and the "alc" network driver. The device works great when a
> cable is plugged in, no issues at all. However, if I unplug the ethernet
> and reboot then I get a hard-lock when it tries to bring up the device.
> 
> I disabled ifconfig_alc0="DHCP" in rc.conf, and now the system boots
> normally, but just for kicks I tried running "dhclient alc0" on it
> manually, and sure enough it resulted in another system lockup. (No kern dump,
> doesn't even get that far)
> 
> Here's some information about the system / device, let me know if there
> is any other data / commands I should run and send over. 
> 
> 
> FreeBSD mininova 9.0-CURRENT FreeBSD 9.0-CURRENT #14: Sat Oct 23 13:11:00 PDT 
> 2010
> 
> a...@pci0:1:0:0:class=0x02 card=0x838a1043 chip=0x10621969 
> rev=0xc0 hdr=0x00
> vendor = 'Attansic (Now owned by Atheros)'
> device = 'Atheros AR8132 PCI-E Fast Ethernet Controller (AR8132)'
> class  = network
> subclass   = ethernet
> 
> 
> alc0: flags=8802 metric 0 mtu 1500
> 
> options=c3198
> ether 20:cf:30:1e:b2:38
> media: Ethernet autoselect
> 

I was not able to reproduce it with sample board so I'm not sure
what register access could trigger the stuck. Given that there are
some configuration changes in BIOS for better power saving(ASPM) it
could be related with accessing ALC_PM_CFG register.
I also remember some user reported controller couldn't establish
link when system booted without UTP cable plugged in. Not sure this
is also the same issue as sample board does not show the issue.

Anyway, would you try attached patch?
Index: sys/dev/alc/if_alc.c
===
--- sys/dev/alc/if_alc.c	(revision 214514)
+++ sys/dev/alc/if_alc.c	(working copy)
@@ -331,8 +331,8 @@
 		reg = CSR_READ_4(sc, ALC_MAC_CFG);
 		reg |= MAC_CFG_TX_ENB | MAC_CFG_RX_ENB;
 		CSR_WRITE_4(sc, ALC_MAC_CFG, reg);
+		alc_aspm(sc, IFM_SUBTYPE(mii->mii_media_active));
 	}
-	alc_aspm(sc, IFM_SUBTYPE(mii->mii_media_active));
 }
 
 static void
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: driver for broadcom 57711E

2010-10-26 Thread Pyun YongHyeon

On Tue, Oct 26, 2010 at 11:21:21AM -0700, Pyun YongHyeon wrote:
> On Tue, Oct 26, 2010 at 07:30:19PM +0200, Christoph Weber-Fahr wrote:
> > Hello,
> > 
> > On 07.10.2010 09:07, Дмитрий Александров wrote:
> > >  Hi! I really need broadcom 57711E driver for my FreeBSD 8.0 i386.
> > > Does anybody have this driver already?
> > > 
> > > P.S. Also wanted David Christensen who had previously tried to write a 
> > > driver.
> > 
> > LOL. Welcome to the club :-)
> > 
> > ( Though I'm afraid David, or Broadcomm, respectively, are incommunicado
> > on this particular issue. )
> > 
> > Let us know if you hear something different.
> > 
> 
> Actually David is very responsive but I think he is somewhat busy
> for other work. For example, I had been working with David to
> support next generation Broadcom controller that would be widely
> available in next year.
> Personally I haven't have chance to try bnx(4) which was written
> by David to support BCM57710/57711/57711E controllers so I don't
> know how well it works. As David said long time ago, one of issue
> was testing wide variety of PHYs used in controller such that it
> made hard to release the driver.
> David, would you clarify current status of bnx(4)?

Actually the driver name was bxe(4).
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: driver for broadcom 57711E

2010-10-26 Thread Pyun YongHyeon

On Tue, Oct 26, 2010 at 07:30:19PM +0200, Christoph Weber-Fahr wrote:
> Hello,
> 
> On 07.10.2010 09:07, Дмитрий Александров wrote:
> >  Hi! I really need broadcom 57711E driver for my FreeBSD 8.0 i386.
> > Does anybody have this driver already?
> > 
> > P.S. Also wanted David Christensen who had previously tried to write a 
> > driver.
> 
> LOL. Welcome to the club :-)
> 
> ( Though I'm afraid David, or Broadcomm, respectively, are incommunicado
> on this particular issue. )
> 
> Let us know if you hear something different.
> 

Actually David is very responsive but I think he is somewhat busy
for other work. For example, I had been working with David to
support next generation Broadcom controller that would be widely
available in next year.
Personally I haven't have chance to try bnx(4) which was written
by David to support BCM57710/57711/57711E controllers so I don't
know how well it works. As David said long time ago, one of issue
was testing wide variety of PHYs used in controller such that it
made hard to release the driver.
David, would you clarify current status of bnx(4)?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Strange problem with sk0

2010-10-22 Thread Pyun YongHyeon

On Fri, Oct 22, 2010 at 05:09:33PM -0400, Mikhail T. wrote:
>  Hello!
> 
> I have a rather bizarre problem with my on-board sk interface... It only 
> works, when tcpdump is running...
> 
> Seriously. It negotiates with the switch (1000baseT/full-duplex) just 
> fine, but, unless tcpdump has it open (and in "promiscuous" mode), no 
> traffic seems to go through. It would not respond to pings -- not even 
> from the switch itself, nothing.
> 
> But, as soon as I start tcpdump -- even if tcpdump never has anything to 
> output:
> 
>tcpdump -i sk0 -n src host 10.non.existent.IP
> 
> Traffic starts flowing just fine... Do I simply have flaky hardware? The 
> motherboard is old, and, for some reason, I need to "remind" sk0, what 
> its ethernet address upon reboote (it starts off with 00:00:00:00:00:00).
> 

The all 0 station address is clear indication of source of problem.
Normally ethernet controllers drop frames not destined for the
station address unless promiscuous mode is activated. tcpdump is
one of program that activates the promiscuous mode.

To narrow down the issue, show me the output both dmesg and
pciconf -lvcb.

> Any other explanations for what is happening? There are plenty of other 
> systems (computers, VoIP phone, two TVs) on this switch and all are 
> fine... I did try different ports on it -- same results. I also tried 
> forcing things down to 100/half-duplex -- no change...
> 

It seems sk(4) failed to extract station address from controller so
I have to know why it happens on your box.

> Thanks! Yours,
> 
>-mi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: NFE adapter 'hangs'

2010-10-15 Thread Pyun YongHyeon

On Fri, Oct 15, 2010 at 01:25:08PM +0100, Melissa Jenkins wrote:
> 
> On 4 Sep 2010, at 01:53, Pyun YongHyeon wrote:
> 
> > On Fri, Sep 03, 2010 at 07:59:26AM +0100, Melissa Jenkins wrote:
> >> 
> >> Thank you for your very quick response :)
> >> 
> > 
> > [...]
> > 
> >>> Also I'd like to know whether both RX and TX are dead or only one
> >>> RX/TX path is hung. Can you see incoming traffic with tcpdump when
> >>> you think the controller is in stuck?
> >> 
> >> Yes, though not very much. The traffic to 4800 is every second so you can 
> >> see in the following trace when it stops
> >> 
> >> 07:10:42.287163 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
> >> 07:10:42.911995
> >> 07:10:43.112073 STP 802.1d, Config, Flags [Topology change], bridge-id 
> >> 8000.c4:7d:4f:a9:ac:30.8008, length 43
> >> 07:10:43.148659 IP 192.168.1.203.57026 > 192.168.1.255.4800: UDP, length 60
> >> 07:10:43.148684 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 
> >> 192.168.1.203.57026 > 192.168.1.129.4800: UDP, length 60
> >> 07:10:43.148689 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 
> >> 192.168.1.203.57026 > 192.168.1.1.4800: UDP, length 60
> >> 07:10:43.148918 IP 192.168.1.213.40677 > 192.168.1.255.4800: UDP, length 48
> > 
> > [...]
> > 
> >> a bit later on, still broken, a slight odd message:
> >> 07:11:43.079720 IP 172.31.1.129 > 172.31.1.213: GREv0, length 52: IP 
> >> 192.168.1.129.60446 > 192.168.1.213.179:  tcp 12 [bad hdr length 16 - too 
> >> short, < 20]
> >> 07:11:44.210794 IP 172.31.1.129 > 172.31.1.203: GREv0, length 84: IP 
> >> 192.168.1.129.64744 > 192.168.1.203.4800: UDP, length 52
> >> 07:11:44.210831 IP 172.31.1.129 > 172.31.1.213: GREv0, length 84: IP 
> >> 192.168.1.129.64744 > 192.168.1.213.4800: UDP, length 52
> >> 
> >> Now this really is odd, I don't recognise either of those MAC addresses, 
> >> though the SQL shown is used on this machine (
> >> 07:12:13.054393 45:43:54:20:41:63 > 00:00:03:53:45:4c, ethertype Unknown 
> >> (0x6374), length 60:
> >>0x:  556e 6971 7565 4964 2046 524f 4d20 7261  UniqueId.FROM.ra
> >>0x0010:  6461 6363 7420 2057 4845 5245 2043 616c  dacct..WHERE.Cal
> >>0x0020:  6c69 6e67 5374 6174 696f 6e49 6420   lingStationId.
> > 
> > Hmm, it seems you're using really complex setup. It's very hard to
> > narrow down guilty ones under these environments. Could you setup
> > simple network configuration that reproduces the issue? One of
> > possible cause would be wrong(garbled) data might be passed up to
> > upper stack. But I have no idea why you see GRE packets with
> > truncated TCP header(172.31.1.129 > 172.31.1.213).
> > How about disabling TX/RX checksum offloading as well as TSO?
> > 
> > [...]
> > 
> >> 
> >> I then restarted the interface (nfe down/up, route restart)
> >> 
> >> From dmesg at the time (slight obfuscated)
> >> Sep  3 07:10:19 manch2 bgpd[89612]: neighbor XX: received notification: 
> >> HoldTimer expired, unknown subcode 0
> >> Sep  3 07:10:49 manch2 bgpd[89612]: neighbor XX connect: Host is down
> >> # at this point I took the interface down & up and reloaded the routing 
> >> tables
> >> Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
> >> Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
> >> Sep  3 07:12:07 manch2 kernel: nfe0: link state changed to DOWN
> >> Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
> >> Sep  3 07:12:11 manch2 kernel: nfe0: link state changed to UP   
> >> Sep  3 07:12:11 manch2 kernel: carp0: link state changed to DOWN
> >> Sep  3 07:12:14 manch2 kernel: carp0: link state changed to UP
> > 
> > Hmm, it does not look right, carp0 showed link DOWN message four
> > times in a row.
> > By the way, are you using IPMI on MCP55? nfe(4) is not ready to
> > handle MAC operation with IPMI.
> 
> 
> Turning off tx & rc checksum offloading seems to have resolved the problem:
> 
> ifconfig nfe0 -txcsum -rxcsum
> 
> Seems to have stopped both the corruption and the interface hanging.  I ran 
> it for about 16 hours on the FreeBSD 8 box.  It also appears to have fixed 
> the problem on my FreeBSD 7 machine as well.  
> 

Hmm, could you try the patch at the following URL?
http://people.freebsd.org/~yongari/nfe/nfe.mcp55.tx

Re: bge watchdog timeout errors FreeBSD 7.3

2010-10-06 Thread Pyun YongHyeon

On Wed, Oct 06, 2010 at 05:45:08PM +0100, a.sm...@ukgrid.net wrote:
> Hi,
> 
>   sorry not to have replied sooner. Ive been trying to get the end  
> user to confirm whether he has any issues with the server as it is. He  
> still hasnt replied :(
> I think tho, its likely I will leave the server as is for the time  
> being as I have no information that it isnt working correctly (other  
> than continued watchdog messages in the logs), and wait for a more  
> mature bge driver to be tested and released.
> 

Ok, there might be a couple of edge cases not handled in bge(4). I
believe things will improve over time but it depends on users
feedback and testing.
Anyway thanks for reporting.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge watchdog timeout errors FreeBSD 7.3

2010-09-28 Thread Pyun YongHyeon

On Tue, Sep 28, 2010 at 11:08:24PM +0100, a.sm...@ukgrid.net wrote:
> Quoting Pyun YongHyeon :
> 
> > However don't apply the patch to production box.
> >
> 
> Hi,
> 
>   actually the only server of this type is a production box, it was  
> originally running FreeBSD 7.2 without issue but was upgraded when 7.2  
> went EOL.
> What would you recommend? I guess I can wait for this to be tested by  

I don't think it would cause severe problem with the patch but it
was not heavily tested under various network load so this is the
main reason why I took precaution against applying it on production
box. If you're prepared to go back to old working kernel and it's
tolerable for small down time you may go that route.

> someone else as it doesnt seem to be causing and severe issues. The  

As I said, the patch requires another patch that handles shared
interrupt with tagged status. I also have initial patch for that
but it needs more polishing and cleanups. Lack of controller that
has this issue also makes it hard to write patch. Your controller
has no such issue so I wanted to know whether you're seeing known
hardware errata or not.

> server in question is running wordpress-mu and I havent had any  
> complaints from end users about this,
> 
> thanks Andy.
> 
> 
> 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge watchdog timeout errors FreeBSD 7.3

2010-09-28 Thread Pyun YongHyeon

On Tue, Sep 28, 2010 at 01:24:45PM +0100, a.sm...@ukgrid.net wrote:
> Quoting Pyun YongHyeon :
> 
> >Oops, sorry. I forgot one more chunk. You need to apply this one in
> >addition to two patches.
> >http://svn.freebsd.org/viewvc/base/stable/7/sys/dev/bge/if_bgereg.h?r1=202861&r2=208995&view=patch
> >
> 
> Hi,
> 
>   Ok I have installed the patches, and rebuilt the kernel.  
> Unfortunately the errors persist,
> 
> 
> Sep 28 12:27:58 vcomm kernel: bge0: watchdog timeout -- resetting
> Sep 28 12:27:58 vcomm kernel: bge0: link state changed to DOWN
> Sep 28 12:28:00 vcomm kernel: bge0: link state changed to UP
> 
> Although prior to the installation of the patch I tried to copy some  
> backup files off the server via scp. Copying a large file ~2GB caused  
> the network connection to drop and the copy to fail. Testing after  
> applying the patch shows that this is now improved, I have ran a few  
> copies without any problems...
> 
> Where does that leave things?
> 

Ok thanks for testing. It seems you have another issue which is not
correctly handled in bge(4). I'm not sure you're actually seeing an
errata of controller but could you try patch at the following URL?
http://freefall.freebsd.org/~yongari/bge/bge.7.3R.post.diff

The patch includes all patches I suggested so please back out
previous patches before applying it. The patch was written to get
better RX performance under high network load and it also includes
a fix for a known hardware errata. But it's highly experimental and
it's not for non-MSI bge(4) controllers because the patch may
trigger other locking issues due to highly increased RX BD updates
ratio in firmware for controllers that use shared interrupt. It 
seems your controller uses MSI so you don't have to worry about the
issue. However don't apply the patch to production box.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge watchdog timeout errors FreeBSD 7.3

2010-09-27 Thread Pyun YongHyeon

On Mon, Sep 27, 2010 at 12:27:13PM +0100, a.sm...@ukgrid.net wrote:
> Quoting Pyun YongHyeon :
> 
> >
> >Order wouldn't be important but you have to apply both patches.
> >
> 
> Hi,
> 
>   After successfully applying the patchs I get this error when doing a make:
> 

Oops, sorry. I forgot one more chunk. You need to apply this one in
addition to two patches.
http://svn.freebsd.org/viewvc/base/stable/7/sys/dev/bge/if_bgereg.h?r1=202861&r2=208995&view=patch

> # make
> Warning: Object directory not changed from original /usr/src/sys/modules/bge
> @ -> /usr/src/sys
> machine -> /usr/src/sys/i386/include
> awk -f @/tools/makeobjops.awk @/dev/mii/miibus_if.m -h
> awk -f @/tools/miidevs2h.awk @/dev/mii/miidevs
> awk -f @/tools/makeobjops.awk @/kern/device_if.m -h
> awk -f @/tools/makeobjops.awk @/kern/bus_if.m -h
> awk -f @/tools/makeobjops.awk @/dev/pci/pci_if.m -h
> cc -O2 -fno-strict-aliasing -pipe  -D_KERNEL -DKLD_MODULE -std=c99  
> -nostdinc   -I. -I@ -I@/contrib/altq -finline-limit=8000 --param  
> inline-unit-growth=100 --param large-function-growth=1000 -fno-common   
> -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx  
> -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -Wall  
> -Wredundant-decls -Wnested-externs -Wstrict-prototypes   
> -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef  
> -Wno-pointer-sign -fformat-extensions -c  
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c: In function 
> 'bge_newbuf_std':
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:954: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_std_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c: In function  
> 'bge_newbuf_jumbo':
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1012: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1013: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1014: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1015: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1029: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1034: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1039: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:1044: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c: In function  
> 'bge_rxreuse_std':
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:3271: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_std_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c: In function  
> 'bge_rxreuse_jumbo':
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:3283: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:3284: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:3285: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> /usr/src/sys/modules/bge/../../dev/bge/if_bge.c:3286: error: 'struct  
> bge_chain_data' has no member named 'bge_rx_jumbo_seglen'
> *** Error code 1
> 
> Stop in /usr/src/sys/modules/bge.
> 
> As mentioned, this is applying the two patches you provided to the  
> source on my system running 7.3-RELEASE-p1.
> Any ideas?
> 
> thanks Andy.
> 
> 
> 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge watchdog timeout errors FreeBSD 7.3

2010-09-24 Thread Pyun YongHyeon

On Fri, Sep 24, 2010 at 12:39:38PM +0100, a.sm...@ukgrid.net wrote:
> Quoting Pyun YongHyeon :
> 
> >Please apply patch at the following URL and let me know how it goes.
> >
> >http://svn.freebsd.org/viewvc/base/stable/7/sys/dev/bge/if_bge.c?r1=207862&r2=208995&view=patch
> >http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=212302&r2=212755&view=patch
> >
> 
> Thanks for the reply!
> So I have to apply both of these patchs in the order you posted them?
> 

Order wouldn't be important but you have to apply both patches.

> thanks Andy.
> 
> 
> 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge watchdog timeout errors FreeBSD 7.3

2010-09-23 Thread Pyun YongHyeon

On Thu, Sep 23, 2010 at 03:40:54PM +0100, a.sm...@ukgrid.net wrote:
> Hi,
> 
>   we are seeing these errors repeatedly on a new Dell R300 server:
> 
> Sep 23 15:06:29 vcomm kernel: bge0: watchdog timeout -- resetting
> Sep 23 15:06:29 vcomm kernel: bge0: link state changed to DOWN
> Sep 23 15:06:31 vcomm kernel: bge0: link state changed to UP
> 
> Server OS is:
> 
> FreeBSD vcomm 7.3-RELEASE-p1 FreeBSD 7.3-RELEASE-p1 #0: Wed May 26  
> 04:29:05 UTC 2010  
> r...@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  i386
> 
> We have so far, changed the cable, changed the network port, and  
> lastly upgraded the server firmware (which included some update to the  
> NIC). But the issue continues.
> 
> Info on the hardware from dmesg:
> 
> bge0:  mem  
> 0xdfdf-0xdfdf irq 16 at device 0.0 on pci1
> miibus0:  on bge0
> 
> No shared interrupts:
> 
> # vmstat -i
> interrupt  total   rate
> irq16: mpt0 21776213 30
> irq21: uhci0 uhci2+  290  0
> irq23: atapci058  0
> cpu0: timer   1438708756   2006
> irq256: bge050544776 70
> cpu1: timer   1438700714   2006
> Total 2949730807   4113
> 
> Network switch is Cisco, I can get the exact model if required.
> Is this a bug, hardware issue? Should I be worried?
> 

Please apply patch at the following URL and let me know how it goes.

http://svn.freebsd.org/viewvc/base/stable/7/sys/dev/bge/if_bge.c?r1=207862&r2=208995&view=patch
http://svn.freebsd.org/viewvc/base/head/sys/dev/bge/if_bge.c?r1=212302&r2=212755&view=patch
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) un hiding adapter info

2010-09-23 Thread Pyun YongHyeon

On Thu, Sep 23, 2010 at 01:48:13PM -0700, David Christensen wrote:
> > > What I'd really like to do is revamp the debug code so that it
> > > can be enabled/disabled on the fly rather than requiring that
> > > the driver be compiled.  Adding some performance stuff would
> > 
> > Couldn't it be implemented with sysctl? Users may set a variable
> > something like dev.bce.0.diag_bitmap to activate debugging code.
> > But it seems that requires a lot of code changes.
> 
> I already have an extensive amount of conditionally compiled
> debug code in the driver which uses a bitmap to adjust the 
> debug spew.  For personal testing I've also added a sysctl
> to allow that value to be modified at runtime.  My concerns
> are from a security standpoint (some of the debug code allows
> direct register access), a code size perspective (adds a lot

Normal users couldn't change a sysctl variable so that information
is only available to root(Here I assume the feature is off by
default).

> of code most people don't use), and the performance penalty

I think code size wouldn't be large compared to firmware code. If
we switch to use firmware(9) I guess we can reduce the code size a
lot. We don't have to carry that image after downloading the
firmware.

> (lots of bitmap checking even when all debug spew is turned
> off).  
> 

That's correct. If that checking is not in fast path that
wouldn't affect much I guess.

> Probably worth the time to pull the most useful stuff into
> the production driver and leave the more obscure code in the
> debug only build.  If there are no complaints I'll add that
> to my task list.
> 
> Dave
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-23 Thread Pyun YongHyeon

On Thu, Sep 23, 2010 at 10:05:33AM -0500, Tom Judge wrote:
> On 09/13/2010 03:53 PM, Pyun YongHyeon wrote:
> > On Mon, Sep 13, 2010 at 03:38:41PM -0500, Tom Judge wrote:
> >   
> >> On 09/13/2010 02:33 PM, Pyun YongHyeon wrote:
> >> 
> >>> On Mon, Sep 13, 2010 at 02:07:58PM -0500, Tom Judge wrote:
> >>>   
> >>>   
> >>>>>> Without BCE_JUMBO_HDRSPLIT then we see no errors.  With it we see 
> >>>>>> number
> >>>>>> of errors, however the rate seems to be reduced compaired to the
> >>>>>> previous version of the driver.
> >>>>>> 
> >>>>> It seems there are issues in header splitting and it was disabled
> >>>>> by default. Header splitting reduces packet processing overhead in
> >>>>> upper layer so it's normal to see better performance with header
> >>>>> splitting.
> >>>>>   
> >>>> The reason that we have had header splitting enabled in the past is that
> >>>> historically there have been issues with memory fragmentation when using
> >>>> 8k jumbo frames (resulting in 9k mbuf's).
> >>>> 
> >>> Yes, if you use jumbo frames, header splitting would help to reduce
> >>> memory fragmentation as header splitting wouldn't allocate jumbo
> >>> clusters.
> >>>
> >>>   
> >> Under testing I have yet to see a memory fragmentation issue with this
> >> driver.  I follow up if/when I find a problem with this again.
> >>
> >> 
> So here we are again.  The system is locking up again because of 9k mbuf
> allocation failures.
> 
> t...@pidge '14:12:25' '~'
> > $ netstat -m
> 514/4781/5295 mbufs in use (current/cache/total)
> 0/2708/2708/25600 mbuf clusters in use (current/cache/total/max)
> 0/1750 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/2904/2904/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 513/3274/3787/6400 9k jumbo clusters in use (current/cache/total/max)

Number of 9k clusters didn't reach to the limit.

> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 4745K/47693K/52438K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/2692655/0 requests for jumbo clusters denied (4k/9k/16k)

I see large denied value for 9k jumbo clusters but it could be
normal under hight network load. But it should not lock up the
controller. Note, under these conditions(cluster allocation
failure) driver would drop incoming frames which in turn will does
not pass received frames to upper stack. The end result could be
shown as locked up as upper stack does not receive frames. I think
you can check MAC statistics whether the driver is still running or
not.

> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
> 
> 
> >>>> I have a kernel with the following configuration in testing right now:
> >>>>
> >>>> * Flow control enabled.
> >>>> * Jumbo header splitting turned off.
> >>>>
> >>>>
> >>>> Is there any way that we can fix flow control with jumbo header
> >>>> splitting turned on?
> >>>>
> >>>> 
> >>>> 
> >>> Flow control has nothing to do with header splitting(i.e. flow
> >>> control is always enabled). 
> >>>
> >>>   
> >>>   
> >> Sorry let me rephrase that:
> >>
> >> Is there a way to fix the RX buffer shortage issues (when header
> >> splitting is turned on) so that they are guarded by flow control.  Maybe
> >> change the low watermark for flow control when its enabled?
> >>
> >> 
> > I'm not sure how much it would help but try changing RX low
> > watermark. Default value is 32 which seems to be reasonable value.
> > But it's only for 5709/5716 controllers and Linux seems to use
> > different default value.
> >   
> These are: NetXtreme II BCM5709 Gigabit Ethernet
> 
> So my next task is to turn the watermark related defines into sysctls
> and turn on header splitting so that I can try to tune them without
> having to reboot.
> 
> 
> 
> My next question is, is it possible to increase the size of the RX ring
> without switching to RSS?
> 

Yes but I doubt it would help in this case as you seem to suffer
from 9K jumbo frame allocation failure.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Changing link status in bge driver

2010-09-23 Thread Pyun YongHyeon

On Thu, Sep 23, 2010 at 11:05:08AM -0700, Sushanth Rai wrote:
> "ifconfig bge1 media none" does change the PHY status temporarily. I see the 
> following when I run this command:
> 
> bge1: flags=8843 metric 0 mtu 1500
>   options=bb
> ether 00:40:d0:b8:1e:0b
> media: Ethernet none
> status: no carrier
> 
> After a few seconds if I do "ifconfig bge1", it's automagically changed to:
> 
> bge1: flags=8843 metric 0 mtu 1500
>   options=bb
> ether 00:40:d0:b8:1e:0b
> media: Ethernet none (1000baseTX )
> status: active
> 
> 
> I tried playing around with mediaopt, but PHY status doesn't seem to change 
> permanently. 
> 

brgphy(4) does not correctly handle IFM_NONE at this moment. In
fact, brgphy(4)'s manual media configuration does not seem to work
well. See MII_MEDIACHG handler of brgphy_physervice and implement
IFM_NONE media type to power down or isolate the PHY.

> 
> --- On Thu, 9/23/10, Luiz Otavio O Souza  wrote:
> 
> > From: Luiz Otavio O Souza 
> > Subject: Re: Changing link status in bge driver
> > To: "Sushanth Rai" 
> > Date: Thursday, September 23, 2010, 2:31 AM
> > On Sep 23, 2010, at 4:11 AM, Sushanth
> > Rai wrote:
> > 
> > > Hello,
> > > 
> > > I'm using BCM5715C based NIC card on a FreeBSD 7.2
> > system. I would like to simulate condition where the PHY
> > layer is powered-off i.e, the link status should show as "no
> > carrier". When I do "ifconfig down", it just turns-off the
> > driver and the link status is still active. Is there is
> > anything I can do in the bge driver or anywhere else in the
> > software stack to simulate this condition without physically
> > disconnecting the cable ?
> > > 
> > > Thanks,
> > > Sushanth
> > 
> > Hi,
> > 
> > ifconfig bgeX media none should do what you need.
> > 
> > Luiz
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) un hiding adapter info

2010-09-23 Thread Pyun YongHyeon

On Thu, Sep 23, 2010 at 10:43:52AM -0700, David Christensen wrote:
> > Would it be possible to unhide the output of bce_print_adapter_info()
> > from under boot verbose?
> > 
> > This information is useful for comparing firmware and card versions
> > between machines.
> > 
> > Alternatively what about adding a sysctl under dev.bce.X for this info?
> 
> I have no problem doing that, just not sure that everyone would
> appreciate the spew.  I like it for exactly the same reason,
> troubleshooting remote systems.  
> 

I also like to see that. It would just add one informational line
and it wouldn't hurt.

> What I'd really like to do is revamp the debug code so that it 
> can be enabled/disabled on the fly rather than requiring that 
> the driver be compiled.  Adding some performance stuff would 

Couldn't it be implemented with sysctl? Users may set a variable
something like dev.bce.0.diag_bitmap to activate debugging code.
But it seems that requires a lot of code changes.

> be awesome as well.
> 
> Dave 
> 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/123172: [bce] Watchdog timeout problems with if_bce

2010-09-22 Thread Pyun YongHyeon

On Wed, Apr 30, 2008 at 01:20:03PM +, Josh Endries wrote:
> The following reply was made to PR kern/123172; it has been noted by GNATS.
> 
> From: Josh Endries 
> To: bug-follo...@freebsd.org
> Cc:  
> Subject: Re: kern/123172: [bce] Watchdog timeout problems with if_bce
> Date: Wed, 30 Apr 2008 08:58:23 -0400
> 
>  It's been working well for a while, so I might have fixed whatever was 
>  causing the problem to manifest. I converted my lo0 jails to use real 
>  IPs, and removed my pf config that was routing things around, so I 
>  suspect that had something to do with it. I can't change it back right 
>  now, later this week/weekend I should be able to.
>  
>  I had jails on 127.0.0.x (a couple still are), namely the BIND jail, but 
>  my switch was load balancing it so the machine was BINATing the 10.0.0 
>  address the switch was talking to  with the local 127.0.0.3 address 
>  (iirc). The switch was NATing the 10 address to a real IP.
>  
>  The problem was that I also have some real IPs on this box, so the 
>  machine wanted to route packets out that interface, so I had to use 
>  route-to/reply-to for the BIND jail to get things to go back out the way 
>  they came in, instead of using the x.x.164/24 route or default route. It 
>  did work in the configuration, but had the watchdog errors. I think this 
>  might have something to do with the issue; I'll put things back and test 
>  it some more when I have some time.
>  
>  I did see that other PR, but since this is a newer version of FreeBSD, 
>  and I have no ACPI problems or problems booting (that I've noticed, at 
>  least), I decided to submit. They certainly could be related (we're both 
>  using amd64; unfortunately I can't change that to test i386). Here is 
>  some more info...
>  
>  jls:
>  
>  JID  IP Address  Hostname  Path
>5  x.x.164.7   smtp  /jails/smtp/root
>4  127.0.0.5   mx/jails/mx/root
>3  x.x.164.4   ns/jails/ns/root
>2  127.0.0.4   pkg   /jails/pkg/root
>1  127.0.0.2   mysql /jails/mysql/root
>  
>  /etc/pf.conf:
>  
>  nat on vlan2 from  to any -> x.x.164.123
>  binat on vlan8 from $jail_mysql_ip to any -> $jail_mysql_exip
>  
>  block log (user) all
>  pass in log (user) quick on vlan2 inet proto { tcp, udp } from any to 
>  $jail_ns_ip port domain keep state
>  pass out log (user) quick on vlan2 inet proto { tcp, udp } from 
>  $jail_ns_ip to any port domain keep state
>  pass quick log (user) on lo0 inet proto udp from  to 
>   port domain keep state
>  pass quick log (user) on lo0 inet proto tcp from  to 
>  $jail_mysql_ip port 3306 keep state
>  pass out log (user) quick on vlan8 inet proto tcp from $jail_mysql_exip 
>  to 10.0.1.2 port 3306 keep state
>  pass in log (user) quick on vlan11 inet proto tcp from  to 
>  vlan11 port 55185 keep state
>  pass log (user) quick inet proto icmp all icmp-type $icmp_types keep state
>  pass out log (user) quick on vlan2 inet proto tcp from x.x.164.123 to 
>  any keep state
>  
>  uname -a:
>  
>  FreeBSD hathor 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Tue Mar 24 13:36:33 
>  EDT 2009 r...@hathor:/jails/src/usr/obj/jails/src/usr/src/sys/ULEMAC 
>amd64
>  
>  kernel config:
>  
>  include GENERIC
>  ident   ULEMAC
>  nooptions   SCHED_4BSD
>  options SCHED_ULE
>  options MAC
>  
>  ifconfig:
>  
>  bce0: flags=8843 metric 0 mtu 1500
>   
>  options=1bb
>   ether 00:1f:29:06:d9:e2
>   media: Ethernet autoselect (100baseTX )
>   status: active
>   lagg: laggdev lagg0
>  bce1: flags=8843 metric 0 mtu 1500
>   
>  options=1bb
>   ether 00:1f:29:06:d9:e2
>   media: Ethernet autoselect (100baseTX )
>   status: active
>   lagg: laggdev lagg0
>  lo0: flags=8049 metric 0 mtu 16384
>   inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
>   inet6 ::1 prefixlen 128
>   inet 127.0.0.1 netmask 0xff00
>   inet 127.0.0.2 netmask 0x
>   inet 127.0.0.4 netmask 0x
>   inet 127.0.0.5 netmask 0x
>  lagg0: flags=8843 metric 0 mtu 1500
>   
>  options=1bb
>   ether 00:1f:29:06:d9:e2
>   media: Ethernet autoselect
>   status: active
>   laggproto lacp
>   laggport: bce1 flags=1c
>   laggport: bce0 flags=1c
>  vlan2: flags=8843 metric 0 mtu 1500
>   options=3
>   ether 00:1f:29:06:d9:e2
>   inet x.x.164.123 netmask 0xff00 broadcast x.x.164.255
>   inet x.x.164.4 netmask 0x broadcast x.x.164.4
>   inet x.x.164.7 netmask 0x broadcast x.x.164.7
>   media: Ethernet autoselect
>   status: active
>   vlan: 2 parent interface: lagg0
>  vlan8: flags=8843 metric 0 mtu 1500
>   options=3
>

Re: kern/144689: [re] TCP transfer corruption using if_re

2010-09-22 Thread Pyun YongHyeon

On Sat, Mar 20, 2010 at 02:38:55PM -0700, Steven Noonan wrote:
> On Tue, Mar 16, 2010 at 1:46 PM, Pyun YongHyeon  wrote:
> > On Tue, Mar 16, 2010 at 12:31:22PM -0700, Steven Noonan wrote:
> >> On Tue, Mar 16, 2010 at 11:23 AM, Pyun YongHyeon  wrote:
> >
> > [...]
> >
> >> > The real issue looks like PHY read failure which can result in
> >> > unexpected behavior. I don't see rgephy(4) related message here,
> >> > would you show me the output of "devinfo -rv | grep phy"?
> >> > By chance are you using PCMCIA ethernet controller?
> >>
> >> I am. It's a Netgear GA511. I think I said in my original post that it
> >> was connected via cardbus.
> >>
> >> xerxes ~ # devinfo -rv | grep phy
> >> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? rgephy0 pnpinfo oui=0x732 model=0x11 rev=0x3 
> >> at phyno=1
> >> ?? ?? ?? ?? ?? ?? ?? ?? inphy0 pnpinfo oui=0xaa00 model=0x33 rev=0x0 at 
> >> phyno=1
> >>
> >
> > Ok, thanks for the info. Did the controller ever work before?
> > Or you start seeing the issue on 8.0-RELEASE?
> >
> 
> Uh, hm. This is weird, now I'm getting the problem not just using
> re(4), but also with fxp(4) (which is my on-board card). I don't think
> it's a driver bug here.
> 
> Could this be a TCP stack bug?
> 

How about using 8.1-RELEASE? Does it make any difference?

> - Steven
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge hangs on recent 7.3-STABLE

2010-09-13 Thread Pyun YongHyeon

On Tue, Sep 14, 2010 at 01:08:08AM +0300, Vlad Galu wrote:
> On Mon, Sep 13, 2010 at 9:04 PM, Pyun YongHyeon  wrote:
> > On Mon, Sep 13, 2010 at 06:27:08PM +0400, Igor Sysoev wrote:
> >> On Thu, Sep 09, 2010 at 02:18:08PM -0700, Pyun YongHyeon wrote:
> >>
> >> > On Thu, Sep 09, 2010 at 01:10:50PM -0700, Pyun YongHyeon wrote:
> >> > > On Thu, Sep 09, 2010 at 02:28:26PM +0400, Igor Sysoev wrote:
> >> > > > Hi,
> >> > > >
> >> > > > I have several hosts running FreeBSD/amd64 7.2-STABLE updated on 
> >> > > > 11.01.2010
> >> > > > and 25.02.2010. Hosts process about 10K input and 10K output 
> >> > > > packets/s
> >> > > > without issues. One of them, however, is loaded more than others, so 
> >> > > > it
> >> > > > processes 20K/20K packets/s.
> >> > > >
> >> > > > Recently, I have upgraded one host to 7.3-STABLE, 24.08.2010.
> >> > > > Then bge on this host hung two times. I was able to restart it from
> >> > > > console using:
> >> > > > ? /etc/rc.d/netif restart bge0
> >> > > >
> >> > > > Then I have upgraded the most loaded (20K/20K) host to 7.3-STABLE, 
> >> > > > 07.09.2010.
> >> > > > After reboot bge hung every several seconds. I was able to restart 
> >> > > > it,
> >> > > > but bge hung again after several seconds.
> >> > > >
> >> > > > Then I have downgraded this host to 7.3-STABLE, 14.08.2010, since 
> >> > > > there
> >> > > > were several if_bge.c commits on 15.08.2010. The same hangs.
> >> > > > Then I have downgraded this host to 7.3-STABLE, 17.03.2010, before
> >> > > > the first if_bge.c commit after 25.02.2010. Now it runs without 
> >> > > > hangs.
> >> > > >
> >> > > > The hosts are amd64 dual core SMP with 4G machines. bge information:
> >> > > >
> >> > > > b...@pci0:4:0:0: ? ? ? ?class=0x02 card=0x165914e4 
> >> > > > chip=0x165914e4 rev=0x11 hdr=0x00
> >> > > > ? ? vendor ? ? = 'Broadcom Corporation'
> >> > > > ? ? device ? ? = 'NetXtreme Gigabit Ethernet PCI Express (BCM5721)'
> >> > > >
> >> > > > bge0:  >> > > > 0x004101> mem 0xfe5f-0xfe5f irq 19 at device 0.0 on pci4
> >> > > > miibus1:  on bge0
> >> > > > brgphy0:  PHY 1 on miibus1
> >> > > > brgphy0: ?10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> >> > > > 1000baseT-FDX, auto
> >> > > > bge0: Ethernet address: 00:e0:81:5f:6e:8a
> >> > > >
> >> > >
> >> > > Could you show me verbose boot message(bge part only)?
> >> > > Also show me the output of "pciconf -lcbv".
> >> > >
> >> >
> >> > Forgot to send a patch. Let me know whether attached patch fixes
> >> > the issue or not.
> >>
> >> > Index: sys/dev/bge/if_bge.c
> >> > ===
> >> > --- sys/dev/bge/if_bge.c ? ?(revision 212341)
> >> > +++ sys/dev/bge/if_bge.c ? ?(working copy)
> >> > @@ -3386,9 +3386,11 @@
> >> > ? ? sc->bge_rx_saved_considx = rx_cons;
> >> > ? ? bge_writembx(sc, BGE_MBX_RX_CONS0_LO, sc->bge_rx_saved_considx);
> >> > ? ? if (stdcnt)
> >> > - ? ? ? ? ? bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, sc->bge_std);
> >> > + ? ? ? ? ? bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, (sc->bge_std +
> >> > + ? ? ? ? ? ? ? BGE_STD_RX_RING_CNT - 1) % BGE_STD_RX_RING_CNT);
> >> > ? ? if (jumbocnt)
> >> > - ? ? ? ? ? bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc->bge_jumbo);
> >> > + ? ? ? ? ? bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, (sc->bge_jumbo +
> >> > + ? ? ? ? ? ? ? BGE_JUMBO_RX_RING_CNT - 1) % BGE_JUMBO_RX_RING_CNT);
> >> > ?#ifdef notyet
> >> > ? ? /*
> >> > ? ? ?* This register wraps very quickly under heavy packet drops.
> >>
> >> Thank you, it seems the patch has fixed the bug.
> >> BTW, I noticed the same hungs on FreeBSD 8.1, date=2010.09.06.23.59.59
> >> I will apply the patch on all my updated hosts.
> >>
> >
> > Thanks for testing. I'm afraid bge(4) in HEAD, stable/8 and
> > stable/7(including 8.1-RELEASE and 7.3-RELEASE) may suffer from
> > this issue. Let me know what other hosts work with the patch.
> 
> Hi Pyun,
> 
> Thanks for the patch. It seems to have fixed the symptom in my case,
> on a card identical to Igor's, but on board of an IBM eServer 306m.
> 

Thanks for reporting and testing! I really appreciate it.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-13 Thread Pyun YongHyeon

On Mon, Sep 13, 2010 at 03:21:13PM -0700, David Christensen wrote:
> > I'm under the impression the header splitting in bce(4) is for
> > LRO(opposite of TSO), not for VM magic to enable page flipping
> > tricks.
> 
> Header splitting was implemented in the Linux version of bce(4)
> to prevent jumbo memory allocations.  Allocating 9KB frames was
> causing problems on systems used for virtualization.  (Harder to
> find a contiguous 9KB frame when a hypervisor is in use.)  Using 
> 4KB or smaller buffer sizes was considered more compatible with
> virtualization.  
> 
> LRO (Large Receive Offload, aka Transparent Packet Aggregation
> or TPA on the 10Gb controllers) is not supported on the 1Gb 
> bce(4) devices.
> 

I meant tcp_lro implementation of FreeBSD. ATM tcp_lro_rx() runs
long list of sanity checks before combining TCP segments into a TCP
segment but if TCP header is split with its payload I guess we can
optimize that path. This way we may be able to support LRO over
VLAN, I guess.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-13 Thread Pyun YongHyeon

On Mon, Sep 13, 2010 at 03:38:41PM -0500, Tom Judge wrote:
> On 09/13/2010 02:33 PM, Pyun YongHyeon wrote:
> > On Mon, Sep 13, 2010 at 02:07:58PM -0500, Tom Judge wrote:
> >   
> >> On 09/13/2010 01:48 PM, Pyun YongHyeon wrote:
> >> 
> >>> On Mon, Sep 13, 2010 at 10:04:25AM -0500, Tom Judge wrote:
> >>>   
> >>>   
> >>>> 
> >> 
> >> 
> >>>> Does this mean that these cards are going to perform badly? This is was
> >>>> what I gathered from the previous thread.
> >>>>
> >>>> 
> >>>> 
> >>> I mean there are still many rooms to be done in driver for better
> >>> performance. bce(4) controllers are one of best controllers for
> >>> servers and driver didn't take full advantage of it.
> >>>
> >>>   
> >>>   
> >> So far our experiences with bce(4) on FreeBSD have been very
> >> disappointing.  Starting with when Dell switched to bce(4) based NIC's
> >> (around the time 6.2 was released and with the introduction of the Power
> >> Edge X9XX hardware) we have always had problems with the driver in every
> >> release we have used: 6.2, 7.0 and 7.1.  Luckily David has been helpful
> >> and helped us fix the issues.
> >>
> >> 
> >> 
> >>>   
> >>>   
> >>>> Without BCE_JUMBO_HDRSPLIT then we see no errors.  With it we see number
> >>>> of errors, however the rate seems to be reduced compaired to the
> >>>> previous version of the driver.
> >>>>
> >>>> 
> >>>> 
> >>> It seems there are issues in header splitting and it was disabled
> >>> by default. Header splitting reduces packet processing overhead in
> >>> upper layer so it's normal to see better performance with header
> >>> splitting.
> >>>   
> >>>   
> >> The reason that we have had header splitting enabled in the past is that
> >> historically there have been issues with memory fragmentation when using
> >> 8k jumbo frames (resulting in 9k mbuf's).
> >>
> >> 
> > Yes, if you use jumbo frames, header splitting would help to reduce
> > memory fragmentation as header splitting wouldn't allocate jumbo
> > clusters.
> >
> >   
> 
> Under testing I have yet to see a memory fragmentation issue with this
> driver.  I follow up if/when I find a problem with this again.
> 
> >> I have a kernel with the following configuration in testing right now:
> >>
> >> * Flow control enabled.
> >> * Jumbo header splitting turned off.
> >>
> >>
> >> Is there any way that we can fix flow control with jumbo header
> >> splitting turned on?
> >>
> >> 
> > Flow control has nothing to do with header splitting(i.e. flow
> > control is always enabled). 
> >
> >   
> Sorry let me rephrase that:
> 
> Is there a way to fix the RX buffer shortage issues (when header
> splitting is turned on) so that they are guarded by flow control.  Maybe
> change the low watermark for flow control when its enabled?
> 

I'm not sure how much it would help but try changing RX low
watermark. Default value is 32 which seems to be reasonable value.
But it's only for 5709/5716 controllers and Linux seems to use
different default value.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-13 Thread Pyun YongHyeon

On Mon, Sep 13, 2010 at 09:11:25PM +0200, Andre Oppermann wrote:
> On 13.09.2010 20:48, Pyun YongHyeon wrote:
> >On Mon, Sep 13, 2010 at 10:04:25AM -0500, Tom Judge wrote:
> >>Without BCE_JUMBO_HDRSPLIT then we see no errors.  With it we see number
> >>of errors, however the rate seems to be reduced compaired to the
> >>previous version of the driver.
> >>
> >
> >It seems there are issues in header splitting and it was disabled
> >by default. Header splitting reduces packet processing overhead in
> >upper layer so it's normal to see better performance with header
> >splitting.
> 
> I'm not sure that header splitting really helps much at least for TCP.
> The only place where it could make a difference is at socket buffer
> append time.  There the header get 'thrown away'.  With header splitting
> the first mbuf in the chain containing the header can be returned to the
> free pool.  Without header splitting it's just a offset change in the
> mbuf.
> 
> IIRC header splitting was introduced with the Tigeon cards which were
> the first programmable network cards and the first to support putting
> the header in a different mbuf.  Header splitting, in theory, could
> make a difference with zero copy sockets where the data portion in a
> separate mbuf is flipped by VM magic into userspace.  The trouble is
> that no driver fully supports the semantics required for page flipping
> and the zero copy code, if compiled in, is less much less optimized for
> the non-flipping case than the standard code path.  With the many dozen
> gigabyte per second memory copy bandwidth of current CPU's it remains
> questionable whether the page-flipping VM magic is actually faster than
> a plain kernel/userspace copy as in the standard code path.  I generally
> recommend not to use ZERO_COPY_SOCKETS.
> 
> I suspect in the case of the bce(4) driver the change in header splitting
> is probably not the cause of the performance difference.
> 

I'm under the impression the header splitting in bce(4) is for
LRO(opposite of TSO), not for VM magic to enable page flipping
tricks.

> -- 
> Andre
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-13 Thread Pyun YongHyeon

On Mon, Sep 13, 2010 at 02:07:58PM -0500, Tom Judge wrote:
> On 09/13/2010 01:48 PM, Pyun YongHyeon wrote:
> > On Mon, Sep 13, 2010 at 10:04:25AM -0500, Tom Judge wrote:
> >   
> >>
> 
> >> Does this mean that these cards are going to perform badly? This is was
> >> what I gathered from the previous thread.
> >>
> >> 
> > I mean there are still many rooms to be done in driver for better
> > performance. bce(4) controllers are one of best controllers for
> > servers and driver didn't take full advantage of it.
> >
> >   
> 
> So far our experiences with bce(4) on FreeBSD have been very
> disappointing.  Starting with when Dell switched to bce(4) based NIC's
> (around the time 6.2 was released and with the introduction of the Power
> Edge X9XX hardware) we have always had problems with the driver in every
> release we have used: 6.2, 7.0 and 7.1.  Luckily David has been helpful
> and helped us fix the issues.
> 
> 
> >   
> >> Without BCE_JUMBO_HDRSPLIT then we see no errors.  With it we see number
> >> of errors, however the rate seems to be reduced compaired to the
> >> previous version of the driver.
> >>
> >> 
> > It seems there are issues in header splitting and it was disabled
> > by default. Header splitting reduces packet processing overhead in
> > upper layer so it's normal to see better performance with header
> > splitting.
> >   
> 
> The reason that we have had header splitting enabled in the past is that
> historically there have been issues with memory fragmentation when using
> 8k jumbo frames (resulting in 9k mbuf's).
> 

Yes, if you use jumbo frames, header splitting would help to reduce
memory fragmentation as header splitting wouldn't allocate jumbo
clusters.

> I have a kernel with the following configuration in testing right now:
> 
> * Flow control enabled.
> * Jumbo header splitting turned off.
> 
> 
> Is there any way that we can fix flow control with jumbo header
> splitting turned on?
> 

Flow control has nothing to do with header splitting(i.e. flow
control is always enabled). 

> Thanks
> 
> Tom
> 
> PS. The following test was more than enough to trigger buffer shortages
> with header splitting on:
> 
> ( while true; do ldapsearch -h ldap-server1 -b "ou=Some,o=Base" dn; done
> ) &
> ( while true; do ldapsearch -h ldap-server1 -b "ou=Some,o=Base" dn; done
> ) &
> ( while true; do ldapsearch -h ldap-server1 -b "ou=Some,o=Base" dn; done
> ) &
> 
> The search in question returned about 1700 entries.
> 

I can trigger this kind of buffer shortage with benchmark tools.
Actually fixing header splitting is on my TODO list as well as
other things but I don't know how long it would take.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-13 Thread Pyun YongHyeon

On Mon, Sep 13, 2010 at 10:04:25AM -0500, Tom Judge wrote:
> On 09/09/2010 07:24 PM, Pyun YongHyeon wrote:
> > On Thu, Sep 09, 2010 at 03:58:30PM -0500, Tom Judge wrote:
> >   
> >> Hi,
> >> I am just following up on the thread from March (I think) about this issue.
> >>
> >> We are seeing this issue on a number of systems running 7.1. 
> >>
> >> The systems in question are all Dell:
> >>
> >> * R710 R610 R410
> >> * PE2950
> >>
> >> The latter do not show the issue as much as the R series systems.
> >>
> >> The cards in one of the R610's that I am testing with are:
> >>
> >> b...@pci0:1:0:0:class=0x02 card=0x02361028 chip=0x163914e4
> >> rev=0x20 hdr=0x00
> >> vendor = 'Broadcom Corporation'
> >> device = 'NetXtreme II BCM5709 Gigabit Ethernet'
> >> class  = network
> >> subclass   = ethernet
> >>
> >> They are connected to Dell PowerConnect 5424 switches.
> >>
> >> uname -a:
> >> FreeBSD bandor.chi-dc.mintel.ad 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4
> >> #3: Wed Sep  8 08:19:03 UTC 2010
> >> t...@dev-tj-7-1-amd64.chicago.mintel.ad:/usr/obj/usr/src/sys/MINTELv10  
> >> amd64
> >>
> >> We are also using 8192 byte jumbo frames, if_lagg and if_vlan in the
> >> configuration (the nics are in promisc as we are currently capturing
> >> netflow data on another vlan for diagnostic purposes. ):
> >>
> >>
> >> 
> 
> >> I have updated the bce driver and the Broadcomm MII driver to the
> >> version from stable/7 and am still seeing the issue.
> >>
> >> This morning I did a test with increasing the RX_PAGES to 8 but the
> >> system just hung starting the network.  The route command got stuck in a
> >> zone state (Sorry can't remember exactly which).
> >>
> >> The real question is, how do we go about increasing the number of RX
> >> BDs? I guess we have to bump more that just RX_PAGES...
> >>
> >>
> >> The cause for us, from what we can see, is the openldap server sending
> >> large group search results back to nss_ldap or pam_ldap.  When it does
> >> this it seems to send each of the 600 results in its own TCP segment
> >> creating a small packet storm (600*~100byte PDU's) at the destination
> >> host.  The kernel then retransmits 2 blocks of 100 results each after
> >> SACK kicks in for the data that was dropped by the NIC.
> >>
> >>
> >> Thanks in advance
> >>
> >> Tom
> >>
> >>
> >> 
> 
> > FW may drop incoming frames when it does not see available RX
> > buffers. Increasing number of RX buffers slightly reduce the
> > possibility of dropping frames but it wouldn't completely fix it.
> > Alternatively driver may tell available RX buffers in the middle
> > of RX ring processing instead of giving updated buffers at the end
> > of RX processing. This way FW may see available RX buffers while
> > driver/upper stack is busy to process received frames. But this may
> > introduce coherency issues because the RX ring is shared between
> > host and FW. If FreeBSD has way to sync partial region of a DMA
> > map, this could be implemented without fear of coherency issue.
> > Another way to improve RX performance would be switching to
> > multi-RX queue with RSS but that would require a lot of work and I
> > had no time to implement it.
> >   
> 
> Does this mean that these cards are going to perform badly? This is was
> what I gathered from the previous thread.
> 

I mean there are still many rooms to be done in driver for better
performance. bce(4) controllers are one of best controllers for
servers and driver didn't take full advantage of it.

> > BTW, given that you've updated to bce(4)/mii(4) of stable/7, I
> > wonder why TX/RX flow controls were not kicked in.
> >   
> 
> The working copy I used for grabbing the upstream source is at r212371.
> 
> Last changes for the directories in my working copy:
> 
> sys/dev/bce @  211388
> sys/dev/mii @ 212020
> 
> 
> I discovered that flow control was disabled on the switches, so I set it
> to auto and added a pair of BCE_PRINTF's in the code where it enables
> and disables flow control and now it gets enabled.
> 

Ok.

> 
> Without BCE_JUMBO_HDRSPLIT then we see no errors.  With it we see number
> of errors, however the rate seems to be reduced compaired to the
> previous version of the driver.
> 

It seems there are issues in header splitting and it was disabled
by default. Header splitting reduces packet processing overhead in
upper layer so it's normal to see better performance with header
splitting.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge hangs on recent 7.3-STABLE

2010-09-13 Thread Pyun YongHyeon

On Mon, Sep 13, 2010 at 06:27:08PM +0400, Igor Sysoev wrote:
> On Thu, Sep 09, 2010 at 02:18:08PM -0700, Pyun YongHyeon wrote:
> 
> > On Thu, Sep 09, 2010 at 01:10:50PM -0700, Pyun YongHyeon wrote:
> > > On Thu, Sep 09, 2010 at 02:28:26PM +0400, Igor Sysoev wrote:
> > > > Hi,
> > > > 
> > > > I have several hosts running FreeBSD/amd64 7.2-STABLE updated on 
> > > > 11.01.2010
> > > > and 25.02.2010. Hosts process about 10K input and 10K output packets/s
> > > > without issues. One of them, however, is loaded more than others, so it
> > > > processes 20K/20K packets/s.
> > > > 
> > > > Recently, I have upgraded one host to 7.3-STABLE, 24.08.2010.
> > > > Then bge on this host hung two times. I was able to restart it from
> > > > console using:
> > > >   /etc/rc.d/netif restart bge0
> > > > 
> > > > Then I have upgraded the most loaded (20K/20K) host to 7.3-STABLE, 
> > > > 07.09.2010.
> > > > After reboot bge hung every several seconds. I was able to restart it,
> > > > but bge hung again after several seconds.
> > > > 
> > > > Then I have downgraded this host to 7.3-STABLE, 14.08.2010, since there
> > > > were several if_bge.c commits on 15.08.2010. The same hangs.
> > > > Then I have downgraded this host to 7.3-STABLE, 17.03.2010, before
> > > > the first if_bge.c commit after 25.02.2010. Now it runs without hangs.
> > > > 
> > > > The hosts are amd64 dual core SMP with 4G machines. bge information:
> > > > 
> > > > b...@pci0:4:0:0:class=0x02 card=0x165914e4 chip=0x165914e4 
> > > > rev=0x11 hdr=0x00
> > > > vendor = 'Broadcom Corporation'
> > > > device = 'NetXtreme Gigabit Ethernet PCI Express (BCM5721)'
> > > > 
> > > > bge0:  > > > 0x004101> mem 0xfe5f-0xfe5f irq 19 at device 0.0 on pci4
> > > > miibus1:  on bge0
> > > > brgphy0:  PHY 1 on miibus1
> > > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > > 1000baseT-FDX, auto
> > > > bge0: Ethernet address: 00:e0:81:5f:6e:8a
> > > > 
> > > 
> > > Could you show me verbose boot message(bge part only)?
> > > Also show me the output of "pciconf -lcbv".
> > > 
> > 
> > Forgot to send a patch. Let me know whether attached patch fixes
> > the issue or not.
> 
> > Index: sys/dev/bge/if_bge.c
> > ===
> > --- sys/dev/bge/if_bge.c(revision 212341)
> > +++ sys/dev/bge/if_bge.c(working copy)
> > @@ -3386,9 +3386,11 @@
> > sc->bge_rx_saved_considx = rx_cons;
> > bge_writembx(sc, BGE_MBX_RX_CONS0_LO, sc->bge_rx_saved_considx);
> > if (stdcnt)
> > -   bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, sc->bge_std);
> > +   bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, (sc->bge_std +
> > +   BGE_STD_RX_RING_CNT - 1) % BGE_STD_RX_RING_CNT);
> > if (jumbocnt)
> > -   bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc->bge_jumbo);
> > +   bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, (sc->bge_jumbo +
> > +   BGE_JUMBO_RX_RING_CNT - 1) % BGE_JUMBO_RX_RING_CNT);
> >  #ifdef notyet
> > /*
> >  * This register wraps very quickly under heavy packet drops.
> 
> Thank you, it seems the patch has fixed the bug.
> BTW, I noticed the same hungs on FreeBSD 8.1, date=2010.09.06.23.59.59
> I will apply the patch on all my updated hosts.
> 

Thanks for testing. I'm afraid bge(4) in HEAD, stable/8 and
stable/7(including 8.1-RELEASE and 7.3-RELEASE) may suffer from
this issue. Let me know what other hosts work with the patch.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bce(4) - com_no_buffers (Again)

2010-09-09 Thread Pyun YongHyeon

On Thu, Sep 09, 2010 at 03:58:30PM -0500, Tom Judge wrote:
> Hi,
> 
> I am just following up on the thread from March (I think) about this issue.
> 
> We are seeing this issue on a number of systems running 7.1. 
> 
> The systems in question are all Dell:
> 
> * R710 R610 R410
> * PE2950
> 
> The latter do not show the issue as much as the R series systems.
> 
> The cards in one of the R610's that I am testing with are:
> 
> b...@pci0:1:0:0:class=0x02 card=0x02361028 chip=0x163914e4
> rev=0x20 hdr=0x00
> vendor = 'Broadcom Corporation'
> device = 'NetXtreme II BCM5709 Gigabit Ethernet'
> class  = network
> subclass   = ethernet
> 
> They are connected to Dell PowerConnect 5424 switches.
> 
> uname -a:
> FreeBSD bandor.chi-dc.mintel.ad 7.1-RELEASE-p4 FreeBSD 7.1-RELEASE-p4
> #3: Wed Sep  8 08:19:03 UTC 2010
> t...@dev-tj-7-1-amd64.chicago.mintel.ad:/usr/obj/usr/src/sys/MINTELv10  amd64
> 
> We are also using 8192 byte jumbo frames, if_lagg and if_vlan in the
> configuration (the nics are in promisc as we are currently capturing
> netflow data on another vlan for diagnostic purposes. ):
> 
> t...@bandor '20:51:17' '~'
> > $ ifconfig bce0
> bce0: flags=8943 metric
> 0 mtu 8192
>
> options=400bb
> ether 00:21:9b:95:7a:b8
> media: Ethernet autoselect (1000baseTX )
> status: active
> lagg: laggdev lagg0
> t...@bandor '20:51:22' '~'
> > $ ifconfig bce1
> bce1: flags=8943 metric
> 0 mtu 8192
>
> options=400bb
> ether 00:21:9b:95:7a:b8
> media: Ethernet autoselect (1000baseTX )
> status: active
> lagg: laggdev lagg0
> t...@bandor '20:51:35' '~'
> > $ ifconfig lagg0
> lagg0: flags=8943 metric
> 0 mtu 8192
>
> options=400bb
> ether 00:21:9b:95:7a:b8
> media: Ethernet autoselect
> status: active
> laggproto failover
> laggport: bce1 flags=0<>
> laggport: bce0 flags=5
> t...@bandor '20:51:40' '~'
> > $ ifconfig vlan2
> vlan2: flags=8943 metric
> 0 mtu 8192
> options=3
> ether 00:21:9b:95:7a:b8
> inet 172.30.XX.XX netmask 0xfe00 broadcast 172.30.XX.XX
> media: Ethernet autoselect
> status: active
> vlan: 2 parent interface: lagg0
> 
> 
> I have updated the bce driver and the Broadcomm MII driver to the
> version from stable/7 and am still seeing the issue.
> 
> This morning I did a test with increasing the RX_PAGES to 8 but the
> system just hung starting the network.  The route command got stuck in a
> zone state (Sorry can't remember exactly which).
> 
> The real question is, how do we go about increasing the number of RX
> BDs? I guess we have to bump more that just RX_PAGES...
> 
> 
> The cause for us, from what we can see, is the openldap server sending
> large group search results back to nss_ldap or pam_ldap.  When it does
> this it seems to send each of the 600 results in its own TCP segment
> creating a small packet storm (600*~100byte PDU's) at the destination
> host.  The kernel then retransmits 2 blocks of 100 results each after
> SACK kicks in for the data that was dropped by the NIC.
> 
> 
> Thanks in advance
> 
> Tom
> 
> t...@bandor '20:57:33' '~'
> > $ sysctl -a dev.bce.0
> dev.bce.0.%desc: Broadcom NetXtreme II BCM5709 1000Base-T (C0)
> dev.bce.0.%driver: bce
> dev.bce.0.%location: slot=0 function=0
> dev.bce.0.%pnpinfo: vendor=0x14e4 device=0x1639 subvendor=0x1028
> subdevice=0x0236 class=0x02
> dev.bce.0.%parent: pci1
> dev.bce.0.l2fhdr_error_count: 0
> dev.bce.0.mbuf_alloc_failed_count: 0
> dev.bce.0.mbuf_frag_count: 0
> dev.bce.0.dma_map_addr_rx_failed_count: 0
> dev.bce.0.dma_map_addr_tx_failed_count: 0
> dev.bce.0.unexpected_attention_count: 0
> dev.bce.0.stat_IfHcInOctets: 439779802
> dev.bce.0.stat_IfHCInBadOctets: 0
> dev.bce.0.stat_IfHCOutOctets: 108341440
> dev.bce.0.stat_IfHCOutBadOctets: 0
> dev.bce.0.stat_IfHCInUcastPkts: 2341369
> dev.bce.0.stat_IfHCInMulticastPkts: 26065
> dev.bce.0.stat_IfHCInBroadcastPkts: 9191
> dev.bce.0.stat_IfHCOutUcastPkts: 1230052
> dev.bce.0.stat_IfHCOutMulticastPkts: 2870
> dev.bce.0.stat_IfHCOutBroadcastPkts: 45
> dev.bce.0.stat_emac_tx_stat_dot3statsinternalmactransmiterrors: 0
> dev.bce.0.stat_Dot3StatsCarrierSenseErrors: 0
> dev.bce.0.stat_Dot3StatsFCSErrors: 0
> dev.bce.0.stat_Dot3StatsAlignmentErrors: 0
> dev.bce.0.stat_Dot3StatsSingleCollisionFrames: 0
> dev.bce.0.stat_Dot3StatsMultipleCollisionFrames: 0
> dev.bce.0.stat_Dot3StatsDeferredTransmissions: 0
> dev.bce.0.stat_Dot3StatsExcessiveCollisions: 0
> dev.bce.0.stat_Dot3StatsLateCollisions: 0
> dev.bce.0.stat_EtherStatsCollisions: 0
> dev.bce.0.stat_EtherStatsFragments: 0
> dev.bce.0.stat_EtherStatsJabbers: 0
> dev.bce.0.stat_EtherStatsUndersizePkts: 0
> dev.bce.0.stat_EtherStatsOversizePkts: 0
> dev.bce.0.stat_EtherStatsPktsRx64Octets: 3381
> dev.bce.0.stat_EtherStatsPktsRx65Octetsto127Octets: 98883
> dev.bce.0.stat_EtherStatsPktsRx128Octetsto255Octets: 2255959
> dev.

Re: bge hangs on recent 7.3-STABLE

2010-09-09 Thread Pyun YongHyeon

On Thu, Sep 09, 2010 at 01:10:50PM -0700, Pyun YongHyeon wrote:
> On Thu, Sep 09, 2010 at 02:28:26PM +0400, Igor Sysoev wrote:
> > Hi,
> > 
> > I have several hosts running FreeBSD/amd64 7.2-STABLE updated on 11.01.2010
> > and 25.02.2010. Hosts process about 10K input and 10K output packets/s
> > without issues. One of them, however, is loaded more than others, so it
> > processes 20K/20K packets/s.
> > 
> > Recently, I have upgraded one host to 7.3-STABLE, 24.08.2010.
> > Then bge on this host hung two times. I was able to restart it from
> > console using:
> >   /etc/rc.d/netif restart bge0
> > 
> > Then I have upgraded the most loaded (20K/20K) host to 7.3-STABLE, 
> > 07.09.2010.
> > After reboot bge hung every several seconds. I was able to restart it,
> > but bge hung again after several seconds.
> > 
> > Then I have downgraded this host to 7.3-STABLE, 14.08.2010, since there
> > were several if_bge.c commits on 15.08.2010. The same hangs.
> > Then I have downgraded this host to 7.3-STABLE, 17.03.2010, before
> > the first if_bge.c commit after 25.02.2010. Now it runs without hangs.
> > 
> > The hosts are amd64 dual core SMP with 4G machines. bge information:
> > 
> > b...@pci0:4:0:0:class=0x02 card=0x165914e4 chip=0x165914e4 
> > rev=0x11 hdr=0x00
> > vendor = 'Broadcom Corporation'
> > device = 'NetXtreme Gigabit Ethernet PCI Express (BCM5721)'
> > 
> > bge0:  
> > mem 0xfe5f-0xfe5f irq 19 at device 0.0 on pci4
> > miibus1:  on bge0
> > brgphy0:  PHY 1 on miibus1
> > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > 1000baseT-FDX, auto
> > bge0: Ethernet address: 00:e0:81:5f:6e:8a
> > 
> 
> Could you show me verbose boot message(bge part only)?
> Also show me the output of "pciconf -lcbv".
> 

Forgot to send a patch. Let me know whether attached patch fixes
the issue or not.
Index: sys/dev/bge/if_bge.c
===
--- sys/dev/bge/if_bge.c	(revision 212341)
+++ sys/dev/bge/if_bge.c	(working copy)
@@ -3386,9 +3386,11 @@
 	sc->bge_rx_saved_considx = rx_cons;
 	bge_writembx(sc, BGE_MBX_RX_CONS0_LO, sc->bge_rx_saved_considx);
 	if (stdcnt)
-		bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, sc->bge_std);
+		bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, (sc->bge_std +
+		BGE_STD_RX_RING_CNT - 1) % BGE_STD_RX_RING_CNT);
 	if (jumbocnt)
-		bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, sc->bge_jumbo);
+		bge_writembx(sc, BGE_MBX_RX_JUMBO_PROD_LO, (sc->bge_jumbo +
+		BGE_JUMBO_RX_RING_CNT - 1) % BGE_JUMBO_RX_RING_CNT);
 #ifdef notyet
 	/*
 	 * This register wraps very quickly under heavy packet drops.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bge hangs on recent 7.3-STABLE

2010-09-09 Thread Pyun YongHyeon

On Thu, Sep 09, 2010 at 02:28:26PM +0400, Igor Sysoev wrote:
> Hi,
> 
> I have several hosts running FreeBSD/amd64 7.2-STABLE updated on 11.01.2010
> and 25.02.2010. Hosts process about 10K input and 10K output packets/s
> without issues. One of them, however, is loaded more than others, so it
> processes 20K/20K packets/s.
> 
> Recently, I have upgraded one host to 7.3-STABLE, 24.08.2010.
> Then bge on this host hung two times. I was able to restart it from
> console using:
>   /etc/rc.d/netif restart bge0
> 
> Then I have upgraded the most loaded (20K/20K) host to 7.3-STABLE, 07.09.2010.
> After reboot bge hung every several seconds. I was able to restart it,
> but bge hung again after several seconds.
> 
> Then I have downgraded this host to 7.3-STABLE, 14.08.2010, since there
> were several if_bge.c commits on 15.08.2010. The same hangs.
> Then I have downgraded this host to 7.3-STABLE, 17.03.2010, before
> the first if_bge.c commit after 25.02.2010. Now it runs without hangs.
> 
> The hosts are amd64 dual core SMP with 4G machines. bge information:
> 
> b...@pci0:4:0:0:class=0x02 card=0x165914e4 chip=0x165914e4 
> rev=0x11 hdr=0x00
> vendor = 'Broadcom Corporation'
> device = 'NetXtreme Gigabit Ethernet PCI Express (BCM5721)'
> 
> bge0:  
> mem 0xfe5f-0xfe5f irq 19 at device 0.0 on pci4
> miibus1:  on bge0
> brgphy0:  PHY 1 on miibus1
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto
> bge0: Ethernet address: 00:e0:81:5f:6e:8a
> 

Could you show me verbose boot message(bge part only)?
Also show me the output of "pciconf -lcbv".

> bge has 3 vlans:
> 
> bge0: flags=8943 metric 0 mtu 
> 15
> 00
> options=9b
> ether 00:e0:81:5f:6e:8a
> media: Ethernet autoselect (1000baseTX )
> status: active
> 
> vlan173: flags=8843 metric 0 mtu 1500
> options=3
> ether 00:e0:81:5f:6e:8a
> inet 192.168.173.101 netmask 0xff00 broadcast 192.168.173.255
> media: Ethernet autoselect (1000baseTX )
> status: active
> vlan: 173 parent interface: bge0
> 
> [ ... ]
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: NFE adapter 'hangs'

2010-09-03 Thread Pyun YongHyeon

On Fri, Sep 03, 2010 at 07:59:26AM +0100, Melissa Jenkins wrote:
> 
> Thank you for your very quick response :)
> 

[...]

> >Also I'd like to know whether both RX and TX are dead or only one
> >RX/TX path is hung. Can you see incoming traffic with tcpdump when
> >you think the controller is in stuck?
> 
> Yes, though not very much. The traffic to 4800 is every second so you can see 
> in the following trace when it stops
> 
> 07:10:42.287163 IP 192.168.1.203 > 224.0.0.240:  pfsync 108
> 07:10:42.911995
> 07:10:43.112073 STP 802.1d, Config, Flags [Topology change], bridge-id 
> 8000.c4:7d:4f:a9:ac:30.8008, length 43
> 07:10:43.148659 IP 192.168.1.203.57026 > 192.168.1.255.4800: UDP, length 60
> 07:10:43.148684 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 
> 192.168.1.203.57026 > 192.168.1.129.4800: UDP, length 60
> 07:10:43.148689 IP 172.31.1.203 > 172.31.1.129: GREv0, length 92: IP 
> 192.168.1.203.57026 > 192.168.1.1.4800: UDP, length 60
> 07:10:43.148918 IP 192.168.1.213.40677 > 192.168.1.255.4800: UDP, length 48

[...]

> a bit later on, still broken, a slight odd message:
> 07:11:43.079720 IP 172.31.1.129 > 172.31.1.213: GREv0, length 52: IP 
> 192.168.1.129.60446 > 192.168.1.213.179:  tcp 12 [bad hdr length 16 - too 
> short, < 20]
> 07:11:44.210794 IP 172.31.1.129 > 172.31.1.203: GREv0, length 84: IP 
> 192.168.1.129.64744 > 192.168.1.203.4800: UDP, length 52
> 07:11:44.210831 IP 172.31.1.129 > 172.31.1.213: GREv0, length 84: IP 
> 192.168.1.129.64744 > 192.168.1.213.4800: UDP, length 52
> 
> Now this really is odd, I don't recognise either of those MAC addresses, 
> though the SQL shown is used on this machine (
> 07:12:13.054393 45:43:54:20:41:63 > 00:00:03:53:45:4c, ethertype Unknown 
> (0x6374), length 60:
> 0x:  556e 6971 7565 4964 2046 524f 4d20 7261  UniqueId.FROM.ra
> 0x0010:  6461 6363 7420 2057 4845 5245 2043 616c  dacct..WHERE.Cal
> 0x0020:  6c69 6e67 5374 6174 696f 6e49 6420   lingStationId.

Hmm, it seems you're using really complex setup. It's very hard to
narrow down guilty ones under these environments. Could you setup
simple network configuration that reproduces the issue? One of
possible cause would be wrong(garbled) data might be passed up to
upper stack. But I have no idea why you see GRE packets with
truncated TCP header(172.31.1.129 > 172.31.1.213).
How about disabling TX/RX checksum offloading as well as TSO?

[...]

> 
> I then restarted the interface (nfe down/up, route restart)
> 
> From dmesg at the time (slight obfuscated)
> Sep  3 07:10:19 manch2 bgpd[89612]: neighbor XX: received notification: 
> HoldTimer expired, unknown subcode 0
> Sep  3 07:10:49 manch2 bgpd[89612]: neighbor XX connect: Host is down
> # at this point I took the interface down & up and reloaded the routing tables
> Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
> Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
> Sep  3 07:12:07 manch2 kernel: nfe0: link state changed to DOWN
> Sep  3 07:12:07 manch2 kernel: carp0: link state changed to DOWN
> Sep  3 07:12:11 manch2 kernel: nfe0: link state changed to UP   
> Sep  3 07:12:11 manch2 kernel: carp0: link state changed to DOWN
> Sep  3 07:12:14 manch2 kernel: carp0: link state changed to UP

Hmm, it does not look right, carp0 showed link DOWN message four
times in a row.
By the way, are you using IPMI on MCP55? nfe(4) is not ready to
handle MAC operation with IPMI.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: NFE adapter 'hangs'

2010-09-02 Thread Pyun YongHyeon

On Thu, Sep 02, 2010 at 09:13:46AM +0100, Melissa Jenkins wrote:
> Hiya,
> 
> I've been having trouble with two different machines (FBSD 8.0p3 & FBSD 
> 7.0p5) using the NFE network adapter.  The machines are, respectively, Sun 
> X2200 (AMD64) and a Sun X2100M2 (AMD64) and both are running the amd64 
> kernel. 
> 
> Basically what appears to happen is that traffic stops flowing through the 
> interface and 'No buffer space available' error messages are produced when 
> trying to send icmp packets. All establish connections appear to hang.
> 
> The machines are running as packet routers, and nfe0 is acting as the 'lan' 
> side.  PF is being used for filtering, NAT, BINAT and RDR.  The same PF 
> configuration works correctly on two other servers using different network 
> adapters. One of them is configured with pfsync & CARP, but the other one 
> isn't.
> 
> The problem seems to happen under fairly light number of sessions ( < 100 
> active states in PF) though the more states the quicker it occurs.  It is 
> possible it's related to packet rates as putting on high bandwidth clients 
> seems to produce the problem very quickly (several minutes) This is 
> reinforced by the fact that the problem first manifested when we upgraded one 
> of the leased lines.
> 
> Executing ifconfig nfe0 down && ifconfig nfe0 up will restart traffic flow.  
> 
> Neither box is very highly loaded, generally around ~ 1.5 Mb/s.  This doesn't 
> appear to be related to the amount of traffic as I have tried re-routing 95% 
> of traffic around the server without any improvement in performance.  The 
> traffic profile is fairly random - a mix of TCP and UDP, mostly flowing OUT 
> of nfe0.  It is all L3 and there are  less than 5 hosts on the segment 
> attached to the nfe interface.
> 
> Both boxes are in different locations and are connected to different types of 
> Cisco switches.  Both appear to autonegotiate correctly and the switch ports 
> show no status changes.
> 
> It appears that PFSync, CARP & a GRE tunnel works correctly over the NFE 
> interface for long periods of time (weeks +) And that it is something to do 
> adding other traffic to the mix that is resulting in the interface 'hanging'.
> 
> If I move the traffic from NFE to the other BGE interface (the one shared 
> with the LOM) everything is stable and works correctly.  I have not been able 
> to reproduce this using test loads, and the interface worked correctly with 
> iperf testing prior to deployment.  I unfortunately (legal reasons) can't 
> provide a traffic trace up to the time it occurs though everything looks 
> normal to me.
> 
> The FreeBSD 7 X2100 lists the following from PCI conf:
> n...@pci0:0:8:0:class=0x068000 card=0x534c108e chip=0x037310de 
> rev=0xa3 hdr=0x00
>vendor = 'Nvidia Corp'
>device = 'MCP55 Ethernet'
>class  = bridge
> n...@pci0:0:9:0:class=0x068000 card=0x534c108e chip=0x037310de 
> rev=0xa3 hdr=0x00
>vendor = 'Nvidia Corp'
>device = 'MCP55 Ethernet'
>class  = bridge
> 
> The FreeBSD 8 X2200 lists the same thing:
> n...@pci0:0:8:0:class=0x068000 card=0x534b108e chip=0x037310de 
> rev=0xa3 hdr=0x00
>vendor = 'Nvidia Corp'
>device = 'MCP55 Ethernet'
>class  = bridge
> n...@pci0:0:9:0:class=0x068000 card=0x534b108e chip=0x037310de 
> rev=0xa3 hdr=0x00
>vendor = 'Nvidia Corp'
>device = 'MCP55 Ethernet'
>class  = bridge
> 
> 
> Here are the two obvious tests (both from the FreeBSD 7 box), but the icmp 
> response & the mbuf stats are very much the same on both boxes.
> 
> ping 172.31.3.129
> PING 172.31.3.129 (172.31.3.129): 56 data bytes
> ping: sendto: No buffer space available
> ping: sendto: No buffer space available
> ^C
> 
> -- 172.31.3.129 ping statistics ---
> 2 packets transmitted, 0 packets received, 100.0% packet loss
> 
> netstat -m
> 852/678/1530 mbufs in use (current/cache/total)
> 818/448/1266/25600 mbuf clusters in use (current/cache/total/max)
> 817/317 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/362/362/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 1879K/2513K/4392K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
> 
> From the other machine, after the problem has occurred & and ifconfig down/up 
> cycle has been done (ie when the interface is working)
> vmstat -z 
> mbuf_packet:  256,0, 1033, 1783, 330792410,   
>  0
> mbuf: 256,0,5, 1664, 395145472,

Re: HELP. FreeBSD 8.1 polling issue

2010-08-26 Thread Pyun YongHyeon

On Thu, Aug 26, 2010 at 01:34:45PM +0800, MAI JIN wrote:
> Hi,
> 
> I got a freeBSD 8.1 polling issue on my PC. It is a dual-core Intel
> Pentium x86 PC (2.8GHz each core). The Ethernet interface is Broadcom
> NetXtreme 57xx Gigabit Ethernet interface.
> I set the following options (enable polling and zero-buffer copy) and
> rebuilt the kernel:
> 
> Code:
> # To make an SMP kernel, the next two lines are needed
> options SMP # Symmetric MultiProcessor
> Kernel
> device  apic# I/O APIC
> 
> options DEVICE_POLLING # Open Polling
> options HZ=1000
> options ZERO_COPY_SOCKETS
> The following were appended to the /etc/sysctl.conf
> 
> Code:
> kern.polling.enable=1
> # increase BPF buffer to 10M
> net.bpf.bufsize=10485760
> net.bpf.maxbufsize=10485760
> kern.polling.idle_poll=1
> kern.polling.burst_max=1000
> After installed and rebooted the system, kern.polling.enable was not
> found in MIB so I had to ignore this error. Looks like
> kern.polling.enable is removed from FreeBSD v8.1?
> Everything looked good so build my application to received data from
> another HP server. I wrote the application using libpcap-1.1.1 with BFP
> zero-copy turned on (I found the #define HAVE_ZEROCOPY_BPF 1 in
> config.h). Attached please find the source code of my application.
> 
> Before running the application, I set the following parameters:
> 
> Code:
> ifconfig bge0 polling # This will turn on the polling of the
> Broadcom driver.
> Code:
> sysctl -w net.bpf.bufsize=10485760 
> sysctl -w net.bpf.maxbufsize=10485760
> sysctl -w kern.polling.idle_poll=1
> sysctl -w kern.polling.burst_max=1000
> sysctl -w kern.polling.each_burst=128
> sysctl -w net.inet.ip.intr_queue_maxlen=256
> Then I ran the application to receive data from the HP server. I ran
> multiple iperf on the HP server to send around 133Mbits/s UDP load to
> the PC under test. The UDP payload size was 47 bytes. The entire IP
> packet size is 76 bytes.
> 
> First of all, the receiving application worked well and received around
> 205K packets/second without packet losing (I checked the receiving
> status using pcap_stats). However, after 2 minutes, the application can
> not received data any more. The packets/second is 0. I ran the ping from
> the PC under test and found that the ping reporting timeout and
> destination unreachable (the ping from HP to the PC also failed). Looked
> like the link between the HP server and PC was broken so the application
> could receive data. No packet was dropped. Then I restart the bge0
> interface using: ifconfig bge0 down && ifconfig bge0 up
> 
> And then I re-ran the application and it continued receiving data. But
> after 1 or 2 minutes, the link broke again. I think it was my
> application that caused the bge0 interface down. I started the tcpdump
> and it worked well without breaking the link. 
> 
> I tried to increase the kern.polling.each_burst from 128 to 500 but the
> application would cause the bge0 down within 1 minute. No packet was
> dropped before the link was down.
> 
> I checked the CPU usage of the PC. The sys used is around 90% (might be
> caused by kern.polling.idle_poll=1), user land is 13%. 
> I don't understand why the application would break the bge0.
> 
> I tried changing the parameters:
> options HZ=2000
> 
> sysctl -w net.bpf.bufsize=20485760 
> sysctl -w net.bpf.maxbufsize=20485760
> sysctl -w kern.polling.idle_poll=1
> sysctl -w kern.polling.burst_max=1
> sysctl -w kern.polling.each_burst=5000
> 
> The performance was better: I got 307K packet/second (the HP server
> sended around 250Mbits/s, my PC got 200Mbits/s). But after 2 minutes,
> the bge0 was down again. 
> 

I'm not a fan of polling(4) especially for intelligent controllers
like bge(4) but it seems bge(4) was dead under high network load.
Would you show me the output of both verbose dmesg and
"pciconf -lcbv"?

> Could anybody have a look at this issue? How can  <> I optimize
> the performance of the polling?
> 
> Thanks,
> Jin 
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 8.0-RELEASE-p3: 4k jumbo mbuf cluster exhaustion

2010-08-24 Thread Pyun YongHyeon

On Tue, Aug 24, 2010 at 08:37:52PM +0800, Adrian Chadd wrote:
> On 23 August 2010 18:18, Andre Oppermann  wrote:
> > It seems the 4k clusters do not get freed back to the pool after they've
> > been sent by the NIC and dropped from the socket buffer after the ACK has
> > arrived. ?The leak must occur in one of these two places. ?The socket
> > buffer is unlikely as it would affect not just you but everyone else too.
> > Thus the mbuf freeing after DMA/tx in the bce(4) driver is the prime
> > suspect.
> 
> They don't stay leaked though. Killing the offending process sees
> mbuf's eventually returned.
> It isn't immediate though. It may be related to timing out existing
> socket connections or something?
> 
> I haven't yet brought up the second box enough to start passing test
> traffic, so I can't provide any further details than this.
> 

Here is patch that fixes TX/RX related issues. The patch was
generated against HEAD. I'm not sure you can apply this patch to
8.0-RELEASE but you can see the point of TX issues of driver.
I'm still waiting for David's opinion on this patch but it seems
he's busy to address other issues of Broadcom controllers which
might be triggered by me.

> Adrian
Index: sys/dev/bce/if_bce.c
===
--- sys/dev/bce/if_bce.c	(revision 210298)
+++ sys/dev/bce/if_bce.c	(working copy)
@@ -4995,7 +4995,7 @@ bce_get_rx_buf(struct bce_softc *sc, struct mbuf *
 u16 *chain_prod, u32 *prod_bseq)
 {
 	bus_dmamap_t map;
-	bus_dma_segment_t segs[BCE_MAX_SEGMENTS];
+	bus_dma_segment_t segs[1];
 	struct mbuf *m_new = NULL;
 	struct rx_bd *rxbd;
 	int nsegs, error, rc = 0;
@@ -5067,9 +5067,10 @@ bce_get_rx_buf(struct bce_softc *sc, struct mbuf *
 
 	/* Handle any mapping errors. */
 	if (error) {
+#ifdef	BCE_DEBUG
 		BCE_PRINTF("%s(%d): Error mapping mbuf into RX "
 		"chain (%d)!\n", __FILE__, __LINE__, error);
-
+#endif
 		sc->dma_map_addr_rx_failed_count++;
 		m_freem(m_new);
 
@@ -5183,9 +5184,10 @@ bce_get_pg_buf(struct bce_softc *sc, struct mbuf *
 
 	/* Handle any mapping errors. */
 	if (error) {
+#ifdef	BCE_DEBUG
 		BCE_PRINTF("%s(%d): Error mapping mbuf into page chain!\n",
 		__FILE__, __LINE__);
-
+#endif
 		m_freem(m_new);
 		DBRUN(sc->debug_pg_mbuf_alloc--);
 
@@ -5323,6 +5325,9 @@ bce_init_tx_chain(struct bce_softc *sc)
 		txbd->tx_bd_haddr_hi = htole32(BCE_ADDR_HI(sc->tx_bd_chain_paddr[j]));
 		txbd->tx_bd_haddr_lo = htole32(BCE_ADDR_LO(sc->tx_bd_chain_paddr[j]));
 	}
+	for (i = 0; i < TX_PAGES; i++)
+		bus_dmamap_sync(sc->tx_bd_chain_tag, sc->tx_bd_chain_map[i],
+		BUS_DMASYNC_PREWRITE);
 
 	bce_init_tx_context(sc);
 
@@ -5360,8 +5365,11 @@ bce_free_tx_chain(struct bce_softc *sc)
 	}
 
 	/* Clear each TX chain page. */
-	for (i = 0; i < TX_PAGES; i++)
+	for (i = 0; i < TX_PAGES; i++) {
 		bzero((char *)sc->tx_bd_chain[i], BCE_TX_CHAIN_PAGE_SZ);
+		bus_dmamap_sync(sc->tx_bd_chain_tag, sc->tx_bd_chain_map[i],
+		BUS_DMASYNC_PREWRITE);
+	}
 
 	sc->used_tx_bd = 0;
 
@@ -5497,10 +5505,6 @@ bce_init_rx_chain(struct bce_softc *sc)
 
 	DBRUN(sc->rx_low_watermark = USABLE_RX_BD);
 	DBRUN(sc->rx_empty_count = 0);
-	for (i = 0; i < RX_PAGES; i++) {
-		bus_dmamap_sync(sc->rx_bd_chain_tag, sc->rx_bd_chain_map[i],
-		BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
-	}
 
 	bce_init_rx_context(sc);
 
@@ -5526,6 +5530,7 @@ bce_fill_rx_chain(struct bce_softc *sc)
 {
 	u16 prod, prod_idx;
 	u32 prod_bseq;
+	int i;
 
 	DBENTER(BCE_VERBOSE_RESET | BCE_EXTREME_RECV | BCE_VERBOSE_LOAD |
 	BCE_VERBOSE_CTX);
@@ -5544,6 +5549,11 @@ bce_fill_rx_chain(struct bce_softc *sc)
 		prod = NEXT_RX_BD(prod);
 	}
 
+	for (i = 0; i < RX_PAGES; i++) {
+		bus_dmamap_sync(sc->rx_bd_chain_tag, sc->rx_bd_chain_map[i],
+		BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
+	}
+
 	/* Save the RX chain producer indices. */
 	sc->rx_prod  = prod;
 	sc->rx_prod_bseq = prod_bseq;
@@ -5651,6 +5661,9 @@ bce_init_pg_chain(struct bce_softc *sc)
 		pgbd->rx_bd_haddr_lo = htole32(BCE_ADDR_LO(sc->pg_bd_chain_paddr[j]));
 	}
 
+	/* Fill up the page chain. */
+	bce_fill_pg_chain(sc);
+
 	/* Setup the MQ BIN mapping for host_pg_bidx. */
 	if ((BCE_CHIP_NUM(sc) == BCE_CHIP_NUM_5709)	||
 		(BCE_CHIP_NUM(sc) == BCE_CHIP_NUM_5716))
@@ -5672,14 +5685,6 @@ bce_init_pg_chain(struct bce_softc *sc)
 	val = BCE_ADDR_LO(sc->pg_bd_chain_paddr[0]);
 	CTX_WR(sc, GET_CID_ADDR(RX_CID), BCE_L2CTX_RX_NX_PG_BDHADDR_LO, val);
 
-	/* Fill up the page chain. */
-	bce_fill_pg_chain(sc);
-
-	for (i = 0; i < PG_PAGES; i++) {
-		bus_dmamap_sync(sc->pg_bd_chain_tag, sc->pg_bd_chain_map[i],
-		BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
-	}
-
 	DBRUNMSG(BCE_EXTREME_RECV, bce_dump_pg_chain(sc, 0, TOTAL_PG_BD));
 	DBEXIT(BCE_VERBOSE_RESET | BCE_VERBOSE_RECV | BCE_VERBOSE_LOAD |
 		BCE_VERBOSE_CTX);
@@ -5698,6 +5703,7 @@ static void
 bce_fill_pg_chain(struct bce_softc *sc)
 {
 	u16 prod, prod_idx;
+	int i;
 
 	DBENTER(BCE_VERBOSE_RESET | BCE_EXTREME_RECV | BCE_VERBOSE_LOAD |
 	B

Re: kern/79262: [dc] Adaptec ANA-6922 not fully supported

2010-08-23 Thread Pyun YongHyeon

On Mon, Aug 23, 2010 at 06:22:58PM +, an...@freebsd.org wrote:
> Synopsis: [dc] Adaptec ANA-6922 not fully supported
> 
> Responsible-Changed-From-To: freebsd-net->yongari
> Responsible-Changed-By: andre
> Responsible-Changed-When: Mon Aug 23 18:22:28 UTC 2010
> Responsible-Changed-Why: 
> Over to expert.
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=79262

Would you try the following patch and let me know how it goes on
your box? I don't have dc(4) controller so it's hard to verify this
at this moment.
http://people.freebsd.org/~yongari/dc.eaddr.patch
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 8.0-RELEASE-p3: 4k jumbo mbuf cluster exhaustion

2010-08-23 Thread Pyun YongHyeon

On Mon, Aug 23, 2010 at 09:45:20PM +0200, Andre Oppermann wrote:
> On 23.08.2010 21:16, Pyun YongHyeon wrote:
> >On Mon, Aug 23, 2010 at 09:04:02PM +0200, Andre Oppermann wrote:
> >>On 23.08.2010 19:52, Pyun YongHyeon wrote:
> >>>On Mon, Aug 23, 2010 at 12:18:01PM +0200, Andre Oppermann wrote:
> >>>>The function that is called on a socket write is sosend_generic() which
> >>>>makes use of m_getm2().  This function allocates mbuf chains with the
> >>>>tightest packing it can achieve.  It will make use 4k (page size) mbufs
> >>>>as much as it can.  This is where they come from.
> >>>>
> >>>>It seems the 4k clusters do not get freed back to the pool after they've
> >>>>been sent by the NIC and dropped from the socket buffer after the ACK 
> >>>>has
> >>>>arrived.  The leak must occur in one of these two places.  The socket
> >>>>buffer is unlikely as it would affect not just you but everyone else 
> >>>>too.
> >>>>Thus the mbuf freeing after DMA/tx in the bce(4) driver is the prime
> >>>>suspect.
> >>>>
> >>>
> >>>I know bce(4) has a couple of bug in TX path(wrong dma tag, lack of
> >>>bus_dmamap_sync(9) etc) but this is the same code path with/without
> >>>TX checksum offloading. This is one of reason why I still do not
> >>>understand what's really happening here. TX checksum offloading may
> >>>introduce additional frame processing time to fill internal FIFO to
> >>>compute checksum before transmitting the frame to wire such that it
> >>>can change timing of TX path. This timing change might trigger the
> >>>TX path bug. It's just vague guessing though.
> >>
> >>Had a chat with clau...@openbsd and he said that the bce(4) DMA engine
> >>can only access the first 1GB of physical RAM and has to use bounce
> >>buffers all the time.  Maybe this is related.
> >>
> >
> >Really? I don't remember I saw such a DMA address space limitation
> >in data sheet. And I don't think Broadcom made such a horrible
> >thing for controllers targeted for servers. The only limitation I
> >know is BCM5708 is not able to handle DMA addresses greater than
> >40bits so bce(4) limits the DMA address space in DMA tag creation.
> 
> Oops... OpenBSD bce(4) != FreeBSD bce(4).  The former is for BCM440x
> chips the latter for BCM57xx.
> 

Ok, OpenBSD has bnx(4) for Broadcom NetXtreme II controllers.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 8.0-RELEASE-p3: 4k jumbo mbuf cluster exhaustion

2010-08-23 Thread Pyun YongHyeon

On Mon, Aug 23, 2010 at 09:04:02PM +0200, Andre Oppermann wrote:
> On 23.08.2010 19:52, Pyun YongHyeon wrote:
> >On Mon, Aug 23, 2010 at 12:18:01PM +0200, Andre Oppermann wrote:
> >>On 23.08.2010 11:26, Adrian Chadd wrote:
> >>>On 23 August 2010 06:27, Pyun YongHyeon   wrote:
> >>>
> >>>>I recall there was SIOCSIFCAP ioctl handling bug in bce(4) on 8.0 so
> >>>>it might also disable IFCAP_TSO4/IFCAP_TXCSUM/IFCAP_RXCSUM when yo
> >>>>disabled RX checksum offloading. But I can't explain how checksum
> >>>>offloading could be related with the growth of 4k jumbo buffers.
> >>>
> >>>Neither can I!
> >>>
> >>>I'm trying to come up with a reproduction method that doesn't involve
> >>>"put box on the internet, push clients through it, wait."
> >>
> >>Network drivers use 2k sized mbuf clusters on receive.  So the problem
> >>doesn't seem to be RX related.
> >>
> >
> >bce(4) is special in this regards. The controller would allocate
> >jumbo cluster on RX if jumbo frame is used. If header splitting is
> >used, driver will use normal mbuf clusters.
> 
> Didn't know that.
> 
> >>The function that is called on a socket write is sosend_generic() which
> >>makes use of m_getm2().  This function allocates mbuf chains with the
> >>tightest packing it can achieve.  It will make use 4k (page size) mbufs
> >>as much as it can.  This is where they come from.
> >>
> >>It seems the 4k clusters do not get freed back to the pool after they've
> >>been sent by the NIC and dropped from the socket buffer after the ACK has
> >>arrived.  The leak must occur in one of these two places.  The socket
> >>buffer is unlikely as it would affect not just you but everyone else too.
> >>Thus the mbuf freeing after DMA/tx in the bce(4) driver is the prime
> >>suspect.
> >>
> >
> >I know bce(4) has a couple of bug in TX path(wrong dma tag, lack of
> >bus_dmamap_sync(9) etc) but this is the same code path with/without
> >TX checksum offloading. This is one of reason why I still do not
> >understand what's really happening here. TX checksum offloading may
> >introduce additional frame processing time to fill internal FIFO to
> >compute checksum before transmitting the frame to wire such that it
> >can change timing of TX path. This timing change might trigger the
> >TX path bug. It's just vague guessing though.
> 
> Had a chat with clau...@openbsd and he said that the bce(4) DMA engine
> can only access the first 1GB of physical RAM and has to use bounce
> buffers all the time.  Maybe this is related.
> 

Really? I don't remember I saw such a DMA address space limitation
in data sheet. And I don't think Broadcom made such a horrible
thing for controllers targeted for servers. The only limitation I
know is BCM5708 is not able to handle DMA addresses greater than
40bits so bce(4) limits the DMA address space in DMA tag creation.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 8.0-RELEASE-p3: 4k jumbo mbuf cluster exhaustion

2010-08-23 Thread Pyun YongHyeon

On Mon, Aug 23, 2010 at 12:18:01PM +0200, Andre Oppermann wrote:
> On 23.08.2010 11:26, Adrian Chadd wrote:
> >On 23 August 2010 06:27, Pyun YongHyeon  wrote:
> >
> >>I recall there was SIOCSIFCAP ioctl handling bug in bce(4) on 8.0 so
> >>it might also disable IFCAP_TSO4/IFCAP_TXCSUM/IFCAP_RXCSUM when yo
> >>disabled RX checksum offloading. But I can't explain how checksum
> >>offloading could be related with the growth of 4k jumbo buffers.
> >
> >Neither can I!
> >
> >I'm trying to come up with a reproduction method that doesn't involve
> >"put box on the internet, push clients through it, wait."
> 
> Network drivers use 2k sized mbuf clusters on receive.  So the problem
> doesn't seem to be RX related.
> 

bce(4) is special in this regards. The controller would allocate
jumbo cluster on RX if jumbo frame is used. If header splitting is
used, driver will use normal mbuf clusters.

> The function that is called on a socket write is sosend_generic() which
> makes use of m_getm2().  This function allocates mbuf chains with the
> tightest packing it can achieve.  It will make use 4k (page size) mbufs
> as much as it can.  This is where they come from.
> 
> It seems the 4k clusters do not get freed back to the pool after they've
> been sent by the NIC and dropped from the socket buffer after the ACK has
> arrived.  The leak must occur in one of these two places.  The socket
> buffer is unlikely as it would affect not just you but everyone else too.
> Thus the mbuf freeing after DMA/tx in the bce(4) driver is the prime 
> suspect.
> 

I know bce(4) has a couple of bug in TX path(wrong dma tag, lack of
bus_dmamap_sync(9) etc) but this is the same code path with/without
TX checksum offloading. This is one of reason why I still do not
understand what's really happening here. TX checksum offloading may 
introduce additional frame processing time to fill internal FIFO to
compute checksum before transmitting the frame to wire such that it
can change timing of TX path. This timing change might trigger the
TX path bug. It's just vague guessing though.

> -- 
> Andre
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 8.0-RELEASE-p3: 4k jumbo mbuf cluster exhaustion

2010-08-22 Thread Pyun YongHyeon

On Sun, Aug 22, 2010 at 05:40:30PM +0800, Adrian Chadd wrote:
> I disabled tso, tx chksum and rx chksum. This fixed the 4k jumbo
> allocation growth.
> 

I recall there was SIOCSIFCAP ioctl handling bug in bce(4) on 8.0 so
it might also disable IFCAP_TSO4/IFCAP_TXCSUM/IFCAP_RXCSUM when yo
disabled RX checksum offloading. But I can't explain how checksum
offloading could be related with the growth of 4k jumbo buffers.
> 
> Turning on tso on a live proxy didn't affect jumbo allocations.
> Turning on txcsum caused jumbo allocations to begin growing again.
> DIsabling txcsum again caused jumbo allocations to stop increasing,
> but it doesn't seem to be decreasing back to the steady state (~ 8k.)
> Turning on rxcsum didn't affect jumbo allocations.
> 
> So it seems txcsum is the culprit here.
> 

There was a lot of changes in bce(4) since 8.0-RELEASE. I vaguely
guess your issue could be related with header split feature of
bce(4) which was now disabled. Are you using jumbo
frame/ZERO_COPY_SOCKETS with bce(4)?

> 
> 
> Adrian
> 
> On 22 August 2010 16:11, Adrian Chadd  wrote:
> > Hi,
> >
> > I've got a Squid/Lusca server on 8.0-RELEASE-p3 which is exhibiting
> > some very strange behaviour.
> >
> > After a few minutes uptime, the 4k mbuf cluster zone fills up and
> > Squid/Lusca spends almost all of it's time sleeping in "keglimit".
> >
> > I've bumped kern.ipc.nmbclusters to 262144 and kern.ipc.jumbop to
> > 32768 but the system will slowly crawl towards filling that zone.
> >
> > The box has a bce on-board NIC and is using ipfw to handle redirecting
> > traffic to/from the box for transparent TCP interception. It's
> > handling around ~30,000 concurrent connections at the moment.
> >
> > I have other very busy proxies on FreeBSD-7.x pushing a few hundred
> > megabits without any issues. This box falls over after ~ 20 mbit.
> >
> > If I bypass redirection and/or kill squid, the 4k cluster count drops
> > back down to < 500 and stays there.
> >
> > Does anyone have any ideas on where to begin debugging this?
> >
> > Thanks,
> >
> >
> > Adrian
> >
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: packet loss on ixgbe using vlans and routing

2010-08-20 Thread Pyun YongHyeon

On Fri, Aug 20, 2010 at 07:54:49PM +0200, John Hay wrote:
> On Fri, Aug 20, 2010 at 10:43:46AM -0700, Jack Vogel wrote:
> > Why does the ixgbe loadable show as if_ixgbe,  you've altered it?
> 
> Only the module Makefile:
> # cvs -q diff -u
> Index: Makefile
> ===
> RCS file: /home/ncvs/src/sys/modules/ixgbe/Makefile,v
> retrieving revision 1.6.2.2
> diff -u -r1.6.2.2 Makefile
> --- Makefile5 Apr 2010 21:43:22 -   1.6.2.2
> +++ Makefile28 Apr 2010 18:09:24 -
> @@ -1,6 +1,6 @@
>  #$FreeBSD: src/sys/modules/ixgbe/Makefile,v 1.6.2.2 2010/04/05 21:43:22 jfv 
> Exp $
>  .PATH:  ${.CURDIR}/../../dev/ixgbe
> -KMOD= ixgbe
> +KMOD= if_ixgbe
>  SRCS= device_if.h bus_if.h pci_if.h
>  SRCS+= ixgbe.c
>  # Shared source
> 

This looks more correct to me.

> John
> 
> > 
> > Jack
> > 
> > 
> > On Fri, Aug 20, 2010 at 7:04 AM, John Hay  wrote:
> > 
> > > Hi Jack,
> > >
> > > Have you had a chance to look at it yet? I would love to get these
> > > network cards working. :-)
> > >
> > > John
> > >
> > > On Fri, Jul 23, 2010 at 01:36:10AM -0700, Jack Vogel wrote:
> > > > Yes, I am here, I have been reading this, but I am also very busy with a
> > > > couple of things, please be patient, I will get on this asap.
> > > >
> > > > Cheers,
> > > >
> > > > Jack
> > > >
> > > >
> > > > On Fri, Jul 23, 2010 at 12:40 AM, John Hay  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > (Jack any chance that you can look at this please?)
> > > > >
> > > > > It looks like there are 2 problems with the ixgbe driver on FreeBSD-8.
> > > > > I have a Dell T710 with 4 X 10G ethernet interfaces (2 X Dual port
> > > Intel
> > > > > 82599 cards). It is running FreeBSD RELENG_8.
> > > > >
> > > > > 1 - When routing (using vlans) there is heavy packet loss that go away
> > > > > when you do "ifconfig ix2 -rxcsum". The packet loss seems to be on the
> > > > > receive side because I do not see them on the receiving interface with
> > > > > tcpdump. This seems to impact both ipv4 and ipv6.
> > > > >
> > > > > My test setup is the Dell T710 with its ix2 connected to a 10G port of
> > > > > a Nortel 4526GTX. On that port I have 2 vlans configured with half of
> > > > > the 1G ports in the one vlan and the other half in the other vlan.
> > > > >
> > > > > If I test with iperf from one of the machines on a 1G port to the 
> > > > > T710,
> > > > > I get 920Mbit/s. If I do it simultaneously from a few machines
> > > connected
> > > > > to the 1G ports, all of them basically saturate their 1G links.
> > > > >
> > > > > If I now try to route from the one vlan to the other, ie. doing an
> > > iperf
> > > > > from a 1G connected machine, through the T710, to another 1G connected
> > > > > machine, I see packet loss, sometimes iperf is only able to do
> > > 100kbits/s.
> > > > > (Configuring a tcp relay, like socat, on the T710, and working through
> > > it,
> > > > > I again get 900Mbit/s and more.)
> > > > >
> > > > > So it seems that as long as the T710 with the 10G card is the start or
> > > > > end point of the connection, I get no packet loss, but as soon as it
> > > > > has to route, something go wrong.
> > > > >
> > > > > 2 - I see packet loss (0 - 40%) on IPv6 packets in vlans, when the
> > > > > machine is not the originator of the packets. This happen even with
> > > > > the "ifconfig ix2 -rxcsum".
> > > > >
> > > > > Let me try to describe a little more. If a neigbouring machine ping6
> > > it,
> > > > > there will be packet loss. If it act as a router for ipv6, there will
> > > be
> > > > > packet loss. This happen even when the network is pretty idle and with
> > > > > different switches (Nortel and Cisco equipment). The packet loss is
> > > > > very fluctuating. Pinging 1000 packets might loose 1% one time and the
> > > > > next time 30%. Looking with tcpdump, I can see the packets arriving 
> > > > > and
> > > > > going out, but the packet never arrive at the next machine. (My 
> > > > > feeling
> > > is
> > > > > that they get lost inside the card.) The error counters on the switch
> > > > > does not increment.
> > > > >
> > > > > I do not see packet loss if the machine originate the packets, for
> > > example
> > > > > ping6 from the machine. Also ipv4 packets do not have any packets 
> > > > > loss.
> > > If
> > > > > I do not use vlans, I don't see packet loss with ipv6 either.
> > > > >
> > > > > The machine also have bce 1G interfaces and I do not see the packet
> > > loss
> > > > > on them.
> > > > >
> > > > > Here is some info about the machine / setup. The numbers are pretty 
> > > > > low
> > > > > because I rebooted after compiling a kernel with IPFIREWALL,
> > > ROUTETABLES,
> > > > > MROUTING and FLOWTABLE removed. I'll add my kernel config file with
> > > empty
> > > > > and commented out lines removed.
> > > > >
> > > > > pciconf -lvc
> > > > > i...@pci0:129:0:0:   class=0x02 card=0x00038086 
> > > > > chip=0x10fb8086
> > > > > r

Re: re0 link UP/DOWN on 8.1-STABLE amd64

2010-08-11 Thread Pyun YongHyeon

On Wed, Aug 11, 2010 at 11:24:56PM +0300, Zeus V Panchenko wrote:
> Pyun YongHyeon (pyu...@gmail.com) [10.08.11 23:09] wrote:
> > On Wed, Aug 11, 2010 at 10:34:07PM +0300, Zeus V Panchenko wrote:
> > > oh, i forgoten :(
> > > 
> > > dmesg.boot contains:
> > > 
> > > re0:  > > Gigabit Ethernet> port 0xe800-0xe8ff mem 
> > > 0xfafff000-0xfaff,0xfaff8000-0xfaffbfff irq 17 at device 0.0 on pci2
> > > re0: Using 1 MSI messages
> > > re0: Chip rev. 0x2800
> > > re0: MAC rev. 0x
> > > miibus0:  on re0
> > > rgephy0:  PHY 1 on miibus0
> > > rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > 1000baseT-FDX, auto
> > > re0: Ethernet address: 48:5b:39:d2:1d:89
> > > re0: [FILTER]
> > > 
> > 
> > >From the output above, I believe you're using slightly old stable.
> > Please use 8.1-RELEASE and let me know how it works on 8.1-RELEASE.
> 
> ooops  sorry, it was another box
> here the one i was begining from and where the problem persists too
> 
> > uname -a
> FreeBSD egw.ibs.dn.ua 8.1-STABLE FreeBSD 8.1-STABLE #0: Mon Aug  9 10:33:17 
> EEST 2010 r...@egw.ibs.dn.ua:/usr/obj/usr/src/sys/EGW  amd64
> 
> re0:  port 
> 0xe800-0xe8ff mem 0xfafff000-0xfaff,0xfaff8000-0xfaffbfff irq 17 at 
> device 0.0 on pci2
> re0: Using 1 MSI messages
> re0: Chip rev. 0x2800
> re0: MAC rev. 0x
> miibus0:  on re0
> rgephy0:  PHY 1 on miibus0
> rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto
> re0: Ethernet address: 20:cf:30:89:5e:95
> re0: [FILTER]
> 
> > devinfo -rv
> rgephy0 pnpinfo oui=0x732 model=0x11 rev=0x2 at phyno=1
> 
> i was cvsup-ing a couple of days ago ... now i have killed all tree and 
> cvs-ing again ...
> 

Ok thanks for the info. Would you try attached patch and let me
know whether it makes any difference?
Index: sys/dev/re/if_re.c
===
--- sys/dev/re/if_re.c	(revision 211176)
+++ sys/dev/re/if_re.c	(working copy)
@@ -1311,6 +1311,8 @@
 		 * RTL8111C/CP : supports up to 9KB jumbo frame.
 		 */
 		sc->rl_flags |= RL_FLAG_NOJUMBO;
+		if (hw_rev->rl_rev == RL_HWREV_8168D)
+			sc->rl_flags |= RL_FLAG_PHYWAKE_PM;
 		break;
 	case RL_HWREV_8168E:
 		sc->rl_flags |= RL_FLAG_PHYWAKE | RL_FLAG_PHYWAKE_PM |
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: re0 link UP/DOWN on 8.1-STABLE amd64

2010-08-11 Thread Pyun YongHyeon

On Wed, Aug 11, 2010 at 03:27:48PM -0400, Tom Pusateri wrote:
> 
> On Aug 11, 2010, at 3:11 PM, Zeus V Panchenko wrote:
> 
> > Pyun YongHyeon (pyu...@gmail.com) [10.08.11 19:31] wrote:
> >> On Wed, Aug 11, 2010 at 03:50:14PM +0300, Zeus V Panchenko wrote:
> >>> Hi All,
> >>> 
> >>> can enybody help with the subj, please?
> >>> 
> >>> problem: onboard interface re0 link state UP/DOWN flapping
> >>> 
> >>> 
> >>> i have:
> >>> # uname -a 
> >>> FreeBSD 8.1-STABLE #0: Mon Aug  9 10:33:17 EEST 2010 amd64
> >>> 
> >>> # dmidecode
> >>> ...
> >>> Base Board Information
> >>> Manufacturer: ASUSTeK Computer INC.
> >>> Product Name: AT5NM10-I
> >>> ...
> >>> 
> >>> # pciconf -lcv
> >>> r...@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03 
> >>> hdr=0x00
> >>>vendor = 'Realtek Semiconductor'
> >>>device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> >>>class  = network
> >>>subclass   = ethernet
> >>>cap 01[40] = powerspec 3  supports D0 D1 D2 D3  current D0
> >>>cap 05[50] = MSI supports 1 message, 64 bit enabled with 1 message
> >>>cap 10[70] = PCI-Express 2 endpoint IRQ 2 max data 128(256) link x1(x1)
> >>>cap 11[ac] = MSI-X supports 4 messages in map 0x20
> >>>cap 03[cc] = VPD
> >>> 
> >>> 
> >>> # ifconfig re0
> >>> re0: flags=8843 metric 0 mtu 1500
> >>>
> >>> options=389b
> >>>ether 20:cf:30:89:5e:95
> >>>inet 10.10.0.111 netmask 0x broadcast 10.10.255.255
> >>>media: Ethernet 1000baseT 
> >>>status: active
> >>> 
> >>> 
> >>> 
> >>> 
> >>> sporadically interface begins to flap and dmesg shows:
> >>> ...
> >>> Aug 11 14:29:44 kernel: re0: link state changed to DOWN
> >>> Aug 11 14:29:47 kernel: re0: link state changed to UP
> >>> Aug 11 14:29:58 kernel: re0: link state changed to DOWN
> >>> Aug 11 14:30:01 kernel: re0: link state changed to UP
> >>> ...
> >>> 
> >>> 
> >>> systat doesn't show high interrupts on the card
> >>> # systat -v
> >>>1 usersLoad  0.06  0.02  0.00  Aug 11 15:45
> >>> 
> >>> Mem:KBREALVIRTUAL   VN PAGER   SWAP 
> >>> PAGER
> >>>Tot   Share  TotShareFree   in   out in   
> >>> out
> >>> Act 1069020  177580  2968312   209660  455852  count
> >>> All 1149408  184236 1076855k   251780  pages
> >>> Proc:
> >>> Interrupts
> >>>  r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt278 cow8057 total
> >>>  1  74   885  842 1181   57  268  736278 zfod
> >>> atkbd0 1
> >>>  ozfod22 rl0 
> >>> irq17
> >>> 0.4%Sys   0.1%Intr  0.2%User  0.0%Nice 99.3%Idle%ozfod  2000 
> >>> cpu0: time
> >>> |||||||||||   daefr33 re0 
> >>> irq256
> >>>  379 prcfr 2 
> >>> ahci0 257
> >>>29 dtbuf  704 totfr  2000 
> >>> cpu1: time
> >>> Namei Name-cache   Dir-cache10 desvn  react  2000 
> >>> cpu3: time
> >>>   Callshits   %hits   % 87484 numvn  pdwak  2000 
> >>> cpu2: time
> >>> 921 921 100 24183 frevn  pdpgs
> >>>3 intrn
> >>> Disks  ada0  ada1 pass0 pass1  576392 wire
> >>> KB/t  21.40  0.00  0.00  0.00 1040084 act
> >>> tps   2 0 0 0 1948900 inact
> >>> MB/s   0.04  0.00  0.00  0.00 cache
> >>> %busy 0 0 0 0  455852 free
> >>>   427520 buf
> >>>

Re: re0 link UP/DOWN on 8.1-STABLE amd64

2010-08-11 Thread Pyun YongHyeon

On Wed, Aug 11, 2010 at 10:34:07PM +0300, Zeus V Panchenko wrote:
> oh, i forgoten :(
> 
> dmesg.boot contains:
> 
> re0:  Ethernet> port 0xe800-0xe8ff mem 0xfafff000-0xfaff,0xfaff8000-0xfaffbfff 
> irq 17 at device 0.0 on pci2
> re0: Using 1 MSI messages
> re0: Chip rev. 0x2800
> re0: MAC rev. 0x
> miibus0:  on re0
> rgephy0:  PHY 1 on miibus0
> rgephy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-FDX, auto
> re0: Ethernet address: 48:5b:39:d2:1d:89
> re0: [FILTER]
> 

>From the output above, I believe you're using slightly old stable.
Please use 8.1-RELEASE and let me know how it works on 8.1-RELEASE.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: "RX ring hdr initialization error"

2010-08-11 Thread Pyun YongHyeon

On Wed, Aug 11, 2010 at 10:05:20AM -0700, Jack Vogel wrote:
> Why would you set the ring size so large? On a home system you should be
> fine with 1024 or even 512.
> 
> If you have a panic on boot reconfigure the kernel so em is not static, then
> load
> it as a module after boot and tune it that way, once you have it tweaked you
> can
> make it static again if you wish.
> 

He used default TX/RX ring size.

> Jack
> 
> 
> On Wed, Aug 11, 2010 at 9:43 AM, Pyun YongHyeon  wrote:
> 
> > On Wed, Aug 11, 2010 at 10:19:11AM +0200, Victor Ophof wrote:
> > >
> > >
> > >
> > > From: pyu...@gmail.com
> > > Date: Tue, 10 Aug 2010 14:37:54 -0700
> > > To: mr4hu...@hotmail.com
> > > CC: j...@freebsd.org; freebsd-net@freebsd.org
> > > Subject: Re: "RX ring hdr initialization error"
> > >
> > > On Tue, Aug 10, 2010 at 12:52:56PM +0200, Victor Ophof wrote:
> > > >
> > > >
> > > >
> > > >
> > > > Hi
> > > >
> > > >  I've bought a asus M4a78-EM Motherboard.  to build a NAS on,
> > > > thinking the onboard Realtek would be sufficant speed
> > > > unfortunatly the onboard fives 16/31 mbs at best
> > > >
> > > > ps later It improved with enabeling "polling" in the kernel (duh)
> > > >
> > > > so I had a PCI intel GT nic around, what gave intermittent tcp/ip
> > connections in a other machine (ESXi)
> > > > unfortunatly this was the same with Freebsd (card issue?) The card is
> > still in the machine
> > > > even with the Intel supplied BSD driver
> > > >
> > > > now I bought a PCIe intel CT nic, put it in and the kernel panic with
> > > > "RX ring hdr initialization error"
> > > > so replaced the intel with the freebsd one by doing
> > > > intel overwrites the freebsd one /boot/kernel/if_em.ko
> > > > # cd /usr/src/sys/modules/em/ && make obj depend all install
> > > > (was already in the kernel)
> > > >
> > > >
> > > > still panic
> > > > anybody got some idea's howto fix ?
> > > >
> > >  --- reaction pyunyh ---
> > > I have been using the attached patch for em(4)/igb(4) controllers.
> > > These drivers explicitly calls panic(9) when memory allocation
> > > failure happens. I don't think it's good idea to panic the box
> > > under resource shortage condition as it's common to see this
> > > situation on heavily loaded servers.
> > >
> > > The patch does not solve the one issue yet. The panic caused by
> > > RX buffer allocation failure condition which in turn means you're
> > > allocating a lot of buffers. Reduce number of descriptors if you
> > > increased that too high and see whether the issue could be gone.
> > > ---/reaction pyunyh ---What buffers /descriptors I need to reduce? I have
> > 2gb ram and set the following in /boot/loader.conf vm.kmem_size_max="1024m"
> >
> > The loader tunables are hw.em.txd and hw.em.rxd. I thought you
> > increased TX/RX descriptor size to large value(e.g. 4096).
> >
> > > vm.kmem_size="1024m"
> > > #vfs.zfs.prefetch_disable=1
> > > vm.kmem_size="2048M"
> > > vfs.zfs.arc_min="1024M"
> > > vfs.zfs.arc_max="1536M"
> > > vfs.zfs.vdev.min_pending=2
> > > vfs.zfs.vdev.max_pending=8
> > > vfs.zfs.txg.timeout=5
> > > aio_load="YES"
> > > ahci_load="YES"
> > >
> >
> > I guess zfs consumed a lot of memory such that em(4) was not able
> > to allocate RX buffers. It seems there is nothing can be done in
> > this case unless some memory is reclaimed from zfs. I'm not
> > familiar with zfs internals but others can comment on this.
> >
> > However the patch should fix the panic under these resource
> > shortage situation.
> > ___
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: "RX ring hdr initialization error"

2010-08-11 Thread Pyun YongHyeon

On Wed, Aug 11, 2010 at 10:19:11AM +0200, Victor Ophof wrote:
> 
> 
>  
> From: pyu...@gmail.com
> Date: Tue, 10 Aug 2010 14:37:54 -0700
> To: mr4hu...@hotmail.com
> CC: j...@freebsd.org; freebsd-net@freebsd.org
> Subject: Re: "RX ring hdr initialization error"
> 
> On Tue, Aug 10, 2010 at 12:52:56PM +0200, Victor Ophof wrote:
> > 
> > 
> > 
> > 
> > Hi 
> > 
> >  I've bought a asus M4a78-EM Motherboard.  to build a NAS on, 
> > thinking the onboard Realtek would be sufficant speed 
> > unfortunatly the onboard fives 16/31 mbs at best 
> > 
> > ps later It improved with enabeling "polling" in the kernel (duh) 
> > 
> > so I had a PCI intel GT nic around, what gave intermittent tcp/ip 
> > connections in a other machine (ESXi) 
> > unfortunatly this was the same with Freebsd (card issue?) The card is still 
> > in the machine
> > even with the Intel supplied BSD driver 
> > 
> > now I bought a PCIe intel CT nic, put it in and the kernel panic with 
> > "RX ring hdr initialization error"
> > so replaced the intel with the freebsd one by doing 
> > intel overwrites the freebsd one /boot/kernel/if_em.ko 
> > # cd /usr/src/sys/modules/em/ && make obj depend all install
> > (was already in the kernel) 
> >  
> > 
> > still panic 
> > anybody got some idea's howto fix ? 
> > 
>  --- reaction pyunyh ---
> I have been using the attached patch for em(4)/igb(4) controllers.
> These drivers explicitly calls panic(9) when memory allocation
> failure happens. I don't think it's good idea to panic the box
> under resource shortage condition as it's common to see this
> situation on heavily loaded servers.
>  
> The patch does not solve the one issue yet. The panic caused by
> RX buffer allocation failure condition which in turn means you're
> allocating a lot of buffers. Reduce number of descriptors if you 
> increased that too high and see whether the issue could be gone.
> ---/reaction pyunyh ---What buffers /descriptors I need to reduce? I have 2gb 
> ram and set the following in /boot/loader.conf vm.kmem_size_max="1024m"

The loader tunables are hw.em.txd and hw.em.rxd. I thought you
increased TX/RX descriptor size to large value(e.g. 4096).

> vm.kmem_size="1024m"
> #vfs.zfs.prefetch_disable=1
> vm.kmem_size="2048M"
> vfs.zfs.arc_min="1024M"
> vfs.zfs.arc_max="1536M"
> vfs.zfs.vdev.min_pending=2
> vfs.zfs.vdev.max_pending=8
> vfs.zfs.txg.timeout=5
> aio_load="YES"
> ahci_load="YES"
> 

I guess zfs consumed a lot of memory such that em(4) was not able
to allocate RX buffers. It seems there is nothing can be done in
this case unless some memory is reclaimed from zfs. I'm not
familiar with zfs internals but others can comment on this.

However the patch should fix the panic under these resource
shortage situation.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: re0 link UP/DOWN on 8.1-STABLE amd64

2010-08-11 Thread Pyun YongHyeon

On Wed, Aug 11, 2010 at 03:50:14PM +0300, Zeus V Panchenko wrote:
> Hi All,
> 
> can enybody help with the subj, please?
> 
> problem: onboard interface re0 link state UP/DOWN flapping
> 
> 
> i have:
> # uname -a 
> FreeBSD 8.1-STABLE #0: Mon Aug  9 10:33:17 EEST 2010 amd64
> 
> # dmidecode
> ...
> Base Board Information
>  Manufacturer: ASUSTeK Computer INC.
>  Product Name: AT5NM10-I
> ...
> 
> # pciconf -lcv
> r...@pci0:2:0:0: class=0x02 card=0x83a31043 chip=0x816810ec rev=0x03 
> hdr=0x00
> vendor = 'Realtek Semiconductor'
> device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
> class  = network
> subclass   = ethernet
> cap 01[40] = powerspec 3  supports D0 D1 D2 D3  current D0
> cap 05[50] = MSI supports 1 message, 64 bit enabled with 1 message
> cap 10[70] = PCI-Express 2 endpoint IRQ 2 max data 128(256) link x1(x1)
> cap 11[ac] = MSI-X supports 4 messages in map 0x20
> cap 03[cc] = VPD
> 
> 
> # ifconfig re0
> re0: flags=8843 metric 0 mtu 1500
> 
> options=389b
> ether 20:cf:30:89:5e:95
> inet 10.10.0.111 netmask 0x broadcast 10.10.255.255
> media: Ethernet 1000baseT 
> status: active
> 
> 
> 
> 
> sporadically interface begins to flap and dmesg shows:
> ...
> Aug 11 14:29:44 kernel: re0: link state changed to DOWN
> Aug 11 14:29:47 kernel: re0: link state changed to UP
> Aug 11 14:29:58 kernel: re0: link state changed to DOWN
> Aug 11 14:30:01 kernel: re0: link state changed to UP
> ...
> 
> 
> systat doesn't show high interrupts on the card
> # systat -v
> 1 usersLoad  0.06  0.02  0.00  Aug 11 15:45
> 
> Mem:KBREALVIRTUAL   VN PAGER   SWAP PAGER
> Tot   Share  TotShareFree   in   out in   out
> Act 1069020  177580  2968312   209660  455852  count
> All 1149408  184236 1076855k   251780  pages
> Proc:Interrupts
>   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Flt278 cow8057 total
>   1  74   885  842 1181   57  268  736278 zfodatkbd0 1
>   ozfod22 rl0 
> irq17
>  0.4%Sys   0.1%Intr  0.2%User  0.0%Nice 99.3%Idle%ozfod  2000 cpu0: 
> time
> |||||||||||   daefr33 re0 
> irq256
>   379 prcfr 2 ahci0 
> 257
> 29 dtbuf  704 totfr  2000 cpu1: 
> time
> Namei Name-cache   Dir-cache10 desvn  react  2000 cpu3: 
> time
>Callshits   %hits   % 87484 numvn  pdwak  2000 cpu2: 
> time
>  921 921 100 24183 frevn  pdpgs
> 3 intrn
> Disks  ada0  ada1 pass0 pass1  576392 wire
> KB/t  21.40  0.00  0.00  0.00 1040084 act
> tps   2 0 0 0 1948900 inact
> MB/s   0.04  0.00  0.00  0.00 cache
> %busy 0 0 0 0  455852 free
>427520 buf
> 
> 
> i have changed motherboards ... the same effect. after some time the
> problem appears again
> 
> 
> is there any info i can provide?
> 

Show me the output of dmesg and "devinfo -rv | rgephy".
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: "RX ring hdr initialization error"

2010-08-10 Thread Pyun YongHyeon

On Tue, Aug 10, 2010 at 12:52:56PM +0200, Victor Ophof wrote:
> 
> 
> 
> 
> Hi 
> 
>  I've bought a asus M4a78-EM Motherboard.  to build a NAS on, 
> thinking the onboard Realtek would be sufficant speed 
> unfortunatly the onboard fives 16/31 mbs at best 
> 
> ps later It improved with enabeling "polling" in the kernel (duh) 
> 
> so I had a PCI intel GT nic around, what gave intermittent tcp/ip connections 
> in a other machine (ESXi) 
> unfortunatly this was the same with Freebsd (card issue?) The card is still 
> in the machine
> even with the Intel supplied BSD driver 
> 
> now I bought a PCIe intel CT nic, put it in and the kernel panic with 
> "RX ring hdr initialization error"
> so replaced the intel with the freebsd one by doing 
> intel overwrites the freebsd one /boot/kernel/if_em.ko 
> # cd /usr/src/sys/modules/em/ && make obj depend all install
> (was already in the kernel) 
>  
> 
> still panic 
> anybody got some idea's howto fix ? 
> 

I have been using the attached patch for em(4)/igb(4) controllers.
These drivers explicitly calls panic(9) when memory allocation
failure happens. I don't think it's good idea to panic the box
under resource shortage condition as it's common to see this
situation on heavily loaded servers.

The patch does not solve the one issue yet. The panic caused by
RX buffer allocation failure condition which in turn means you're
allocating a lot of buffers. Reduce number of descriptors if you 
increased that too high and see whether the issue could be gone.
Index: sys/dev/e1000/if_igb.c
===
--- sys/dev/e1000/if_igb.c	(revision 211102)
+++ sys/dev/e1000/if_igb.c	(working copy)
@@ -178,7 +178,7 @@
 static void	igb_free_pci_resources(struct adapter *);
 static void	igb_local_timer(void *);
 static void	igb_reset(struct adapter *);
-static void	igb_setup_interface(device_t, struct adapter *);
+static int	igb_setup_interface(device_t, struct adapter *);
 static int	igb_allocate_queues(struct adapter *);
 static void	igb_configure_queues(struct adapter *);
 
@@ -559,7 +559,8 @@
 		goto err_late;
 
 	/* Setup OS specific network interface */
-	igb_setup_interface(dev, adapter);
+	if (igb_setup_interface(dev, adapter) != 0)
+		goto err_late;
 
 	/* Now get a good starting state */
 	igb_reset(adapter);
@@ -608,6 +609,8 @@
 	igb_free_transmit_structures(adapter);
 	igb_free_receive_structures(adapter);
 	igb_release_hw_control(adapter);
+	if (adapter->ifp != NULL)
+		if_free(adapter->ifp);
 err_pci:
 	igb_free_pci_resources(adapter);
 	IGB_CORE_LOCK_DESTROY(adapter);
@@ -2653,7 +2656,7 @@
  *  Setup networking device structure and register an interface.
  *
  **/
-static void
+static int
 igb_setup_interface(device_t dev, struct adapter *adapter)
 {
 	struct ifnet   *ifp;
@@ -2661,8 +2664,10 @@
 	INIT_DEBUGOUT("igb_setup_interface: begin");
 
 	ifp = adapter->ifp = if_alloc(IFT_ETHER);
-	if (ifp == NULL)
-		panic("%s: can not if_alloc()", device_get_nameunit(dev));
+	if (ifp == NULL) {
+		device_printf(dev, "can not allocate ifnet structure\n");
+		return (-1);
+	}
 	if_initname(ifp, device_get_name(dev), device_get_unit(dev));
 	ifp->if_mtu = ETHERMTU;
 	ifp->if_init =  igb_init;
@@ -2739,6 +2744,7 @@
 	}
 	ifmedia_add(&adapter->media, IFM_ETHER | IFM_AUTO, 0, NULL);
 	ifmedia_set(&adapter->media, IFM_ETHER | IFM_AUTO);
+	return (0);
 }
 
 
Index: sys/dev/e1000/if_lem.c
===
--- sys/dev/e1000/if_lem.c	(revision 211102)
+++ sys/dev/e1000/if_lem.c	(working copy)
@@ -186,7 +186,7 @@
 static void	lem_free_pci_resources(struct adapter *);
 static void	lem_local_timer(void *);
 static int	lem_hardware_init(struct adapter *);
-static void	lem_setup_interface(device_t, struct adapter *);
+static int	lem_setup_interface(device_t, struct adapter *);
 static void	lem_setup_transmit_structures(struct adapter *);
 static void	lem_initialize_transmit_unit(struct adapter *);
 static int	lem_setup_receive_structures(struct adapter *);
@@ -620,7 +620,8 @@
 	lem_get_wakeup(dev);
 
 	/* Setup OS specific network interface */
-	lem_setup_interface(dev, adapter);
+	if (lem_setup_interface(dev, adapter) != 0)
+		goto err_rx_struct;
 
 	/* Initialize statistics */
 	lem_update_stats_counters(adapter);
@@ -672,6 +673,8 @@
 	lem_dma_free(adapter, &adapter->txdma);
 err_tx_desc:
 err_pci:
+	if (adapter->ifp != NULL)
+		if_free(adapter->ifp);
 	lem_free_pci_resources(adapter);
 	EM_TX_LOCK_DESTROY(adapter);
 	EM_RX_LOCK_DESTROY(adapter);
@@ -1939,6 +1942,19 @@
 
 	IOCTL_DEBUGOUT("lem_set_multi: begin");
 
+	/*
+	 * Allocate temporary memory to setup array.  If there is not
+	 * enough resource, give up setting multicast filter.
+	 */
+	mta = malloc(sizeof(u8) *
+	(ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES),
+	M_DEVBUF, M_NOWAIT | M_ZERO);
+	if (mta == NULL) {
+		device_printf(adapter->dev,
+		"can no

Re: Watchdog resets on 82575

2010-08-10 Thread Pyun YongHyeon

On Tue, Aug 10, 2010 at 03:57:22AM -0700, Jeremy Chadwick wrote:
> On Tue, Aug 10, 2010 at 11:23:26AM +0100, Steven Hartland wrote:
> > Thanks Jeremy, from that we get:-
> > 
> > i...@pci0:1:0:0:class=0x02 card=0x060015d9 chip=0x10c98086 
> > rev=0x01 hdr=0x00
> >vendor = 'Intel Corporation'
> >class  = network
> >subclass   = ethernet
> >cap 01[40] = powerspec 3  supports D0 D3  current D0
> >cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> >cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
> >cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x4(x4)
> > i...@pci0:1:0:1:class=0x02 card=0x060015d9 chip=0x10c98086 
> > rev=0x01 hdr=0x00
> >vendor = 'Intel Corporation'
> >class  = network
> >subclass   = ethernet
> >cap 01[40] = powerspec 3  supports D0 D3  current D0
> >cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> >cap 11[70] = MSI-X supports 10 messages in map 0x1c enabled
> >cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x4(x4)
> > 
> > I assume there is a way to convert from the hex values to the human value
> > but not sure what it is?
> 
> The "card" and "chip" identifiers are part of the PCI ID specification.
> You can see what the "human value" is by examining the source code for
> the driver.  Sometimes it's easy to figure out, other times there's a
> series of #define's which you have to reverse engineer.
> 
> In this case, there's two places with relevant information:
> 
> src/sys/dev/e1000/if_igb.c
> src/sys/dev/e1000/e1000_hw.h
> 
> You have to split the Chip ID into two separate 16-bit portions, so
> 0x10c9 and 0x8086.
> 
> 0x8086 is Intel's vendor code.  0x10c9 is the device ID of the
> individual NIC/model type.  So:
> 
> $ grep -i 0x10c9 *
> e1000_hw.h:#define E1000_DEV_ID_825760x10C9
> 
> For Jack: igb_vendor_info_array should really be extended to include
> actual ASCII strings for the individual chips/models/codenames.  I'm
> sure that's on your todo list somewhere.  I'd be willing to write this
> but would need a list of the models (or maybe the Linux driver has them
> in comments, etc. and I could go off of that).
> 

I guess em(4)/igb(4)/ixgb(4)/ixgbe(4) only shows vendor string and
driver version which effectively hides controller name/model
details in device attach phase. Personally I like to see more
detailed controller model information which may help narrowing down
affected lits of controllers when an issue is reported.
Currently we have to get this information by requesting the output
of pciconf(4) which in turn requires one more round trip of mail.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: does if_vr export true packet error statistics?

2010-07-27 Thread Pyun YongHyeon

On Tue, Jul 27, 2010 at 09:14:22AM +0400, Lev Serebryakov wrote:
> Hello, Freebsd-net.
> 
> I   have   huge   losses ("netstat -s -p tcp" shows 4% of packets, but
>  35% of bytes are retransmitted) on my intenret connection, which is PPPoE 
> over
>  100Mbit  ehternet  link.  Provider  claims, that it is physical level
>  problem, and I should fix my cables in my network. But `netstat -i' shows
>  0  input  errors  /  0  output  errors  /  0  collistion  on physical
>  interface, which is if_vr.
> 
>Could  I  belive  in  these  stats?  Does  if_vr  export proper error
>  statistics?
> 

I think so. You can also check more detailed statistics of vr(4)
with sysctl(8).
#sysctl dev.vr.0.stats=1

But I guess you wouldn't see errors there. To me, your issue looks
like wrongly advertised MSS which could be triggered by incorrect
MTU/MRU configuration of PPPoE. Check your PPPoE configuration.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/148772: [alc] alc0 does not send/receive packets if not plugged in during boot

2010-07-21 Thread Pyun YongHyeon

On Wed, Jul 21, 2010 at 10:03:32PM +0200, Kurt Jaeger wrote:
> Hi!
> 
> > > http://opsec.eu/backup/alc-bug/dmesg.boot-verbose
> 
> > One odd thing is alc(4) failed to read station address from EEPROM.
> > So alc(4) assumed BIOS correctly programmed station address but the
> > station address looks wrong to me.
> 
> > How about cold booting? Does other OS also report the same station
> > address?
> 
> I have no other OS at hand right now 8-}
> 
> > > with the patch applied (and booted with a cable).
> > > 
> > > Before the patch:
> > > 
> > > http://opsec.eu/backup/alc-bug/dmesg.boot
> > > 
> > 
> > Would you try this one?
> > http://people.freebsd.org/~yongari/alc/alc.link.patch2
> 
> It works better, does not hang during boot.
> 
> Next: add break-to-debugger 8-(
> 
> > > Thanks! If you need remote access...
> > 
> > That does not work mainly because I can't unplug/plug UTP cable
> > through remote access.
> 
> must.work.on.telekinetic.power 8-)
> 
> Now, this is going somewhere, as follows:
> 
> 1)
> reboot with unplugged cable, then some ifconfig alc0 up/down, then:
> I ping'ed on the alc0 host and tcpdump on the other host sees some traffic
> (this failed in the past):
> 
> 21:19:59.983843 48:5b:39:73:03:4f > ff:ff:ff:ff:ff:ff, ethertype ARP 
> (0x0806), length 60: arp who-has 192.168.5.11 tell 192.168.5.10
> 21:19:59.983855 00:e0:18:fc:7f:00 > 48:5b:39:73:03:4f, ethertype ARP 
> (0x0806), length 42: arp reply 192.168.5.11 is-at 00:e0:18:fc:7f:00
> 
> But: apparently the alc0 does not receive the answer, and so it
> fails to register the arp.
> 
> Hmm.
> 
> 2) reboot with cable plugged in: ping etc works immediatly.
> 
> 3) shutdown and reboot:
> 
> ifconfig alc0 says:
> 
> alc0: flags=8802 metric 0 mtu 1500
> 
> options=c3198
> ether 48:5b:39:73:03:4f
> media: Ethernet autoselect
> 
> then:
> 
> ndog# ifconfig alc0 192.168.5.10
> ndog# ifconfig alc0
> alc0: flags=8843 metric 0 mtu 1500
> 
> options=c3198
> ether 48:5b:39:73:03:4f
> inet 192.168.5.10 netmask 0xff00 broadcast 192.168.5.255
> media: Ethernet autoselect (none )
> status: no carrier
> 
> then after a few seconds the netbook just hung 8-(
> 
> 4) shutdown and reboot from cold, unplugged cable:
> 
> - started tcpdump on alc0
> - plugged in cable
> - ifconfig alc0 192.168.5.10
> 
> whow, it works.
> 
> I then unplugged, replugged etc. Looks stable now. Did some ipv6 over
> it. Rebooted with this as the primary interface. Works fine.
> 
> Cool. Thank you very much.
> 

Ok, it seems it shows some mixed result. It's possible that warm
boot may not clear some power related configuration of system which
could be incorrectly programmed with stock alc(4).
So start testing from cold booting with/without UTP cable and see
whether alc(4) can establish a valid link with link partner. You
should never see "" media. If that works as expected,
test warm booting with/without UTP cable.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

1 2 3 4 >

1 - 100 of 360 matches

Mail list logo