Re: em driver regression

2010-04-14 Thread Mikolaj Golub
On Wed, 14 Apr 2010 09:28:33 -0700 Jack Vogel wrote:

> Oh, didn't realize you were running the lem code :)  Will make the changes
> shortly,

r206614 works for me. Thanks :-)

> thanks for your debugging efforts.
>
> Jack

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-14 Thread Jack Vogel
Oh, didn't realize you were running the lem code :)  Will make the changes
shortly,
thanks for your debugging efforts.

Jack


On Wed, Apr 14, 2010 at 2:29 AM, Mikolaj Golub wrote:

>
> On Sun, 11 Apr 2010 23:40:03 +0300 Mikolaj Golub wrote:
>
>  MG> Hi,
>
>  MG> Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to
> the
>  MG> latest current and have "em0: Watchdog timeout -- resetting" issue. My
>  MG> previous kernel was for Mar 12.
>
>  MG> Tracking the revision where the problem appeared I see that the issue
> is not
>  MG> observed for r203834 and starts to observe after r205869.
>
>  MG> Interestingly, if I enter ddb and then exit (sometimes I needed to do
> this
>  MG> twice) the errors stop and network starts working.
>
> Adding some prints I observed the following:
>
> Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 813,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks
> 818, watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in
> lem_mq_start_locked 1 (ticks 818, watchdog_
> time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 818,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks
> 823, watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in
> lem_mq_start_locked 1 (ticks 828, watchdog_
> time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 923,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 923,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1023,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 1023,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks:
> 1023, watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 1024,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks
> 1028, watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1128,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 1 (ticks: 1128,
> watchdog_time: 0)
> Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks:
> 1128, watchdog_time: 0)
> ...
>
> So althogh adapter->watchdog_check was set TRUE, adapter->watchdog_time was
> never set.
>
> I see that before r205869 watchdog_time was set in em_xmit but lem_xmit
> does
> not contain this. After adding back this line to lem_xmit (see the first
> patch
> below) the problem has gone on my box.
>
> Also seeing that in the current em_mq_start_locked() both watchdog_check
> and
> watchdog_time are set I tried another patch adding watchdog_time setting in
> lem_mq_start_locked() too (see the second patch below). This has also fixed
> the issue for me but I don't know if this is a correct fix and if this is
> the
> only place where watchdog_time should be set (there are other places in the
> function and in the code where watchdog_check is set to TRUE but
> watchdog_time
> is not set).
>
> --
> Mikolaj Golub
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-14 Thread Mikolaj Golub

On Sun, 11 Apr 2010 23:40:03 +0300 Mikolaj Golub wrote:

 MG> Hi,

 MG> Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to the
 MG> latest current and have "em0: Watchdog timeout -- resetting" issue. My
 MG> previous kernel was for Mar 12.

 MG> Tracking the revision where the problem appeared I see that the issue is 
not
 MG> observed for r203834 and starts to observe after r205869.

 MG> Interestingly, if I enter ddb and then exit (sometimes I needed to do this
 MG> twice) the errors stop and network starts working.

Adding some prints I observed the following:

Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 813, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 818, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in 
lem_mq_start_locked 1 (ticks 818, watchdog_
time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 818, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 823, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: setting watchdog_check to TRUE in 
lem_mq_start_locked 1 (ticks 828, watchdog_
time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 923, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 923, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1023, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 3 (ticks: 1023, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1023, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked started (ticks 1024, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_init_locked returned at 3 (ticks 1028, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof started (ticks: 1128, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: lem_txeof returned at 1 (ticks: 1128, 
watchdog_time: 0)
Apr 14 07:14:08 hasta kernel: em0: Watchdog timeout -- resetting (ticks: 1128, 
watchdog_time: 0)
...

So althogh adapter->watchdog_check was set TRUE, adapter->watchdog_time was
never set.

I see that before r205869 watchdog_time was set in em_xmit but lem_xmit does
not contain this. After adding back this line to lem_xmit (see the first patch
below) the problem has gone on my box.

Also seeing that in the current em_mq_start_locked() both watchdog_check and
watchdog_time are set I tried another patch adding watchdog_time setting in
lem_mq_start_locked() too (see the second patch below). This has also fixed
the issue for me but I don't know if this is a correct fix and if this is the
only place where watchdog_time should be set (there are other places in the
function and in the code where watchdog_check is set to TRUE but watchdog_time
is not set).

-- 
Mikolaj Golub

Index: sys/dev/e1000/if_lem.c
===
--- sys/dev/e1000/if_lem.c	(revision 206595)
+++ sys/dev/e1000/if_lem.c	(working copy)
@@ -1880,6 +1880,7 @@ lem_xmit(struct adapter *adapter, struct mbuf **m_
 	 */
 	tx_buffer = &adapter->tx_buffer_area[first];
 	tx_buffer->next_eop = last;
+	adapter->watchdog_time = ticks;
 
 	/*
 	 * Advance the Transmit Descriptor Tail (TDT), this tells the E1000
Index: sys/dev/e1000/if_lem.c
===
--- sys/dev/e1000/if_lem.c	(revision 206595)
+++ sys/dev/e1000/if_lem.c	(working copy)
@@ -873,6 +873,7 @@ lem_mq_start_locked(struct ifnet *ifp, struct mbuf
 			*/
 			ETHER_BPF_MTAP(ifp, m);
 			adapter->watchdog_check = TRUE;
+			adapter->watchdog_time = ticks;
 		}
 	} else if ((error = drbr_enqueue(ifp, adapter->br, m)) != 0)
 		return (error);
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em driver regression

2010-04-11 Thread Mikolaj Golub
Hi,

On Thu, 8 Apr 2010 14:52:07 -0500 Brandon Gooch wrote:

> On Thu, Apr 8, 2010 at 2:17 PM, Jack Vogel  wrote:
>> Try the code I just checked in, it puts in the CRC stripping, but also
>> tweaks the
>> TX code, this may resolve the watchdogs. Let me know.
>>
>> Cheers,
>>
>> Jack
>>
>
> Yes, this is indeed the fix for both the dhclient and VirtualBox issue
> (at least with my setup). There appear to be no ill effects either.

Today I have upgraded the kernel in my VirtualBox (3.1.51.r27187) to the
latest current and have "em0: Watchdog timeout -- resetting" issue. My
previous kernel was for Mar 12.

Tracking the revision where the problem appeared I see that the issue is not
observed for r203834 and starts to observe after r205869.

Interestingly, if I enter ddb and then exit (sometimes I needed to do this
twice) the errors stop and network starts working.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-09 Thread Jack Vogel
On Fri, Apr 9, 2010 at 9:41 AM, Pyun YongHyeon  wrote:

> On Fri, Apr 09, 2010 at 09:17:07AM -0400, Mike Tancsa wrote:
> > At 07:07 PM 4/8/2010, Pyun YongHyeon wrote:
> > >On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote:
> > >> Only one device support by em does multiqueue right now, and that is
> > >> Hartwell, 82574.
> > >>
> > >
> > >Thanks for the info.
> > >
> > >Mike, here is updated patch. Now UDP bulk TX transfer performance
> > >recovered a lot(about 890Mbps) but it still shows bad numbers
> > >compared to other controllers. For example, bce(4) shows about
> > >958Mbps for the same load.
> > >During the testing I found a strong indication of packet reordering
> > >issue of drbr interface. If I forcibly change to use single TX
> > >queue, em(4) got 950Mbps as it used to be.
> > >
> > >Jack, as we talked about possible drbr issue with igb(4), UDP
> > >transfer seems to suffer from packet reordering issue here. Can we
> > >make em(4)/igb(4) use single TX queue until we solve drbr interface
> > >issue? Given that only one em(4) controller supports multiqueue,
> > >dropping multiqueue support for em(4) does not look bad to me.
> >
> > No watchdog errors over night. I wonder if the issue was due to
> > 100Mb, or the patch from current fixed it.  I will try today with the
> > new patch below! I am guessing the rejection was due to the RX/TX fix ?
> >
>
> The patch was generated against latest HEAD. This includes Jack's
> latest fix too so it may not be applied cleanly on stable/8.
> I think you can use em(4) in HEAD.
>

Yes, you can.  And I think its the code change not the
speed Mike.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-09 Thread Pyun YongHyeon
On Fri, Apr 09, 2010 at 09:17:07AM -0400, Mike Tancsa wrote:
> At 07:07 PM 4/8/2010, Pyun YongHyeon wrote:
> >On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote:
> >> Only one device support by em does multiqueue right now, and that is
> >> Hartwell, 82574.
> >>
> >
> >Thanks for the info.
> >
> >Mike, here is updated patch. Now UDP bulk TX transfer performance
> >recovered a lot(about 890Mbps) but it still shows bad numbers
> >compared to other controllers. For example, bce(4) shows about
> >958Mbps for the same load.
> >During the testing I found a strong indication of packet reordering
> >issue of drbr interface. If I forcibly change to use single TX
> >queue, em(4) got 950Mbps as it used to be.
> >
> >Jack, as we talked about possible drbr issue with igb(4), UDP
> >transfer seems to suffer from packet reordering issue here. Can we
> >make em(4)/igb(4) use single TX queue until we solve drbr interface
> >issue? Given that only one em(4) controller supports multiqueue,
> >dropping multiqueue support for em(4) does not look bad to me.
> 
> No watchdog errors over night. I wonder if the issue was due to 
> 100Mb, or the patch from current fixed it.  I will try today with the 
> new patch below! I am guessing the rejection was due to the RX/TX fix ?
> 

The patch was generated against latest HEAD. This includes Jack's
latest fix too so it may not be applied cleanly on stable/8.
I think you can use em(4) in HEAD.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-09 Thread Mike Tancsa

At 07:07 PM 4/8/2010, Pyun YongHyeon wrote:

On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote:
> Only one device support by em does multiqueue right now, and that is
> Hartwell, 82574.
>

Thanks for the info.

Mike, here is updated patch. Now UDP bulk TX transfer performance
recovered a lot(about 890Mbps) but it still shows bad numbers
compared to other controllers. For example, bce(4) shows about
958Mbps for the same load.
During the testing I found a strong indication of packet reordering
issue of drbr interface. If I forcibly change to use single TX
queue, em(4) got 950Mbps as it used to be.

Jack, as we talked about possible drbr issue with igb(4), UDP
transfer seems to suffer from packet reordering issue here. Can we
make em(4)/igb(4) use single TX queue until we solve drbr interface
issue? Given that only one em(4) controller supports multiqueue,
dropping multiqueue support for em(4) does not look bad to me.


No watchdog errors over night. I wonder if the issue was due to 
100Mb, or the patch from current fixed it.  I will try today with the 
new patch below! I am guessing the rejection was due to the RX/TX fix ?


---Mike

Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|Index: sys/dev/e1000/if_em.c
|===
|--- sys/dev/e1000/if_em.c  (revision 206403)
|+++ sys/dev/e1000/if_em.c  (working copy)
--
Patching file if_em.c using Plan A...
Hunk #1 succeeded at 812 with fuzz 2.
Hunk #2 succeeded at 834 (offset -4 lines).
Hunk #3 succeeded at 869 (offset -4 lines).
Hunk #4 succeeded at 913 (offset -4 lines).
Hunk #5 succeeded at 941 (offset -4 lines).
Hunk #6 succeeded at 1439 (offset -4 lines).
Hunk #7 succeeded at 1452 (offset -4 lines).
Hunk #8 succeeded at 1472 (offset -4 lines).
Hunk #9 succeeded at 1532 (offset -4 lines).
Hunk #10 succeeded at 1549 (offset -4 lines).
Hunk #11 failed at 1909.
Hunk #12 succeeded at 3617 (offset 2 lines).
Hunk #13 succeeded at 4069 (offset -6 lines).
Hunk #14 succeeded at 4087 (offset 2 lines).
Hunk #15 succeeded at 4187 (offset -6 lines).
1 out of 15 hunks failed--saving rejects to if_em.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--
|Index: sys/dev/e1000/if_em.h
|===
|--- sys/dev/e1000/if_em.h  (revision 206403)
|+++ sys/dev/e1000/if_em.h  (working copy)
--
Patching file if_em.h using Plan A...
Hunk #1 succeeded at 223.
done
1(ich10)# less if_em.c.rej
***
*** 1908,1919 
bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map,
BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i);
-   txr->watchdog_time = ticks;

- /* Call cleanup if number of TX descriptors low */
-   if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
-   em_txeof(txr);
-
return (0);
  }

--- 1909,1915 
bus_dmamap_sync(txr->txdma.dma_tag, txr->txdma.dma_map,
BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i);

return (0);
  }

0(ich10)#



> Jack
>
>
> On Thu, Apr 8, 2010 at 2:05 PM, Mike Tancsa  wrote:
>
> > At 04:56 PM 4/8/2010, Pyun YongHyeon wrote:
> >
> >> On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
> >> > At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
> >> >
> >> > >Try this patch. It should fix the issue. It seems Jack forgot to
> >> > >strip CRC bytes as old em(4) didn't strip it, probably to
> >> > >workaround silicon bug of old em(4) controllers.
> >> >
> >> > Thanks! The attached patch does indeed fix the dhclient issue.
> >> >
> >> >
> >> > >It seems there are also TX issues here. The system load is too high
> >> > >and sometimes system is not responsive while TX is in progress.
> >> > >Because I initiated TCP bulk transfers, TSO should reduce CPU load
> >> > >a lot but it didn't so I guess it could also be related watchdog
> >> > >timeouts you've seen. I'll see what can be done.
> >> >
> >> > Thanks for looking into that as well!!
> >> >
> >> > ---Mike
> >> >
> >>
> >> Mike,
> >>
> >> Here is patch I'm working on. This patch fixes high system load and
> >> system is very responsive as before. But it seems there is still
> >> some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
> >> and I have no idea what caused this at this moment.
> >>
> >> BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
> >> whether latest fix from Jack cured it. By chance does your
> >> controller support multi TX/RX queues? You can check whether em(4)
> >> uses multi-queues with "vmstat -i". If em(4) use multi-queue you
> >> may have multiple irq output for em0.
> >>
> >
> > Hi,
> >I will give it a try later tonight!  This 

Re: em driver regression

2010-04-08 Thread Jack Vogel
Ah, ok, let me play around with it a bit, perhaps I'll make it definable,
of course if there is no positive benefit from using it it would seem silly
to leave it around :)

Will look at your patch changes and that issue tomorrow. Thanks
for your efforts!

Jack


On Thu, Apr 8, 2010 at 4:07 PM, Pyun YongHyeon  wrote:

> On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote:
> > Only one device support by em does multiqueue right now, and that is
> > Hartwell, 82574.
> >
>
> Thanks for the info.
>
> Mike, here is updated patch. Now UDP bulk TX transfer performance
> recovered a lot(about 890Mbps) but it still shows bad numbers
> compared to other controllers. For example, bce(4) shows about
> 958Mbps for the same load.
> During the testing I found a strong indication of packet reordering
> issue of drbr interface. If I forcibly change to use single TX
> queue, em(4) got 950Mbps as it used to be.
>
> Jack, as we talked about possible drbr issue with igb(4), UDP
> transfer seems to suffer from packet reordering issue here. Can we
> make em(4)/igb(4) use single TX queue until we solve drbr interface
> issue? Given that only one em(4) controller supports multiqueue,
> dropping multiqueue support for em(4) does not look bad to me.
>
> > Jack
> >
> >
> > On Thu, Apr 8, 2010 at 2:05 PM, Mike Tancsa  wrote:
> >
> > > At 04:56 PM 4/8/2010, Pyun YongHyeon wrote:
> > >
> > >> On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
> > >> > At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
> > >> >
> > >> > >Try this patch. It should fix the issue. It seems Jack forgot to
> > >> > >strip CRC bytes as old em(4) didn't strip it, probably to
> > >> > >workaround silicon bug of old em(4) controllers.
> > >> >
> > >> > Thanks! The attached patch does indeed fix the dhclient issue.
> > >> >
> > >> >
> > >> > >It seems there are also TX issues here. The system load is too high
> > >> > >and sometimes system is not responsive while TX is in progress.
> > >> > >Because I initiated TCP bulk transfers, TSO should reduce CPU load
> > >> > >a lot but it didn't so I guess it could also be related watchdog
> > >> > >timeouts you've seen. I'll see what can be done.
> > >> >
> > >> > Thanks for looking into that as well!!
> > >> >
> > >> > ---Mike
> > >> >
> > >>
> > >> Mike,
> > >>
> > >> Here is patch I'm working on. This patch fixes high system load and
> > >> system is very responsive as before. But it seems there is still
> > >> some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
> > >> and I have no idea what caused this at this moment.
> > >>
> > >> BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
> > >> whether latest fix from Jack cured it. By chance does your
> > >> controller support multi TX/RX queues? You can check whether em(4)
> > >> uses multi-queues with "vmstat -i". If em(4) use multi-queue you
> > >> may have multiple irq output for em0.
> > >>
> > >
> > > Hi,
> > >I will give it a try later tonight!  This one does not seem to.
> > >
> > > 0(ich10)# vmstat -i
> > > interrupt  total   rate
> > > irq16: uhci0+ 30  0
> > > irq18: ehci0 uhci5158419 17
> > > irq19: fwohci0++  86  0
> > > irq21: uhci1  17  0
> > > irq23: uhci3 ehci1 2  0
> > > cpu0: timer 18570305   1994
> > > irq256: igb0  80  0
> > > irq257: igb0 255  0
> > > irq258: igb0  66  0
> > > irq259: igb0  32  0
> > > irq260: igb0   2  0
> > > irq261: igb12679  0
> > > irq262: igb1 998  0
> > > irq263: igb12468  0
> > > irq264: igb16361  0
> > > irq265: igb1   2  0
> > > irq266: em033910  3
> > > irq267: ahci1  15317  1
> > > cpu1: timer 18557074   1993
> > > cpu3: timer 18557168   1993
> > > cpu2: timer 18557108   1993
> > > Total   74462379   7998
> > > 0(ich10)#
> > >
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Pyun YongHyeon
On Thu, Apr 08, 2010 at 02:06:09PM -0700, Jack Vogel wrote:
> Only one device support by em does multiqueue right now, and that is
> Hartwell, 82574.
> 

Thanks for the info.

Mike, here is updated patch. Now UDP bulk TX transfer performance
recovered a lot(about 890Mbps) but it still shows bad numbers
compared to other controllers. For example, bce(4) shows about
958Mbps for the same load.
During the testing I found a strong indication of packet reordering
issue of drbr interface. If I forcibly change to use single TX
queue, em(4) got 950Mbps as it used to be.

Jack, as we talked about possible drbr issue with igb(4), UDP
transfer seems to suffer from packet reordering issue here. Can we
make em(4)/igb(4) use single TX queue until we solve drbr interface
issue? Given that only one em(4) controller supports multiqueue,
dropping multiqueue support for em(4) does not look bad to me.

> Jack
> 
> 
> On Thu, Apr 8, 2010 at 2:05 PM, Mike Tancsa  wrote:
> 
> > At 04:56 PM 4/8/2010, Pyun YongHyeon wrote:
> >
> >> On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
> >> > At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
> >> >
> >> > >Try this patch. It should fix the issue. It seems Jack forgot to
> >> > >strip CRC bytes as old em(4) didn't strip it, probably to
> >> > >workaround silicon bug of old em(4) controllers.
> >> >
> >> > Thanks! The attached patch does indeed fix the dhclient issue.
> >> >
> >> >
> >> > >It seems there are also TX issues here. The system load is too high
> >> > >and sometimes system is not responsive while TX is in progress.
> >> > >Because I initiated TCP bulk transfers, TSO should reduce CPU load
> >> > >a lot but it didn't so I guess it could also be related watchdog
> >> > >timeouts you've seen. I'll see what can be done.
> >> >
> >> > Thanks for looking into that as well!!
> >> >
> >> > ---Mike
> >> >
> >>
> >> Mike,
> >>
> >> Here is patch I'm working on. This patch fixes high system load and
> >> system is very responsive as before. But it seems there is still
> >> some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
> >> and I have no idea what caused this at this moment.
> >>
> >> BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
> >> whether latest fix from Jack cured it. By chance does your
> >> controller support multi TX/RX queues? You can check whether em(4)
> >> uses multi-queues with "vmstat -i". If em(4) use multi-queue you
> >> may have multiple irq output for em0.
> >>
> >
> > Hi,
> >I will give it a try later tonight!  This one does not seem to.
> >
> > 0(ich10)# vmstat -i
> > interrupt  total   rate
> > irq16: uhci0+ 30  0
> > irq18: ehci0 uhci5158419 17
> > irq19: fwohci0++  86  0
> > irq21: uhci1  17  0
> > irq23: uhci3 ehci1 2  0
> > cpu0: timer 18570305   1994
> > irq256: igb0  80  0
> > irq257: igb0 255  0
> > irq258: igb0  66  0
> > irq259: igb0  32  0
> > irq260: igb0   2  0
> > irq261: igb12679  0
> > irq262: igb1 998  0
> > irq263: igb12468  0
> > irq264: igb16361  0
> > irq265: igb1   2  0
> > irq266: em033910  3
> > irq267: ahci1  15317  1
> > cpu1: timer 18557074   1993
> > cpu3: timer 18557168   1993
> > cpu2: timer 18557108   1993
> > Total   74462379   7998
> > 0(ich10)#
> >
Index: sys/dev/e1000/if_em.c
===
--- sys/dev/e1000/if_em.c   (revision 206403)
+++ sys/dev/e1000/if_em.c   (working copy)
@@ -812,6 +812,10 @@
return (err);
}
 
+/* Call cleanup if number of TX descriptors low */
+   if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
+   em_txeof(txr);
+
enq = 0;
if (m == NULL) {
next = drbr_dequeue(ifp, txr->br);
@@ -834,11 +838,16 @@
ETHER_BPF_MTAP(ifp, next);
if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0)
 break;
+   if (txr->tx_avail < EM_MAX_SCATTER) {
+   ifp->if_drv_flags |= IFF_DRV_OACTIVE;
+   break;
+   }
next = drbr_dequeue(ifp, txr->br);
}
 
if (enq > 0) {
 /* Set the watchdog */
+   txr->watchdog_time = ticks;
 txr->watchdog_check = TRUE;
}
return (err);
@@ -864,8 +873,7 

Re: em driver regression

2010-04-08 Thread Jack Vogel
Only one device support by em does multiqueue right now, and that is
Hartwell, 82574.

Jack


On Thu, Apr 8, 2010 at 2:05 PM, Mike Tancsa  wrote:

> At 04:56 PM 4/8/2010, Pyun YongHyeon wrote:
>
>> On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
>> > At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
>> >
>> > >Try this patch. It should fix the issue. It seems Jack forgot to
>> > >strip CRC bytes as old em(4) didn't strip it, probably to
>> > >workaround silicon bug of old em(4) controllers.
>> >
>> > Thanks! The attached patch does indeed fix the dhclient issue.
>> >
>> >
>> > >It seems there are also TX issues here. The system load is too high
>> > >and sometimes system is not responsive while TX is in progress.
>> > >Because I initiated TCP bulk transfers, TSO should reduce CPU load
>> > >a lot but it didn't so I guess it could also be related watchdog
>> > >timeouts you've seen. I'll see what can be done.
>> >
>> > Thanks for looking into that as well!!
>> >
>> > ---Mike
>> >
>>
>> Mike,
>>
>> Here is patch I'm working on. This patch fixes high system load and
>> system is very responsive as before. But it seems there is still
>> some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
>> and I have no idea what caused this at this moment.
>>
>> BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
>> whether latest fix from Jack cured it. By chance does your
>> controller support multi TX/RX queues? You can check whether em(4)
>> uses multi-queues with "vmstat -i". If em(4) use multi-queue you
>> may have multiple irq output for em0.
>>
>
> Hi,
>I will give it a try later tonight!  This one does not seem to.
>
> 0(ich10)# vmstat -i
> interrupt  total   rate
> irq16: uhci0+ 30  0
> irq18: ehci0 uhci5158419 17
> irq19: fwohci0++  86  0
> irq21: uhci1  17  0
> irq23: uhci3 ehci1 2  0
> cpu0: timer 18570305   1994
> irq256: igb0  80  0
> irq257: igb0 255  0
> irq258: igb0  66  0
> irq259: igb0  32  0
> irq260: igb0   2  0
> irq261: igb12679  0
> irq262: igb1 998  0
> irq263: igb12468  0
> irq264: igb16361  0
> irq265: igb1   2  0
> irq266: em033910  3
> irq267: ahci1  15317  1
> cpu1: timer 18557074   1993
> cpu3: timer 18557168   1993
> cpu2: timer 18557108   1993
> Total   74462379   7998
> 0(ich10)#
>
>
>
>
>
>
>
> 
> Mike Tancsa,  tel +1 519 651 3400
> Sentex Communications,m...@sentex.net
> Providing Internet since 1994www.sentex.net
> Cambridge, Ontario Canada www.sentex.net/mike
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Mike Tancsa

At 04:56 PM 4/8/2010, Pyun YongHyeon wrote:

On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
> At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
>
> >Try this patch. It should fix the issue. It seems Jack forgot to
> >strip CRC bytes as old em(4) didn't strip it, probably to
> >workaround silicon bug of old em(4) controllers.
>
> Thanks! The attached patch does indeed fix the dhclient issue.
>
>
> >It seems there are also TX issues here. The system load is too high
> >and sometimes system is not responsive while TX is in progress.
> >Because I initiated TCP bulk transfers, TSO should reduce CPU load
> >a lot but it didn't so I guess it could also be related watchdog
> >timeouts you've seen. I'll see what can be done.
>
> Thanks for looking into that as well!!
>
> ---Mike
>

Mike,

Here is patch I'm working on. This patch fixes high system load and
system is very responsive as before. But it seems there is still
some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
and I have no idea what caused this at this moment.

BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
whether latest fix from Jack cured it. By chance does your
controller support multi TX/RX queues? You can check whether em(4)
uses multi-queues with "vmstat -i". If em(4) use multi-queue you
may have multiple irq output for em0.


Hi,
I will give it a try later tonight!  This one does not seem to.

0(ich10)# vmstat -i
interrupt  total   rate
irq16: uhci0+ 30  0
irq18: ehci0 uhci5158419 17
irq19: fwohci0++  86  0
irq21: uhci1  17  0
irq23: uhci3 ehci1 2  0
cpu0: timer 18570305   1994
irq256: igb0  80  0
irq257: igb0 255  0
irq258: igb0  66  0
irq259: igb0  32  0
irq260: igb0   2  0
irq261: igb12679  0
irq262: igb1 998  0
irq263: igb12468  0
irq264: igb16361  0
irq265: igb1   2  0
irq266: em033910  3
irq267: ahci1  15317  1
cpu1: timer 18557074   1993
cpu3: timer 18557168   1993
cpu2: timer 18557108   1993
Total   74462379   7998
0(ich10)#







Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Pyun YongHyeon
On Thu, Apr 08, 2010 at 02:31:18PM -0400, Mike Tancsa wrote:
> At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
> 
> >Try this patch. It should fix the issue. It seems Jack forgot to
> >strip CRC bytes as old em(4) didn't strip it, probably to
> >workaround silicon bug of old em(4) controllers.
> 
> Thanks! The attached patch does indeed fix the dhclient issue.
> 
> 
> >It seems there are also TX issues here. The system load is too high
> >and sometimes system is not responsive while TX is in progress.
> >Because I initiated TCP bulk transfers, TSO should reduce CPU load
> >a lot but it didn't so I guess it could also be related watchdog
> >timeouts you've seen. I'll see what can be done.
> 
> Thanks for looking into that as well!!
> 
> ---Mike
> 

Mike,

Here is patch I'm working on. This patch fixes high system load and 
system is very responsive as before. But it seems there is still
some TX issue here. Bulk UDP performance is very poor(< 700Mbps)
and I have no idea what caused this at this moment.

BTW, I have trouble to reproduce watchdog timeouts. I'm not sure
whether latest fix from Jack cured it. By chance does your
controller support multi TX/RX queues? You can check whether em(4)
uses multi-queues with "vmstat -i". If em(4) use multi-queue you
may have multiple irq output for em0.
Index: if_em.c
===
--- if_em.c	(revision 206403)
+++ if_em.c	(working copy)
@@ -812,6 +812,10 @@
 		return (err);
 	}
 
+/* Call cleanup if number of TX descriptors low */
+	if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
+		em_txeof(txr);
+
 	enq = 0;
 	if (m == NULL) {
 		next = drbr_dequeue(ifp, txr->br);
@@ -909,6 +913,10 @@
 	if (!adapter->link_active)
 		return;
 
+/* Call cleanup if number of TX descriptors low */
+	if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
+		em_txeof(txr);
+
 	while (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) {
 
 IFQ_DRV_DEQUEUE(&ifp->if_snd, m_head);
@@ -1427,17 +1435,12 @@
 	struct ifnet	*ifp = adapter->ifp;
 	struct tx_ring	*txr = adapter->tx_rings;
 	struct rx_ring	*rxr = adapter->rx_rings;
-	u32		loop = EM_MAX_LOOP;
-	bool		more_rx, more_tx;
+	bool		more_rx;
 
-
 	if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
+		more_rx = em_rxeof(rxr, adapter->rx_process_limit);
 		EM_TX_LOCK(txr);
-		do {
-			more_rx = em_rxeof(rxr, adapter->rx_process_limit);
-			more_tx = em_txeof(txr);
-		} while (loop-- && (more_rx || more_tx));
-
+		em_txeof(txr);
 #if __FreeBSD_version >= 80
 		if (!drbr_empty(ifp, txr->br))
 			em_mq_start_locked(ifp, txr, NULL);
@@ -1445,10 +1448,9 @@
 		if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
 			em_start_locked(ifp, txr);
 #endif
-		if (more_rx || more_tx)
-			taskqueue_enqueue(adapter->tq, &adapter->que_task);
-
 		EM_TX_UNLOCK(txr);
+		if (more_rx)
+			taskqueue_enqueue(adapter->tq, &adapter->que_task);
 	}
 
 	em_enable_intr(adapter);
@@ -1466,18 +1468,13 @@
 {
 	struct tx_ring *txr = arg;
 	struct adapter *adapter = txr->adapter;
-	bool		more;
 
 	++txr->tx_irq;
 	EM_TX_LOCK(txr);
-	more = em_txeof(txr);
+	em_txeof(txr);
 	EM_TX_UNLOCK(txr);
-	if (more)
-		taskqueue_enqueue(txr->tq, &txr->tx_task);
-	else
-		/* Reenable this interrupt */
-		E1000_WRITE_REG(&adapter->hw, E1000_IMS, txr->ims);
-	return;
+	/* Reenable this interrupt */
+	E1000_WRITE_REG(&adapter->hw, E1000_IMS, txr->ims);
 }
 
 /*
@@ -1531,14 +1528,15 @@
 {
 	struct rx_ring	*rxr = context;
 	struct adapter	*adapter = rxr->adapter;
-	u32		loop = EM_MAX_LOOP;
 boolmore;
 
-do {
-		more = em_rxeof(rxr, adapter->rx_process_limit);
-} while (loop-- && more);
-/* Reenable this interrupt */
-	E1000_WRITE_REG(&adapter->hw, E1000_IMS, rxr->ims);
+	more = em_rxeof(rxr, adapter->rx_process_limit);
+	if (more)
+		taskqueue_enqueue(rxr->tq, &rxr->rx_task);
+	else {
+		/* Reenable this interrupt */
+		E1000_WRITE_REG(&adapter->hw, E1000_IMS, rxr->ims);
+	}
 }
 
 static void
@@ -1547,15 +1545,10 @@
 	struct tx_ring	*txr = context;
 	struct adapter	*adapter = txr->adapter;
 	struct ifnet	*ifp = adapter->ifp;
-	u32		loop = EM_MAX_LOOP;
-boolmore;
 
 	if (!EM_TX_TRYLOCK(txr))
 		return;
-	do {
-		more = em_txeof(txr);
-	} while (loop-- && more);
-
+	em_txeof(txr);
 #if __FreeBSD_version >= 80
 	if (!drbr_empty(ifp, txr->br))
 		em_mq_start_locked(ifp, txr, NULL);
@@ -1914,10 +1907,6 @@
 	E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), i);
 	txr->watchdog_time = ticks;
 
-/* Call cleanup if number of TX descriptors low */
-	if (txr->tx_avail <= EM_TX_CLEANUP_THRESHOLD)
-		em_txeof(txr);
-
 	return (0);
 }
 
@@ -4078,7 +4067,7 @@
 em_rxeof(struct rx_ring *rxr, int count)
 {
 	struct adapter		*adapter = rxr->adapter;
-	struct ifnet		*ifp = adapter->ifp;;
+	struct ifnet		*ifp = adapter->ifp;
 	struct mbuf		*mp, *sendmp;
 	u8			status;
 	u16 			len;
@@ -4088,6 +4077,7 @@
 
 	EM_RX_LOCK(rxr);
 
+	statu

Re: em driver regression

2010-04-08 Thread Brandon Gooch
On Thu, Apr 8, 2010 at 2:17 PM, Jack Vogel  wrote:
> Try the code I just checked in, it puts in the CRC stripping, but also
> tweaks the
> TX code, this may resolve the watchdogs. Let me know.
>
> Cheers,
>
> Jack
>

Yes, this is indeed the fix for both the dhclient and VirtualBox issue
(at least with my setup). There appear to be no ill effects either.

Thank you Jack (and Pyun) for tracking down the problems! I'll keep my
eyes open for anything else.

-Brandon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
Try the code I just checked in, it puts in the CRC stripping, but also
tweaks the
TX code, this may resolve the watchdogs. Let me know.

Cheers,

Jack


On Thu, Apr 8, 2010 at 11:39 AM, Pyun YongHyeon  wrote:

> On Thu, Apr 08, 2010 at 11:27:10AM -0700, Jack Vogel wrote:
> > You know, I'm wondering if the so-called ALTQ fix, which makes the TX
> > start always queue is causing the problem on that side?
> >
>
> I'm not sure because I didn't configure ALTQ so it might be NOP for
> non-ALTQ case.
>
> > Jack
> >
> >
> > On Thu, Apr 8, 2010 at 11:22 AM, Jack Vogel  wrote:
> >
> > >
> > >
> > > On Thu, Apr 8, 2010 at 11:17 AM, Pyun YongHyeon 
> wrote:
> > >
> > >> On Thu, Apr 08, 2010 at 10:46:22AM -0400, Mike Tancsa wrote:
> > >> >
> > >> > OK, some more data... It seems dhclient is getting upset as well
> > >> > since the updated driver
> > >> >
> > >> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
> > >> > 255.255.255.255 port 67 interval 6
> > >> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with
> > >> > bytes received 332.
> > >> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data
> > >> > after udp payload.
> > >> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> > >> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
> > >> > 255.255.255.255 port 67
> > >> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with
> > >> > bytes received 332.
> > >>   ^^
> > >>
> > >> Try this patch. It should fix the issue. It seems Jack forgot to
> > >> strip CRC bytes as old em(4) didn't strip it, probably to
> > >> workaround silicon bug of old em(4) controllers.
> > >>
> > >>
> > > Actually it did strip it, but its buried in the code in an obscure way,
> > > that's
> > > what I just realized by looking at the old code. having the hardware
> strip
> > > will be easier I think.
> > >
> > >
> > >> It seems there are also TX issues here. The system load is too high
> > >> and sometimes system is not responsive while TX is in progress.
> > >> Because I initiated TCP bulk transfers, TSO should reduce CPU load
> > >> a lot but it didn't so I guess it could also be related watchdog
> > >> timeouts you've seen. I'll see what can be done.
> > >>
> > >
> > > Will look at that as well.
> > >
> > > Thanks!
> > >
> > > Jack
> > >
> > >
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Pyun YongHyeon
On Thu, Apr 08, 2010 at 11:27:10AM -0700, Jack Vogel wrote:
> You know, I'm wondering if the so-called ALTQ fix, which makes the TX
> start always queue is causing the problem on that side?
> 

I'm not sure because I didn't configure ALTQ so it might be NOP for
non-ALTQ case.

> Jack
> 
> 
> On Thu, Apr 8, 2010 at 11:22 AM, Jack Vogel  wrote:
> 
> >
> >
> > On Thu, Apr 8, 2010 at 11:17 AM, Pyun YongHyeon  wrote:
> >
> >> On Thu, Apr 08, 2010 at 10:46:22AM -0400, Mike Tancsa wrote:
> >> >
> >> > OK, some more data... It seems dhclient is getting upset as well
> >> > since the updated driver
> >> >
> >> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
> >> > 255.255.255.255 port 67 interval 6
> >> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with
> >> > bytes received 332.
> >> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data
> >> > after udp payload.
> >> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> >> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
> >> > 255.255.255.255 port 67
> >> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with
> >> > bytes received 332.
> >>   ^^
> >>
> >> Try this patch. It should fix the issue. It seems Jack forgot to
> >> strip CRC bytes as old em(4) didn't strip it, probably to
> >> workaround silicon bug of old em(4) controllers.
> >>
> >>
> > Actually it did strip it, but its buried in the code in an obscure way,
> > that's
> > what I just realized by looking at the old code. having the hardware strip
> > will be easier I think.
> >
> >
> >> It seems there are also TX issues here. The system load is too high
> >> and sometimes system is not responsive while TX is in progress.
> >> Because I initiated TCP bulk transfers, TSO should reduce CPU load
> >> a lot but it didn't so I guess it could also be related watchdog
> >> timeouts you've seen. I'll see what can be done.
> >>
> >
> > Will look at that as well.
> >
> > Thanks!
> >
> > Jack
> >
> >
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
Bigger question is will it fix Brandon's VirtualBox issue??

Jack


On Thu, Apr 8, 2010 at 11:31 AM, Mike Tancsa  wrote:

> At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:
>
>  Try this patch. It should fix the issue. It seems Jack forgot to
>> strip CRC bytes as old em(4) didn't strip it, probably to
>> workaround silicon bug of old em(4) controllers.
>>
>
> Thanks! The attached patch does indeed fix the dhclient issue.
>
>
>
>  It seems there are also TX issues here. The system load is too high
>> and sometimes system is not responsive while TX is in progress.
>> Because I initiated TCP bulk transfers, TSO should reduce CPU load
>> a lot but it didn't so I guess it could also be related watchdog
>> timeouts you've seen. I'll see what can be done.
>>
>
> Thanks for looking into that as well!!
>
>
>---Mike
>
>
>
>
> 
> Mike Tancsa,  tel +1 519 651 3400
> Sentex Communications,m...@sentex.net
> Providing Internet since 1994www.sentex.net
> Cambridge, Ontario Canada www.sentex.net/mike
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Mike Tancsa

At 02:17 PM 4/8/2010, Pyun YongHyeon wrote:


Try this patch. It should fix the issue. It seems Jack forgot to
strip CRC bytes as old em(4) didn't strip it, probably to
workaround silicon bug of old em(4) controllers.


Thanks! The attached patch does indeed fix the dhclient issue.



It seems there are also TX issues here. The system load is too high
and sometimes system is not responsive while TX is in progress.
Because I initiated TCP bulk transfers, TSO should reduce CPU load
a lot but it didn't so I guess it could also be related watchdog
timeouts you've seen. I'll see what can be done.


Thanks for looking into that as well!!

---Mike





Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
You know, I'm wondering if the so-called ALTQ fix, which makes the TX
start always queue is causing the problem on that side?

Jack


On Thu, Apr 8, 2010 at 11:22 AM, Jack Vogel  wrote:

>
>
> On Thu, Apr 8, 2010 at 11:17 AM, Pyun YongHyeon  wrote:
>
>> On Thu, Apr 08, 2010 at 10:46:22AM -0400, Mike Tancsa wrote:
>> >
>> > OK, some more data... It seems dhclient is getting upset as well
>> > since the updated driver
>> >
>> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
>> > 255.255.255.255 port 67 interval 6
>> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with
>> > bytes received 332.
>> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data
>> > after udp payload.
>> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
>> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
>> > 255.255.255.255 port 67
>> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with
>> > bytes received 332.
>>   ^^
>>
>> Try this patch. It should fix the issue. It seems Jack forgot to
>> strip CRC bytes as old em(4) didn't strip it, probably to
>> workaround silicon bug of old em(4) controllers.
>>
>>
> Actually it did strip it, but its buried in the code in an obscure way,
> that's
> what I just realized by looking at the old code. having the hardware strip
> will be easier I think.
>
>
>> It seems there are also TX issues here. The system load is too high
>> and sometimes system is not responsive while TX is in progress.
>> Because I initiated TCP bulk transfers, TSO should reduce CPU load
>> a lot but it didn't so I guess it could also be related watchdog
>> timeouts you've seen. I'll see what can be done.
>>
>
> Will look at that as well.
>
> Thanks!
>
> Jack
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
On Thu, Apr 8, 2010 at 11:17 AM, Pyun YongHyeon  wrote:

> On Thu, Apr 08, 2010 at 10:46:22AM -0400, Mike Tancsa wrote:
> >
> > OK, some more data... It seems dhclient is getting upset as well
> > since the updated driver
> >
> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
> > 255.255.255.255 port 67 interval 6
> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with
> > bytes received 332.
> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data
> > after udp payload.
> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
> > 255.255.255.255 port 67
> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with
> > bytes received 332.
>   ^^
>
> Try this patch. It should fix the issue. It seems Jack forgot to
> strip CRC bytes as old em(4) didn't strip it, probably to
> workaround silicon bug of old em(4) controllers.
>
>
Actually it did strip it, but its buried in the code in an obscure way,
that's
what I just realized by looking at the old code. having the hardware strip
will be easier I think.


> It seems there are also TX issues here. The system load is too high
> and sometimes system is not responsive while TX is in progress.
> Because I initiated TCP bulk transfers, TSO should reduce CPU load
> a lot but it didn't so I guess it could also be related watchdog
> timeouts you've seen. I'll see what can be done.
>

Will look at that as well.

Thanks!

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
LOL, what timing :)


On Thu, Apr 8, 2010 at 11:17 AM, Pyun YongHyeon  wrote:

> On Thu, Apr 08, 2010 at 10:46:22AM -0400, Mike Tancsa wrote:
> >
> > OK, some more data... It seems dhclient is getting upset as well
> > since the updated driver
> >
> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
> > 255.255.255.255 port 67 interval 6
> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with
> > bytes received 332.
> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data
> > after udp payload.
> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
> > 255.255.255.255 port 67
> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with
> > bytes received 332.
>   ^^
>
> Try this patch. It should fix the issue. It seems Jack forgot to
> strip CRC bytes as old em(4) didn't strip it, probably to
> workaround silicon bug of old em(4) controllers.
>
> It seems there are also TX issues here. The system load is too high
> and sometimes system is not responsive while TX is in progress.
> Because I initiated TCP bulk transfers, TSO should reduce CPU load
> a lot but it didn't so I guess it could also be related watchdog
> timeouts you've seen. I'll see what can be done.
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
Both of you try something for me:

Assuming you are using the latest code in HEAD, at line 4042 please make
this insert:

/* Strip the CRC */
rctl |= E1000_RCTL_SECRC;

And try things again, I think this will solve at least the DHCP thing. I
hope.

Jack


On Thu, Apr 8, 2010 at 10:46 AM, Mike Tancsa  wrote:

> At 12:52 PM 4/8/2010, Jack Vogel wrote:
>
>> Mike, I noticed this connection is only 100Mb, that isn't accidental? And,
>> is it possible for
>> you to check a connection at 1Gb and see if the watchdogs don't happen.
>>
>> My test engineer is running this code, and we are having trouble repro'ing
>> the issue, so any
>> clues might help. Is the kernel 64 or 32 bit?
>>
>
> It is a 32 bit kernel (see the attached dmesg from the first email) in a
> cisco 10/100 switch. I just tried and the dhclient issue happens at gig
> speeds as well.
>
> Apr  8 13:34:29 ich10 dhclient[1480]: DHCPREQUEST on em0 to 255.255.255.255
> port 67
> Apr  8 13:34:35 ich10 dhclient[1480]: DHCPREQUEST on em0 to 255.255.255.255
> port 67
> Apr  8 13:34:48 ich10 dhclient[1480]: DHCPDISCOVER on em0 to
> 255.255.255.255 port 67 interval 5
> Apr  8 13:34:48 ich10 dhclient[1480]: ip length 328 disagrees with bytes
> received 332.
> Apr  8 13:34:48 ich10 dhclient[1480]: accepting packet with data after udp
> payload.
>
> 0(ich10)# ifconfig em0
>
> em0: flags=8843 metric 0 mtu 1500
>
>  
> options=399b
>ether 00:1c:c0:95:0d:0d
>inet 192.168.xx.219 netmask 0xff00 broadcast 192.168.xx.255
>media: Ethernet autoselect (1000baseT )
>status: active
> 0(ich10)#
>
>
> ... As for the watchdog issue, it just seems to show up. I am not able to
> reproduce it on demand. However, the dhclient issue happens all the time. I
> will give it a whirl on a gigabit for a day and see.
>
> Its not that frequent
>
>
> Apr  7 02:19:05 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 03:46:51 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 08:04:03 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 10:39:40 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 11:12:34 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 13:25:26 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 14:01:36 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 17:19:53 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 21:16:45 ich10 kernel: em0: Watchdog timeout -- resetting
> Apr  7 22:09:10 ich10 kernel: em0: Watchdog timeout -- resetting
>
> But it should in theory show up at least once in 24hrs if its not a port
> speed issue.
>
> A potential 3rd issue I also noticed is that this morning I could not login
> to the box-- but I could ping it, but no SSH banner. ie no 3way handshake
> completing.  I was able to 'fix' the issue by logging onto the console,
> initiating some outbound tcp traffic (ie. ssh out from the box) and then I
> could login again. Perhaps a TSO issue ? I now have a firewire console
> hooked up so I can login out of band. If this issue comes up again, how can
> I best narrow down what/where this 3rd issue is ?
>
>---Mike
>
>
>
> 
> Mike Tancsa,  tel +1 519 651 3400
> Sentex Communications,m...@sentex.net
> Providing Internet since 1994www.sentex.net
> Cambridge, Ontario Canada www.sentex.net/mike
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Pyun YongHyeon
On Thu, Apr 08, 2010 at 10:46:22AM -0400, Mike Tancsa wrote:
> 
> OK, some more data... It seems dhclient is getting upset as well 
> since the updated driver
> 
> Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to 
> 255.255.255.255 port 67 interval 6
> Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with 
> bytes received 332.
> Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data 
> after udp payload.
> Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to 
> 255.255.255.255 port 67
> Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with 
> bytes received 332.
  ^^

Try this patch. It should fix the issue. It seems Jack forgot to
strip CRC bytes as old em(4) didn't strip it, probably to
workaround silicon bug of old em(4) controllers.

It seems there are also TX issues here. The system load is too high
and sometimes system is not responsive while TX is in progress.
Because I initiated TCP bulk transfers, TSO should reduce CPU load
a lot but it didn't so I guess it could also be related watchdog
timeouts you've seen. I'll see what can be done.
Index: sys/dev/e1000/if_em.c
===
--- sys/dev/e1000/if_em.c	(revision 206399)
+++ sys/dev/e1000/if_em.c	(working copy)
@@ -3706,6 +3706,8 @@
 		rxr->next_to_refresh = i;
 	}
 update:
+	bus_dmamap_sync(rxr->rxdma.dma_tag, rxr->rxdma.dma_map,
+	BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
 	if (cleaned != -1) /* Update tail index */
 		E1000_WRITE_REG(&adapter->hw,
 		E1000_RDT(rxr->me), cleaned);
@@ -4039,7 +4041,8 @@
 	rctl |= E1000_RCTL_EN | E1000_RCTL_BAM |
 	E1000_RCTL_LBM_NO | E1000_RCTL_RDMTS_HALF |
 	(hw->mac.mc_filter_type << E1000_RCTL_MO_SHIFT);
-
+	/* Strip CRC bytes. */
+	rctl |= E1000_RCTL_SECRC;
 /* Make sure VLAN Filters are off */
 rctl &= ~E1000_RCTL_VFE;
 	rctl &= ~E1000_RCTL_SBP;
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: em driver regression

2010-04-08 Thread Mike Tancsa

At 12:52 PM 4/8/2010, Jack Vogel wrote:
Mike, I noticed this connection is only 100Mb, that isn't 
accidental? And, is it possible for

you to check a connection at 1Gb and see if the watchdogs don't happen.

My test engineer is running this code, and we are having trouble 
repro'ing the issue, so any

clues might help. Is the kernel 64 or 32 bit?


It is a 32 bit kernel (see the attached dmesg from the first email) 
in a cisco 10/100 switch. I just tried and the dhclient issue happens 
at gig speeds as well.


Apr  8 13:34:29 ich10 dhclient[1480]: DHCPREQUEST on em0 to 
255.255.255.255 port 67
Apr  8 13:34:35 ich10 dhclient[1480]: DHCPREQUEST on em0 to 
255.255.255.255 port 67
Apr  8 13:34:48 ich10 dhclient[1480]: DHCPDISCOVER on em0 to 
255.255.255.255 port 67 interval 5
Apr  8 13:34:48 ich10 dhclient[1480]: ip length 328 disagrees with 
bytes received 332.
Apr  8 13:34:48 ich10 dhclient[1480]: accepting packet with data 
after udp payload.


0(ich10)# ifconfig em0
em0: flags=8843 metric 0 mtu 1500

options=399b
ether 00:1c:c0:95:0d:0d
inet 192.168.xx.219 netmask 0xff00 broadcast 192.168.xx.255
media: Ethernet autoselect (1000baseT )
status: active
0(ich10)#


... As for the watchdog issue, it just seems to show up. I am not 
able to reproduce it on demand. However, the dhclient issue happens 
all the time. I will give it a whirl on a gigabit for a day and see.


Its not that frequent

Apr  7 02:19:05 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 03:46:51 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 08:04:03 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 10:39:40 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 11:12:34 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 13:25:26 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 14:01:36 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 17:19:53 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 21:16:45 ich10 kernel: em0: Watchdog timeout -- resetting
Apr  7 22:09:10 ich10 kernel: em0: Watchdog timeout -- resetting

But it should in theory show up at least once in 24hrs if its not a 
port speed issue.


A potential 3rd issue I also noticed is that this morning I could not 
login to the box-- but I could ping it, but no SSH banner. ie no 3way 
handshake completing.  I was able to 'fix' the issue by logging onto 
the console, initiating some outbound tcp traffic (ie. ssh out from 
the box) and then I could login again. Perhaps a TSO issue ? I now 
have a firewire console hooked up so I can login out of band. If this 
issue comes up again, how can I best narrow down what/where this 3rd issue is ?


---Mike



Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
On Thu, Apr 8, 2010 at 10:18 AM, Brandon Gooch
wrote:

> On Thu, Apr 8, 2010 at 12:06 PM, Jack Vogel  wrote:
> >
> >
> > On Thu, Apr 8, 2010 at 10:01 AM, Brandon Gooch <
> jamesbrandongo...@gmail.com>
> > wrote:
> >>
> >> On Thu, Apr 8, 2010 at 11:52 AM, Jack Vogel  wrote:
> >> > Mike, I noticed this connection is only 100Mb, that isn't accidental?
> >> > And,
> >> > is it possible for
> >> > you to check a connection at 1Gb and see if the watchdogs don't
> happen.
> >> >
> >> > My test engineer is running this code, and we are having trouble
> >> > repro'ing
> >> > the issue, so any
> >> > clues might help. Is the kernel 64 or 32 bit?
> >> >
> >> > Jack
> >> >
> >>
> >> Not to butt in or anything...
> >
> > Not butting in :)  OK, so this all looks fine or am I missing something?
> >
> > Jack
> >
>
> This is the dmesg from the system exhibiting the "ip length 328
> disagrees with bytes received 332" while attempting to obtain a lease
> on the two DHCP-enabled VLANs, and also manifests in the VirtualBox
> bridged networking guests.
>
> I can honestly say that other than the output from dhclient and the
> VirtualBox issue, I might not have noticed problems otherwise.
>
> For instance, I have a VLAN interface configured to connect to an
> "outside" LAN segment and I'm running sshd on that interfaces IP
> address (using the new multiple routing table feature as well). I was
> able to connect to the sshd instance as usual, and I can make
> connections out as in:
>
> # setfib 4 ping google.com
>
> ...things seemed OK. Until VirtualBox. Then I started paying attention
> to messages scrolling by as my machine booted and saw the dhclient "ip
> length" thing (just as Mike Tancsa had) and thought, "It must be the
> new em(4) driver".
>
> That's my story :)
>
> I don't know what chip my em(4) device is, how can I check that? Also,
> would some type of traffic capture help in this case?
>
> -Brandon
>
>
pciconf -l will show us. my tester is having trouble reproducing this,
but I dont think he is using vlans, that must be the missing ingredient.

The disagreement in size is 4 bytes, just the size of the CRC
coincidentally, but I dont have it set to strip, h. I may have
some code for you to try shortly, stay tuned.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Brandon Gooch
On Thu, Apr 8, 2010 at 12:06 PM, Jack Vogel  wrote:
>
>
> On Thu, Apr 8, 2010 at 10:01 AM, Brandon Gooch 
> wrote:
>>
>> On Thu, Apr 8, 2010 at 11:52 AM, Jack Vogel  wrote:
>> > Mike, I noticed this connection is only 100Mb, that isn't accidental?
>> > And,
>> > is it possible for
>> > you to check a connection at 1Gb and see if the watchdogs don't happen.
>> >
>> > My test engineer is running this code, and we are having trouble
>> > repro'ing
>> > the issue, so any
>> > clues might help. Is the kernel 64 or 32 bit?
>> >
>> > Jack
>> >
>>
>> Not to butt in or anything...
>
> Not butting in :)  OK, so this all looks fine or am I missing something?
>
> Jack
>

This is the dmesg from the system exhibiting the "ip length 328
disagrees with bytes received 332" while attempting to obtain a lease
on the two DHCP-enabled VLANs, and also manifests in the VirtualBox
bridged networking guests.

I can honestly say that other than the output from dhclient and the
VirtualBox issue, I might not have noticed problems otherwise.

For instance, I have a VLAN interface configured to connect to an
"outside" LAN segment and I'm running sshd on that interfaces IP
address (using the new multiple routing table feature as well). I was
able to connect to the sshd instance as usual, and I can make
connections out as in:

# setfib 4 ping google.com

...things seemed OK. Until VirtualBox. Then I started paying attention
to messages scrolling by as my machine booted and saw the dhclient "ip
length" thing (just as Mike Tancsa had) and thought, "It must be the
new em(4) driver".

That's my story :)

I don't know what chip my em(4) device is, how can I check that? Also,
would some type of traffic capture help in this case?

-Brandon

>>
>> 64-bit FreeBSD Stable, 1Gb em(4) connected to Cisco 2960G trunking port.
>>
>> My dmesg:
>>
>> Copyright (c) 1992-2010 The FreeBSD Project.
>> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>>        The Regents of the University of California. All rights reserved.
>> FreeBSD is a registered trademark of The FreeBSD Foundation.
>> FreeBSD 8.0-STABLE #2 r206210:206343MS: Wed Apr  7 16:18:14 CDT 2010
>>    r...@bgooch755.se.edu:/usr/obj/usr/src/sys/DELL755 amd64
>> Timecounter "i8254" frequency 1193182 Hz quality 0
>> CPU: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz (2394.00-MHz K8-class
>> CPU)
>>  Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
>>
>>  Features=0xbfebfbff
>>  Features2=0xe3bd
>>  AMD Features=0x20100800
>>  AMD Features2=0x1
>>  TSC: P-state invariant
>> real memory  = 8589934592 (8192 MB)
>> avail memory = 8103940096 (7728 MB)
>> ACPI APIC Table: 
>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>> FreeBSD/SMP: 1 package(s) x 4 core(s)
>>  cpu0 (BSP): APIC ID:  0
>>  cpu1 (AP): APIC ID:  1
>>  cpu2 (AP): APIC ID:  2
>>  cpu3 (AP): APIC ID:  3
>> ioapic0: Changing APIC ID to 8
>> ioapic0  irqs 0-23 on motherboard
>> lapic0: Forcing LINT1 to edge trigger
>> kbd1 at kbdmux0
>> acpi0:  on motherboard
>> acpi0: [ITHREAD]
>> acpi0: Power Button (fixed)
>> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
>> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
>> cpu0:  on acpi0
>> cpu1:  on acpi0
>> cpu2:  on acpi0
>> cpu3:  on acpi0
>> acpi_hpet0:  iomem 0xfed0-0xfed003ff on
>> acpi0
>> Timecounter "HPET" frequency 14318180 Hz quality 900
>> acpi_button0:  on acpi0
>> pcib0:  port 0xcf8-0xcff on acpi0
>> pci0:  on pcib0
>> pcib1:  irq 16 at device 1.0 on pci0
>> pci1:  on pcib1
>> vgapci0:  port 0xdc80-0xdcff mem
>> 0xfd00-0xfdff,0xd000-0xdfff,0xfa00-0xfbff irq
>> 16 at device 0.0 on pci1
>> nvidia0:  on vgapci0
>> vgapci0: child nvidia0 requested pci_enable_busmaster
>> vgapci0: child nvidia0 requested pci_enable_io
>> vgapci0: child nvidia0 requested pci_enable_io
>> nvidia0: [ITHREAD]
>> pci0:  at device 3.0 (no driver attached)
>> atapci0:  port
>> 0xfe80-0xfe87,0xfe90-0xfe93,0xfea0-0xfea7,0xfeb0-0xfeb3,0xfef0-0xfeff
>> irq 18 at device 3.2 on pci0
>> atapci0: [ITHREAD]
>> ata2:  on atapci0
>> ata2: [ITHREAD]
>> ata3:  on atapci0
>> ata3: [ITHREAD]
>> pci0:  at device 3.3 (no driver attached)
>> em0:  port 0xecc0-0xecdf
>> mem 0xfebe-0xfebf,0xfebdb000-0xfebdbfff irq 21 at device 25.0
>> on pci0
>> em0: Using MSI interrupt
>> em0: [FILTER]
>> em0: Ethernet address: 00:1e:4f:d5:84:b7
>> uhci0:  port 0xff20-0xff3f irq 16
>> at device 26.0 on pci0
>> uhci0: [ITHREAD]
>> uhci0: LegSup = 0x2f00
>> usbus0:  on uhci0
>> uhci1:  port 0xff00-0xff1f irq 17
>> at device 26.1 on pci0
>> uhci1: [ITHREAD]
>> uhci1: LegSup = 0x2f00
>> usbus1:  on uhci1
>> ehci0:  mem
>> 0xfebd9c00-0xfebd9fff irq 22 at device 26.7 on pci0
>> ehci0: [ITHREAD]
>> usbus2: EHCI version 1.0
>> usbus2:  on ehci0
>> hdac0:  mem
>> 0xfebdc000-0xfebd irq 16 at device 27.0 on pci0
>> hdac0: HDA Driver Revision: 20100226_0142
>> hdac0: [ITHREAD]
>> pcib2:  irq 16 at device 28.0 on pci0
>> pci2:  on pcib2
>>

Re: em driver regression

2010-04-08 Thread Jack Vogel
On Thu, Apr 8, 2010 at 10:01 AM, Brandon Gooch
wrote:

> On Thu, Apr 8, 2010 at 11:52 AM, Jack Vogel  wrote:
> > Mike, I noticed this connection is only 100Mb, that isn't accidental?
> And,
> > is it possible for
> > you to check a connection at 1Gb and see if the watchdogs don't happen.
> >
> > My test engineer is running this code, and we are having trouble
> repro'ing
> > the issue, so any
> > clues might help. Is the kernel 64 or 32 bit?
> >
> > Jack
> >
>
> Not to butt in or anything...
>

Not butting in :)  OK, so this all looks fine or am I missing something?

Jack


>
> 64-bit FreeBSD Stable, 1Gb em(4) connected to Cisco 2960G trunking port.
>
> My dmesg:
>
> Copyright (c) 1992-2010 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.0-STABLE #2 r206210:206343MS: Wed Apr  7 16:18:14 CDT 2010
>r...@bgooch755.se.edu:/usr/obj/usr/src/sys/DELL755 amd64
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Core(TM)2 Quad CPUQ6600  @ 2.40GHz (2394.00-MHz K8-class
> CPU)
>  Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
>
>  
> Features=0xbfebfbff
>  Features2=0xe3bd
>  AMD Features=0x20100800
>  AMD Features2=0x1
>  TSC: P-state invariant
> real memory  = 8589934592 (8192 MB)
> avail memory = 8103940096 (7728 MB)
> ACPI APIC Table: 
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
> FreeBSD/SMP: 1 package(s) x 4 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
>  cpu2 (AP): APIC ID:  2
>  cpu3 (AP): APIC ID:  3
> ioapic0: Changing APIC ID to 8
> ioapic0  irqs 0-23 on motherboard
> lapic0: Forcing LINT1 to edge trigger
> kbd1 at kbdmux0
> acpi0:  on motherboard
> acpi0: [ITHREAD]
> acpi0: Power Button (fixed)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0:  on acpi0
> cpu1:  on acpi0
> cpu2:  on acpi0
> cpu3:  on acpi0
> acpi_hpet0:  iomem 0xfed0-0xfed003ff on
> acpi0
> Timecounter "HPET" frequency 14318180 Hz quality 900
> acpi_button0:  on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> pci0:  on pcib0
> pcib1:  irq 16 at device 1.0 on pci0
> pci1:  on pcib1
> vgapci0:  port 0xdc80-0xdcff mem
> 0xfd00-0xfdff,0xd000-0xdfff,0xfa00-0xfbff irq
> 16 at device 0.0 on pci1
> nvidia0:  on vgapci0
> vgapci0: child nvidia0 requested pci_enable_busmaster
> vgapci0: child nvidia0 requested pci_enable_io
> vgapci0: child nvidia0 requested pci_enable_io
> nvidia0: [ITHREAD]
> pci0:  at device 3.0 (no driver attached)
> atapci0:  port
> 0xfe80-0xfe87,0xfe90-0xfe93,0xfea0-0xfea7,0xfeb0-0xfeb3,0xfef0-0xfeff
> irq 18 at device 3.2 on pci0
> atapci0: [ITHREAD]
> ata2:  on atapci0
> ata2: [ITHREAD]
> ata3:  on atapci0
> ata3: [ITHREAD]
> pci0:  at device 3.3 (no driver attached)
> em0:  port 0xecc0-0xecdf
> mem 0xfebe-0xfebf,0xfebdb000-0xfebdbfff irq 21 at device 25.0
> on pci0
> em0: Using MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:1e:4f:d5:84:b7
> uhci0:  port 0xff20-0xff3f irq 16
> at device 26.0 on pci0
> uhci0: [ITHREAD]
> uhci0: LegSup = 0x2f00
> usbus0:  on uhci0
> uhci1:  port 0xff00-0xff1f irq 17
> at device 26.1 on pci0
> uhci1: [ITHREAD]
> uhci1: LegSup = 0x2f00
> usbus1:  on uhci1
> ehci0:  mem
> 0xfebd9c00-0xfebd9fff irq 22 at device 26.7 on pci0
> ehci0: [ITHREAD]
> usbus2: EHCI version 1.0
> usbus2:  on ehci0
> hdac0:  mem
> 0xfebdc000-0xfebd irq 16 at device 27.0 on pci0
> hdac0: HDA Driver Revision: 20100226_0142
> hdac0: [ITHREAD]
> pcib2:  irq 16 at device 28.0 on pci0
> pci2:  on pcib2
> uhci2:  port 0xff80-0xff9f irq 23
> at device 29.0 on pci0
> uhci2: [ITHREAD]
> usbus3:  on uhci2
> uhci3:  port 0xff60-0xff7f irq 17
> at device 29.1 on pci0
> uhci3: [ITHREAD]
> usbus4:  on uhci3
> uhci4:  port 0xff40-0xff5f irq 18
> at device 29.2 on pci0
> uhci4: [ITHREAD]
> usbus5:  on uhci4
> ehci1:  mem
> 0xff980800-0xff980bff irq 23 at device 29.7 on pci0
> ehci1: [ITHREAD]
> usbus6: EHCI version 1.0
> usbus6:  on ehci1
> pcib3:  at device 30.0 on pci0
> pci3:  on pcib3
> atapci1:  port
> 0xc8e0-0xc8e7,0xc8d8-0xc8db,0xc8e8-0xc8ef,0xc8dc-0xc8df,0xc8f0-0xc8ff
> mem 0xf9dffc00-0xf9df irq 16 at device 0.0 on pci3
> atapci1: [ITHREAD]
> ata4:  on atapci1
> ata4: [ITHREAD]
> ata5:  on atapci1
> ata5: [ITHREAD]
> ata6:  on atapci1
> ata6: [ITHREAD]
> ata7:  on atapci1
> ata7: [ITHREAD]
> pci3:  at device 2.0 (no driver attached)
> isab0:  at device 31.0 on pci0
> isa0:  on isab0
> atapci2:  port
> 0xfe00-0xfe07,0xfe10-0xfe13,0xfe20-0xfe27,0xfe30-0xfe33,0xfec0-0xfedf
> mem 0xff97-0xff9707ff irq 18 at device 31.2 on pci0
> atapci2: [ITHREAD]
> atapci2: AHCI called from vendor specific driver
> atapci2: AHCI v1.20 controller with 6 3Gbps ports, PM supported
> ata8:  on atapci2
> ata8: [ITHREAD]
> ata9:  on atapci2
> at

Re: em driver regression

2010-04-08 Thread Brandon Gooch
On Thu, Apr 8, 2010 at 11:52 AM, Jack Vogel  wrote:
> Mike, I noticed this connection is only 100Mb, that isn't accidental? And,
> is it possible for
> you to check a connection at 1Gb and see if the watchdogs don't happen.
>
> My test engineer is running this code, and we are having trouble repro'ing
> the issue, so any
> clues might help. Is the kernel 64 or 32 bit?
>
> Jack
>

Not to butt in or anything...

64-bit FreeBSD Stable, 1Gb em(4) connected to Cisco 2960G trunking port.

My dmesg:

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #2 r206210:206343MS: Wed Apr  7 16:18:14 CDT 2010
r...@bgooch755.se.edu:/usr/obj/usr/src/sys/DELL755 amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Quad CPUQ6600  @ 2.40GHz (2394.00-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
  
Features=0xbfebfbff
  Features2=0xe3bd
  AMD Features=0x20100800
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 8589934592 (8192 MB)
avail memory = 8103940096 (7728 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
ioapic0: Changing APIC ID to 8
ioapic0  irqs 0-23 on motherboard
lapic0: Forcing LINT1 to edge trigger
kbd1 at kbdmux0
acpi0:  on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0:  on acpi0
cpu1:  on acpi0
cpu2:  on acpi0
cpu3:  on acpi0
acpi_hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
acpi_button0:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  irq 16 at device 1.0 on pci0
pci1:  on pcib1
vgapci0:  port 0xdc80-0xdcff mem
0xfd00-0xfdff,0xd000-0xdfff,0xfa00-0xfbff irq
16 at device 0.0 on pci1
nvidia0:  on vgapci0
vgapci0: child nvidia0 requested pci_enable_busmaster
vgapci0: child nvidia0 requested pci_enable_io
vgapci0: child nvidia0 requested pci_enable_io
nvidia0: [ITHREAD]
pci0:  at device 3.0 (no driver attached)
atapci0:  port
0xfe80-0xfe87,0xfe90-0xfe93,0xfea0-0xfea7,0xfeb0-0xfeb3,0xfef0-0xfeff
irq 18 at device 3.2 on pci0
atapci0: [ITHREAD]
ata2:  on atapci0
ata2: [ITHREAD]
ata3:  on atapci0
ata3: [ITHREAD]
pci0:  at device 3.3 (no driver attached)
em0:  port 0xecc0-0xecdf
mem 0xfebe-0xfebf,0xfebdb000-0xfebdbfff irq 21 at device 25.0
on pci0
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:1e:4f:d5:84:b7
uhci0:  port 0xff20-0xff3f irq 16
at device 26.0 on pci0
uhci0: [ITHREAD]
uhci0: LegSup = 0x2f00
usbus0:  on uhci0
uhci1:  port 0xff00-0xff1f irq 17
at device 26.1 on pci0
uhci1: [ITHREAD]
uhci1: LegSup = 0x2f00
usbus1:  on uhci1
ehci0:  mem
0xfebd9c00-0xfebd9fff irq 22 at device 26.7 on pci0
ehci0: [ITHREAD]
usbus2: EHCI version 1.0
usbus2:  on ehci0
hdac0:  mem
0xfebdc000-0xfebd irq 16 at device 27.0 on pci0
hdac0: HDA Driver Revision: 20100226_0142
hdac0: [ITHREAD]
pcib2:  irq 16 at device 28.0 on pci0
pci2:  on pcib2
uhci2:  port 0xff80-0xff9f irq 23
at device 29.0 on pci0
uhci2: [ITHREAD]
usbus3:  on uhci2
uhci3:  port 0xff60-0xff7f irq 17
at device 29.1 on pci0
uhci3: [ITHREAD]
usbus4:  on uhci3
uhci4:  port 0xff40-0xff5f irq 18
at device 29.2 on pci0
uhci4: [ITHREAD]
usbus5:  on uhci4
ehci1:  mem
0xff980800-0xff980bff irq 23 at device 29.7 on pci0
ehci1: [ITHREAD]
usbus6: EHCI version 1.0
usbus6:  on ehci1
pcib3:  at device 30.0 on pci0
pci3:  on pcib3
atapci1:  port
0xc8e0-0xc8e7,0xc8d8-0xc8db,0xc8e8-0xc8ef,0xc8dc-0xc8df,0xc8f0-0xc8ff
mem 0xf9dffc00-0xf9df irq 16 at device 0.0 on pci3
atapci1: [ITHREAD]
ata4:  on atapci1
ata4: [ITHREAD]
ata5:  on atapci1
ata5: [ITHREAD]
ata6:  on atapci1
ata6: [ITHREAD]
ata7:  on atapci1
ata7: [ITHREAD]
pci3:  at device 2.0 (no driver attached)
isab0:  at device 31.0 on pci0
isa0:  on isab0
atapci2:  port
0xfe00-0xfe07,0xfe10-0xfe13,0xfe20-0xfe27,0xfe30-0xfe33,0xfec0-0xfedf
mem 0xff97-0xff9707ff irq 18 at device 31.2 on pci0
atapci2: [ITHREAD]
atapci2: AHCI called from vendor specific driver
atapci2: AHCI v1.20 controller with 6 3Gbps ports, PM supported
ata8:  on atapci2
ata8: [ITHREAD]
ata9:  on atapci2
ata9: [ITHREAD]
ata10:  on atapci2
ata10: [ITHREAD]
ata11:  on atapci2
ata11: [ITHREAD]
ata12:  on atapci2
ata12: [ITHREAD]
pci0:  at device 31.3 (no driver attached)
atrtc0:  port 0x70-0x7f irq 8 on acpi0
fdc0:  port 0x3f0-0x3f5,0x3f7 irq 6 drq
2 on acpi0
fdc0: [FILTER]
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
orm0:  at iomem
0xc-0xce7ff,0xce800-0xd37ff,0xd3800-0xd57ff,0xd5800-0xd7fff o

Re: em driver regression

2010-04-08 Thread Jack Vogel
Mike, I noticed this connection is only 100Mb, that isn't accidental? And,
is it possible for
you to check a connection at 1Gb and see if the watchdogs don't happen.

My test engineer is running this code, and we are having trouble repro'ing
the issue, so any
clues might help. Is the kernel 64 or 32 bit?

Jack


On Thu, Apr 8, 2010 at 6:20 AM, Mike Tancsa  wrote:

> At 09:12 AM 4/8/2010, Mike Tancsa wrote:
>
>> Hi Jack,
>>I looks like the latest MFC to RELENG_8 for the em driver has
>> caused a regression. The box is not doing much as its a development server
>> in the lab. This is an Intel MB (DX58SO). dmesg and pciconf -lvc attached.
>>
>
>
>
> Here are the stats from the NIC as well.
>
> em0: Excessive collisions = 0
> em0: Sequence errors = 0
> em0: Defer count = 0
> em0: Missed Packets = 0
> em0: Receive No Buffers = 0
> em0: Receive Length Errors = 0
> em0: Receive errors = 0
> em0: Crc errors = 0
> em0: Alignment errors = 0
> em0: Collision/Carrier extension errors = 0
> em0: watchdog timeouts = 16
> em0: XON Rcvd = 0
> em0: XON Xmtd = 0
> em0: XOFF Rcvd = 0
> em0: XOFF Xmtd = 0
> em0: Good Packets Rcvd = 65839
> em0: Good Packets Xmtd = 13100
> em0: TSO Contexts Xmtd = 203
> em0: TSO Contexts Failed = 0
>
> It just grabs the IP via DHCP
>
> em0: flags=8843 metric 0 mtu 1500
>
>  
> options=399b
>ether 00:1c:c0:95:0d:0d
>inet 192.168.xx.yy netmask 0xff00 broadcast 192.168.xx.zz
>media: Ethernet autoselect (100baseTX )
>status: active
>
>
>
>
> 
> Mike Tancsa,  tel +1 519 651 3400
> Sentex Communications,m...@sentex.net
> Providing Internet since 1994www.sentex.net
> Cambridge, Ontario Canada www.sentex.net/mike
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Brandon Gooch
On Thu, Apr 8, 2010 at 11:04 AM, Jack Vogel  wrote:
> Brandon,
>
> Did the checkin of yesterday afternoon resolve the problem of the win7
> systems in
> VirtualBox? I will continue to look at this today.
>
> Jack
>

Sorry, I was a little unclear on that :(

No, the issue wasn't resolved even after the most recent commits.

I will be available for testing all day (and this evening if
required), let me know what you'd like from me, and I'll help any way
I can.

-Brandon

>
> On Thu, Apr 8, 2010 at 8:29 AM, Brandon Gooch 
> wrote:
>>
>> On Thu, Apr 8, 2010 at 9:46 AM, Mike Tancsa  wrote:
>> >
>> > OK, some more data... It seems dhclient is getting upset as well since
>> > the
>> > updated driver
>> >
>> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
>> > 255.255.255.255
>> > port 67 interval 6
>> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with bytes
>> > received 332.
>> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data after
>> > udp
>> > payload.
>> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
>> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
>> > 255.255.255.255
>> > port 67
>> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with bytes
>> > received 332.
>> > Apr  8 10:28:40 ich10 dhclient[1383]: accepting packet with data after
>> > udp
>> > payload.
>> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPACK from 192.168.xx.1
>> >
>> > I also tried manually applying the patch below
>> >
>> >
>> > http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016189.html
>> >
>> > but still get the same error on dhclient
>> >
>> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with bytes
>> > received 332.
>> >
>> > which was not there before the 7.0.0 driver update
>> >
>> > em0: flags=8843 metric 0 mtu
>> > 1500
>> >
>> >
>> >  options=399b
>> >        ether 00:1c:c0:95:0d:0d
>> >        inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
>> >        media: Ethernet autoselect (100baseTX )
>> >        status: active
>> >
>> > Also, should not
>> >
>> > # ifconfig em0 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso
>> > 0(ich10)# ifconfig em0
>> > em0: flags=8843 metric 0 mtu
>> > 1500
>> >
>> >
>> >  options=388b
>> >        ether 00:1c:c0:95:0d:0d
>> >        inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
>> >        media: Ethernet autoselect (100baseTX )
>> >        status: active
>> > 0(ich10)# killall dhclient
>> > 0(ich10)# dhclient em0
>> > DHCPREQUEST on em0 to 255.255.255.255 port 67
>> > ip length 328 disagrees with bytes received 332.
>> > accepting packet with data after udp payload.
>> > DHCPACK from 192.168.xx.1
>> > bound to 192.168.xx.219 -- renewal in 22777 seconds.
>> > 0(ich10)#
>> >
>> > disable all the vlan features on the nic ?
>> >
>> >        ---Mike
>> >
>> >
>> > 
>> > Mike Tancsa,                                      tel +1 519 651 3400
>> > Sentex Communications,                            m...@sentex.net
>> > Providing Internet since 1994                    www.sentex.net
>> > Cambridge, Ontario Canada                         www.sentex.net/mike
>> >
>> > ___
>> > freebsd-stable@freebsd.org mailing list
>> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> > To unsubscribe, send any mail to
>> > "freebsd-stable-unsubscr...@freebsd.org"
>> >
>>
>> I'm also seeing this.
>>
>> Jack, I've built the most recent revision from CURRENT and installed
>> it on the 8-STABLE machine. This is the computer I e-mailed about
>> yesterday (20100407) with which I've been having trouble with
>> VirtualBox 3.1.6 (FreeBSD Host) Windows Guests, bridged networking,
>> etc...
>>
>> Same situation with VirtualBox and still:
>>
>> ip length 328 disagrees with bytes received 332
>>
>> -Brandon
>
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Jack Vogel
Brandon,

Did the checkin of yesterday afternoon resolve the problem of the win7
systems in
VirtualBox? I will continue to look at this today.

Jack


On Thu, Apr 8, 2010 at 8:29 AM, Brandon Gooch
wrote:

> On Thu, Apr 8, 2010 at 9:46 AM, Mike Tancsa  wrote:
> >
> > OK, some more data... It seems dhclient is getting upset as well since
> the
> > updated driver
> >
> > Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to
> 255.255.255.255
> > port 67 interval 6
> > Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with bytes
> > received 332.
> > Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data after
> udp
> > payload.
> > Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to
> 255.255.255.255
> > port 67
> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with bytes
> > received 332.
> > Apr  8 10:28:40 ich10 dhclient[1383]: accepting packet with data after
> udp
> > payload.
> > Apr  8 10:28:40 ich10 dhclient[1383]: DHCPACK from 192.168.xx.1
> >
> > I also tried manually applying the patch below
> >
> > 
> http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016189.html
> >
> > but still get the same error on dhclient
> >
> > Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with bytes
> > received 332.
> >
> > which was not there before the 7.0.0 driver update
> >
> > em0: flags=8843 metric 0 mtu 1500
> >
> >
>  
> options=399b
> >ether 00:1c:c0:95:0d:0d
> >inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
> >media: Ethernet autoselect (100baseTX )
> >status: active
> >
> > Also, should not
> >
> > # ifconfig em0 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso
> > 0(ich10)# ifconfig em0
> > em0: flags=8843 metric 0 mtu 1500
> >
> >
>  
> options=388b
> >ether 00:1c:c0:95:0d:0d
> >inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
> >media: Ethernet autoselect (100baseTX )
> >status: active
> > 0(ich10)# killall dhclient
> > 0(ich10)# dhclient em0
> > DHCPREQUEST on em0 to 255.255.255.255 port 67
> > ip length 328 disagrees with bytes received 332.
> > accepting packet with data after udp payload.
> > DHCPACK from 192.168.xx.1
> > bound to 192.168.xx.219 -- renewal in 22777 seconds.
> > 0(ich10)#
> >
> > disable all the vlan features on the nic ?
> >
> >---Mike
> >
> >
> > 
> > Mike Tancsa,  tel +1 519 651 3400
> > Sentex Communications,m...@sentex.net
> > Providing Internet since 1994www.sentex.net
> > Cambridge, Ontario Canada www.sentex.net/mike
> >
> > ___
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
> "
> >
>
> I'm also seeing this.
>
> Jack, I've built the most recent revision from CURRENT and installed
> it on the 8-STABLE machine. This is the computer I e-mailed about
> yesterday (20100407) with which I've been having trouble with
> VirtualBox 3.1.6 (FreeBSD Host) Windows Guests, bridged networking,
> etc...
>
> Same situation with VirtualBox and still:
>
> ip length 328 disagrees with bytes received 332
>
> -Brandon
>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Brandon Gooch
On Thu, Apr 8, 2010 at 9:46 AM, Mike Tancsa  wrote:
>
> OK, some more data... It seems dhclient is getting upset as well since the
> updated driver
>
> Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to 255.255.255.255
> port 67 interval 6
> Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with bytes
> received 332.
> Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data after udp
> payload.
> Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
> Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to 255.255.255.255
> port 67
> Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with bytes
> received 332.
> Apr  8 10:28:40 ich10 dhclient[1383]: accepting packet with data after udp
> payload.
> Apr  8 10:28:40 ich10 dhclient[1383]: DHCPACK from 192.168.xx.1
>
> I also tried manually applying the patch below
>
> http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016189.html
>
> but still get the same error on dhclient
>
> Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with bytes
> received 332.
>
> which was not there before the 7.0.0 driver update
>
> em0: flags=8843 metric 0 mtu 1500
>
>  options=399b
>        ether 00:1c:c0:95:0d:0d
>        inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
>        media: Ethernet autoselect (100baseTX )
>        status: active
>
> Also, should not
>
> # ifconfig em0 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso
> 0(ich10)# ifconfig em0
> em0: flags=8843 metric 0 mtu 1500
>
>  options=388b
>        ether 00:1c:c0:95:0d:0d
>        inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
>        media: Ethernet autoselect (100baseTX )
>        status: active
> 0(ich10)# killall dhclient
> 0(ich10)# dhclient em0
> DHCPREQUEST on em0 to 255.255.255.255 port 67
> ip length 328 disagrees with bytes received 332.
> accepting packet with data after udp payload.
> DHCPACK from 192.168.xx.1
> bound to 192.168.xx.219 -- renewal in 22777 seconds.
> 0(ich10)#
>
> disable all the vlan features on the nic ?
>
>        ---Mike
>
>
> 
> Mike Tancsa,                                      tel +1 519 651 3400
> Sentex Communications,                            m...@sentex.net
> Providing Internet since 1994                    www.sentex.net
> Cambridge, Ontario Canada                         www.sentex.net/mike
>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>

I'm also seeing this.

Jack, I've built the most recent revision from CURRENT and installed
it on the 8-STABLE machine. This is the computer I e-mailed about
yesterday (20100407) with which I've been having trouble with
VirtualBox 3.1.6 (FreeBSD Host) Windows Guests, bridged networking,
etc...

Same situation with VirtualBox and still:

ip length 328 disagrees with bytes received 332

-Brandon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Mike Tancsa


OK, some more data... It seems dhclient is getting upset as well 
since the updated driver


Apr  8 10:28:37 ich10 dhclient[1383]: DHCPDISCOVER on em0 to 
255.255.255.255 port 67 interval 6
Apr  8 10:28:38 ich10 dhclient[1383]: ip length 328 disagrees with 
bytes received 332.
Apr  8 10:28:38 ich10 dhclient[1383]: accepting packet with data 
after udp payload.

Apr  8 10:28:38 ich10 dhclient[1383]: DHCPOFFER from 192.168.xx.1
Apr  8 10:28:40 ich10 dhclient[1383]: DHCPREQUEST on em0 to 
255.255.255.255 port 67
Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with 
bytes received 332.
Apr  8 10:28:40 ich10 dhclient[1383]: accepting packet with data 
after udp payload.

Apr  8 10:28:40 ich10 dhclient[1383]: DHCPACK from 192.168.xx.1

I also tried manually applying the patch below

http://lists.freebsd.org/pipermail/svn-src-head/2010-April/016189.html 



but still get the same error on dhclient

Apr  8 10:28:40 ich10 dhclient[1383]: ip length 328 disagrees with 
bytes received 332.


which was not there before the 7.0.0 driver update

em0: flags=8843 metric 0 mtu 1500

options=399b
ether 00:1c:c0:95:0d:0d
inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
media: Ethernet autoselect (100baseTX )
status: active

Also, should not

# ifconfig em0 -vlanmtu -vlanhwtag -vlanhwfilter -vlanhwtso
0(ich10)# ifconfig em0
em0: flags=8843 metric 0 mtu 1500

options=388b
ether 00:1c:c0:95:0d:0d
inet 192.168.43.219 netmask 0xff00 broadcast 192.168.43.255
media: Ethernet autoselect (100baseTX )
status: active
0(ich10)# killall dhclient
0(ich10)# dhclient em0
DHCPREQUEST on em0 to 255.255.255.255 port 67
ip length 328 disagrees with bytes received 332.
accepting packet with data after udp payload.
DHCPACK from 192.168.xx.1
bound to 192.168.xx.219 -- renewal in 22777 seconds.
0(ich10)#

disable all the vlan features on the nic ?

---Mike



Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: em driver regression

2010-04-08 Thread Mike Tancsa

At 09:12 AM 4/8/2010, Mike Tancsa wrote:

Hi Jack,
I looks like the latest MFC to RELENG_8 for the em driver 
has caused a regression. The box is not doing much as its a 
development server in the lab. This is an Intel MB (DX58SO). dmesg 
and pciconf -lvc attached.




Here are the stats from the NIC as well.

em0: Excessive collisions = 0
em0: Sequence errors = 0
em0: Defer count = 0
em0: Missed Packets = 0
em0: Receive No Buffers = 0
em0: Receive Length Errors = 0
em0: Receive errors = 0
em0: Crc errors = 0
em0: Alignment errors = 0
em0: Collision/Carrier extension errors = 0
em0: watchdog timeouts = 16
em0: XON Rcvd = 0
em0: XON Xmtd = 0
em0: XOFF Rcvd = 0
em0: XOFF Xmtd = 0
em0: Good Packets Rcvd = 65839
em0: Good Packets Xmtd = 13100
em0: TSO Contexts Xmtd = 203
em0: TSO Contexts Failed = 0

It just grabs the IP via DHCP

em0: flags=8843 metric 0 mtu 1500

options=399b
ether 00:1c:c0:95:0d:0d
inet 192.168.xx.yy netmask 0xff00 broadcast 192.168.xx.zz
media: Ethernet autoselect (100baseTX )
status: active




Mike Tancsa,  tel +1 519 651 3400
Sentex Communications,m...@sentex.net
Providing Internet since 1994www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"