Re: CARP and em0 timeout watchdog

2007-04-27 Thread Sven Willenberger
On Fri, 2007-04-20 at 14:44 -0400, Sven Willenberger wrote:
 On Fri, 2007-04-20 at 11:27 -0700, Jack Vogel wrote:
  On 4/20/07, Sven Willenberger [EMAIL PROTECTED] wrote:
   On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
 On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
  Having done more diagnostics I have found out it is not CARP 
  related at
  all. It turns out that the same timeouts will happen when ftp'ing 
  to the
  physical address IPs as well. There is also an odd situation here
  depending on which protocol I use. The two boxes are connected to a 
  Dell
  Powerconnect 2616 gig switch with CAT6. If I scp files from the
  192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth 
  without a
  hiccup (I used dd to create various sized testfiles from 32M to 1G 
  in
  size and just scp testfile* to the other box). On the other hand, 
  if I
  connect to 192.168.0.19 using ftp (either active or passive) where 
  ftp
  is being run through inetd, the interface resets (watchdog) within
  seconds (a few MBs) of traffic. Enabling polling does nothing, nor 
  does
  changing net.inet.tcp.{recv,send}space. Any ideas why I would be 
  seeing
  such behavioral differences between scp and ftp?

 You'll get a much higher throughput rate with FTP than you will with
 SSH, simply because encryption overhead is quite high (even with the
 Blowfish cipher).  With a very fast processor and on a gigE network
 you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
 That's the only difference I can think of.

 The watchdog resets I can't explain; Jack Vogel should be able to 
 assist
 with that.  But it sounds like the resets only happen under very high
 throughput conditions (which is why you'd see it with FTP but not 
 SSH).
   
What kind of hardware is this interface? Watchdogs mean TX cleanup
isn't happening in a reasonable time, without further data its hard to
know what might be going on.
   
Jack
  
   from pciconf:
  
   [EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 
   rev=0x03
   hdr=0x00
   vendor   = 'Intel Corporation'
   device   = 'PRO/1000 PM'
   class= network
   subclass = ethernet
   [EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 
   rev=0x00
   hdr=0x00
   vendor   = 'Intel Corporation'
   class= network
   subclass = ethernet
  
   em0 is the interface in question.
  
   from dmesg:
  
   em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
   0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13
  
   em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
   0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14
  
  OH, this is an 82573, and I've posted a firmware patcher a couple
  different times, there is a bit in the MANC register that is incorrectly
  programmed in some vendors systems. Can you search email for
  that patcher, it needs to run from DOS. If you are unable to find
  it let me know and I'll resent you a copy.
  
  Jack
 
 If you are referring to the dcgdis.ThisIsZip attachment, I found it in
 earlier threads, thanks. Will work on patching the nics and will keep
 the list updated.
 
 Thanks again.
 
 Sven
 
I am happy to report that the firmware patch seems to have fixed the
issue and I can transfer data across the gigE network without the
watchdog timeouts and lockups. Thanks again!!

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Sven Willenberger
On Wed, 2007-04-18 at 11:50 -0400, Sven Willenberger wrote:
 I currently have a FreeBSD 6.2-RELEASE-p3 SMP with dual intel PRO/1000PM
 nics configured as follows:
 
 em0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500
 options=bRXCSUM,TXCSUM,VLAN_MTU
 inet 192.168.0.18 netmask 0xff00 broadcast 192.168.0.255
 ether 00:30:48:8d:5c:0a
 media: Ethernet autoselect (1000baseTX full-duplex)
 status: active
 em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 4096
 options=bRXCSUM,TXCSUM,VLAN_MTU
 inet 10.10.0.18 netmask 0xfff8 broadcast 10.10.0.23
 ether 00:30:48:8d:5c:0b
 media: Ethernet autoselect (1000baseTX full-duplex)
 status: active
 
 the em0 interface connects to the LAN while the em1 interface is
 connected to an identical box via CAT6 crossover cable (for
 ggate/gmirror).
 
 Now, I have also configured a carp interface:
 
 carp0: flags=49UP,LOOPBACK,RUNNING mtu 1500
 inet 192.168.0.20 netmask 0x
 carp: MASTER vhid 1 advbase 1 advskew 0
 
 There are twin boxes here and I am running Samba. The problem is that
 with transfers across the carp IP (192.168.0.20) I end up with em0
 resetting after a watchdog timeout error. This occurs whether I transfer
 files from a windows box using a share (samba) or via ftp. This problem
 does *not* occur if I ftp to the 192.168.0.19 interface (non-virtual). I
 suspected cabling at first so had all the cabling in question replaced
 with fresh CAT6 to no avail. Several gigs of data can be transferred to
 the real interface (em0) without any issue at all; a max of maybe 1 - 2
 Gig can be transferred connected to the carp'ed IP before the em0 reset.
 Any ideas here?
 
 Sven
 

Having done more diagnostics I have found out it is not CARP related at
all. It turns out that the same timeouts will happen when ftp'ing to the
physical address IPs as well. There is also an odd situation here
depending on which protocol I use. The two boxes are connected to a Dell
Powerconnect 2616 gig switch with CAT6. If I scp files from the
192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
hiccup (I used dd to create various sized testfiles from 32M to 1G in
size and just scp testfile* to the other box). On the other hand, if I
connect to 192.168.0.19 using ftp (either active or passive) where ftp
is being run through inetd, the interface resets (watchdog) within
seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
such behavioral differences between scp and ftp?

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Jeremy Chadwick
On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
 Having done more diagnostics I have found out it is not CARP related at
 all. It turns out that the same timeouts will happen when ftp'ing to the
 physical address IPs as well. There is also an odd situation here
 depending on which protocol I use. The two boxes are connected to a Dell
 Powerconnect 2616 gig switch with CAT6. If I scp files from the
 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
 hiccup (I used dd to create various sized testfiles from 32M to 1G in
 size and just scp testfile* to the other box). On the other hand, if I
 connect to 192.168.0.19 using ftp (either active or passive) where ftp
 is being run through inetd, the interface resets (watchdog) within
 seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
 changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
 such behavioral differences between scp and ftp?

You'll get a much higher throughput rate with FTP than you will with
SSH, simply because encryption overhead is quite high (even with the
Blowfish cipher).  With a very fast processor and on a gigE network
you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
That's the only difference I can think of.

The watchdog resets I can't explain; Jack Vogel should be able to assist
with that.  But it sounds like the resets only happen under very high
throughput conditions (which is why you'd see it with FTP but not SSH).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Sven Willenberger
On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote:
 On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
  Having done more diagnostics I have found out it is not CARP related at
  all. It turns out that the same timeouts will happen when ftp'ing to the
  physical address IPs as well. There is also an odd situation here
  depending on which protocol I use. The two boxes are connected to a Dell
  Powerconnect 2616 gig switch with CAT6. If I scp files from the
  192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
  hiccup (I used dd to create various sized testfiles from 32M to 1G in
  size and just scp testfile* to the other box). On the other hand, if I
  connect to 192.168.0.19 using ftp (either active or passive) where ftp
  is being run through inetd, the interface resets (watchdog) within
  seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
  changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
  such behavioral differences between scp and ftp?
 
 You'll get a much higher throughput rate with FTP than you will with
 SSH, simply because encryption overhead is quite high (even with the
 Blowfish cipher).  With a very fast processor and on a gigE network
 you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
 That's the only difference I can think of.
 
 The watchdog resets I can't explain; Jack Vogel should be able to assist
 with that.  But it sounds like the resets only happen under very high
 throughput conditions (which is why you'd see it with FTP but not SSH).
 

I guess it is possible that the traffic from ftp (or smb) is overloading
the interface; fwiw, if I increase the {recv,send}space to 131072 I can
acheive 32MB+/s using scp (and ftp shows similar values). The real
question is how to avoid these watchdog timeouts during heavy traffic;
the whole point here was to replace windows-based fileshare servers with
FreeBSD for the local network but at the moment it is proving
ineffectual as any samba file transfers stall (much like ftp). I see no
other error messages in the logfiles other than the watchdog timeouts
plus interface down/up messages.

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Clayton Milos


- Original Message - 
From: Sven Willenberger [EMAIL PROTECTED]

To: Jeremy Chadwick [EMAIL PROTECTED]
Cc: freebsd-stable@FreeBSD.org
Sent: Friday, April 20, 2007 6:25 PM
Subject: Re: CARP and em0 timeout watchdog



On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote:

On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
 Having done more diagnostics I have found out it is not CARP related at
 all. It turns out that the same timeouts will happen when ftp'ing to 
 the

 physical address IPs as well. There is also an odd situation here
 depending on which protocol I use. The two boxes are connected to a 
 Dell

 Powerconnect 2616 gig switch with CAT6. If I scp files from the
 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without 
 a

 hiccup (I used dd to create various sized testfiles from 32M to 1G in
 size and just scp testfile* to the other box). On the other hand, if I
 connect to 192.168.0.19 using ftp (either active or passive) where ftp
 is being run through inetd, the interface resets (watchdog) within
 seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
 changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
 such behavioral differences between scp and ftp?

You'll get a much higher throughput rate with FTP than you will with
SSH, simply because encryption overhead is quite high (even with the
Blowfish cipher).  With a very fast processor and on a gigE network
you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
That's the only difference I can think of.

The watchdog resets I can't explain; Jack Vogel should be able to assist
with that.  But it sounds like the resets only happen under very high
throughput conditions (which is why you'd see it with FTP but not SSH).



I guess it is possible that the traffic from ftp (or smb) is overloading
the interface; fwiw, if I increase the {recv,send}space to 131072 I can
acheive 32MB+/s using scp (and ftp shows similar values). The real
question is how to avoid these watchdog timeouts during heavy traffic;
the whole point here was to replace windows-based fileshare servers with
FreeBSD for the local network but at the moment it is proving
ineffectual as any samba file transfers stall (much like ftp). I see no
other error messages in the logfiles other than the watchdog timeouts
plus interface down/up messages.

Sven



Sorry for jumping on a thread here. I've had issues with em NIC's as well. 
Especially with heavy loads. What helped for me was turning on polling. I 
recompiled the kernel with polling and turned it on in rc.conf and my 
problems disappeared.


Are you running with polling on?

-Clay

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Jack Vogel

On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:

On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
 Having done more diagnostics I have found out it is not CARP related at
 all. It turns out that the same timeouts will happen when ftp'ing to the
 physical address IPs as well. There is also an odd situation here
 depending on which protocol I use. The two boxes are connected to a Dell
 Powerconnect 2616 gig switch with CAT6. If I scp files from the
 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
 hiccup (I used dd to create various sized testfiles from 32M to 1G in
 size and just scp testfile* to the other box). On the other hand, if I
 connect to 192.168.0.19 using ftp (either active or passive) where ftp
 is being run through inetd, the interface resets (watchdog) within
 seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
 changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
 such behavioral differences between scp and ftp?

You'll get a much higher throughput rate with FTP than you will with
SSH, simply because encryption overhead is quite high (even with the
Blowfish cipher).  With a very fast processor and on a gigE network
you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
That's the only difference I can think of.

The watchdog resets I can't explain; Jack Vogel should be able to assist
with that.  But it sounds like the resets only happen under very high
throughput conditions (which is why you'd see it with FTP but not SSH).


What kind of hardware is this interface? Watchdogs mean TX cleanup
isn't happening in a reasonable time, without further data its hard to
know what might be going on.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Sven Willenberger
On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
 On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
  On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
   Having done more diagnostics I have found out it is not CARP related at
   all. It turns out that the same timeouts will happen when ftp'ing to the
   physical address IPs as well. There is also an odd situation here
   depending on which protocol I use. The two boxes are connected to a Dell
   Powerconnect 2616 gig switch with CAT6. If I scp files from the
   192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
   hiccup (I used dd to create various sized testfiles from 32M to 1G in
   size and just scp testfile* to the other box). On the other hand, if I
   connect to 192.168.0.19 using ftp (either active or passive) where ftp
   is being run through inetd, the interface resets (watchdog) within
   seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
   changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
   such behavioral differences between scp and ftp?
 
  You'll get a much higher throughput rate with FTP than you will with
  SSH, simply because encryption overhead is quite high (even with the
  Blowfish cipher).  With a very fast processor and on a gigE network
  you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
  That's the only difference I can think of.
 
  The watchdog resets I can't explain; Jack Vogel should be able to assist
  with that.  But it sounds like the resets only happen under very high
  throughput conditions (which is why you'd see it with FTP but not SSH).
 
 What kind of hardware is this interface? Watchdogs mean TX cleanup
 isn't happening in a reasonable time, without further data its hard to
 know what might be going on.
 
 Jack

from pciconf:

[EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 rev=0x03
hdr=0x00
vendor   = 'Intel Corporation'
device   = 'PRO/1000 PM'
class= network
subclass = ethernet
[EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 rev=0x00
hdr=0x00
vendor   = 'Intel Corporation'
class= network
subclass = ethernet

em0 is the interface in question.

from dmesg:

em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13

em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Sven Willenberger
On Fri, 2007-04-20 at 18:46 +0200, Clayton Milos wrote:
 - Original Message - 
 From: Sven Willenberger [EMAIL PROTECTED]
 To: Jeremy Chadwick [EMAIL PROTECTED]
 Cc: freebsd-stable@FreeBSD.org
 Sent: Friday, April 20, 2007 6:25 PM
 Subject: Re: CARP and em0 timeout watchdog
 
 
  On Fri, 2007-04-20 at 09:04 -0700, Jeremy Chadwick wrote:
  On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
   Having done more diagnostics I have found out it is not CARP related at
   all. It turns out that the same timeouts will happen when ftp'ing to 
   the
   physical address IPs as well. There is also an odd situation here
   depending on which protocol I use. The two boxes are connected to a 
   Dell
   Powerconnect 2616 gig switch with CAT6. If I scp files from the
   192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without 
   a
   hiccup (I used dd to create various sized testfiles from 32M to 1G in
   size and just scp testfile* to the other box). On the other hand, if I
   connect to 192.168.0.19 using ftp (either active or passive) where ftp
   is being run through inetd, the interface resets (watchdog) within
   seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
   changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
   such behavioral differences between scp and ftp?
 
  You'll get a much higher throughput rate with FTP than you will with
  SSH, simply because encryption overhead is quite high (even with the
  Blowfish cipher).  With a very fast processor and on a gigE network
  you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
  That's the only difference I can think of.
 
  The watchdog resets I can't explain; Jack Vogel should be able to assist
  with that.  But it sounds like the resets only happen under very high
  throughput conditions (which is why you'd see it with FTP but not SSH).
 
 
  I guess it is possible that the traffic from ftp (or smb) is overloading
  the interface; fwiw, if I increase the {recv,send}space to 131072 I can
  acheive 32MB+/s using scp (and ftp shows similar values). The real
  question is how to avoid these watchdog timeouts during heavy traffic;
  the whole point here was to replace windows-based fileshare servers with
  FreeBSD for the local network but at the moment it is proving
  ineffectual as any samba file transfers stall (much like ftp). I see no
  other error messages in the logfiles other than the watchdog timeouts
  plus interface down/up messages.
 
  Sven
 
 
 Sorry for jumping on a thread here. I've had issues with em NIC's as well. 
 Especially with heavy loads. What helped for me was turning on polling. I 
 recompiled the kernel with polling and turned it on in rc.conf and my 
 problems disappeared.
 
 Are you running with polling on?
 

At first I did not have polling compiled in, so no. Then I compiled in
polling (and used options HZ=2000) but it didn't change anything.
Whether I have polling enabled or disabled on the interface, the outcome
is the same.

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Jack Vogel

On 4/20/07, Sven Willenberger [EMAIL PROTECTED] wrote:

On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
 On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
  On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
   Having done more diagnostics I have found out it is not CARP related at
   all. It turns out that the same timeouts will happen when ftp'ing to the
   physical address IPs as well. There is also an odd situation here
   depending on which protocol I use. The two boxes are connected to a Dell
   Powerconnect 2616 gig switch with CAT6. If I scp files from the
   192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
   hiccup (I used dd to create various sized testfiles from 32M to 1G in
   size and just scp testfile* to the other box). On the other hand, if I
   connect to 192.168.0.19 using ftp (either active or passive) where ftp
   is being run through inetd, the interface resets (watchdog) within
   seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
   changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
   such behavioral differences between scp and ftp?
 
  You'll get a much higher throughput rate with FTP than you will with
  SSH, simply because encryption overhead is quite high (even with the
  Blowfish cipher).  With a very fast processor and on a gigE network
  you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
  That's the only difference I can think of.
 
  The watchdog resets I can't explain; Jack Vogel should be able to assist
  with that.  But it sounds like the resets only happen under very high
  throughput conditions (which is why you'd see it with FTP but not SSH).

 What kind of hardware is this interface? Watchdogs mean TX cleanup
 isn't happening in a reasonable time, without further data its hard to
 know what might be going on.

 Jack

from pciconf:

[EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 rev=0x03
hdr=0x00
vendor   = 'Intel Corporation'
device   = 'PRO/1000 PM'
class= network
subclass = ethernet
[EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 rev=0x00
hdr=0x00
vendor   = 'Intel Corporation'
class= network
subclass = ethernet

em0 is the interface in question.

from dmesg:

em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13

em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14


OH, this is an 82573, and I've posted a firmware patcher a couple
different times, there is a bit in the MANC register that is incorrectly
programmed in some vendors systems. Can you search email for
that patcher, it needs to run from DOS. If you are unable to find
it let me know and I'll resent you a copy.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Sven Willenberger
On Fri, 2007-04-20 at 11:27 -0700, Jack Vogel wrote:
 On 4/20/07, Sven Willenberger [EMAIL PROTECTED] wrote:
  On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
   On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
 Having done more diagnostics I have found out it is not CARP related 
 at
 all. It turns out that the same timeouts will happen when ftp'ing to 
 the
 physical address IPs as well. There is also an odd situation here
 depending on which protocol I use. The two boxes are connected to a 
 Dell
 Powerconnect 2616 gig switch with CAT6. If I scp files from the
 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth 
 without a
 hiccup (I used dd to create various sized testfiles from 32M to 1G in
 size and just scp testfile* to the other box). On the other hand, if I
 connect to 192.168.0.19 using ftp (either active or passive) where ftp
 is being run through inetd, the interface resets (watchdog) within
 seconds (a few MBs) of traffic. Enabling polling does nothing, nor 
 does
 changing net.inet.tcp.{recv,send}space. Any ideas why I would be 
 seeing
 such behavioral differences between scp and ftp?
   
You'll get a much higher throughput rate with FTP than you will with
SSH, simply because encryption overhead is quite high (even with the
Blowfish cipher).  With a very fast processor and on a gigE network
you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
That's the only difference I can think of.
   
The watchdog resets I can't explain; Jack Vogel should be able to assist
with that.  But it sounds like the resets only happen under very high
throughput conditions (which is why you'd see it with FTP but not SSH).
  
   What kind of hardware is this interface? Watchdogs mean TX cleanup
   isn't happening in a reasonable time, without further data its hard to
   know what might be going on.
  
   Jack
 
  from pciconf:
 
  [EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 
  rev=0x03
  hdr=0x00
  vendor   = 'Intel Corporation'
  device   = 'PRO/1000 PM'
  class= network
  subclass = ethernet
  [EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 
  rev=0x00
  hdr=0x00
  vendor   = 'Intel Corporation'
  class= network
  subclass = ethernet
 
  em0 is the interface in question.
 
  from dmesg:
 
  em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
  0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13
 
  em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
  0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14
 
 OH, this is an 82573, and I've posted a firmware patcher a couple
 different times, there is a bit in the MANC register that is incorrectly
 programmed in some vendors systems. Can you search email for
 that patcher, it needs to run from DOS. If you are unable to find
 it let me know and I'll resent you a copy.
 
 Jack

If you are referring to the dcgdis.ThisIsZip attachment, I found it in
earlier threads, thanks. Will work on patching the nics and will keep
the list updated.

Thanks again.

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Kevin Oberman
 Date: Fri, 20 Apr 2007 09:04:31 -0700
 From: Jeremy Chadwick [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 
 On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
  Having done more diagnostics I have found out it is not CARP related at
  all. It turns out that the same timeouts will happen when ftp'ing to the
  physical address IPs as well. There is also an odd situation here
  depending on which protocol I use. The two boxes are connected to a Dell
  Powerconnect 2616 gig switch with CAT6. If I scp files from the
  192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
  hiccup (I used dd to create various sized testfiles from 32M to 1G in
  size and just scp testfile* to the other box). On the other hand, if I
  connect to 192.168.0.19 using ftp (either active or passive) where ftp
  is being run through inetd, the interface resets (watchdog) within
  seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
  changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
  such behavioral differences between scp and ftp?
 
 You'll get a much higher throughput rate with FTP than you will with
 SSH, simply because encryption overhead is quite high (even with the
 Blowfish cipher).  With a very fast processor and on a gigE network
 you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
 That's the only difference I can think of.

OK. Let's put the blame where it belongs. It's probably not the
encryption/decryption that slows down scp. It's the OpenSSH code. It is
only slightly related to CPU speed on reasonably modern CPUs. My Athlon
64 system goes to 23% CPU while transferring a large (150MB) file using
AES128-CBC. My Ethernet runs at over 11 MBytes/sec on a FastEthernet
about 5 nanoseconds long.

If you have a system slower than about 600 MHz, then it may be the
encryption.

At least 3 years ago the folks at the Pittsburgh Supercomputer Center
(PSC) were seeing slow scp performance and investigated. The systems
they were running on were pretty fast (it is a Supercomputer Center) and
should have been able to run at nearly 1 Gbps without problems, but
could not. FTP (which is a VERY inefficient protocol) was much faster.

They examined the OpenSSH source code and found the problem. They
published patches to OpenSSH and have continued to maintain them, but
the OpenBSD people have yet to incorporate them, so ssh is still slow on
long paths. This only applies to transfers over longer distances.
Transfers over the LAN should not be impacted by this.

More information and the patch are available at:
http://www.psc.edu/networking/projects/hpn-ssh/
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


pgpFhGWAgbpwA.pgp
Description: PGP signature


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Jack Vogel

On 4/20/07, Brian McCann [EMAIL PROTECTED] wrote:

On 4/20/07, Jack Vogel [EMAIL PROTECTED] wrote:
 On 4/20/07, Sven Willenberger [EMAIL PROTECTED] wrote:
  On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
   On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
 Having done more diagnostics I have found out it is not CARP related 
at
 all. It turns out that the same timeouts will happen when ftp'ing to 
the
 physical address IPs as well. There is also an odd situation here
 depending on which protocol I use. The two boxes are connected to a 
Dell
 Powerconnect 2616 gig switch with CAT6. If I scp files from the
 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth 
without a
 hiccup (I used dd to create various sized testfiles from 32M to 1G in
 size and just scp testfile* to the other box). On the other hand, if I
 connect to 192.168.0.19 using ftp (either active or passive) where ftp
 is being run through inetd, the interface resets (watchdog) within
 seconds (a few MBs) of traffic. Enabling polling does nothing, nor 
does
 changing net.inet.tcp.{recv,send}space. Any ideas why I would be 
seeing
 such behavioral differences between scp and ftp?
   
You'll get a much higher throughput rate with FTP than you will with
SSH, simply because encryption overhead is quite high (even with the
Blowfish cipher).  With a very fast processor and on a gigE network
you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
That's the only difference I can think of.
   
The watchdog resets I can't explain; Jack Vogel should be able to assist
with that.  But it sounds like the resets only happen under very high
throughput conditions (which is why you'd see it with FTP but not SSH).
  
   What kind of hardware is this interface? Watchdogs mean TX cleanup
   isn't happening in a reasonable time, without further data its hard to
   know what might be going on.
  
   Jack
 
  from pciconf:
 
  [EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 
rev=0x03
  hdr=0x00
  vendor   = 'Intel Corporation'
  device   = 'PRO/1000 PM'
  class= network
  subclass = ethernet
  [EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 
rev=0x00
  hdr=0x00
  vendor   = 'Intel Corporation'
  class= network
  subclass = ethernet
 
  em0 is the interface in question.
 
  from dmesg:
 
  em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
  0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13
 
  em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
  0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14

 OH, this is an 82573, and I've posted a firmware patcher a couple
 different times, there is a bit in the MANC register that is incorrectly
 programmed in some vendors systems. Can you search email for
 that patcher, it needs to run from DOS. If you are unable to find
 it let me know and I'll resent you a copy.

 Jack
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]


FWIW, I've got 82546B cards and it's happening to me as well, but I'm
on 6.1.  I'm upgrading to 6.2 and trying polling as we speak.

--Brian


This is not the same problem, until you are running 6.2 RELEASE its
a whole other ballpark, there were locking issues between the driver
and the net layer that were fixed.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: CARP and em0 timeout watchdog

2007-04-20 Thread Brian McCann

On 4/20/07, Jack Vogel [EMAIL PROTECTED] wrote:

On 4/20/07, Sven Willenberger [EMAIL PROTECTED] wrote:
 On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
  On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
   On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
Having done more diagnostics I have found out it is not CARP related at
all. It turns out that the same timeouts will happen when ftp'ing to the
physical address IPs as well. There is also an odd situation here
depending on which protocol I use. The two boxes are connected to a Dell
Powerconnect 2616 gig switch with CAT6. If I scp files from the
192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
hiccup (I used dd to create various sized testfiles from 32M to 1G in
size and just scp testfile* to the other box). On the other hand, if I
connect to 192.168.0.19 using ftp (either active or passive) where ftp
is being run through inetd, the interface resets (watchdog) within
seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
such behavioral differences between scp and ftp?
  
   You'll get a much higher throughput rate with FTP than you will with
   SSH, simply because encryption overhead is quite high (even with the
   Blowfish cipher).  With a very fast processor and on a gigE network
   you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
   That's the only difference I can think of.
  
   The watchdog resets I can't explain; Jack Vogel should be able to assist
   with that.  But it sounds like the resets only happen under very high
   throughput conditions (which is why you'd see it with FTP but not SSH).
 
  What kind of hardware is this interface? Watchdogs mean TX cleanup
  isn't happening in a reasonable time, without further data its hard to
  know what might be going on.
 
  Jack

 from pciconf:

 [EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 
rev=0x03
 hdr=0x00
 vendor   = 'Intel Corporation'
 device   = 'PRO/1000 PM'
 class= network
 subclass = ethernet
 [EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 
rev=0x00
 hdr=0x00
 vendor   = 'Intel Corporation'
 class= network
 subclass = ethernet

 em0 is the interface in question.

 from dmesg:

 em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
 0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13

 em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
 0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14

OH, this is an 82573, and I've posted a firmware patcher a couple
different times, there is a bit in the MANC register that is incorrectly
programmed in some vendors systems. Can you search email for
that patcher, it needs to run from DOS. If you are unable to find
it let me know and I'll resent you a copy.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]



FWIW, I've got 82546B cards and it's happening to me as well, but I'm
on 6.1.  I'm upgrading to 6.2 and trying polling as we speak.

--Brian


--
_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_
Brian McCann
Systems  Network Administrator, K12USA

I don't have to take this abuse from you -- I've got hundreds of
people waiting to abuse me.
   -- Bill Murray, Ghostbusters
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


CARP and em0 timeout watchdog

2007-04-18 Thread Sven Willenberger
I currently have a FreeBSD 6.2-RELEASE-p3 SMP with dual intel PRO/1000PM
nics configured as follows:

em0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST mtu 1500
options=bRXCSUM,TXCSUM,VLAN_MTU
inet 192.168.0.18 netmask 0xff00 broadcast 192.168.0.255
ether 00:30:48:8d:5c:0a
media: Ethernet autoselect (1000baseTX full-duplex)
status: active
em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 4096
options=bRXCSUM,TXCSUM,VLAN_MTU
inet 10.10.0.18 netmask 0xfff8 broadcast 10.10.0.23
ether 00:30:48:8d:5c:0b
media: Ethernet autoselect (1000baseTX full-duplex)
status: active

the em0 interface connects to the LAN while the em1 interface is
connected to an identical box via CAT6 crossover cable (for
ggate/gmirror).

Now, I have also configured a carp interface:

carp0: flags=49UP,LOOPBACK,RUNNING mtu 1500
inet 192.168.0.20 netmask 0x
carp: MASTER vhid 1 advbase 1 advskew 0

There are twin boxes here and I am running Samba. The problem is that
with transfers across the carp IP (192.168.0.20) I end up with em0
resetting after a watchdog timeout error. This occurs whether I transfer
files from a windows box using a share (samba) or via ftp. This problem
does *not* occur if I ftp to the 192.168.0.19 interface (non-virtual). I
suspected cabling at first so had all the cabling in question replaced
with fresh CAT6 to no avail. Several gigs of data can be transferred to
the real interface (em0) without any issue at all; a max of maybe 1 - 2
Gig can be transferred connected to the carp'ed IP before the em0 reset.
Any ideas here?

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]