Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-06-07 Thread Stephen Hemminger
On Wed, 7 Jun 2006 12:33:21 -0700
"Guenther Thomsen" <[EMAIL PROTECTED]> wrote:

> I was perhaps a bit quick to declare victory. While the results below stand 
> and the machine survived the last few days (idle), it occurred to me only 
> today, to have a look at the kernel's message buffer, where I found following:
> --8<--
> sky2 eth0: enabling interface
> sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control none
> sky2 eth1: enabling interface
> sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control none
> audit(1149379670.514:3): audit_pid=1915 old=0 by auid=4294967295
> : hw csum failure.
> sky2 eth1: rx error, status 0x7ffc0001 length 444
> 
> Call Trace: {__skb_checksum_complete+76}
>{__tcp_checksum_complete_user+33}
>{tcp_rcv_established+817} {tcp_v4_
> do_rcv+43}
>{sk_wait_data+203} {tcp_prequeue_p
> rocess+121}
>{tcp_recvmsg+1104} {sock_common_re
> cvmsg+48}
>{do_sock_read+209} {sock_aio_read+
> 83}
>{dev_queue_xmit+0} {do_sync_read+1
> 99}
>{remove_wait_queue+18} {autoremove
> _wake_function+0}
>{vfs_read+228} {sys_read+69}
>{tracesys+209}
> : hw csum failure.
> sky2 eth1: rx error, status 0x7ffc0001 length 444

Different problem, I have seen it before.  Basically if the receiver gets 
overloaded, the
packet FIFO gets full. The driver needs to have some kind of recovery logic for 
this;
probably just shutting down the receiver and restarting.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-06-07 Thread Guenther Thomsen
I was perhaps a bit quick to declare victory. While the results below stand and 
the machine survived the last few days (idle), it occurred to me only today, to 
have a look at the kernel's message buffer, where I found following:
--8<--
sky2 eth0: enabling interface
sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control none
sky2 eth1: enabling interface
sky2 eth1: Link is up at 1000 Mbps, full duplex, flow control none
audit(1149379670.514:3): audit_pid=1915 old=0 by auid=4294967295
: hw csum failure.
sky2 eth1: rx error, status 0x7ffc0001 length 444

Call Trace: {__skb_checksum_complete+76}
   {__tcp_checksum_complete_user+33}
   {tcp_rcv_established+817} {tcp_v4_
do_rcv+43}
   {sk_wait_data+203} {tcp_prequeue_p
rocess+121}
   {tcp_recvmsg+1104} {sock_common_re
cvmsg+48}
   {do_sock_read+209} {sock_aio_read+
83}
   {dev_queue_xmit+0} {do_sync_read+1
99}
   {remove_wait_queue+18} {autoremove
_wake_function+0}
   {vfs_read+228} {sys_read+69}
   {tracesys+209}
: hw csum failure.
sky2 eth1: rx error, status 0x7ffc0001 length 444


Call Trace: {__skb_checksum_complete+76}
   {__tcp_checksum_complete_user+33}
   {tcp_rcv_established+817} {tcp_v4_
do_rcv+43}
   {sk_wait_data+203} {tcp_prequeue_p
rocess+121}
   {tcp_recvmsg+1104} {sock_common_re
cvmsg+48}
   {alloc_sock_iocb+20} {do_sock_read
+209}
   {sock_aio_read+83} {do_sync_read+1
99}
   {remove_wait_queue+18} {autoremove
_wake_function+0}
   {vfs_read+228} {sys_read+69}
   {tracesys+209}
: hw csum failure.
sky2 eth1: rx error, status 0x7ffc0001 length 444


Call Trace: {__skb_checksum_complete+76}
   {__tcp_checksum_complete_user+33}
   {tcp_rcv_established+817} {tcp_v4_
do_rcv+43}
   {sk_wait_data+203} {tcp_prequeue_p
rocess+121}
   {tcp_recvmsg+1104} {sock_common_re
cvmsg+48}
   {do_sock_read+209} {sock_aio_read+
83}
   {do_sync_read+199} {remove_wait_qu
eue+18}
   {autoremove_wake_function+0} {curr
ent_kernel_time+13}
   {vfs_read+228} {sys_read+69}
   {tracesys+209}

sky2 eth0: rx error, status 0x7ffc0001 length 444
sky2 eth0: rx error, status 0x7ffc0001 length 444
sky2 eth1: rx error, status 0x7ffc0001 length 444
sky2 eth1: rx error, status 0x7ffc0001 length 444
sky2 eth1: rx error, status 0x7ffc0001 length 444
-->8--
Looks, like we're almost, but not quite there yet.

cheers
Guenther


-Original Message-
From: Guenther Thomsen 
Sent: Saturday, June 03, 2006 9:06 PM
To: 'Stephen Hemminger'
Cc: John W. Linville; netdev@vger.kernel.org
Subject: RE: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


I received the hardware back and took the opportunity to test with 
2.6.17-rc5-git11. So far I did only little tests (ttcp on both interfaces in, 
out or mixed with some 10e6 packets), but it looks good. No errors (well, 16 
overruns in 76574513 packets) and line rate (about 111MB/s) on both channels 
simultaneously. Hurray!

Thanks a lot for you continued efforts.
Guenther

-Original Message-
From: Stephen Hemminger [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 16, 2006 12:12 PM
To: Guenther Thomsen
Cc: John W. Linville; netdev@vger.kernel.org
Subject: Re: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +---
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
struct sky2_hw *hw = sky2->hw;
unsigned port = sky2->port;
u32 ramsize, rxspace, imask;
-   int err = -ENOMEM;
+   int cap, err;
+   struct net_device *otherdev = hw->dev[sky2->port^1];
 
+   /*
+* Reduce split transactions (and turn off) rx checksums to
+* prevent problems with dual ports.
+*/
+   if (otherdev && netif_running(otherdev) &&
+   (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+   struct sky2_port *osky2 = netdev_priv(otherdev);
+   u16 cmd;
+
+   cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+   cmd &= ~PCI_X_CMD_MAX_SPLIT;
+   sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+   sky2->rx_csum = 0;
+   osky2->rx_csum = 0;
+   }
+
+   err = -ENOMEM;
if (netif_msg_ifup

RE: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-06-03 Thread Guenther Thomsen
I received the hardware back and took the opportunity to test with 
2.6.17-rc5-git11. So far I did only little tests (ttcp on both interfaces in, 
out or mixed with some 10e6 packets), but it looks good. No errors (well, 16 
overruns in 76574513 packets) and line rate (about 111MB/s) on both channels 
simultaneously. Hurray!

Thanks a lot for you continued efforts.
Guenther

-Original Message-
From: Stephen Hemminger [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 16, 2006 12:12 PM
To: Guenther Thomsen
Cc: John W. Linville; netdev@vger.kernel.org
Subject: Re: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +---
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
struct sky2_hw *hw = sky2->hw;
unsigned port = sky2->port;
u32 ramsize, rxspace, imask;
-   int err = -ENOMEM;
+   int cap, err;
+   struct net_device *otherdev = hw->dev[sky2->port^1];
 
+   /*
+* Reduce split transactions (and turn off) rx checksums to
+* prevent problems with dual ports.
+*/
+   if (otherdev && netif_running(otherdev) &&
+   (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+   struct sky2_port *osky2 = netdev_priv(otherdev);
+   u16 cmd;
+
+   cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+   cmd &= ~PCI_X_CMD_MAX_SPLIT;
+   sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+   sky2->rx_csum = 0;
+   osky2->rx_csum = 0;
+   }
+
+   err = -ENOMEM;
if (netif_msg_ifup(sky2))
printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
sky2->duplex = -1;
sky2->speed = -1;
sky2->advertising = sky2_supported_modes(hw);
-
-   /* Receive checksum disabled for Yukon XL
-* because of observed problems with incorrect
-* values when multiple packets are received in one interrupt
-*/
-   sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+   sky2->rx_csum = 1;
 
spin_lock_init(&sky2->phy_lock);
sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-05-16 Thread Guenther Thomsen
Thanks for your continued work on it. I will test the patch, as soon as I get 
access to the hardware again (probably next week).

best regards
Guenther 

-Original Message-
From: Stephen Hemminger [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 16, 2006 12:12 PM
To: Guenther Thomsen
Cc: John W. Linville; netdev@vger.kernel.org
Subject: Re: kernel panic (on DHCP discover?) in sky2 driver of
2.6.17-rc1


Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +---
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
struct sky2_hw *hw = sky2->hw;
unsigned port = sky2->port;
u32 ramsize, rxspace, imask;
-   int err = -ENOMEM;
+   int cap, err;
+   struct net_device *otherdev = hw->dev[sky2->port^1];
 
+   /*
+* Reduce split transactions (and turn off) rx checksums to
+* prevent problems with dual ports.
+*/
+   if (otherdev && netif_running(otherdev) &&
+   (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+   struct sky2_port *osky2 = netdev_priv(otherdev);
+   u16 cmd;
+
+   cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+   cmd &= ~PCI_X_CMD_MAX_SPLIT;
+   sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+   sky2->rx_csum = 0;
+   osky2->rx_csum = 0;
+   }
+
+   err = -ENOMEM;
if (netif_msg_ifup(sky2))
printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
sky2->duplex = -1;
sky2->speed = -1;
sky2->advertising = sky2_supported_modes(hw);
-
-   /* Receive checksum disabled for Yukon XL
-* because of observed problems with incorrect
-* values when multiple packets are received in one interrupt
-*/
-   sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+   sky2->rx_csum = 1;
 
spin_lock_init(&sky2->phy_lock);
sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-05-16 Thread Stephen Hemminger
Could you try the 2.6.17-rc4 version with this patch. It turns out the board
seems to give out of order status responses.

Ignore the vendor sk98lin driver, when I try the stock version it spends it's
life resetting itself because it sets up PCI bus wrong. If I fix that, it spends
it's time getting confused because it can't handle intermixed status reports
properly (checksum et all is per port not per board).


 drivers/net/sky2.c |   28 +---
 1 files changed, 21 insertions(+), 7 deletions(-)

792547bc5e8e4f7d5a1070a168056f429635c254
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index ffd267f..11e7914 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1020,8 +1020,27 @@ static int sky2_up(struct net_device *de
struct sky2_hw *hw = sky2->hw;
unsigned port = sky2->port;
u32 ramsize, rxspace, imask;
-   int err = -ENOMEM;
+   int cap, err;
+   struct net_device *otherdev = hw->dev[sky2->port^1];
 
+   /*
+* Reduce split transactions (and turn off) rx checksums to
+* prevent problems with dual ports.
+*/
+   if (otherdev && netif_running(otherdev) &&
+   (cap = pci_find_capability(hw->pdev, PCI_CAP_ID_PCIX))) {
+   struct sky2_port *osky2 = netdev_priv(otherdev);
+   u16 cmd;
+
+   cmd = sky2_pci_read16(hw, cap + PCI_X_CMD);
+   cmd &= ~PCI_X_CMD_MAX_SPLIT;
+   sky2_pci_write16(hw, cap + PCI_X_CMD, cmd);
+
+   sky2->rx_csum = 0;
+   osky2->rx_csum = 0;
+   }
+
+   err = -ENOMEM;
if (netif_msg_ifup(sky2))
printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 
@@ -3067,12 +3086,7 @@ static __devinit struct net_device *sky2
sky2->duplex = -1;
sky2->speed = -1;
sky2->advertising = sky2_supported_modes(hw);
-
-   /* Receive checksum disabled for Yukon XL
-* because of observed problems with incorrect
-* values when multiple packets are received in one interrupt
-*/
-   sky2->rx_csum = (hw->chip_id != CHIP_ID_YUKON_XL);
+   sky2->rx_csum = 1;
 
spin_lock_init(&sky2->phy_lock);
sky2->tx_pending = TX_DEF_PENDING;
-- 
1.2.4

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 driver problems in 2.6.17-rc2-git6 (was: Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1)

2006-04-26 Thread Guenther Thomsen
On Wednesday 26 April 2006 09:44, Stephen Hemminger wrote:
> On Tue, 25 Apr 2006 17:06:25 -0700
>
> Guenther Thomsen <[EMAIL PROTECTED]> wrote:
[..]
> > Considering the recent NFS changes, I tried to get the system into
> > this state using just ttcp. With some determination, three more
> > hosts and a few million packets, I succeeded. This time eth0
> > truncated packets and traffic slowed to a crawl (~1 good packet
> > every 2s).
> >
> > Some progress has been made, but it's not quite solid yet.
>
> Are you saturating both ports on the card or only one?

On the system under test I started four ttcp sessions: two senders and 
two receivers (the second one on a non-standard port). One pair for 
each port (device). I'm not sure, to which degree the device was 
saturated. It certainly should have been, since the remote hosts are 
capable of line rate, but I found the sending ttcp sessions on the 
system under test to be slow, as long as traffic was incoming.

best regards
Guenther
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 driver problems in 2.6.17-rc2-git6 (was: Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1)

2006-04-26 Thread Stephen Hemminger
On Tue, 25 Apr 2006 17:06:25 -0700
Guenther Thomsen <[EMAIL PROTECTED]> wrote:

> On Monday 17 April 2006 11:18, Stephen Hemminger wrote:
> > I don't know what you are doing different, but my 2 port SysKonnect
> > card is working fine.  Running SMP AMD64 and 2.6.17 latest.
> >
> > Showing full speed on both ports.
> I missed that e-mail, sorry.
> 
> I just gave it another try, this time with 2.6.16.11 . One port works 
> fine (so far, I just did very limited testing with ttcp). The second port 
> does negotiate IP address via DHCP, but the packgages it receives 
> seem to be garbled:
> 
> --8<--
>0x:   6175 6469 7428 3131 3435 3939 3430  ..audit(11459940
> 0x0010:  3031 2e39 3738 3a33 3829 3a20 7573 6572  01.978:38):.user
> 0x0020:  2070 6964 3d33 3230 3920 7569 643d   .pid=3209.uid=
> 12:56:23.725090 00:00:00:00:00:00 > 30:6e:6d:00:00:00 null I (s=32,r=55,P) 
> len=42
> 12:56:24.603274 00:00:21:00:00:00 > 00:00:00:00:00:00 null disc/C len=43
> 12:56:26.619326 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 12:56:28.635346 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 12:56:29.734046 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 12:56:29.865239 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 12:56:30.651371 00:00:00:00:00:00 > a6:00:00:00:4d:04, ethertype Unknown 
> (0xe20c), length 60:
> 0x:   6175 6469 7428 3131 3435 3939 3436  ..audit(11459946
> 0x0010:  3031 2e33 3639 3a34 3729 3a20 7573 6572  01.369:47):.user
> 0x0020:  2070 6964 3d33 3239 3820 7569 643d   .pid=3298.uid=
> 12:56:30.916718 00:00:f0:71:61:00 > 28:37:03:5b:3a:00 null I (s=16,r=0,C) 
> len=42
> 12:56:30.923558 00:00:21:00:00:00 > 00:00:00:00:00:00 null rnr (r=55,C) len=42
> 12:56:32.667413 00:00:d0:2e:30:42 > 10:60:61:00:00:00, ethertype Unknown 
> (0x572b), length 60:
> 0x:   d675 0d00   0200    ...u
> 0x0010:           
> 0x0020:       1300    ..
> 12:56:33.296384 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 12:56:33.303222 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> [..]
> 13:00:44.340062 00:00:00:00:00:00 > 5f:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:44.672350 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:44.868724 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:45.340123 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:46.340173 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:46.688433 IP truncated-ip - 1454 bytes missing! 192.168.65.66.40313 > 
> 192.168.65.65.5001: . 1426488980:1426490428(1448) ack 1790562292 win 1460 
> 
> 13:00:48.704431 00:00:21:00:00:00 > 00:00:00:00:00:00 null I (s=17,r=18,C) 
> len=42
> 13:00:48.886426 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:50.720463 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:52.736496 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:54.752522 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:54.927556 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> 13:00:54.934394 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) 
> len=42
> -->8--
> On a different host connected to the same switch, traffic looks more like:
> --8<--
> 2:01:49.388992 IP 192.168.64.1.ntp > 255.255.255.255.ntp: NTPv3, Broadcast, 
> length 48
> 12:01:50.176550 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
> 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
> 12:01:51.235034 arp reply 192.168.64.32 is-at 00:0a:49:00:5e:8a
> 12:01:51.241857 arp reply 192.168.64.33 is-at 00:0a:49:00:5e:8b
> 12:01:51.891193 00:00:01:02:c8:58 > 45:c0:00:1c:00:20, ethertype Unknown 
> (0xe000), length 60:
> 0x:  0001 1164 ee9b       ...d
> 0x0010:        2f6b 8c87  /k..
> 0x0020:           ..
> 12:01:52.192552 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
> 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
> 12:01:52.801392 arp reply 192.168.64.34 is-at 00:0a:49:00:5e:8c
> 12:01:52.808240 arp reply 192.168.64.35 is-at 00:0a:49:00:5e:8d
> 12:01:54.208495 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
> 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
> 12:01:56.224453 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
> 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
> 12:01:58.240464 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
> 8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
> 12:02:00.029320 arp reply 192.168.64.39 

sky2 driver problems in 2.6.17-rc2-git6 (was: Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1)

2006-04-25 Thread Guenther Thomsen
On Monday 17 April 2006 11:18, Stephen Hemminger wrote:
> I don't know what you are doing different, but my 2 port SysKonnect
> card is working fine.  Running SMP AMD64 and 2.6.17 latest.
>
> Showing full speed on both ports.
I missed that e-mail, sorry.

I just gave it another try, this time with 2.6.16.11 . One port works 
fine (so far, I just did very limited testing with ttcp). The second port 
does negotiate IP address via DHCP, but the packgages it receives 
seem to be garbled:

--8<--
   0x:   6175 6469 7428 3131 3435 3939 3430  ..audit(11459940
0x0010:  3031 2e39 3738 3a33 3829 3a20 7573 6572  01.978:38):.user
0x0020:  2070 6964 3d33 3230 3920 7569 643d   .pid=3209.uid=
12:56:23.725090 00:00:00:00:00:00 > 30:6e:6d:00:00:00 null I (s=32,r=55,P) 
len=42
12:56:24.603274 00:00:21:00:00:00 > 00:00:00:00:00:00 null disc/C len=43
12:56:26.619326 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
12:56:28.635346 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
12:56:29.734046 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
12:56:29.865239 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
12:56:30.651371 00:00:00:00:00:00 > a6:00:00:00:4d:04, ethertype Unknown 
(0xe20c), length 60:
0x:   6175 6469 7428 3131 3435 3939 3436  ..audit(11459946
0x0010:  3031 2e33 3639 3a34 3729 3a20 7573 6572  01.369:47):.user
0x0020:  2070 6964 3d33 3239 3820 7569 643d   .pid=3298.uid=
12:56:30.916718 00:00:f0:71:61:00 > 28:37:03:5b:3a:00 null I (s=16,r=0,C) len=42
12:56:30.923558 00:00:21:00:00:00 > 00:00:00:00:00:00 null rnr (r=55,C) len=42
12:56:32.667413 00:00:d0:2e:30:42 > 10:60:61:00:00:00, ethertype Unknown 
(0x572b), length 60:
0x:   d675 0d00   0200    ...u
0x0010:           
0x0020:       1300    ..
12:56:33.296384 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
12:56:33.303222 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
[..]
13:00:44.340062 00:00:00:00:00:00 > 5f:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:44.672350 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:44.868724 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:45.340123 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:46.340173 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:46.688433 IP truncated-ip - 1454 bytes missing! 192.168.65.66.40313 > 
192.168.65.65.5001: . 1426488980:1426490428(1448) ack 1790562292 win 1460 

13:00:48.704431 00:00:21:00:00:00 > 00:00:00:00:00:00 null I (s=17,r=18,C) 
len=42
13:00:48.886426 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:50.720463 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:52.736496 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:54.752522 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:54.927556 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
13:00:54.934394 00:00:00:00:00:00 > 00:00:00:00:00:00 null I (s=0,r=0,C) len=42
-->8--
On a different host connected to the same switch, traffic looks more like:
--8<--
2:01:49.388992 IP 192.168.64.1.ntp > 255.255.255.255.ntp: NTPv3, Broadcast, 
length 48
12:01:50.176550 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
12:01:51.235034 arp reply 192.168.64.32 is-at 00:0a:49:00:5e:8a
12:01:51.241857 arp reply 192.168.64.33 is-at 00:0a:49:00:5e:8b
12:01:51.891193 00:00:01:02:c8:58 > 45:c0:00:1c:00:20, ethertype Unknown 
(0xe000), length 60:
0x:  0001 1164 ee9b       ...d
0x0010:        2f6b 8c87  /k..
0x0020:           ..
12:01:52.192552 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
12:01:52.801392 arp reply 192.168.64.34 is-at 00:0a:49:00:5e:8c
12:01:52.808240 arp reply 192.168.64.35 is-at 00:0a:49:00:5e:8d
12:01:54.208495 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
12:01:56.224453 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
12:01:58.240464 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
12:02:00.029320 arp reply 192.168.64.39 is-at 00:0a:49:00:5e:ff
12:02:00.256420 802.1d config 8000.00:a0:d1:e1:b4:78.8026 root 
8000.00:a0:d1:e1:b4:78 pathcost 0 age 0 max 20 hello 2 fdelay 15
-->8--

I noticed that the interrupt count is very low too (the interrupt count
as shown in /proc/interrupts is much higher):
--8<--
[EMAIL

Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-04-17 Thread Stephen Hemminger
I don't know what you are doing different, but my 2 port SysKonnect card
is working fine.  Running SMP AMD64 and 2.6.17 latest.

Showing full speed on both ports.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-04-12 Thread Guenther Thomsen
On Wednesday 12 April 2006 14:48, Stephen Hemminger wrote:
> You need this patch, which Jeff hasn't applied yet.
> -
> Subject: sky2: crash when bringing up second port
>
> Sky2 driver will oops referencing bad memory if used on
> a dual port card.  The problem is accessing past end of
> MIB counter space.
>
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
>
>
> --- test-2.6.orig/drivers/net/sky2.c
> +++ test-2.6/drivers/net/sky2.c
> @@ -579,8 +579,8 @@ static void sky2_mac_init(struct sky2_hw
>   reg = gma_read16(hw, port, GM_PHY_ADDR);
>   gma_write16(hw, port, GM_PHY_ADDR, reg | GM_PAR_MIB_CLR);
>
> - for (i = 0; i < GM_MIB_CNT_SIZE; i++)
> - gma_read16(hw, port, GM_MIB_CNT_BASE + 8 * i);
> + for (i = GM_MIB_CNT_BASE; i <= GM_MIB_CNT_END; i += 4)
> + gma_read16(hw, port, i);
>   gma_write16(hw, port, GM_PHY_ADDR, reg);
>
>   /* transmit control */
> --- test-2.6.orig/drivers/net/sky2.h
> +++ test-2.6/drivers/net/sky2.h
> @@ -1375,7 +1375,7 @@ enum {
>   GM_PHY_ADDR = 0x0088,   /* 16 bit r/w   GPHY Address Register */
>  /* MIB Counters */
>   GM_MIB_CNT_BASE = 0x0100,   /* Base Address of MIB Counters */
> - GM_MIB_CNT_SIZE = 256,
> + GM_MIB_CNT_END  = 0x025C,   /* Last MIB counter */
>  };

Thanks for the very quick response. The patch indeed prevents the panic 
when bringing up the second interface, but now the host doesn't receive 
any packets anymore. It still sends packets (ARP requests, naturally). 
If I inject the Ethernet address of a second host into the arp table of 
the test subject, ICMP Echo requests are sent, but then sendmsg's 
buffer space is exhausted (?):
--8<--
[EMAIL PROTECTED] ~]# arp -s 192.168.65.67 00:A0:D1:E1:F3:2C
[EMAIL PROTECTED] ~]# ping 192.168.65.67
PING 192.168.65.67 (192.168.65.67) 56(84) bytes of data.
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available

--- 192.168.65.67 ping statistics ---
19 packets transmitted, 0 received, 100% packet loss, time 37012ms
-->8--

There is no hint of a malfunction to be found in the kernel's message 
buffer.

best regards
Guenther
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-04-12 Thread Stephen Hemminger
You need this patch, which Jeff hasn't applied yet.
-
Subject: sky2: crash when bringing up second port

Sky2 driver will oops referencing bad memory if used on
a dual port card.  The problem is accessing past end of
MIB counter space.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>


--- test-2.6.orig/drivers/net/sky2.c
+++ test-2.6/drivers/net/sky2.c
@@ -579,8 +579,8 @@ static void sky2_mac_init(struct sky2_hw
reg = gma_read16(hw, port, GM_PHY_ADDR);
gma_write16(hw, port, GM_PHY_ADDR, reg | GM_PAR_MIB_CLR);
 
-   for (i = 0; i < GM_MIB_CNT_SIZE; i++)
-   gma_read16(hw, port, GM_MIB_CNT_BASE + 8 * i);
+   for (i = GM_MIB_CNT_BASE; i <= GM_MIB_CNT_END; i += 4)
+   gma_read16(hw, port, i);
gma_write16(hw, port, GM_PHY_ADDR, reg);
 
/* transmit control */
--- test-2.6.orig/drivers/net/sky2.h
+++ test-2.6/drivers/net/sky2.h
@@ -1375,7 +1375,7 @@ enum {
GM_PHY_ADDR = 0x0088,   /* 16 bit r/w   GPHY Address Register */
 /* MIB Counters */
GM_MIB_CNT_BASE = 0x0100,   /* Base Address of MIB Counters */
-   GM_MIB_CNT_SIZE = 256,
+   GM_MIB_CNT_END  = 0x025C,   /* Last MIB counter */
 };
 
 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kernel panic (on DHCP discover?) in sky2 driver of 2.6.17-rc1

2006-04-12 Thread Guenther Thomsen
I'm happy to report, that the version of the sky2 driver in 2.6.17-rc1 
yields line rate at low CPU utilization (as determined using ttcp).

Unfortunately, it's not quite bug-free yet ;-} 

When enabling the second interface (of the same network controller) the 
kernel panics (perhaps during DHCP discovery?):

--8<--
[EMAIL PROTECTED] ~]# ifup eth1

Determining IP information for eth1...Unable to handle kernel paging 
request at c2014000 RIP:
{sky2_mac_init+522}
PGD 13fc49067 PUD 13fc4a067 PMD 13fc4b067 PTE 0
Oops:  [1] SMP
CPU 3os linked in: autofs4 sr_mod cdrom dm_mod button usb_storage 
uhci_hcd 11BladeRunner_sk98lin #1
RIP: 0010:[] {sky2_up+334} {sprintf+144} 
{inet_ioctl{s_ioctl+44} 
{sys_ioctl+107}
   
CR2: c2014000
 <0>Kerne
-->8--

or (2nd try): 

--8<--
[EMAIL PROTECTED] ~]# Unable to handle kernel paging request at 
c2014000 RIP:
{sky2_mac_init+522}
PGD 13fc49067 PUD 13fc4a067 PMD 13fc4b067 PTE 0
Oops:  [1] SMP
CPU 2
Modules linked in: autofs4 sr_mod cdrom dm_mod button usb_storage 
uhci_hcd ehci_hcd e752x_edac edac_mc shpcR: 00:[] 
{sky2_mac_init+522}
RDX: 4008 RSI: c201 RDI: 
R1 01000
R13: 81013f0511a8 R14: 0001 R15: S CR0: 
8005003b
CR2: c2014000 CR3: 00013425b0001000 81013f051000 
81013f051500 
   Call Trace: {sky2_up+334} 
{dev_op844cf4>{sprintf+144} 
{inet_ioctl+74}
   {ys_ioctl+107}
   {tracesys+209}
+} RSP 
CR2: c2014000
 <0>Kernel panic - n
-->8--

The kernel is vanilla 2.6.17-rc1, the sky2 driver was compiled into the 
kernel. OS is RedHat Fedora Core 4. The kernel was compiled using 
gcc32.

The system is a Blade of a BladeRunner 4130 of Penguincomputing, it 
contains two Xeon CPU (+ HT enabled) and an on-board 8062 network 
controller of Marvell (88E8062 is stamped on the chip).

The hardware seems to work fine using 2.6.15(.7) with the sk98lin driver 
version 8.31 of Syskonnect (skd.de).

Please let me know, if I can provide further information or assist in 
any other way.

best regards
Guenther
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html