Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Thomas Gleixner
On Mon, 2007-06-04 at 09:09 -0700, Stephen Hemminger wrote:
> > > I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
> > > external as module. 
> > 
> > Can you look at iperf to figure out, whether it does some weird timer
> > stuff (high frequency interval timer or such) ? Either check the code or
> > strace it.
>
> It is the receiver doing a tight loop doing gettimeofday/recv calls.
> 
> 
> sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
> gettimeofday({1180973726, 981615}, NULL) = 0
> gettimeofday({1180973726, 981751}, NULL) = 0
> futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
> futex(0x8055c90, FUTEX_WAKE, 1) = 0
> recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 
> 0) = 8192
> gettimeofday({1180973726, 982754}, NULL) = 0
> recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 
> 0) = 8192
> gettimeofday({1180973726, 983790}, NULL) = 0

Well, gettimeofday() is not affected by the highres code, but

> nanosleep({0, 0}, NULL) = 0
> nanosleep({0, 0}, NULL) = 0

is. The nanosleep call with a relative timeout of 0 returns immediately
with highres enabled, while it sleeps at least until the next tick
arrives when highres is off. Are there more of those stupid sleeps in
the code ?

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Stephen Hemminger
On Mon, 04 Jun 2007 08:39:48 +0200
Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> On Sun, 2007-06-03 at 18:26 +0200, Maximilian Engelhardt wrote:
> > > Is there any other strange behavior of the high res enabled kernel than
> > > the b44 problem ?
> > 
> > I didn't notice anything in the past (as I wrote). But today I did some 
> > tests 
> > for an updated version of the p54 mac80211 wlan driver and I noticed 
> > exactly 
> > the same problem:
> > 
> > when booting with highres=off everything is fine.
> > But when I boot an highres enabled kernel and I do the iperf-test with the 
> > p54 
> > driver, my systems becomes unresponsive during the test. It seems to be 
> > exactly the same problem I have with the b44 driver.
> > So this might not be a bug in the b44 code but a bug somewhere in the linux 
> > networking code.
> > 
> > I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
> > external as module. 
> 
> Can you look at iperf to figure out, whether it does some weird timer
> stuff (high frequency interval timer or such) ? Either check the code or
> strace it.
> 
>   tglx
> 
> 


It is the receiver doing a tight loop doing gettimeofday/recv calls.


sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
gettimeofday({1180973726, 981615}, NULL) = 0
gettimeofday({1180973726, 981751}, NULL) = 0
futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055c90, FUTEX_WAKE, 1) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 982754}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 983790}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 984355}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 984706}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 985111}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 985499}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986088}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986436}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986916}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 987397}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 987872}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 988440}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 988823}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 989314}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 990029}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 990890}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 991803}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 992616}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 993105}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 993585}, NULL) = 0
recv(4, "\0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364"..., 8192, 0) 
= 8192
gettimeofday({1180973726, 994014}, NULL) = 0
...

recv(4, "45678901234567890123456789012345"..., 8192, 0) = 1448
gettimeofday({1180973757, 172437}, NULL) = 0
recv(4, "23456789012345678901234567890123"..., 8192, 0) = 1448
gettimeofday({1180973757, 172576}, NULL) = 0
recv(4, "01234567890123456789012345678901"..., 8192, 0) = 1632
gettimeofday({1180973757, 172752}, NULL) = 0
recv(4, "", 8192, 0)= 0
gettimeofday({1180973757, 172797}, NULL) = 0
gettimeofday({1180973757, 172817}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
close(4) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 0
_exit(0)

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Thomas Gleixner
On Sun, 2007-06-03 at 18:26 +0200, Maximilian Engelhardt wrote:
> > Is there any other strange behavior of the high res enabled kernel than
> > the b44 problem ?
> 
> I didn't notice anything in the past (as I wrote). But today I did some tests 
> for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
> the same problem:
> 
> when booting with highres=off everything is fine.
> But when I boot an highres enabled kernel and I do the iperf-test with the 
> p54 
> driver, my systems becomes unresponsive during the test. It seems to be 
> exactly the same problem I have with the b44 driver.
> So this might not be a bug in the b44 code but a bug somewhere in the linux 
> networking code.
> 
> I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
> external as module. 

Can you look at iperf to figure out, whether it does some weird timer
stuff (high frequency interval timer or such) ? Either check the code or
strace it.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Thomas Gleixner
On Sun, 2007-06-03 at 18:26 +0200, Maximilian Engelhardt wrote:
  Is there any other strange behavior of the high res enabled kernel than
  the b44 problem ?
 
 I didn't notice anything in the past (as I wrote). But today I did some tests 
 for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
 the same problem:
 
 when booting with highres=off everything is fine.
 But when I boot an highres enabled kernel and I do the iperf-test with the 
 p54 
 driver, my systems becomes unresponsive during the test. It seems to be 
 exactly the same problem I have with the b44 driver.
 So this might not be a bug in the b44 code but a bug somewhere in the linux 
 networking code.
 
 I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
 external as module. 

Can you look at iperf to figure out, whether it does some weird timer
stuff (high frequency interval timer or such) ? Either check the code or
strace it.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Stephen Hemminger
On Mon, 04 Jun 2007 08:39:48 +0200
Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Sun, 2007-06-03 at 18:26 +0200, Maximilian Engelhardt wrote:
   Is there any other strange behavior of the high res enabled kernel than
   the b44 problem ?
  
  I didn't notice anything in the past (as I wrote). But today I did some 
  tests 
  for an updated version of the p54 mac80211 wlan driver and I noticed 
  exactly 
  the same problem:
  
  when booting with highres=off everything is fine.
  But when I boot an highres enabled kernel and I do the iperf-test with the 
  p54 
  driver, my systems becomes unresponsive during the test. It seems to be 
  exactly the same problem I have with the b44 driver.
  So this might not be a bug in the b44 code but a bug somewhere in the linux 
  networking code.
  
  I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
  external as module. 
 
 Can you look at iperf to figure out, whether it does some weird timer
 stuff (high frequency interval timer or such) ? Either check the code or
 strace it.
 
   tglx
 
 


It is the receiver doing a tight loop doing gettimeofday/recv calls.


sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
gettimeofday({1180973726, 981615}, NULL) = 0
gettimeofday({1180973726, 981751}, NULL) = 0
futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055c90, FUTEX_WAKE, 1) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 982754}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 983790}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 984355}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 984706}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 985111}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 985499}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986088}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986436}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 986916}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 987397}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 987872}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 988440}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 988823}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 989314}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 990029}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 990890}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 991803}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 992616}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 993105}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 993585}, NULL) = 0
recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 0) 
= 8192
gettimeofday({1180973726, 994014}, NULL) = 0
...

recv(4, 45678901234567890123456789012345..., 8192, 0) = 1448
gettimeofday({1180973757, 172437}, NULL) = 0
recv(4, 23456789012345678901234567890123..., 8192, 0) = 1448
gettimeofday({1180973757, 172576}, NULL) = 0
recv(4, 01234567890123456789012345678901..., 8192, 0) = 1632
gettimeofday({1180973757, 172752}, NULL) = 0
recv(4, , 8192, 0)= 0
gettimeofday({1180973757, 172797}, NULL) = 0
gettimeofday({1180973757, 172817}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
nanosleep({0, 0}, NULL) = 0
close(4) = 0
futex(0x8055d04, 0x5 /* FUTEX_??? */, 1) = 1
futex(0x8055d30, FUTEX_WAKE, 1) = 0
_exit(0)

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read 

Re: b44: regression in 2.6.22 (resend)

2007-06-04 Thread Thomas Gleixner
On Mon, 2007-06-04 at 09:09 -0700, Stephen Hemminger wrote:
   I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
   external as module. 
  
  Can you look at iperf to figure out, whether it does some weird timer
  stuff (high frequency interval timer or such) ? Either check the code or
  strace it.

 It is the receiver doing a tight loop doing gettimeofday/recv calls.
 
 
 sendto(-1227715616, 0xc, 3085438964, 0, {...}, 3067249832) = 0
 gettimeofday({1180973726, 981615}, NULL) = 0
 gettimeofday({1180973726, 981751}, NULL) = 0
 futex(0x8055c64, 0x5 /* FUTEX_??? */, 1) = 1
 futex(0x8055c90, FUTEX_WAKE, 1) = 0
 recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 
 0) = 8192
 gettimeofday({1180973726, 982754}, NULL) = 0
 recv(4, \0\0\0\0\0\0\0\1\0\0\23\211\0\0\0\0\0\0\0\0\377\377\364..., 8192, 
 0) = 8192
 gettimeofday({1180973726, 983790}, NULL) = 0

Well, gettimeofday() is not affected by the highres code, but

 nanosleep({0, 0}, NULL) = 0
 nanosleep({0, 0}, NULL) = 0

is. The nanosleep call with a relative timeout of 0 returns immediately
with highres enabled, while it sleeps at least until the next tick
arrives when highres is off. Are there more of those stupid sleeps in
the code ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-06-03 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> > > following combinations on the kernel command line:
> > >
> > > 1) highres=off nohz=off (should be the same as your working config)
> > > 2) highres=off
> > > 3) nohz=off
> >
> > I tested this with my 2.6.22-rc3 kernel, here are the results:
> >
> > without any special boot parameters: problem does appear
> > highres=off nohz=off: problem does not appear
> > highres=off: problem does not appear
> > nohz=off: problem does appear
>
> Is there any other strange behavior of the high res enabled kernel than
> the b44 problem ?

I didn't notice anything in the past (as I wrote). But today I did some tests 
for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
the same problem:

when booting with highres=off everything is fine.
But when I boot an highres enabled kernel and I do the iperf-test with the p54 
driver, my systems becomes unresponsive during the test. It seems to be 
exactly the same problem I have with the b44 driver.
So this might not be a bug in the b44 code but a bug somewhere in the linux 
networking code.

I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
external as module. 

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-06-03 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
   Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
   following combinations on the kernel command line:
  
   1) highres=off nohz=off (should be the same as your working config)
   2) highres=off
   3) nohz=off
 
  I tested this with my 2.6.22-rc3 kernel, here are the results:
 
  without any special boot parameters: problem does appear
  highres=off nohz=off: problem does not appear
  highres=off: problem does not appear
  nohz=off: problem does appear

 Is there any other strange behavior of the high res enabled kernel than
 the b44 problem ?

I didn't notice anything in the past (as I wrote). But today I did some tests 
for an updated version of the p54 mac80211 wlan driver and I noticed exactly 
the same problem:

when booting with highres=off everything is fine.
But when I boot an highres enabled kernel and I do the iperf-test with the p54 
driver, my systems becomes unresponsive during the test. It seems to be 
exactly the same problem I have with the b44 driver.
So this might not be a bug in the b44 code but a bug somewhere in the linux 
networking code.

I did the test with an 2.6.22-rc3-git4 kernel and the p54 driver built 
external as module. 

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-30 Thread Michael Buesch
On Tuesday 29 May 2007 23:36:51 Gary Zambrano wrote:
> On Tue, 2007-05-29 at 18:39 -0400, Jeff Garzik wrote:
> 
> > We check for 0x because that is often how a fault is indicated, 
> > when the memory location is read during or immediately after hotplug (or 
> > if the PCI bus is truly faulty).  So for most hardware, you see
> > 
> > tmp = read(irq status)
> > if (!tmp)
> > return irq-none /* no irq events raised */
> > if (tmp == 0x)
> > return irq-none /* hot unplug or h/w fault */
> > 
> > and the method that determines no interrupt handling is needed.
> > 
> 
> I guess you are right, but then shouldn't the driver be checking for
> faults in other parts of the code too? What if a fault/hotplug occurs
> immediately after an interrupt, but before a tx?
> Thanks,

Well, in general it's not desired (or even possible) to check every
single MMIO access. General practice is to check on entering the IRQ handler
and on values from registers or DMA which could crash the driver.
For example on DMA we get some cookie back from the device in the TX
status report that says which packet this was associated to.
That value is used to look up a table.
In bcm43xx I initialize that to 0 in the driver, which is not a valid value.
Later I check for that. So if the device is unplugged before DMA was
on that value was complete it will recognize it and it won't crash.

In general we can only do our very best to prevent bad sideeffects from
a pull-in-full-operation. We can't do much here anyway. Best we can do
is to _try_ to prevent a crash. It might not always be 100% possible
to do so, though.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-30 Thread Michael Buesch
On Tuesday 29 May 2007 23:36:51 Gary Zambrano wrote:
 On Tue, 2007-05-29 at 18:39 -0400, Jeff Garzik wrote:
 
  We check for 0x because that is often how a fault is indicated, 
  when the memory location is read during or immediately after hotplug (or 
  if the PCI bus is truly faulty).  So for most hardware, you see
  
  tmp = read(irq status)
  if (!tmp)
  return irq-none /* no irq events raised */
  if (tmp == 0x)
  return irq-none /* hot unplug or h/w fault */
  
  and the method that determines no interrupt handling is needed.
  
 
 I guess you are right, but then shouldn't the driver be checking for
 faults in other parts of the code too? What if a fault/hotplug occurs
 immediately after an interrupt, but before a tx?
 Thanks,

Well, in general it's not desired (or even possible) to check every
single MMIO access. General practice is to check on entering the IRQ handler
and on values from registers or DMA which could crash the driver.
For example on DMA we get some cookie back from the device in the TX
status report that says which packet this was associated to.
That value is used to look up a table.
In bcm43xx I initialize that to 0 in the driver, which is not a valid value.
Later I check for that. So if the device is unplugged before DMA was
on that value was complete it will recognize it and it won't crash.

In general we can only do our very best to prevent bad sideeffects from
a pull-in-full-operation. We can't do much here anyway. Best we can do
is to _try_ to prevent a crash. It might not always be 100% possible
to do so, though.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Tue, 2007-05-29 at 18:39 -0400, Jeff Garzik wrote:

> We check for 0x because that is often how a fault is indicated, 
> when the memory location is read during or immediately after hotplug (or 
> if the PCI bus is truly faulty).  So for most hardware, you see
> 
> tmp = read(irq status)
> if (!tmp)
>   return irq-none /* no irq events raised */
> if (tmp == 0x)
>   return irq-none /* hot unplug or h/w fault */
> 
> and the method that determines no interrupt handling is needed.
> 

I guess you are right, but then shouldn't the driver be checking for
faults in other parts of the code too? What if a fault/hotplug occurs
immediately after an interrupt, but before a tx?
Thanks,
Gary

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Jeff Garzik

Gary Zambrano wrote:

The b44 interrupt status reg returns a value of 0 if no interrupts are
pending. The b44 uses a mask to determine which bits (events) can
generate device interrupts on the system. If the masked interrupt status
register bits are not asserted, then the b44 will return to the system
with handled = 0. 
So, I think the way the b44 interrupt code is written should be ok and
not a bug. 



This is normal.

We check for 0x because that is often how a fault is indicated, 
when the memory location is read during or immediately after hotplug (or 
if the PCI bus is truly faulty).  So for most hardware, you see


tmp = read(irq status)
if (!tmp)
return irq-none /* no irq events raised */
if (tmp == 0x)
return irq-none /* hot unplug or h/w fault */

and the method that determines no interrupt handling is needed.

Regards,

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Tue, 2007-05-29 at 22:45 +0200, Michael Buesch wrote:
> On Tuesday 29 May 2007 16:14:35 Gary Zambrano wrote:
> > On Mon, 2007-05-28 at 16:55 +0200, Michael Buesch wrote:
> > > On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
> > > > On Monday 28 May 2007, Michael Buesch wrote:
> > > > > Can you also test the following patch?
> > > > > I think there's a bug in b44 that is doesn't properly discard
> > > > > shared IRQs, so it might possibly generate a NAPI storm, dunno.
> > > > > Worth a try.
> > > > >
> > > > > Index: linux-2.6.22-rc3/drivers/net/b44.c
> > > > > ===
> > > > > --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 
> > > > > 23:01:44.0
> > > > > +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 
> > > > > 12:48:27.0
> > > > > +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
> > > > >   spin_lock(>lock);
> > > > >
> > > > >   istat = br32(bp, B44_ISTAT);
> > > > > + if (istat == 0x)
> > > > > + goto out; /* Shared IRQ not for us */
> > > > >   imask = br32(bp, B44_IMASK);
> > > > >
> > > > >   /* The interrupt mask register controls which interrupt bits
> > > > > @@ -942,6 +944,7 @@ irq_ack:
> > > > >   bw32(bp, B44_ISTAT, istat);
> > > > >   br32(bp, B44_ISTAT);
> > > > >   }
> > > > > +out:
> > > > >   spin_unlock(>lock);
> > > > >   return IRQ_RETVAL(handled);
> > > > >  }
> > > > 
> > > > I did try this patch on a affected kernel, but I didn't notice any big 
> > > > difference. Perhaps the kernel is a bit less slow during the test, but 
> > > > It's 
> > > > hard to tell.
> > > 
> > > Ok, but anyway. I think this is a bug and needs to be fixed this way. 
> > > Gary?
> > > 
> > 
> > Thanks Michael.
> > No, I don't think this is a bug and it does not need to be fixed.
> 
> Are you sure? I'm not so sure, because
> 1) On bcm43xx the reverse engineers told us that the card
>returns 0x for no-irq-pending. Since b44 and bcm43xx
>are very similiar in IRQ and DMA I just thought it would
>be the case there, too. Just guessing.
The b44 interrupt status reg returns a value of 0 if no interrupts are
pending. The b44 uses a mask to determine which bits (events) can
generate device interrupts on the system. If the masked interrupt status
register bits are not asserted, then the b44 will return to the system
with handled = 0. 
So, I think the way the b44 interrupt code is written should be ok and
not a bug. 


> 2) PCMCIA cards usually return all-ones if you try to read a
>register of a card that's been removed. So it's good
>practice to check for this and bail out early in the IRQ
>path. Do PCMCIA cards (PC-card, not neccessarily a real
>16bit PCMCIA card) for b44 exist?

I do not know of any pccard application of the b44. As far as I know
b44s live on motherboards and in the wireless soc.

Thanks,
Gary

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Stephen Hemminger
I am busy bisecting the real cause. Unfortunately, oprofile doesn't work
on the laptop, and build time sucks...

This how I think the IRQ should work:

--- a/drivers/net/b44.c 2007-05-29 09:47:53.0 -0700
+++ b/drivers/net/b44.c 2007-05-29 09:49:50.0 -0700
@@ -908,9 +908,11 @@ static irqreturn_t b44_interrupt(int irq
u32 istat, imask;
int handled = 0;
 
-   spin_lock(>lock);
-
istat = br32(bp, B44_ISTAT);
+   if (istat == 0 || istat == ~0)
+   return IRQ_NONE;
+
+   spin_lock(>lock);
imask = br32(bp, B44_IMASK);
 
/* The interrupt mask register controls which interrupt bits
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Michael Buesch
On Tuesday 29 May 2007 16:14:35 Gary Zambrano wrote:
> On Mon, 2007-05-28 at 16:55 +0200, Michael Buesch wrote:
> > On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
> > > On Monday 28 May 2007, Michael Buesch wrote:
> > > > Can you also test the following patch?
> > > > I think there's a bug in b44 that is doesn't properly discard
> > > > shared IRQs, so it might possibly generate a NAPI storm, dunno.
> > > > Worth a try.
> > > >
> > > > Index: linux-2.6.22-rc3/drivers/net/b44.c
> > > > ===
> > > > --- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 
> > > > 23:01:44.0
> > > > +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c2007-05-28 
> > > > 12:48:27.0
> > > > +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
> > > > spin_lock(>lock);
> > > >
> > > > istat = br32(bp, B44_ISTAT);
> > > > +   if (istat == 0x)
> > > > +   goto out; /* Shared IRQ not for us */
> > > > imask = br32(bp, B44_IMASK);
> > > >
> > > > /* The interrupt mask register controls which interrupt bits
> > > > @@ -942,6 +944,7 @@ irq_ack:
> > > > bw32(bp, B44_ISTAT, istat);
> > > > br32(bp, B44_ISTAT);
> > > > }
> > > > +out:
> > > > spin_unlock(>lock);
> > > > return IRQ_RETVAL(handled);
> > > >  }
> > > 
> > > I did try this patch on a affected kernel, but I didn't notice any big 
> > > difference. Perhaps the kernel is a bit less slow during the test, but 
> > > It's 
> > > hard to tell.
> > 
> > Ok, but anyway. I think this is a bug and needs to be fixed this way. Gary?
> > 
> 
> Thanks Michael.
> No, I don't think this is a bug and it does not need to be fixed.

Are you sure? I'm not so sure, because
1) On bcm43xx the reverse engineers told us that the card
   returns 0x for no-irq-pending. Since b44 and bcm43xx
   are very similiar in IRQ and DMA I just thought it would
   be the case there, too. Just guessing.
2) PCMCIA cards usually return all-ones if you try to read a
   register of a card that's been removed. So it's good
   practice to check for this and bail out early in the IRQ
   path. Do PCMCIA cards (PC-card, not neccessarily a real
   16bit PCMCIA card) for b44 exist?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
> > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > > Timer, but the high ping problem is still there.
> > >
> > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > > "feature" in a different way than rc2-mm1 does.
> >
> > I think the bug in 2.6.21/22-rc3 is a different one that the one in
> > 2.6.22-rc2-mm1, but that's also only a wild guess :)
> >
> > I'll explain this a bit:
> > In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
> > kernels for some time. With this driver and High Resolution Timer turned
> > on I get problems using iperf. The problems are that the systems becomes
> > really slow and unresponsive.  Michael Buesch thought this could be an
> > IRQ storm which sounds logical to me. This bug did never happen to me
> > before I startet the iperf test.
>
> Can you please apply
>
> http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch
>
> on top of rc3 and check, whether it has any effect on your problem.
>
The patch didn't change anything.

> > The other issue happens only with 2.6.22-rc2-mm1 which includes the b44
> > ssb spilt. It's independed wether High Resolution Timer is turned on or
> > off I always get very varying and high ping times. The iperf-test doesn't
> > show the problems from 2.6.21/22-rc3.
>
> Neither with nor without highres ?

Yes, it doesn't matter if highres is turned on or off. iperf never showed the 
problem from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Tuesday 29 May 2007, Gary Zambrano wrote:
> On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote:
> > On Monday 28 May 2007, Thomas Gleixner wrote:
> > > On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try
> > > > > the following combinations on the kernel command line:
> > > > >
> > > > > 1) highres=off nohz=off (should be the same as your working config)
> > > > > 2) highres=off
> > > > > 3) nohz=off
> > > >
> > > > I tested this with my 2.6.22-rc3 kernel, here are the results:
> > > >
> > > > without any special boot parameters: problem does appear
> > > > highres=off nohz=off: problem does not appear
> > > > highres=off: problem does not appear
> > > > nohz=off: problem does appear
> > >
> > > Is there any other strange behavior of the high res enabled kernel than
> > > the b44 problem ?
> >
> > I didn't notice anything.
> >
> > > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > > Timer, but the high ping problem is still there.
> > >
> > > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > > "feature" in a different way than rc2-mm1 does.
> >
> > I think the bug in 2.6.21/22-rc3 is a different one that the one in
> > 2.6.22-rc2-mm1, but that's also only a wild guess :)
> >
> > I'll explain this a bit:
> > In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
> > kernels for some time. With this driver and High Resolution Timer turned
> > on I get problems using iperf. The problems are that the systems becomes
> > really slow and unresponsive.  Michael Buesch thought this could be an
> > IRQ storm which sounds logical to me. This bug did never happen to me
> > before I startet the iperf test.
>
> Can you please check to see if you notice anything out of the ordinary
> using netperf in place of iperf in your high res timer on/off testbed?

ok, here are the results, I also had a look at the cpu kernel usage.
'good' means that the kernel responsiveness during the test was as I would 
expect it and I didn't notice any problems.

highres enabled:

netperf: 80%sy 15%si (good)
iperf: not really messureable (bad, problem described above)

highres disabled:

netperf: 80%sy 15%si (good)
iperf:  5%sy 30%hi 15%si (good)


for test tests I did run the following commands:
netperf -l 60 192.168.1.1
iperf -c 192.168.1.1 -r -t 60

I also tried to run iperf without any additional arguments (iperf -c 
192.168.1.1) on the problematic kernel but the result is the same as the 
command I wrote above.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Mon, 2007-05-28 at 16:55 +0200, Michael Buesch wrote:
> On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
> > On Monday 28 May 2007, Michael Buesch wrote:
> > > Can you also test the following patch?
> > > I think there's a bug in b44 that is doesn't properly discard
> > > shared IRQs, so it might possibly generate a NAPI storm, dunno.
> > > Worth a try.
> > >
> > > Index: linux-2.6.22-rc3/drivers/net/b44.c
> > > ===
> > > --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 
> > > 23:01:44.0
> > > +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 
> > > 12:48:27.0
> > > +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
> > >   spin_lock(>lock);
> > >
> > >   istat = br32(bp, B44_ISTAT);
> > > + if (istat == 0x)
> > > + goto out; /* Shared IRQ not for us */
> > >   imask = br32(bp, B44_IMASK);
> > >
> > >   /* The interrupt mask register controls which interrupt bits
> > > @@ -942,6 +944,7 @@ irq_ack:
> > >   bw32(bp, B44_ISTAT, istat);
> > >   br32(bp, B44_ISTAT);
> > >   }
> > > +out:
> > >   spin_unlock(>lock);
> > >   return IRQ_RETVAL(handled);
> > >  }
> > 
> > I did try this patch on a affected kernel, but I didn't notice any big 
> > difference. Perhaps the kernel is a bit less slow during the test, but It's 
> > hard to tell.
> 
> Ok, but anyway. I think this is a bug and needs to be fixed this way. Gary?
> 

Thanks Michael.
No, I don't think this is a bug and it does not need to be fixed.
Thanks,
Gary


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote:
> On Monday 28 May 2007, Thomas Gleixner wrote:
> > On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> > > > following combinations on the kernel command line:
> > > >
> > > > 1) highres=off nohz=off (should be the same as your working config)
> > > > 2) highres=off
> > > > 3) nohz=off
> > >
> > > I tested this with my 2.6.22-rc3 kernel, here are the results:
> > >
> > > without any special boot parameters: problem does appear
> > > highres=off nohz=off: problem does not appear
> > > highres=off: problem does not appear
> > > nohz=off: problem does appear
> >
> > Is there any other strange behavior of the high res enabled kernel than
> > the b44 problem ?
> 
> I didn't notice anything.
> 
> >
> > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > Timer, but the high ping problem is still there.
> >
> > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > "feature" in a different way than rc2-mm1 does.
> 
> I think the bug in 2.6.21/22-rc3 is a different one that the one in 
> 2.6.22-rc2-mm1, but that's also only a wild guess :)
> 
> I'll explain this a bit:
> In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels 
> for 
> some time. With this driver and High Resolution Timer turned on I get 
> problems using iperf. The problems are that the systems becomes really slow 
> and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
> sounds logical to me. This bug did never happen to me before I startet the 
> iperf test.


Can you please check to see if you notice anything out of the ordinary
using netperf in place of iperf in your high res timer on/off testbed?

Thanks,
Gary







-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote:
 On Monday 28 May 2007, Thomas Gleixner wrote:
  On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
following combinations on the kernel command line:
   
1) highres=off nohz=off (should be the same as your working config)
2) highres=off
3) nohz=off
  
   I tested this with my 2.6.22-rc3 kernel, here are the results:
  
   without any special boot parameters: problem does appear
   highres=off nohz=off: problem does not appear
   highres=off: problem does not appear
   nohz=off: problem does appear
 
  Is there any other strange behavior of the high res enabled kernel than
  the b44 problem ?
 
 I didn't notice anything.
 
 
   I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
   Timer, but the high ping problem is still there.
 
  Hmm, that's mysterious. Wild guess is that highres exposes the hidden
  feature in a different way than rc2-mm1 does.
 
 I think the bug in 2.6.21/22-rc3 is a different one that the one in 
 2.6.22-rc2-mm1, but that's also only a wild guess :)
 
 I'll explain this a bit:
 In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels 
 for 
 some time. With this driver and High Resolution Timer turned on I get 
 problems using iperf. The problems are that the systems becomes really slow 
 and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
 sounds logical to me. This bug did never happen to me before I startet the 
 iperf test.


Can you please check to see if you notice anything out of the ordinary
using netperf in place of iperf in your high res timer on/off testbed?

Thanks,
Gary







-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Mon, 2007-05-28 at 16:55 +0200, Michael Buesch wrote:
 On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
  On Monday 28 May 2007, Michael Buesch wrote:
   Can you also test the following patch?
   I think there's a bug in b44 that is doesn't properly discard
   shared IRQs, so it might possibly generate a NAPI storm, dunno.
   Worth a try.
  
   Index: linux-2.6.22-rc3/drivers/net/b44.c
   ===
   --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 
   23:01:44.0
   +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 
   12:48:27.0
   +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
 spin_lock(bp-lock);
  
 istat = br32(bp, B44_ISTAT);
   + if (istat == 0x)
   + goto out; /* Shared IRQ not for us */
 imask = br32(bp, B44_IMASK);
  
 /* The interrupt mask register controls which interrupt bits
   @@ -942,6 +944,7 @@ irq_ack:
 bw32(bp, B44_ISTAT, istat);
 br32(bp, B44_ISTAT);
 }
   +out:
 spin_unlock(bp-lock);
 return IRQ_RETVAL(handled);
}
  
  I did try this patch on a affected kernel, but I didn't notice any big 
  difference. Perhaps the kernel is a bit less slow during the test, but It's 
  hard to tell.
 
 Ok, but anyway. I think this is a bug and needs to be fixed this way. Gary?
 

Thanks Michael.
No, I don't think this is a bug and it does not need to be fixed.
Thanks,
Gary


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Tuesday 29 May 2007, Gary Zambrano wrote:
 On Mon, 2007-05-28 at 13:55 -0700, Maximilian Engelhardt wrote:
  On Monday 28 May 2007, Thomas Gleixner wrote:
   On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
 Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try
 the following combinations on the kernel command line:

 1) highres=off nohz=off (should be the same as your working config)
 2) highres=off
 3) nohz=off
   
I tested this with my 2.6.22-rc3 kernel, here are the results:
   
without any special boot parameters: problem does appear
highres=off nohz=off: problem does not appear
highres=off: problem does not appear
nohz=off: problem does appear
  
   Is there any other strange behavior of the high res enabled kernel than
   the b44 problem ?
 
  I didn't notice anything.
 
I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
Timer, but the high ping problem is still there.
  
   Hmm, that's mysterious. Wild guess is that highres exposes the hidden
   feature in a different way than rc2-mm1 does.
 
  I think the bug in 2.6.21/22-rc3 is a different one that the one in
  2.6.22-rc2-mm1, but that's also only a wild guess :)
 
  I'll explain this a bit:
  In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
  kernels for some time. With this driver and High Resolution Timer turned
  on I get problems using iperf. The problems are that the systems becomes
  really slow and unresponsive.  Michael Buesch thought this could be an
  IRQ storm which sounds logical to me. This bug did never happen to me
  before I startet the iperf test.

 Can you please check to see if you notice anything out of the ordinary
 using netperf in place of iperf in your high res timer on/off testbed?

ok, here are the results, I also had a look at the cpu kernel usage.
'good' means that the kernel responsiveness during the test was as I would 
expect it and I didn't notice any problems.

highres enabled:

netperf: 80%sy 15%si (good)
iperf: not really messureable (bad, problem described above)

highres disabled:

netperf: 80%sy 15%si (good)
iperf:  5%sy 30%hi 15%si (good)


for test tests I did run the following commands:
netperf -l 60 192.168.1.1
iperf -c 192.168.1.1 -r -t 60

I also tried to run iperf without any additional arguments (iperf -c 
192.168.1.1) on the problematic kernel but the result is the same as the 
command I wrote above.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
Timer, but the high ping problem is still there.
  
   Hmm, that's mysterious. Wild guess is that highres exposes the hidden
   feature in a different way than rc2-mm1 does.
 
  I think the bug in 2.6.21/22-rc3 is a different one that the one in
  2.6.22-rc2-mm1, but that's also only a wild guess :)
 
  I'll explain this a bit:
  In 2.6.21/22-rc3 is the same b44 driver that has been in the stock
  kernels for some time. With this driver and High Resolution Timer turned
  on I get problems using iperf. The problems are that the systems becomes
  really slow and unresponsive.  Michael Buesch thought this could be an
  IRQ storm which sounds logical to me. This bug did never happen to me
  before I startet the iperf test.

 Can you please apply

 http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch

 on top of rc3 and check, whether it has any effect on your problem.

The patch didn't change anything.

  The other issue happens only with 2.6.22-rc2-mm1 which includes the b44
  ssb spilt. It's independed wether High Resolution Timer is turned on or
  off I always get very varying and high ping times. The iperf-test doesn't
  show the problems from 2.6.21/22-rc3.

 Neither with nor without highres ?

Yes, it doesn't matter if highres is turned on or off. iperf never showed the 
problem from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Michael Buesch
On Tuesday 29 May 2007 16:14:35 Gary Zambrano wrote:
 On Mon, 2007-05-28 at 16:55 +0200, Michael Buesch wrote:
  On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
   On Monday 28 May 2007, Michael Buesch wrote:
Can you also test the following patch?
I think there's a bug in b44 that is doesn't properly discard
shared IRQs, so it might possibly generate a NAPI storm, dunno.
Worth a try.
   
Index: linux-2.6.22-rc3/drivers/net/b44.c
===
--- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 
23:01:44.0
+0200 +++ linux-2.6.22-rc3/drivers/net/b44.c2007-05-28 
12:48:27.0
+0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
spin_lock(bp-lock);
   
istat = br32(bp, B44_ISTAT);
+   if (istat == 0x)
+   goto out; /* Shared IRQ not for us */
imask = br32(bp, B44_IMASK);
   
/* The interrupt mask register controls which interrupt bits
@@ -942,6 +944,7 @@ irq_ack:
bw32(bp, B44_ISTAT, istat);
br32(bp, B44_ISTAT);
}
+out:
spin_unlock(bp-lock);
return IRQ_RETVAL(handled);
 }
   
   I did try this patch on a affected kernel, but I didn't notice any big 
   difference. Perhaps the kernel is a bit less slow during the test, but 
   It's 
   hard to tell.
  
  Ok, but anyway. I think this is a bug and needs to be fixed this way. Gary?
  
 
 Thanks Michael.
 No, I don't think this is a bug and it does not need to be fixed.

Are you sure? I'm not so sure, because
1) On bcm43xx the reverse engineers told us that the card
   returns 0x for no-irq-pending. Since b44 and bcm43xx
   are very similiar in IRQ and DMA I just thought it would
   be the case there, too. Just guessing.
2) PCMCIA cards usually return all-ones if you try to read a
   register of a card that's been removed. So it's good
   practice to check for this and bail out early in the IRQ
   path. Do PCMCIA cards (PC-card, not neccessarily a real
   16bit PCMCIA card) for b44 exist?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Stephen Hemminger
I am busy bisecting the real cause. Unfortunately, oprofile doesn't work
on the laptop, and build time sucks...

This how I think the IRQ should work:

--- a/drivers/net/b44.c 2007-05-29 09:47:53.0 -0700
+++ b/drivers/net/b44.c 2007-05-29 09:49:50.0 -0700
@@ -908,9 +908,11 @@ static irqreturn_t b44_interrupt(int irq
u32 istat, imask;
int handled = 0;
 
-   spin_lock(bp-lock);
-
istat = br32(bp, B44_ISTAT);
+   if (istat == 0 || istat == ~0)
+   return IRQ_NONE;
+
+   spin_lock(bp-lock);
imask = br32(bp, B44_IMASK);
 
/* The interrupt mask register controls which interrupt bits
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Tue, 2007-05-29 at 22:45 +0200, Michael Buesch wrote:
 On Tuesday 29 May 2007 16:14:35 Gary Zambrano wrote:
  On Mon, 2007-05-28 at 16:55 +0200, Michael Buesch wrote:
   On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
On Monday 28 May 2007, Michael Buesch wrote:
 Can you also test the following patch?
 I think there's a bug in b44 that is doesn't properly discard
 shared IRQs, so it might possibly generate a NAPI storm, dunno.
 Worth a try.

 Index: linux-2.6.22-rc3/drivers/net/b44.c
 ===
 --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 
 23:01:44.0
 +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 
 12:48:27.0
 +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
   spin_lock(bp-lock);

   istat = br32(bp, B44_ISTAT);
 + if (istat == 0x)
 + goto out; /* Shared IRQ not for us */
   imask = br32(bp, B44_IMASK);

   /* The interrupt mask register controls which interrupt bits
 @@ -942,6 +944,7 @@ irq_ack:
   bw32(bp, B44_ISTAT, istat);
   br32(bp, B44_ISTAT);
   }
 +out:
   spin_unlock(bp-lock);
   return IRQ_RETVAL(handled);
  }

I did try this patch on a affected kernel, but I didn't notice any big 
difference. Perhaps the kernel is a bit less slow during the test, but 
It's 
hard to tell.
   
   Ok, but anyway. I think this is a bug and needs to be fixed this way. 
   Gary?
   
  
  Thanks Michael.
  No, I don't think this is a bug and it does not need to be fixed.
 
 Are you sure? I'm not so sure, because
 1) On bcm43xx the reverse engineers told us that the card
returns 0x for no-irq-pending. Since b44 and bcm43xx
are very similiar in IRQ and DMA I just thought it would
be the case there, too. Just guessing.
The b44 interrupt status reg returns a value of 0 if no interrupts are
pending. The b44 uses a mask to determine which bits (events) can
generate device interrupts on the system. If the masked interrupt status
register bits are not asserted, then the b44 will return to the system
with handled = 0. 
So, I think the way the b44 interrupt code is written should be ok and
not a bug. 


 2) PCMCIA cards usually return all-ones if you try to read a
register of a card that's been removed. So it's good
practice to check for this and bail out early in the IRQ
path. Do PCMCIA cards (PC-card, not neccessarily a real
16bit PCMCIA card) for b44 exist?

I do not know of any pccard application of the b44. As far as I know
b44s live on motherboards and in the wireless soc.

Thanks,
Gary

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Jeff Garzik

Gary Zambrano wrote:

The b44 interrupt status reg returns a value of 0 if no interrupts are
pending. The b44 uses a mask to determine which bits (events) can
generate device interrupts on the system. If the masked interrupt status
register bits are not asserted, then the b44 will return to the system
with handled = 0. 
So, I think the way the b44 interrupt code is written should be ok and
not a bug. 



This is normal.

We check for 0x because that is often how a fault is indicated, 
when the memory location is read during or immediately after hotplug (or 
if the PCI bus is truly faulty).  So for most hardware, you see


tmp = read(irq status)
if (!tmp)
return irq-none /* no irq events raised */
if (tmp == 0x)
return irq-none /* hot unplug or h/w fault */

and the method that determines no interrupt handling is needed.

Regards,

Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-29 Thread Gary Zambrano
On Tue, 2007-05-29 at 18:39 -0400, Jeff Garzik wrote:

 We check for 0x because that is often how a fault is indicated, 
 when the memory location is read during or immediately after hotplug (or 
 if the PCI bus is truly faulty).  So for most hardware, you see
 
 tmp = read(irq status)
 if (!tmp)
   return irq-none /* no irq events raised */
 if (tmp == 0x)
   return irq-none /* hot unplug or h/w fault */
 
 and the method that determines no interrupt handling is needed.
 

I guess you are right, but then shouldn't the driver be checking for
faults in other parts of the code too? What if a fault/hotplug occurs
immediately after an interrupt, but before a tx?
Thanks,
Gary

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Uwe Bugla
On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
> > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > Timer, but the high ping problem is still there.
> >
> > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > "feature" in a different way than rc2-mm1 does.
>
> I think the bug in 2.6.21/22-rc3 is a different one that the one in
> 2.6.22-rc2-mm1, but that's also only a wild guess :)
>
> I'll explain this a bit:
> In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for
> some time. With this driver and High Resolution Timer turned on I get
> problems using iperf. The problems are that the systems becomes really slow
> and unresponsive. Michael Buesch thought this could be an IRQ storm which
> sounds logical to me. This bug did never happen to me before I startet the
> iperf test.

Can you please apply

http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch

on top of rc3 and check, whether it has any effect on your problem.

> The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb
> spilt. It's independed wether High Resolution Timer is turned on or off I
> always get very varying and high ping times. The iperf-test doesn't show the
> problems from 2.6.21/22-rc3.

Neither with nor without highres ?

tglx

Neither with nor without Gleixner ?

Neither with nor without Buesch ?

Neither with nor without Miller ?

Neither with nor without Kyle ?

Neither with nor without  ?

Neither with nor without would-like-to-spare time hackers ?

Neither with nor without profile neurotic would-like-to-copyright owners ?

___
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Thomas Gleixner
On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
> > > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > > Timer, but the high ping problem is still there.
> >
> > Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> > "feature" in a different way than rc2-mm1 does.
> 
> I think the bug in 2.6.21/22-rc3 is a different one that the one in 
> 2.6.22-rc2-mm1, but that's also only a wild guess :)
> 
> I'll explain this a bit:
> In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels 
> for 
> some time. With this driver and High Resolution Timer turned on I get 
> problems using iperf. The problems are that the systems becomes really slow 
> and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
> sounds logical to me. This bug did never happen to me before I startet the 
> iperf test.

Can you please apply

http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch

on top of rc3 and check, whether it has any effect on your problem.

> The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb 
> spilt. It's independed wether High Resolution Timer is turned on or off I 
> always get very varying and high ping times. The iperf-test doesn't show the 
> problems from 2.6.21/22-rc3.

Neither with nor without highres ?

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> > > following combinations on the kernel command line:
> > >
> > > 1) highres=off nohz=off (should be the same as your working config)
> > > 2) highres=off
> > > 3) nohz=off
> >
> > I tested this with my 2.6.22-rc3 kernel, here are the results:
> >
> > without any special boot parameters: problem does appear
> > highres=off nohz=off: problem does not appear
> > highres=off: problem does not appear
> > nohz=off: problem does appear
>
> Is there any other strange behavior of the high res enabled kernel than
> the b44 problem ?

I didn't notice anything.

>
> > I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
> > Timer, but the high ping problem is still there.
>
> Hmm, that's mysterious. Wild guess is that highres exposes the hidden
> "feature" in a different way than rc2-mm1 does.

I think the bug in 2.6.21/22-rc3 is a different one that the one in 
2.6.22-rc2-mm1, but that's also only a wild guess :)

I'll explain this a bit:
In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for 
some time. With this driver and High Resolution Timer turned on I get 
problems using iperf. The problems are that the systems becomes really slow 
and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
sounds logical to me. This bug did never happen to me before I startet the 
iperf test.

The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb 
spilt. It's independed wether High Resolution Timer is turned on or off I 
always get very varying and high ping times. The iperf-test doesn't show the 
problems from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Thomas Gleixner
On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
> > Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> > following combinations on the kernel command line:
> >
> > 1) highres=off nohz=off (should be the same as your working config)
> > 2) highres=off
> > 3) nohz=off
> 
> I tested this with my 2.6.22-rc3 kernel, here are the results:
> 
> without any special boot parameters: problem does appear
> highres=off nohz=off: problem does not appear
> highres=off: problem does not appear
> nohz=off: problem does appear

Is there any other strange behavior of the high res enabled kernel than
the b44 problem ?

> I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, 
> but the high ping problem is still there.

Hmm, that's mysterious. Wild guess is that highres exposes the hidden
"feature" in a different way than rc2-mm1 does.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
> > > The -oldconfig1 is the kernel that had no problems and the other shows
> > > the b44 problem. So if High Resolution Timer Support is disabled
> > > everything works fine and if I enable it the problems do appear again.
> > >
> > > I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling
> > > High Resolution Timer Support will also solve the problem there.
> > >
> > > The older kernels I tried also work perfectly fine and they didn't have
> > > the High Resolution Timer Support yet.
> >
> > So, that's interesting, indeed.
> > Any idea what's going on, someone? Thomas?
>
> Not off the top of my head.
>
> Maximilian, does the kernel work otherwise (I mean aside of the b44
> driver) ?
>
> Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> following combinations on the kernel command line:
>
> 1) highres=off nohz=off (should be the same as your working config)
> 2) highres=off
> 3) nohz=off

I tested this with my 2.6.22-rc3 kernel, here are the results:

without any special boot parameters: problem does appear
highres=off nohz=off: problem does not appear
highres=off: problem does not appear
nohz=off: problem does appear

I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, 
but the high ping problem is still there.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
On Monday 28 May 2007 17:32:51 Thomas Gleixner wrote:
> On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
> > > The -oldconfig1 is the kernel that had no problems and the other shows 
> > > the b44 
> > > problem. So if High Resolution Timer Support is disabled everything works 
> > > fine and if I enable it the problems do appear again.
> > > 
> > > I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling 
> > > High 
> > > Resolution Timer Support will also solve the problem there.
> > > 
> > > The older kernels I tried also work perfectly fine and they didn't have 
> > > the 
> > > High Resolution Timer Support yet.
> > 
> > So, that's interesting, indeed.
> > Any idea what's going on, someone? Thomas?
> 
> Not off the top of my head.
> 
> Maximilian, does the kernel work otherwise (I mean aside of the b44
> driver) ? 
> 
> Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
> following combinations on the kernel command line:
> 
> 1) highres=off nohz=off (should be the same as your working config)
> 2) highres=off
> 3) nohz=off
> 
> Michael, is anything in the b44 driver timer driven ?

NAPI perhaps. I don't know. Steve may know.

The only timer in b44 is to update stats every second. I doubt
that this can cause such an issue. It's not involved in transmission.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Thomas Gleixner
On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
> > The -oldconfig1 is the kernel that had no problems and the other shows the 
> > b44 
> > problem. So if High Resolution Timer Support is disabled everything works 
> > fine and if I enable it the problems do appear again.
> > 
> > I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
> > Resolution Timer Support will also solve the problem there.
> > 
> > The older kernels I tried also work perfectly fine and they didn't have the 
> > High Resolution Timer Support yet.
> 
> So, that's interesting, indeed.
> Any idea what's going on, someone? Thomas?

Not off the top of my head.

Maximilian, does the kernel work otherwise (I mean aside of the b44
driver) ? 

Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
following combinations on the kernel command line:

1) highres=off nohz=off (should be the same as your working config)
2) highres=off
3) nohz=off

Michael, is anything in the b44 driver timer driven ?

Thanks,

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
On Monday 28 May 2007 16:09:46 Maximilian Engelhardt wrote:
> On Monday 28 May 2007, Michael Buesch wrote:
> > Can you give 2.6.16 a try? The diff is not that big and we might
> > be able to find out what broke if you find out 2.6.16 works.
> > You can also try later kernels like .17, .18, .19 to further
> > reduce the patch. (You could also git-bisect, if you have the time).
> >
> I did some testing and compiled some kernels and here are the results:
> 
> I was able to find out what causes the problems for me.  I did build two 
> 2.6.21.3 kernels, and one does work fine and the other doesn't.
> 
> This is a diff of the kernel configs I used:
> 
> --- /usr/src/linux-2.6.21.3-oldconfig1/.config  2007-05-28 13:41:15.0 
> +0200
> +++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200
> @@ -1,7 +1,7 @@
>  #
>  # Automatically generated make config: don't edit
>  # Linux kernel version: 2.6.21.3
> -# Mon May 28 13:41:15 2007
> +# Mon May 28 14:46:09 2007
>  #
>  CONFIG_X86_32=y
>  CONFIG_GENERIC_TIME=y
> @@ -32,7 +32,7 @@
>  #
>  # General setup
>  #
> -CONFIG_LOCALVERSION="-oldconfig1"
> +CONFIG_LOCALVERSION=""
>  CONFIG_LOCALVERSION_AUTO=y
>  CONFIG_SWAP=y
>  CONFIG_SYSVIPC=y
> @@ -108,9 +108,9 @@
>  #
>  # Processor type and features
>  #
> -# CONFIG_TICK_ONESHOT is not set
> +CONFIG_TICK_ONESHOT=y
>  # CONFIG_NO_HZ is not set
> -# CONFIG_HIGH_RES_TIMERS is not set
> +CONFIG_HIGH_RES_TIMERS=y
>  # CONFIG_SMP is not set
>  CONFIG_X86_PC=y
>  # CONFIG_X86_ELAN is not set
> 
> The -oldconfig1 is the kernel that had no problems and the other shows the 
> b44 
> problem. So if High Resolution Timer Support is disabled everything works 
> fine and if I enable it the problems do appear again.
> 
> I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
> Resolution Timer Support will also solve the problem there.
> 
> The older kernels I tried also work perfectly fine and they didn't have the 
> High Resolution Timer Support yet.

So, that's interesting, indeed.
Any idea what's going on, someone? Thomas?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
> On Monday 28 May 2007, Michael Buesch wrote:
> > Can you also test the following patch?
> > I think there's a bug in b44 that is doesn't properly discard
> > shared IRQs, so it might possibly generate a NAPI storm, dunno.
> > Worth a try.
> >
> > Index: linux-2.6.22-rc3/drivers/net/b44.c
> > ===
> > --- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 23:01:44.0
> > +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c2007-05-28 
> > 12:48:27.0
> > +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
> > spin_lock(>lock);
> >
> > istat = br32(bp, B44_ISTAT);
> > +   if (istat == 0x)
> > +   goto out; /* Shared IRQ not for us */
> > imask = br32(bp, B44_IMASK);
> >
> > /* The interrupt mask register controls which interrupt bits
> > @@ -942,6 +944,7 @@ irq_ack:
> > bw32(bp, B44_ISTAT, istat);
> > br32(bp, B44_ISTAT);
> > }
> > +out:
> > spin_unlock(>lock);
> > return IRQ_RETVAL(handled);
> >  }
> 
> I did try this patch on a affected kernel, but I didn't notice any big 
> difference. Perhaps the kernel is a bit less slow during the test, but It's 
> hard to tell.

Ok, but anyway. I think this is a bug and needs to be fixed this way. Gary?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
> Can you also test the following patch?
> I think there's a bug in b44 that is doesn't properly discard
> shared IRQs, so it might possibly generate a NAPI storm, dunno.
> Worth a try.
>
> Index: linux-2.6.22-rc3/drivers/net/b44.c
> ===
> --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 23:01:44.0
> +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 12:48:27.0
> +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
>   spin_lock(>lock);
>
>   istat = br32(bp, B44_ISTAT);
> + if (istat == 0x)
> + goto out; /* Shared IRQ not for us */
>   imask = br32(bp, B44_IMASK);
>
>   /* The interrupt mask register controls which interrupt bits
> @@ -942,6 +944,7 @@ irq_ack:
>   bw32(bp, B44_ISTAT, istat);
>   br32(bp, B44_ISTAT);
>   }
> +out:
>   spin_unlock(>lock);
>   return IRQ_RETVAL(handled);
>  }

I did try this patch on a affected kernel, but I didn't notice any big 
difference. Perhaps the kernel is a bit less slow during the test, but It's 
hard to tell.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
> Can you give 2.6.16 a try? The diff is not that big and we might
> be able to find out what broke if you find out 2.6.16 works.
> You can also try later kernels like .17, .18, .19 to further
> reduce the patch. (You could also git-bisect, if you have the time).
>
I did some testing and compiled some kernels and here are the results:

I was able to find out what causes the problems for me.  I did build two 
2.6.21.3 kernels, and one does work fine and the other doesn't.

This is a diff of the kernel configs I used:

--- /usr/src/linux-2.6.21.3-oldconfig1/.config  2007-05-28 13:41:15.0 
+0200
+++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.21.3
-# Mon May 28 13:41:15 2007
+# Mon May 28 14:46:09 2007
 #
 CONFIG_X86_32=y
 CONFIG_GENERIC_TIME=y
@@ -32,7 +32,7 @@
 #
 # General setup
 #
-CONFIG_LOCALVERSION="-oldconfig1"
+CONFIG_LOCALVERSION=""
 CONFIG_LOCALVERSION_AUTO=y
 CONFIG_SWAP=y
 CONFIG_SYSVIPC=y
@@ -108,9 +108,9 @@
 #
 # Processor type and features
 #
-# CONFIG_TICK_ONESHOT is not set
+CONFIG_TICK_ONESHOT=y
 # CONFIG_NO_HZ is not set
-# CONFIG_HIGH_RES_TIMERS is not set
+CONFIG_HIGH_RES_TIMERS=y
 # CONFIG_SMP is not set
 CONFIG_X86_PC=y
 # CONFIG_X86_ELAN is not set

The -oldconfig1 is the kernel that had no problems and the other shows the b44 
problem. So if High Resolution Timer Support is disabled everything works 
fine and if I enable it the problems do appear again.

I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
Resolution Timer Support will also solve the problem there.

The older kernels I tried also work perfectly fine and they didn't have the 
High Resolution Timer Support yet.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
Can you also test the following patch?
I think there's a bug in b44 that is doesn't properly discard
shared IRQs, so it might possibly generate a NAPI storm, dunno.
Worth a try.

Index: linux-2.6.22-rc3/drivers/net/b44.c
===
--- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 23:01:44.0 
+0200
+++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 12:48:27.0 +0200
@@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
spin_lock(>lock);
 
istat = br32(bp, B44_ISTAT);
+   if (istat == 0x)
+   goto out; /* Shared IRQ not for us */
imask = br32(bp, B44_IMASK);
 
/* The interrupt mask register controls which interrupt bits
@@ -942,6 +944,7 @@ irq_ack:
bw32(bp, B44_ISTAT, istat);
br32(bp, B44_ISTAT);
}
+out:
spin_unlock(>lock);
return IRQ_RETVAL(handled);
 }


-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
Can you give 2.6.16 a try? The diff is not that big and we might
be able to find out what broke if you find out 2.6.16 works.
You can also try later kernels like .17, .18, .19 to further
reduce the patch. (You could also git-bisect, if you have the time).

git-diff v2.6.16..v2.6.22-rc3 drivers/net/b44.c

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index c3267e4..879a2ff 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2,6 +2,7 @@
  *
  * Copyright (C) 2002 David S. Miller ([EMAIL PROTECTED])
  * Fixed by Pekka Pietikainen ([EMAIL PROTECTED])
+ * Copyright (C) 2006 Broadcom Corporation.
  *
  * Distribute under GPL.
  */
@@ -28,8 +29,8 @@ #include "b44.h"
 
 #define DRV_MODULE_NAME"b44"
 #define PFX DRV_MODULE_NAME": "
-#define DRV_MODULE_VERSION "0.97"
-#define DRV_MODULE_RELDATE "Nov 30, 2005"
+#define DRV_MODULE_VERSION "1.01"
+#define DRV_MODULE_RELDATE "Jun 16, 2006"
 
 #define B44_DEF_MSG_ENABLE   \
(NETIF_MSG_DRV  | \
@@ -58,7 +59,6 @@ #define B44_TX_RING_SIZE  512
 #define B44_DEF_TX_RING_PENDING(B44_TX_RING_SIZE - 1)
 #define B44_TX_RING_BYTES  (sizeof(struct dma_desc) * \
 B44_TX_RING_SIZE)
-#define B44_DMA_MASK 0x3fff
 
 #define TX_RING_GAP(BP)\
(B44_TX_RING_SIZE - (BP)->tx_pending)
@@ -74,6 +74,15 @@ #define TX_PKT_BUF_SZ(B44_MAX_MTU + ET
 /* minimum number of free TX descriptors required to wake up TX process */
 #define B44_TX_WAKEUP_THRESH   (B44_TX_RING_SIZE / 4)
 
+/* b44 internal pattern match filter info */
+#define B44_PATTERN_BASE   0x400
+#define B44_PATTERN_SIZE   0x80
+#define B44_PMASK_BASE 0x600
+#define B44_PMASK_SIZE 0x10
+#define B44_MAX_PATTERNS   16
+#define B44_ETHIPV6UDP_HLEN62
+#define B44_ETHIPV4UDP_HLEN42
+
 static char version[] __devinitdata =
DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
 
@@ -100,7 +109,12 @@ MODULE_DEVICE_TABLE(pci, b44_pci_tbl);
 
 static void b44_halt(struct b44 *);
 static void b44_init_rings(struct b44 *);
-static void b44_init_hw(struct b44 *);
+
+#define B44_FULL_RESET 1
+#define B44_FULL_RESET_SKIP_PHY2
+#define B44_PARTIAL_RESET  3
+
+static void b44_init_hw(struct b44 *, int);
 
 static int dma_desc_align_mask;
 static int dma_desc_sync_size;
@@ -136,7 +150,7 @@ static inline unsigned long br32(const s
return readl(bp->regs + reg);
 }
 
-static inline void bw32(const struct b44 *bp, 
+static inline void bw32(const struct b44 *bp,
unsigned long reg, unsigned long val)
 {
writel(val, bp->regs + reg);
@@ -286,13 +300,13 @@ static void __b44_cam_write(struct b44 *
val |= ((u32) data[4]) <<  8;
val |= ((u32) data[5]) <<  0;
bw32(bp, B44_CAM_DATA_LO, val);
-   val = (CAM_DATA_HI_VALID | 
+   val = (CAM_DATA_HI_VALID |
   (((u32) data[0]) << 8) |
   (((u32) data[1]) << 0));
bw32(bp, B44_CAM_DATA_HI, val);
bw32(bp, B44_CAM_CTRL, (CAM_CTRL_WRITE |
(index << CAM_CTRL_INDEX_SHIFT)));
-   b44_wait_bit(bp, B44_CAM_CTRL, CAM_CTRL_BUSY, 100, 1);  
+   b44_wait_bit(bp, B44_CAM_CTRL, CAM_CTRL_BUSY, 100, 1);
 }
 
 static inline void __b44_disable_ints(struct b44 *bp)
@@ -410,25 +424,18 @@ static void __b44_set_flow_ctrl(struct b
 
 static void b44_set_flow_ctrl(struct b44 *bp, u32 local, u32 remote)
 {
-   u32 pause_enab = bp->flags & (B44_FLAG_TX_PAUSE |
- B44_FLAG_RX_PAUSE);
+   u32 pause_enab = 0;
 
-   if (local & ADVERTISE_PAUSE_CAP) {
-   if (local & ADVERTISE_PAUSE_ASYM) {
-   if (remote & LPA_PAUSE_CAP)
-   pause_enab |= (B44_FLAG_TX_PAUSE |
-  B44_FLAG_RX_PAUSE);
-   else if (remote & LPA_PAUSE_ASYM)
-   pause_enab |= B44_FLAG_RX_PAUSE;
-   } else {
-   if (remote & LPA_PAUSE_CAP)
-   pause_enab |= (B44_FLAG_TX_PAUSE |
-  B44_FLAG_RX_PAUSE);
-   }
-   } else if (local & ADVERTISE_PAUSE_ASYM) {
-   if ((remote & LPA_PAUSE_CAP) &&
-   (remote & LPA_PAUSE_ASYM))
-   pause_enab |= B44_FLAG_TX_PAUSE;
+   /* The driver supports only rx pause by default because
+  the b44 mac tx pause mechanism generates excessive
+  pause frames.
+  Use ethtool to turn on b44 tx pause if necessary.
+*/
+   if ((local & ADVERTISE_PAUSE_CAP) &&
+   (local & ADVERTISE_PAUSE_ASYM)){
+   if ((remote & LPA_PAUSE_ASYM) &&
+   !(remote & LPA_PAUSE_CAP))
+   pause_enab |= B44_FLAG_RX_PAUSE;
}
 

Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
Can you give 2.6.16 a try? The diff is not that big and we might
be able to find out what broke if you find out 2.6.16 works.
You can also try later kernels like .17, .18, .19 to further
reduce the patch. (You could also git-bisect, if you have the time).

git-diff v2.6.16..v2.6.22-rc3 drivers/net/b44.c

diff --git a/drivers/net/b44.c b/drivers/net/b44.c
index c3267e4..879a2ff 100644
--- a/drivers/net/b44.c
+++ b/drivers/net/b44.c
@@ -2,6 +2,7 @@
  *
  * Copyright (C) 2002 David S. Miller ([EMAIL PROTECTED])
  * Fixed by Pekka Pietikainen ([EMAIL PROTECTED])
+ * Copyright (C) 2006 Broadcom Corporation.
  *
  * Distribute under GPL.
  */
@@ -28,8 +29,8 @@ #include b44.h
 
 #define DRV_MODULE_NAMEb44
 #define PFX DRV_MODULE_NAME: 
-#define DRV_MODULE_VERSION 0.97
-#define DRV_MODULE_RELDATE Nov 30, 2005
+#define DRV_MODULE_VERSION 1.01
+#define DRV_MODULE_RELDATE Jun 16, 2006
 
 #define B44_DEF_MSG_ENABLE   \
(NETIF_MSG_DRV  | \
@@ -58,7 +59,6 @@ #define B44_TX_RING_SIZE  512
 #define B44_DEF_TX_RING_PENDING(B44_TX_RING_SIZE - 1)
 #define B44_TX_RING_BYTES  (sizeof(struct dma_desc) * \
 B44_TX_RING_SIZE)
-#define B44_DMA_MASK 0x3fff
 
 #define TX_RING_GAP(BP)\
(B44_TX_RING_SIZE - (BP)-tx_pending)
@@ -74,6 +74,15 @@ #define TX_PKT_BUF_SZ(B44_MAX_MTU + ET
 /* minimum number of free TX descriptors required to wake up TX process */
 #define B44_TX_WAKEUP_THRESH   (B44_TX_RING_SIZE / 4)
 
+/* b44 internal pattern match filter info */
+#define B44_PATTERN_BASE   0x400
+#define B44_PATTERN_SIZE   0x80
+#define B44_PMASK_BASE 0x600
+#define B44_PMASK_SIZE 0x10
+#define B44_MAX_PATTERNS   16
+#define B44_ETHIPV6UDP_HLEN62
+#define B44_ETHIPV4UDP_HLEN42
+
 static char version[] __devinitdata =
DRV_MODULE_NAME .c:v DRV_MODULE_VERSION  ( DRV_MODULE_RELDATE )\n;
 
@@ -100,7 +109,12 @@ MODULE_DEVICE_TABLE(pci, b44_pci_tbl);
 
 static void b44_halt(struct b44 *);
 static void b44_init_rings(struct b44 *);
-static void b44_init_hw(struct b44 *);
+
+#define B44_FULL_RESET 1
+#define B44_FULL_RESET_SKIP_PHY2
+#define B44_PARTIAL_RESET  3
+
+static void b44_init_hw(struct b44 *, int);
 
 static int dma_desc_align_mask;
 static int dma_desc_sync_size;
@@ -136,7 +150,7 @@ static inline unsigned long br32(const s
return readl(bp-regs + reg);
 }
 
-static inline void bw32(const struct b44 *bp, 
+static inline void bw32(const struct b44 *bp,
unsigned long reg, unsigned long val)
 {
writel(val, bp-regs + reg);
@@ -286,13 +300,13 @@ static void __b44_cam_write(struct b44 *
val |= ((u32) data[4])   8;
val |= ((u32) data[5])   0;
bw32(bp, B44_CAM_DATA_LO, val);
-   val = (CAM_DATA_HI_VALID | 
+   val = (CAM_DATA_HI_VALID |
   (((u32) data[0])  8) |
   (((u32) data[1])  0));
bw32(bp, B44_CAM_DATA_HI, val);
bw32(bp, B44_CAM_CTRL, (CAM_CTRL_WRITE |
(index  CAM_CTRL_INDEX_SHIFT)));
-   b44_wait_bit(bp, B44_CAM_CTRL, CAM_CTRL_BUSY, 100, 1);  
+   b44_wait_bit(bp, B44_CAM_CTRL, CAM_CTRL_BUSY, 100, 1);
 }
 
 static inline void __b44_disable_ints(struct b44 *bp)
@@ -410,25 +424,18 @@ static void __b44_set_flow_ctrl(struct b
 
 static void b44_set_flow_ctrl(struct b44 *bp, u32 local, u32 remote)
 {
-   u32 pause_enab = bp-flags  (B44_FLAG_TX_PAUSE |
- B44_FLAG_RX_PAUSE);
+   u32 pause_enab = 0;
 
-   if (local  ADVERTISE_PAUSE_CAP) {
-   if (local  ADVERTISE_PAUSE_ASYM) {
-   if (remote  LPA_PAUSE_CAP)
-   pause_enab |= (B44_FLAG_TX_PAUSE |
-  B44_FLAG_RX_PAUSE);
-   else if (remote  LPA_PAUSE_ASYM)
-   pause_enab |= B44_FLAG_RX_PAUSE;
-   } else {
-   if (remote  LPA_PAUSE_CAP)
-   pause_enab |= (B44_FLAG_TX_PAUSE |
-  B44_FLAG_RX_PAUSE);
-   }
-   } else if (local  ADVERTISE_PAUSE_ASYM) {
-   if ((remote  LPA_PAUSE_CAP) 
-   (remote  LPA_PAUSE_ASYM))
-   pause_enab |= B44_FLAG_TX_PAUSE;
+   /* The driver supports only rx pause by default because
+  the b44 mac tx pause mechanism generates excessive
+  pause frames.
+  Use ethtool to turn on b44 tx pause if necessary.
+*/
+   if ((local  ADVERTISE_PAUSE_CAP) 
+   (local  ADVERTISE_PAUSE_ASYM)){
+   if ((remote  LPA_PAUSE_ASYM) 
+   !(remote  LPA_PAUSE_CAP))
+   pause_enab |= B44_FLAG_RX_PAUSE;
}
 
__b44_set_flow_ctrl(bp, pause_enab);
@@ -608,8 +615,7 @@ 

Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
Can you also test the following patch?
I think there's a bug in b44 that is doesn't properly discard
shared IRQs, so it might possibly generate a NAPI storm, dunno.
Worth a try.

Index: linux-2.6.22-rc3/drivers/net/b44.c
===
--- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 23:01:44.0 
+0200
+++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 12:48:27.0 +0200
@@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
spin_lock(bp-lock);
 
istat = br32(bp, B44_ISTAT);
+   if (istat == 0x)
+   goto out; /* Shared IRQ not for us */
imask = br32(bp, B44_IMASK);
 
/* The interrupt mask register controls which interrupt bits
@@ -942,6 +944,7 @@ irq_ack:
bw32(bp, B44_ISTAT, istat);
br32(bp, B44_ISTAT);
}
+out:
spin_unlock(bp-lock);
return IRQ_RETVAL(handled);
 }


-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
 Can you give 2.6.16 a try? The diff is not that big and we might
 be able to find out what broke if you find out 2.6.16 works.
 You can also try later kernels like .17, .18, .19 to further
 reduce the patch. (You could also git-bisect, if you have the time).

I did some testing and compiled some kernels and here are the results:

I was able to find out what causes the problems for me.  I did build two 
2.6.21.3 kernels, and one does work fine and the other doesn't.

This is a diff of the kernel configs I used:

--- /usr/src/linux-2.6.21.3-oldconfig1/.config  2007-05-28 13:41:15.0 
+0200
+++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
 # Linux kernel version: 2.6.21.3
-# Mon May 28 13:41:15 2007
+# Mon May 28 14:46:09 2007
 #
 CONFIG_X86_32=y
 CONFIG_GENERIC_TIME=y
@@ -32,7 +32,7 @@
 #
 # General setup
 #
-CONFIG_LOCALVERSION=-oldconfig1
+CONFIG_LOCALVERSION=
 CONFIG_LOCALVERSION_AUTO=y
 CONFIG_SWAP=y
 CONFIG_SYSVIPC=y
@@ -108,9 +108,9 @@
 #
 # Processor type and features
 #
-# CONFIG_TICK_ONESHOT is not set
+CONFIG_TICK_ONESHOT=y
 # CONFIG_NO_HZ is not set
-# CONFIG_HIGH_RES_TIMERS is not set
+CONFIG_HIGH_RES_TIMERS=y
 # CONFIG_SMP is not set
 CONFIG_X86_PC=y
 # CONFIG_X86_ELAN is not set

The -oldconfig1 is the kernel that had no problems and the other shows the b44 
problem. So if High Resolution Timer Support is disabled everything works 
fine and if I enable it the problems do appear again.

I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
Resolution Timer Support will also solve the problem there.

The older kernels I tried also work perfectly fine and they didn't have the 
High Resolution Timer Support yet.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
 Can you also test the following patch?
 I think there's a bug in b44 that is doesn't properly discard
 shared IRQs, so it might possibly generate a NAPI storm, dunno.
 Worth a try.

 Index: linux-2.6.22-rc3/drivers/net/b44.c
 ===
 --- linux-2.6.22-rc3.orig/drivers/net/b44.c   2007-05-27 23:01:44.0
 +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-28 12:48:27.0
 +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
   spin_lock(bp-lock);

   istat = br32(bp, B44_ISTAT);
 + if (istat == 0x)
 + goto out; /* Shared IRQ not for us */
   imask = br32(bp, B44_IMASK);

   /* The interrupt mask register controls which interrupt bits
 @@ -942,6 +944,7 @@ irq_ack:
   bw32(bp, B44_ISTAT, istat);
   br32(bp, B44_ISTAT);
   }
 +out:
   spin_unlock(bp-lock);
   return IRQ_RETVAL(handled);
  }

I did try this patch on a affected kernel, but I didn't notice any big 
difference. Perhaps the kernel is a bit less slow during the test, but It's 
hard to tell.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
On Monday 28 May 2007 16:12:12 Maximilian Engelhardt wrote:
 On Monday 28 May 2007, Michael Buesch wrote:
  Can you also test the following patch?
  I think there's a bug in b44 that is doesn't properly discard
  shared IRQs, so it might possibly generate a NAPI storm, dunno.
  Worth a try.
 
  Index: linux-2.6.22-rc3/drivers/net/b44.c
  ===
  --- linux-2.6.22-rc3.orig/drivers/net/b44.c 2007-05-27 23:01:44.0
  +0200 +++ linux-2.6.22-rc3/drivers/net/b44.c2007-05-28 
  12:48:27.0
  +0200 @@ -911,6 +911,8 @@ static irqreturn_t b44_interrupt(int irq
  spin_lock(bp-lock);
 
  istat = br32(bp, B44_ISTAT);
  +   if (istat == 0x)
  +   goto out; /* Shared IRQ not for us */
  imask = br32(bp, B44_IMASK);
 
  /* The interrupt mask register controls which interrupt bits
  @@ -942,6 +944,7 @@ irq_ack:
  bw32(bp, B44_ISTAT, istat);
  br32(bp, B44_ISTAT);
  }
  +out:
  spin_unlock(bp-lock);
  return IRQ_RETVAL(handled);
   }
 
 I did try this patch on a affected kernel, but I didn't notice any big 
 difference. Perhaps the kernel is a bit less slow during the test, but It's 
 hard to tell.

Ok, but anyway. I think this is a bug and needs to be fixed this way. Gary?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
On Monday 28 May 2007 16:09:46 Maximilian Engelhardt wrote:
 On Monday 28 May 2007, Michael Buesch wrote:
  Can you give 2.6.16 a try? The diff is not that big and we might
  be able to find out what broke if you find out 2.6.16 works.
  You can also try later kernels like .17, .18, .19 to further
  reduce the patch. (You could also git-bisect, if you have the time).
 
 I did some testing and compiled some kernels and here are the results:
 
 I was able to find out what causes the problems for me.  I did build two 
 2.6.21.3 kernels, and one does work fine and the other doesn't.
 
 This is a diff of the kernel configs I used:
 
 --- /usr/src/linux-2.6.21.3-oldconfig1/.config  2007-05-28 13:41:15.0 
 +0200
 +++ /usr/src/linux-2.6.21.3/.config 2007-05-28 14:46:08.0 +0200
 @@ -1,7 +1,7 @@
  #
  # Automatically generated make config: don't edit
  # Linux kernel version: 2.6.21.3
 -# Mon May 28 13:41:15 2007
 +# Mon May 28 14:46:09 2007
  #
  CONFIG_X86_32=y
  CONFIG_GENERIC_TIME=y
 @@ -32,7 +32,7 @@
  #
  # General setup
  #
 -CONFIG_LOCALVERSION=-oldconfig1
 +CONFIG_LOCALVERSION=
  CONFIG_LOCALVERSION_AUTO=y
  CONFIG_SWAP=y
  CONFIG_SYSVIPC=y
 @@ -108,9 +108,9 @@
  #
  # Processor type and features
  #
 -# CONFIG_TICK_ONESHOT is not set
 +CONFIG_TICK_ONESHOT=y
  # CONFIG_NO_HZ is not set
 -# CONFIG_HIGH_RES_TIMERS is not set
 +CONFIG_HIGH_RES_TIMERS=y
  # CONFIG_SMP is not set
  CONFIG_X86_PC=y
  # CONFIG_X86_ELAN is not set
 
 The -oldconfig1 is the kernel that had no problems and the other shows the 
 b44 
 problem. So if High Resolution Timer Support is disabled everything works 
 fine and if I enable it the problems do appear again.
 
 I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
 Resolution Timer Support will also solve the problem there.
 
 The older kernels I tried also work perfectly fine and they didn't have the 
 High Resolution Timer Support yet.

So, that's interesting, indeed.
Any idea what's going on, someone? Thomas?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Thomas Gleixner
On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
  The -oldconfig1 is the kernel that had no problems and the other shows the 
  b44 
  problem. So if High Resolution Timer Support is disabled everything works 
  fine and if I enable it the problems do appear again.
  
  I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling High 
  Resolution Timer Support will also solve the problem there.
  
  The older kernels I tried also work perfectly fine and they didn't have the 
  High Resolution Timer Support yet.
 
 So, that's interesting, indeed.
 Any idea what's going on, someone? Thomas?

Not off the top of my head.

Maximilian, does the kernel work otherwise (I mean aside of the b44
driver) ? 

Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
following combinations on the kernel command line:

1) highres=off nohz=off (should be the same as your working config)
2) highres=off
3) nohz=off

Michael, is anything in the b44 driver timer driven ?

Thanks,

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Michael Buesch
On Monday 28 May 2007 17:32:51 Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
   The -oldconfig1 is the kernel that had no problems and the other shows 
   the b44 
   problem. So if High Resolution Timer Support is disabled everything works 
   fine and if I enable it the problems do appear again.
   
   I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling 
   High 
   Resolution Timer Support will also solve the problem there.
   
   The older kernels I tried also work perfectly fine and they didn't have 
   the 
   High Resolution Timer Support yet.
  
  So, that's interesting, indeed.
  Any idea what's going on, someone? Thomas?
 
 Not off the top of my head.
 
 Maximilian, does the kernel work otherwise (I mean aside of the b44
 driver) ? 
 
 Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
 following combinations on the kernel command line:
 
 1) highres=off nohz=off (should be the same as your working config)
 2) highres=off
 3) nohz=off
 
 Michael, is anything in the b44 driver timer driven ?

NAPI perhaps. I don't know. Steve may know.

The only timer in b44 is to update stats every second. I doubt
that this can cause such an issue. It's not involved in transmission.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 17:14 +0200, Michael Buesch wrote:
   The -oldconfig1 is the kernel that had no problems and the other shows
   the b44 problem. So if High Resolution Timer Support is disabled
   everything works fine and if I enable it the problems do appear again.
  
   I didn't test this on my 2.6.22-rc3 kernel yet, but I guess disabling
   High Resolution Timer Support will also solve the problem there.
  
   The older kernels I tried also work perfectly fine and they didn't have
   the High Resolution Timer Support yet.
 
  So, that's interesting, indeed.
  Any idea what's going on, someone? Thomas?

 Not off the top of my head.

 Maximilian, does the kernel work otherwise (I mean aside of the b44
 driver) ?

 Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
 following combinations on the kernel command line:

 1) highres=off nohz=off (should be the same as your working config)
 2) highres=off
 3) nohz=off

I tested this with my 2.6.22-rc3 kernel, here are the results:

without any special boot parameters: problem does appear
highres=off nohz=off: problem does not appear
highres=off: problem does not appear
nohz=off: problem does appear

I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, 
but the high ping problem is still there.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Thomas Gleixner
On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
  Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
  following combinations on the kernel command line:
 
  1) highres=off nohz=off (should be the same as your working config)
  2) highres=off
  3) nohz=off
 
 I tested this with my 2.6.22-rc3 kernel, here are the results:
 
 without any special boot parameters: problem does appear
 highres=off nohz=off: problem does not appear
 highres=off: problem does not appear
 nohz=off: problem does appear

Is there any other strange behavior of the high res enabled kernel than
the b44 problem ?

 I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution Timer, 
 but the high ping problem is still there.

Hmm, that's mysterious. Wild guess is that highres exposes the hidden
feature in a different way than rc2-mm1 does.

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Maximilian Engelhardt
On Monday 28 May 2007, Thomas Gleixner wrote:
 On Mon, 2007-05-28 at 19:44 +0200, Maximilian Engelhardt wrote:
   Can you please keep CONFIG_HIGH_RES_TIMERS and CONFIG_NOHZ and try the
   following combinations on the kernel command line:
  
   1) highres=off nohz=off (should be the same as your working config)
   2) highres=off
   3) nohz=off
 
  I tested this with my 2.6.22-rc3 kernel, here are the results:
 
  without any special boot parameters: problem does appear
  highres=off nohz=off: problem does not appear
  highres=off: problem does not appear
  nohz=off: problem does appear

 Is there any other strange behavior of the high res enabled kernel than
 the b44 problem ?

I didn't notice anything.


  I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
  Timer, but the high ping problem is still there.

 Hmm, that's mysterious. Wild guess is that highres exposes the hidden
 feature in a different way than rc2-mm1 does.

I think the bug in 2.6.21/22-rc3 is a different one that the one in 
2.6.22-rc2-mm1, but that's also only a wild guess :)

I'll explain this a bit:
In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for 
some time. With this driver and High Resolution Timer turned on I get 
problems using iperf. The problems are that the systems becomes really slow 
and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
sounds logical to me. This bug did never happen to me before I startet the 
iperf test.

The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb 
spilt. It's independed wether High Resolution Timer is turned on or off I 
always get very varying and high ping times. The iperf-test doesn't show the 
problems from 2.6.21/22-rc3.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Thomas Gleixner
On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
   I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
   Timer, but the high ping problem is still there.
 
  Hmm, that's mysterious. Wild guess is that highres exposes the hidden
  feature in a different way than rc2-mm1 does.
 
 I think the bug in 2.6.21/22-rc3 is a different one that the one in 
 2.6.22-rc2-mm1, but that's also only a wild guess :)
 
 I'll explain this a bit:
 In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels 
 for 
 some time. With this driver and High Resolution Timer turned on I get 
 problems using iperf. The problems are that the systems becomes really slow 
 and unresponsive.  Michael Buesch thought this could be an IRQ storm which 
 sounds logical to me. This bug did never happen to me before I startet the 
 iperf test.

Can you please apply

http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch

on top of rc3 and check, whether it has any effect on your problem.

 The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb 
 spilt. It's independed wether High Resolution Timer is turned on or off I 
 always get very varying and high ping times. The iperf-test doesn't show the 
 problems from 2.6.21/22-rc3.

Neither with nor without highres ?

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-28 Thread Uwe Bugla
On Mon, 2007-05-28 at 22:55 +0200, Maximilian Engelhardt wrote:
   I additionally built my 2.6.22-rc2-mm1 kernel without High Resolution
   Timer, but the high ping problem is still there.
 
  Hmm, that's mysterious. Wild guess is that highres exposes the hidden
  feature in a different way than rc2-mm1 does.

 I think the bug in 2.6.21/22-rc3 is a different one that the one in
 2.6.22-rc2-mm1, but that's also only a wild guess :)

 I'll explain this a bit:
 In 2.6.21/22-rc3 is the same b44 driver that has been in the stock kernels for
 some time. With this driver and High Resolution Timer turned on I get
 problems using iperf. The problems are that the systems becomes really slow
 and unresponsive. Michael Buesch thought this could be an IRQ storm which
 sounds logical to me. This bug did never happen to me before I startet the
 iperf test.

Can you please apply

http://www.tglx.de/projects/hrtimers/2.6.22-rc3/patch-2.6.22-rc3-hrt1.patch

on top of rc3 and check, whether it has any effect on your problem.

 The other issue happens only with 2.6.22-rc2-mm1 which includes the b44 ssb
 spilt. It's independed wether High Resolution Timer is turned on or off I
 always get very varying and high ping times. The iperf-test doesn't show the
 problems from 2.6.21/22-rc3.

Neither with nor without highres ?

tglx

Neither with nor without Gleixner ?

Neither with nor without Buesch ?

Neither with nor without Miller ?

Neither with nor without Kyle ?

Neither with nor without  ?

Neither with nor without would-like-to-spare time hackers ?

Neither with nor without profile neurotic would-like-to-copyright owners ?

___
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
> Ok, another question: On which CPU architecture are you?

[EMAIL PROTECTED]:~$ uname -m
i686

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
Ok, another question: On which CPU architecture are you?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > 2.6.21.1:
> > [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
> > [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
> >
> > 2.6.22-rc3:
> > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
>
> This is the diff between these two kernels.
> I'm not sure why you see a much better TX throughput here.
>
> Can you re-check to make sure it's not just some test-jitter?
>
2.6.21.1:

[  5] local 192.168.1.2 port 54423 connected with 192.168.1.1 port 5001
[  5]  0.0-60.3 sec  3.06 MBytes426 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 41053
[  4]  0.0-163.0 sec130 MBytes  6.67 Mbits/sec


2.6.22-rc3:

[  5] local 192.168.1.2 port 46002 connected with 192.168.1.1 port 5001
[  5]  0.0-61.5 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 44379
[  4]  0.0-93.8 sec  30.6 MBytes  2.74 Mbits/sec

For TX the iperf server reports the same values as the client (all values are 
from the client) but for RX they are differen:

2.6.21.1: (iperf server log):

[  5] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 54423
[  5]  0.0-60.5 sec  3.06 MBytes425 Kbits/sec
[  5] local 192.168.1.1 port 41053 connected with 192.168.1.2 port 5001
[  5]  0.0-63.1 sec130 MBytes  17.2 Mbits/sec


2.6.22-rc3 (iperf server log):

[  4] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 46002
[  4]  0.0-61.6 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.1 port 44379 connected with 192.168.1.2 port 5001
[  4]  0.0-63.3 sec  30.6 MBytes  4.06 Mbits/sec

I have no idea how iperf internally works and what can cause such different 
results here.

>
> --- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200
> +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-27 23:01:44.0 +0200
> @@ -825,12 +825,11 @@
> if (copy_skb == NULL)
> goto drop_it_no_recycle;
>
> -   copy_skb->dev = bp->dev;
> skb_reserve(copy_skb, 2);
> skb_put(copy_skb, len);
> /* DMA sync done above, copy just the actual packet
> */ -   memcpy(copy_skb->data, skb->data+bp->rx_offset,
> len); -
> +   skb_copy_from_linear_data_offset(skb,
> bp->rx_offset, +   
> copy_skb->data, len); skb = copy_skb;
> }
> skb->ip_summed = CHECKSUM_NONE;
> @@ -1007,7 +1006,8 @@
> goto err_out;
> }
>
> -   memcpy(skb_put(bounce_skb, len), skb->data, skb->len);
> +   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
> + skb->len);
> dev_kfree_skb_any(skb);
> skb = bounce_skb;
> }




signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 23:13:32 Michael Buesch wrote:
> > On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > > 2.6.21.1:
> > > [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
> > > [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
> > > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
> > > [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
> > >
> > > 2.6.22-rc3:
> > > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
> >
> > This is the diff between these two kernels.
> > I'm not sure why you see a much better TX throughput here.
> >
> > Can you re-check to make sure it's not just some test-jitter?
>
> Oh, eh, and what I forgot to ask:
> Do you know an old kernel that works perfectly well for you,
> so I can look at a diff between this one and anything >=2.6.21.1.

I don't know any, most older kernels did work fine for me, but I never user 
iperf there so I guess if the bug is there also I simply didn't trigger it.
If you think it's usefull I could go back and try different kernels, but that 
would take some time.
Except the iperf bug 2.6.21.1 and 2.6.22-rc3 work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote:
> > When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in
> > normal use I didn't notice any problems. It did work fine as I would
> > expect it. I think the wget and ping tests here are as they should be.
> >
> > With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The
> > ping test does confirm this, because here response times are very high.
> > As far as I can remember the wget download rate was a bit slower than
> > 2.6.21.1 or 2.6.22-rc3 till it stalled.
> > I would expect it to be someting like the other two kernels. The two
> > problems I see are the high ping times and the fact that the card stopped
> > working.
> >
> > I don't know why the iperf results are so different from my personal
> > experience. I guess the fact that I get so bad results with 2.6.21.1 and
> > 2.6.22-rc3 is that iperf does something that causes the system to be
> > extremely slow and thus degrading performance. This could be a bug
> > somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has
> > unintended been fixed by the ssb switch, but that's only a roughly guess.
>
> Ok. I guess (Yes I do :D) that there is an IRQ storm or something like
> that, because you say that your system is becoming very slow and
> unresponsive. It sounds like an IRQ is not ACKed correctly and so keeps
> triggering and stalling the system. I'll take a look at a few diffs...
> Do you see significant differences in the "hi" and/or "si" times in top?
> Do you see a significant difference in the /proc/interrupts count. For
> example that the kernel that works worse generates 10 times the IRQ count
> for the same amount of data.

ok, here are the results:

Using 2.6.22-rc3 I get lot's of hi during TX and lots of hi and si during RX.
Using 2.6.22-rc3-mm1 hi and si are significantly lower.
It's difficult to give absolute numbers, because top refreshes very slow, but 
with 2.6.22-rc3 hi is about 30% during TX and RX and si is 0% during TX and 
50% during RX. With Using 2.6.22-rc3-mm1 hi is 0% during TX and 0.3% during 
RX and si is 10% during TX and 0% during RX.

When I do the same test on both kernels I get about 10 times (yes, it's really 
about ten times like in your example) more interrupts with 2.6.22-rc3 than 
with 2.6.22-rc3-mm1.

An additional thing I noticed it that it's not the BCM4401 card that stops 
working but my e100 card. If I take the e100 card down and up again the 
connection is working again, so the BCM4401 doesn't have a "stops working" 
bug for me.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 23:13:32 Michael Buesch wrote:
> On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > 2.6.21.1:
> > [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
> > [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
> 
> > 2.6.22-rc3:
> > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
> 
> This is the diff between these two kernels.
> I'm not sure why you see a much better TX throughput here.
> 
> Can you re-check to make sure it's not just some test-jitter?

Oh, eh, and what I forgot to ask:
Do you know an old kernel that works perfectly well for you,
so I can look at a diff between this one and anything >=2.6.21.1.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> 2.6.21.1:
> [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
> [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
> [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
> [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec

> 2.6.22-rc3:
> [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

This is the diff between these two kernels.
I'm not sure why you see a much better TX throughput here.

Can you re-check to make sure it's not just some test-jitter?


--- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200
+++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-27 23:01:44.0 +0200
@@ -825,12 +825,11 @@
if (copy_skb == NULL)
goto drop_it_no_recycle;
 
-   copy_skb->dev = bp->dev;
skb_reserve(copy_skb, 2);
skb_put(copy_skb, len);
/* DMA sync done above, copy just the actual packet */
-   memcpy(copy_skb->data, skb->data+bp->rx_offset, len);
-
+   skb_copy_from_linear_data_offset(skb, bp->rx_offset,
+copy_skb->data, len);
skb = copy_skb;
}
skb->ip_summed = CHECKSUM_NONE;
@@ -1007,7 +1006,8 @@
goto err_out;
}
 
-   memcpy(skb_put(bounce_skb, len), skb->data, skb->len);
+   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
+ skb->len);
dev_kfree_skb_any(skb);
skb = bounce_skb;
}

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote:
> When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal 
> use I didn't notice any problems. It did work fine as I would expect it.
> I think the wget and ping tests here are as they should be.
> 
> With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping 
> test does confirm this, because here response times are very high. As far as 
> I can remember the wget download rate was a bit slower than 2.6.21.1 or 
> 2.6.22-rc3 till it stalled.
> I would expect it to be someting like the other two kernels. The two problems 
> I see are the high ping times and the fact that the card stopped working.
> 
> I don't know why the iperf results are so different from my personal 
> experience. I guess the fact that I get so bad results with 2.6.21.1 and 
> 2.6.22-rc3 is that iperf does something that causes the system to be 
> extremely slow and thus degrading performance. This could be a bug somewhere 
> in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed 
> by the ssb switch, but that's only a roughly guess.

Ok. I guess (Yes I do :D) that there is an IRQ storm or something like that,
because you say that your system is becoming very slow and unresponsive.
It sounds like an IRQ is not ACKed correctly and so keeps triggering and
stalling the system. I'll take a look at a few diffs...
Do you see significant differences in the "hi" and/or "si" times in top?
Do you see a significant difference in the /proc/interrupts count. For
example that the kernel that works worse generates 10 times the IRQ count
for the same amount of data.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
> On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> > 2.6.22-rc3:
> >
> > [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> > [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
>
> Why do we have two different measurements here? Is one TX and one RX?
> Which one?

Yes, the first is TX (BCM4401 --> e100) and the second is RX. Both are tcp 
connections. I think iperf does display the ip addresses wrong in the second 
connection, but that's another issue.

>
> > koala:~# ping -c10 192.168.1.1
> > PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> > 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
> > 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
> > 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
> > 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
> > 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
> > 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
> > 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
> > 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
> > 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
> > 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms
> >
> > --- 192.168.1.1 ping statistics ---
> > 10 packets transmitted, 10 received, 0% packet loss, time 8997ms
> > rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms
> >
> > System responsiveness was the same as with 2.6.21.1.
> >
> > wget got 11.23M/s, again same as 2.6.21.1.
> >
> >
> > 2.6.22-rc2-mm1:
> >
> > [  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
> > [  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
> > [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
> > [  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec
>
> So with -mm (with ssb) you actually get better performace
> then with plain 2.6.22-rc3?
>
> Can you elaborate a bit more about what you get an what you expect
> on which kernel?

When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal 
use I didn't notice any problems. It did work fine as I would expect it.
I think the wget and ping tests here are as they should be.

With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping 
test does confirm this, because here response times are very high. As far as 
I can remember the wget download rate was a bit slower than 2.6.21.1 or 
2.6.22-rc3 till it stalled.
I would expect it to be someting like the other two kernels. The two problems 
I see are the high ping times and the fact that the card stopped working.

I don't know why the iperf results are so different from my personal 
experience. I guess the fact that I get so bad results with 2.6.21.1 and 
2.6.22-rc3 is that iperf does something that causes the system to be 
extremely slow and thus degrading performance. This could be a bug somewhere 
in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed 
by the ssb switch, but that's only a roughly guess.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
> 2.6.22-rc3:
> 
> [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
> [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
> [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
> [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

Why do we have two different measurements here? Is one TX and one RX?
Which one?

> koala:~# ping -c10 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
> 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
> 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
> 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
> 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
> 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
> 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
> 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
> 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
> 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms
> 
> --- 192.168.1.1 ping statistics ---
> 10 packets transmitted, 10 received, 0% packet loss, time 8997ms
> rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms
> 
> System responsiveness was the same as with 2.6.21.1.
> 
> wget got 11.23M/s, again same as 2.6.21.1.
> 
> 
> 2.6.22-rc2-mm1:
> 
> [  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
> [  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
> [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
> [  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

So with -mm (with ssb) you actually get better performace
then with plain 2.6.22-rc3?

Can you elaborate a bit more about what you get an what you expect
on which kernel?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
I send this again because my first mail accidently had html code in it and 
might have been filtered by some people.

On Saturday 26 May 2007, Michael Buesch wrote:
> On Saturday 26 May 2007 02:24:31 Stephen Hemminger wrote:
> > Something is broken with the b44 driver in 2.6.22-rc1 or later. Now
> > bisecting. The performance (with iperf) for receiving is normally 94Mbits
> > or more. But something happened that dropped performance to less than
> > 1Mbit, probably corrupted packets.
> >
> > There is nothing obvious in the commit log for drivers/net/b44.c, so it
> > probably is something more general.
> >
> >
> > Looking at the code in b44_rx(), I see a couple unrelated of bugs:
> > 1. In the small packet case it recycles the skb before copying data
> > out... Not good if new data arrives overwriting existing data.
> >
> > 2. Macros like RX_PKT_BUF_SZ that depend on local variables are evil!!
>
> Very interesting!
> 2.6.22 doesn't include ssb, does it?
>
> Adding CCs to make reporters of another bugreport aware of this.

I did some more tests with my BCM4401 and different kernels, here are the 
results:

2.6.21.1:

iperf:
[  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
[  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
[  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.241 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.231 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.237 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8998ms
rtt min/avg/max/mdev = 0.215/0.230/0.241/0.018 ms

The system was unusable while i ran the iperf test, when I moved the mouse it 
was only jumping around and doing anything like starting programs or 
switching the desktop first happend after iperf had finished it's test.

I did a http downlaod with wget and got 11.23M/s.


2.6.22-rc3:

[  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
[  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
[  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8997ms
rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms

System responsiveness was the same as with 2.6.21.1.

wget got 11.23M/s, again same as 2.6.21.1.


2.6.22-rc2-mm1:

[  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
[  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
[  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=39.8 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=52.7 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=86.7 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=8.22 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=32.1 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=56.0 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=80.0 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.52 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=25.4 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=49.3 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 1.526/43.207/86.700/26.369 ms

Here system responsiveness was ok whil I ran iperf, I didn't notic anything 
anomalous.

When I tried the wget http download the tranfer did stall and from this point 
on I couldn't send or receive anything on my BCM4401 anymore. Taken the 

Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
I send this again because my first mail accidently had html code in it and 
might have been filtered by some people.

On Saturday 26 May 2007, Michael Buesch wrote:
 On Saturday 26 May 2007 02:24:31 Stephen Hemminger wrote:
  Something is broken with the b44 driver in 2.6.22-rc1 or later. Now
  bisecting. The performance (with iperf) for receiving is normally 94Mbits
  or more. But something happened that dropped performance to less than
  1Mbit, probably corrupted packets.
 
  There is nothing obvious in the commit log for drivers/net/b44.c, so it
  probably is something more general.
 
 
  Looking at the code in b44_rx(), I see a couple unrelated of bugs:
  1. In the small packet case it recycles the skb before copying data
  out... Not good if new data arrives overwriting existing data.
 
  2. Macros like RX_PKT_BUF_SZ that depend on local variables are evil!!

 Very interesting!
 2.6.22 doesn't include ssb, does it?

 Adding CCs to make reporters of another bugreport aware of this.

I did some more tests with my BCM4401 and different kernels, here are the 
results:

2.6.21.1:

iperf:
[  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
[  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
[  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.241 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.231 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.229 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.237 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8998ms
rtt min/avg/max/mdev = 0.215/0.230/0.241/0.018 ms

The system was unusable while i ran the iperf test, when I moved the mouse it 
was only jumping around and doing anything like starting programs or 
switching the desktop first happend after iperf had finished it's test.

I did a http downlaod with wget and got 11.23M/s.


2.6.22-rc3:

[  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
[  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
[  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 8997ms
rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms

System responsiveness was the same as with 2.6.21.1.

wget got 11.23M/s, again same as 2.6.21.1.


2.6.22-rc2-mm1:

[  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
[  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
[  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

koala:~# ping -c10 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=39.8 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=52.7 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=86.7 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=8.22 ms
64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=32.1 ms
64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=56.0 ms
64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=80.0 ms
64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=1.52 ms
64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=25.4 ms
64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=49.3 ms

--- 192.168.1.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9000ms
rtt min/avg/max/mdev = 1.526/43.207/86.700/26.369 ms

Here system responsiveness was ok whil I ran iperf, I didn't notic anything 
anomalous.

When I tried the wget http download the tranfer did stall and from this point 
on I couldn't send or receive anything on my BCM4401 anymore. Taken the 
interface down and up again didn't 

Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
 2.6.22-rc3:
 
 [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
 [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
 [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
 [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

Why do we have two different measurements here? Is one TX and one RX?
Which one?

 koala:~# ping -c10 192.168.1.1
 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
 64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
 64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
 64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
 64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms
 
 --- 192.168.1.1 ping statistics ---
 10 packets transmitted, 10 received, 0% packet loss, time 8997ms
 rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms
 
 System responsiveness was the same as with 2.6.21.1.
 
 wget got 11.23M/s, again same as 2.6.21.1.
 
 
 2.6.22-rc2-mm1:
 
 [  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
 [  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
 [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
 [  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

So with -mm (with ssb) you actually get better performace
then with plain 2.6.22-rc3?

Can you elaborate a bit more about what you get an what you expect
on which kernel?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote:
 When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal 
 use I didn't notice any problems. It did work fine as I would expect it.
 I think the wget and ping tests here are as they should be.
 
 With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping 
 test does confirm this, because here response times are very high. As far as 
 I can remember the wget download rate was a bit slower than 2.6.21.1 or 
 2.6.22-rc3 till it stalled.
 I would expect it to be someting like the other two kernels. The two problems 
 I see are the high ping times and the fact that the card stopped working.
 
 I don't know why the iperf results are so different from my personal 
 experience. I guess the fact that I get so bad results with 2.6.21.1 and 
 2.6.22-rc3 is that iperf does something that causes the system to be 
 extremely slow and thus degrading performance. This could be a bug somewhere 
 in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed 
 by the ssb switch, but that's only a roughly guess.

Ok. I guess (Yes I do :D) that there is an IRQ storm or something like that,
because you say that your system is becoming very slow and unresponsive.
It sounds like an IRQ is not ACKed correctly and so keeps triggering and
stalling the system. I'll take a look at a few diffs...
Do you see significant differences in the hi and/or si times in top?
Do you see a significant difference in the /proc/interrupts count. For
example that the kernel that works worse generates 10 times the IRQ count
for the same amount of data.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
  2.6.22-rc3:
 
  [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
  [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

 Why do we have two different measurements here? Is one TX and one RX?
 Which one?

Yes, the first is TX (BCM4401 -- e100) and the second is RX. Both are tcp 
connections. I think iperf does display the ip addresses wrong in the second 
connection, but that's another issue.


  koala:~# ping -c10 192.168.1.1
  PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
  64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.243 ms
  64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.234 ms
  64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.238 ms
  64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.235 ms
  64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=0.230 ms
  64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=0.317 ms
  64 bytes from 192.168.1.1: icmp_seq=7 ttl=64 time=0.232 ms
  64 bytes from 192.168.1.1: icmp_seq=8 ttl=64 time=0.232 ms
  64 bytes from 192.168.1.1: icmp_seq=9 ttl=64 time=0.228 ms
  64 bytes from 192.168.1.1: icmp_seq=10 ttl=64 time=0.238 ms
 
  --- 192.168.1.1 ping statistics ---
  10 packets transmitted, 10 received, 0% packet loss, time 8997ms
  rtt min/avg/max/mdev = 0.228/0.242/0.317/0.031 ms
 
  System responsiveness was the same as with 2.6.21.1.
 
  wget got 11.23M/s, again same as 2.6.21.1.
 
 
  2.6.22-rc2-mm1:
 
  [  5] local 192.168.1.2 port 42198 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.1 sec402 MBytes  56.1 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 48598
  [  4]  0.0-63.0 sec177 MBytes  23.6 Mbits/sec

 So with -mm (with ssb) you actually get better performace
 then with plain 2.6.22-rc3?

 Can you elaborate a bit more about what you get an what you expect
 on which kernel?

When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in normal 
use I didn't notice any problems. It did work fine as I would expect it.
I think the wget and ping tests here are as they should be.

With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The ping 
test does confirm this, because here response times are very high. As far as 
I can remember the wget download rate was a bit slower than 2.6.21.1 or 
2.6.22-rc3 till it stalled.
I would expect it to be someting like the other two kernels. The two problems 
I see are the high ping times and the fact that the card stopped working.

I don't know why the iperf results are so different from my personal 
experience. I guess the fact that I get so bad results with 2.6.21.1 and 
2.6.22-rc3 is that iperf does something that causes the system to be 
extremely slow and thus degrading performance. This could be a bug somewhere 
in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has unintended been fixed 
by the ssb switch, but that's only a roughly guess.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
 2.6.21.1:
 [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
 [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
 [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
 [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec

 2.6.22-rc3:
 [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
 [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
 [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
 [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

This is the diff between these two kernels.
I'm not sure why you see a much better TX throughput here.

Can you re-check to make sure it's not just some test-jitter?


--- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200
+++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-27 23:01:44.0 +0200
@@ -825,12 +825,11 @@
if (copy_skb == NULL)
goto drop_it_no_recycle;
 
-   copy_skb-dev = bp-dev;
skb_reserve(copy_skb, 2);
skb_put(copy_skb, len);
/* DMA sync done above, copy just the actual packet */
-   memcpy(copy_skb-data, skb-data+bp-rx_offset, len);
-
+   skb_copy_from_linear_data_offset(skb, bp-rx_offset,
+copy_skb-data, len);
skb = copy_skb;
}
skb-ip_summed = CHECKSUM_NONE;
@@ -1007,7 +1006,8 @@
goto err_out;
}
 
-   memcpy(skb_put(bounce_skb, len), skb-data, skb-len);
+   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
+ skb-len);
dev_kfree_skb_any(skb);
skb = bounce_skb;
}

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
On Sunday 27 May 2007 23:13:32 Michael Buesch wrote:
 On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
  2.6.21.1:
  [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
  [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
 
  2.6.22-rc3:
  [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
  [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
 
 This is the diff between these two kernels.
 I'm not sure why you see a much better TX throughput here.
 
 Can you re-check to make sure it's not just some test-jitter?

Oh, eh, and what I forgot to ask:
Do you know an old kernel that works perfectly well for you,
so I can look at a diff between this one and anything =2.6.21.1.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 22:36:39 Maximilian Engelhardt wrote:
  When I ran 2.6.21.1 or 2.6.22-rc3 without any debugging tools just in
  normal use I didn't notice any problems. It did work fine as I would
  expect it. I think the wget and ping tests here are as they should be.
 
  With 2.6.22-rc2-mm1 I noticed that connections seem to be slower. The
  ping test does confirm this, because here response times are very high.
  As far as I can remember the wget download rate was a bit slower than
  2.6.21.1 or 2.6.22-rc3 till it stalled.
  I would expect it to be someting like the other two kernels. The two
  problems I see are the high ping times and the fact that the card stopped
  working.
 
  I don't know why the iperf results are so different from my personal
  experience. I guess the fact that I get so bad results with 2.6.21.1 and
  2.6.22-rc3 is that iperf does something that causes the system to be
  extremely slow and thus degrading performance. This could be a bug
  somewhere in the b44 driver of 2.6.21.1 and 2.6.22-RC3 that has
  unintended been fixed by the ssb switch, but that's only a roughly guess.

 Ok. I guess (Yes I do :D) that there is an IRQ storm or something like
 that, because you say that your system is becoming very slow and
 unresponsive. It sounds like an IRQ is not ACKed correctly and so keeps
 triggering and stalling the system. I'll take a look at a few diffs...
 Do you see significant differences in the hi and/or si times in top?
 Do you see a significant difference in the /proc/interrupts count. For
 example that the kernel that works worse generates 10 times the IRQ count
 for the same amount of data.

ok, here are the results:

Using 2.6.22-rc3 I get lot's of hi during TX and lots of hi and si during RX.
Using 2.6.22-rc3-mm1 hi and si are significantly lower.
It's difficult to give absolute numbers, because top refreshes very slow, but 
with 2.6.22-rc3 hi is about 30% during TX and RX and si is 0% during TX and 
50% during RX. With Using 2.6.22-rc3-mm1 hi is 0% during TX and 0.3% during 
RX and si is 10% during TX and 0% during RX.

When I do the same test on both kernels I get about 10 times (yes, it's really 
about ten times like in your example) more interrupts with 2.6.22-rc3 than 
with 2.6.22-rc3-mm1.

An additional thing I noticed it that it's not the BCM4401 card that stops 
working but my e100 card. If I take the e100 card down and up again the 
connection is working again, so the BCM4401 doesn't have a stops working 
bug for me.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 23:13:32 Michael Buesch wrote:
  On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
   2.6.21.1:
   [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
   [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
   [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
   [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
  
   2.6.22-rc3:
   [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
   [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
   [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
   [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec
 
  This is the diff between these two kernels.
  I'm not sure why you see a much better TX throughput here.
 
  Can you re-check to make sure it's not just some test-jitter?

 Oh, eh, and what I forgot to ask:
 Do you know an old kernel that works perfectly well for you,
 so I can look at a diff between this one and anything =2.6.21.1.

I don't know any, most older kernels did work fine for me, but I never user 
iperf there so I guess if the bug is there also I simply didn't trigger it.
If you think it's usefull I could go back and try different kernels, but that 
would take some time.
Except the iperf bug 2.6.21.1 and 2.6.22-rc3 work fine.

Maxi


signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Sunday 27 May 2007, Michael Buesch wrote:
 On Sunday 27 May 2007 21:25:17 Maximilian Engelhardt wrote:
  2.6.21.1:
  [  5] local 192.168.1.2 port 58414 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.6 sec  1.13 MBytes157 Kbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 57837
  [  4]  0.0-63.1 sec  2.82 MBytes375 Kbits/sec
 
  2.6.22-rc3:
  [  5] local 192.168.1.2 port 46557 connected with 192.168.1.1 port 5001
  [  5]  0.0-60.4 sec  58.9 MBytes  8.18 Mbits/sec
  [  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 51633
  [  4]  0.0-63.1 sec  7.27 MBytes967 Kbits/sec

 This is the diff between these two kernels.
 I'm not sure why you see a much better TX throughput here.

 Can you re-check to make sure it's not just some test-jitter?

2.6.21.1:

[  5] local 192.168.1.2 port 54423 connected with 192.168.1.1 port 5001
[  5]  0.0-60.3 sec  3.06 MBytes426 Kbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 41053
[  4]  0.0-163.0 sec130 MBytes  6.67 Mbits/sec


2.6.22-rc3:

[  5] local 192.168.1.2 port 46002 connected with 192.168.1.1 port 5001
[  5]  0.0-61.5 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.2 port 5001 connected with 192.168.1.1 port 44379
[  4]  0.0-93.8 sec  30.6 MBytes  2.74 Mbits/sec

For TX the iperf server reports the same values as the client (all values are 
from the client) but for RX they are differen:

2.6.21.1: (iperf server log):

[  5] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 54423
[  5]  0.0-60.5 sec  3.06 MBytes425 Kbits/sec
[  5] local 192.168.1.1 port 41053 connected with 192.168.1.2 port 5001
[  5]  0.0-63.1 sec130 MBytes  17.2 Mbits/sec


2.6.22-rc3 (iperf server log):

[  4] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 46002
[  4]  0.0-61.6 sec  84.0 MBytes  11.5 Mbits/sec
[  4] local 192.168.1.1 port 44379 connected with 192.168.1.2 port 5001
[  4]  0.0-63.3 sec  30.6 MBytes  4.06 Mbits/sec

I have no idea how iperf internally works and what can cause such different 
results here.


 --- linux-2.6.21.1/drivers/net/b44.c2007-05-27 22:58:01.0 +0200
 +++ linux-2.6.22-rc3/drivers/net/b44.c  2007-05-27 23:01:44.0 +0200
 @@ -825,12 +825,11 @@
 if (copy_skb == NULL)
 goto drop_it_no_recycle;

 -   copy_skb-dev = bp-dev;
 skb_reserve(copy_skb, 2);
 skb_put(copy_skb, len);
 /* DMA sync done above, copy just the actual packet
 */ -   memcpy(copy_skb-data, skb-data+bp-rx_offset,
 len); -
 +   skb_copy_from_linear_data_offset(skb,
 bp-rx_offset, +   
 copy_skb-data, len); skb = copy_skb;
 }
 skb-ip_summed = CHECKSUM_NONE;
 @@ -1007,7 +1006,8 @@
 goto err_out;
 }

 -   memcpy(skb_put(bounce_skb, len), skb-data, skb-len);
 +   skb_copy_from_linear_data(skb, skb_put(bounce_skb, len),
 + skb-len);
 dev_kfree_skb_any(skb);
 skb = bounce_skb;
 }




signature.asc
Description: This is a digitally signed message part.


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Michael Buesch
Ok, another question: On which CPU architecture are you?

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: b44: regression in 2.6.22 (resend)

2007-05-27 Thread Maximilian Engelhardt
On Monday 28 May 2007, Michael Buesch wrote:
 Ok, another question: On which CPU architecture are you?

[EMAIL PROTECTED]:~$ uname -m
i686

Maxi


signature.asc
Description: This is a digitally signed message part.