Re: [NET]: Mark Paul Moore as maintainer of labelled networking.

2007-08-27 Thread Joe Perches
On Tue, 2007-08-28 at 00:01 +, Linux Kernel Mailing List wrote:
> +NETWORKING [LABELED] (NetLabel, CIPSO, Labeled IPsec, SECMARK)
> +P:   Paul Moore
> +M:   [EMAIL PROTECTED]
> +L:   netdev@vger.kernel.org
> +S:   Maintained
> +

Aren't there now 2 subsystems in MAINTAINERS for the same thing?

NETLABEL
P:  Paul Moore
M:  [EMAIL PROTECTED]
W:  http://netlabel.sf.net
L:  netdev@vger.kernel.org
S:  Supported



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] PS3: fix the bug that 'ifconfig down' would hang

2007-08-27 Thread Masakazu Mokuno
Fix the bug that 'ifconfig eth0 down' would hang up, reported by Stefan
Assmann <[EMAIL PROTECTED]>.
As we removed netif_poll_enable() from dev->open(), we should not use 
netif_poll_disable() in dev->stop(). 

Signed-off-by: Masakazu Mokuno <[EMAIL PROTECTED]>
CC: Geoff Levand <[EMAIL PROTECTED]>
---
 drivers/net/ps3_gelic_net.c |1 -
 1 file changed, 1 deletion(-)

--- a/drivers/net/ps3_gelic_net.c
+++ b/drivers/net/ps3_gelic_net.c
@@ -556,7 +556,6 @@ static int gelic_net_stop(struct net_dev
 {
struct gelic_net_card *card = netdev_priv(netdev);
 
-   netif_poll_disable(netdev);
netif_stop_queue(netdev);
 
/* turn off DMA, force end */


--
Masakazu MOKUNO

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.24 patch] the planned eepro100 removal

2007-08-27 Thread Adrian Bunk
On Mon, Aug 27, 2007 at 02:58:05PM -0700, Kok, Auke wrote:
> Adrian Bunk wrote:
>...
>> This patch has been sent on:
>> - 14 Aug 2007
>> - 29 Jul 2007
>
> currently we won't have e100 fixed up for ARM in 2.6.23, so removing this 
> for 2.6.24 sounds a bit premature. Maybe 2.6.25. Can you 
> reschedule/postpone this?

OK, I'll resend it after 2.6.24-rc1.

> Auke

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9 Rev3] Implement batching skb API and support in IPoIB

2007-08-27 Thread jamal
On Sun, 2007-26-08 at 19:04 -0700, David Miller wrote:

> The transfer is much better behaved if we ACK every two full sized
> frames we copy into the receiver, and therefore don't stretch ACK, but
> at the cost of cpu utilization.

The rx coalescing in theory should help by accumulating more ACKs on the
rx side of the sender. But it doesnt seem to do that i.e For the 9K MTU,
you are better off to turn off the coalescing if you want higher
numbers. Also some of the TOE vendors (chelsio?) claim to have fixed
this by reducing bursts on outgoing packets.
 
Bill:
who suggested (as per your email) the 75usec value and what was it based
on measurement-wise? 
BTW, thanks for the finding the energy to run those tests and a very
refreshing perspective. I dont mean to add more work, but i had some
queries;
On your earlier tests, i think that Reno showed some significant
differences on the lower MTU case over BIC. I wonder if this is
consistent? 
A side note: Although the experimentation reduces the variables (eg
tying all to CPU0), it would be more exciting to see multi-cpu and
multi-flow sender effect (which IMO is more real world). 
Last note: you need a newer netstat.

> These effects are particularly pronounced on systems where the
> bus bandwidth is also one of the limiting factors.

Can you elucidate this a little more Dave? Did you mean memory
bandwidth? 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: r8169: slow samba performance

2007-08-27 Thread john


On Wed, 22 Aug 2007, Bruce Cole wrote:


Shane wrote:

On Wed, Aug 22, 2007 at 09:39:47AM -0700, Bruce Cole wrote:


Shane, join the crowd :)  Try the fix I just re-posted over here:



Bruce, gigabit speeds thanks for the pointer.  This fix
works well for me though I just added the three or so lines
in the elseif statement as it rejected with the
r8169-20070818.  I suppose I could've merged the whole
thing and if you need that tested, let me know but this is
looking good.

Glad it works for you.  I'm not the maintainer, and also don't have adequate 
specs from Realtek to definitively explain why the NPQ bit apparently needs 
to be re-enabled when some but not all of the TX FIFO is dequeued.  It is 
documented as if it isn't cleared until the FIFO is empty.  So I assume an 
official patch will have to wait until Francois is back.



I have had abysmal performance trying to remotely run X apps via ssh on a
computer with a RTL8111 NIC.  Saw this message and decided to give this
patch a try --- success!  Much, much better.

Thanks,

John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6.24 patch] the planned eepro100 removal

2007-08-27 Thread Kok, Auke

Adrian Bunk wrote:

This patch contains the planned removal of the eepro100 driver.

Signed-off-by: Adrian Bunk 


you lost your e-mail address? :)


---

This patch has been sent on:
- 14 Aug 2007
- 29 Jul 2007


currently we won't have e100 fixed up for ARM in 2.6.23, so removing this for 
2.6.24 sounds a bit premature. Maybe 2.6.25. Can you reschedule/postpone this?


Auke
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-27 Thread Daniel Phillips
Say Evgeniy, something I was curious about but forgot to ask you 
earlier...

On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote:
> ...All oerations are not atomic, since we do not care about precise
> number of bios, but a fact, that we are close or close enough to the
> limit. 
> ... in bio->endio
> + q->bio_queued--;

In your proposed patch, what prevents the race:

cpu1cpu2

read q->bio_queued

q->bio_queued--
write q->bio_queued - 1
Whoops! We leaked a throttle count.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread David Miller
From: James Chapman <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 22:41:43 +0100

> I don't recall saying anything in previous posts about this. Are you 
> confusing my posts with Jan-Bernd's?

Yes, my bad.

> Jan-Bernd has been talking about using hrtimers to _reschedule_
> NAPI. My posts are suggesting an alternative mechanism that keeps
> NAPI active (with interrupts disabled) for a jiffy or two after it
> would otherwise have gone idle in order to avoid too many interrupts
> when the packet rate is such that NAPI thrashes between poll-on and
> poll-off.

So in this scheme what runs ->poll() to process incoming packets?
The hrtimer?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread James Chapman

David Miller wrote:

From: James Chapman <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 16:51:29 +0100

To implement this, there's no need for timers, hrtimers or generic NAPI 
support that others have suggested. A driver's poll() would set an 
internal flag and record the current jiffies value when finding 
workdone=0 rather than doing an immediate napi_complete(). Early in 
poll() it would test this flag and if set, do a low-cost test to see if 
it had any work to do. If no work, it would check the saved jiffies 
value and do the napi_complete() only if no work has been done for a 
configurable number of jiffies. This keeps interrupts disabled longer at 
the expense of many more calls to poll() where no work is done. So 
critical to this scheme is modifying the driver's poll() to fastpath the 
case of having no work to do while waiting for its local jiffy count to 
expire.


Here's an untested patch for tg3 that illustrates the idea.


It's only going to work with hrtimers, these interfaces can
process at least 100,000 per jiffies tick.


I don't understand where hrtimers or interface speed comes in. If the 
CPU is fast enough to call poll() 100,000 times per jiffies tick, it 
means 100,000 wasted poll() calls while the netdev migrates from active 
to poll-off state. Hence the need to fastpath the "no work" case in the 
netdev's poll(). These extra poll() calls are tolerable if it avoids 
NAPI thrashing between poll-on and poll-off states for certain packet rates.



And the hrtimer granularity is going to need to be significantly low,
and futhermore you're adding a guaranteed extra interrupt (for the
hrtimer firing) in these cases where we're exactly trying to avoid is
more interrupts.

If you can make it work, fine, but it's going to need to be at a
minimum disabled when the hrtimer granularity is not sufficient.

But there are huger fish to fry for you I think.  Talk to your
platform maintainers and ask for an interface for obtaining
a flat static distribution of interrupts to cpus in order to
support multiqueue NAPI better.

In your previous postings you made arguments saying that the
automatic placement of interrupts to cpus made everything
bunch of to a single cpu and you wanted to propagate the
NAPI work to other cpu's software interrupts from there.


I don't recall saying anything in previous posts about this. Are you 
confusing my posts with Jan-Bernd's? Jan-Bernd has been talking about 
using hrtimers to _reschedule_ NAPI. My posts are suggesting an 
alternative mechanism that keeps NAPI active (with interrupts disabled) 
for a jiffy or two after it would otherwise have gone idle in order to 
avoid too many interrupts when the packet rate is such that NAPI 
thrashes between poll-on and poll-off.



That logic is bogus, because it merely proves that the hardware
interrupt distribution is broken.  If it's a bad cpu to run
software interrupts on, it's also a bad cpu to run hardware
interrupts on.


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 14:26:46 -0700

> My original patch had the equivalent of
> 
>   char* print_mac(char* buf, const char* addr) {
>   sprintf(buf,"%02x:...", addr[0]...)
>   return buf;
>   }
> 
> and used:
> 
>   DECLARE_MAC_BUF(var); //same as char var[18];
>   printk(MAC_FMT, MAC_ARG(var, addr));
> 
> which didn't require splitting printk()s
> 
> I've still got the original patch.
> It's just substituting EUI48 for MAC and forward porting.
> 
> Want something like that?

That sounds OK.  Let's give Johannes a chance to give some
feedback first.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread Joe Perches
On Mon, 2007-08-27 at 13:41 -0700, David Miller wrote:
> there are better approaches to this,
> how about just calling:
> 
>   print_mac(dev->dev_addr);
> 
> Sure, we'll have to split up printk() calls, but in the end it's
> likely still smaller and better.  And I think it's much cleaner
> than this macro stuff.

My original patch had the equivalent of

char* print_mac(char* buf, const char* addr) {
sprintf(buf,"%02x:...", addr[0]...)
return buf;
}

and used:

DECLARE_MAC_BUF(var); //same as char var[18];
printk(MAC_FMT, MAC_ARG(var, addr));

which didn't require splitting printk()s

I've still got the original patch.
It's just substituting EUI48 for MAC and forward porting.

Want something like that?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [E1000-devel] [PATCH net-2.6.24] e100: fix driver init lockup on e100_up()

2007-08-27 Thread David Miller
From: James Chapman <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 22:03:15 +0100

> Kok, Auke wrote:
> > James Chapman wrote:
> >>  nic = netdev_priv(netdev);
> >> -netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
> >>  nic->netdev = netdev;
> >>  nic->pdev = pdev;
> >>  nic->msg_enable = (1 << debug) - 1;
> >>  pci_set_drvdata(pdev, netdev);
> >> +netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
> >> +napi_disable(&nic->napi);
> > 
> > Just wondering, could we even reverse this order? IOW disable NAPI 
> > first, then add it ?
> 
> I think the order shouldn't matter. DaveM?

It doesn't matter.

I'm beginning to think maybe we should do an implicit napi_disable()
in netif_napi_add(), then it's easier for drivers to play nice.

On open you do napi_enable(), in close you do napi_disable().
That's it.

And anywhere else in your driver that you have to napi_disable()
(suspend, recovering from hardware errors, etc.) you must be sure to
do the associated napi_enable() later on in order to keep things
balanced.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 13:57:42 -0700

> On Mon, 2007-08-27 at 13:41 -0700, David Miller wrote:
> > From: Johannes Berg <[EMAIL PROTECTED]>
> > Date: Mon, 27 Aug 2007 12:54:09 +0200
> > > #define MAC_FMT "%s"
> > > #define MAC_ARG(a) ({char __buf[18]; print_mac(a, __buf); __buf;})
> 
> > I don't think this works.
> 
> $ cat test_fmt.c
> #include 
> #include 

You're just getting lucky in this test case.

The language does not allow what you are doing, so you're
playing with fire.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread Stephen Hemminger
On Mon, 27 Aug 2007 13:57:42 -0700
Joe Perches <[EMAIL PROTECTED]> wrote:

> On Mon, 2007-08-27 at 13:41 -0700, David Miller wrote:
> > From: Johannes Berg <[EMAIL PROTECTED]>
> > Date: Mon, 27 Aug 2007 12:54:09 +0200
> > > #define MAC_FMT "%s"
> > > #define MAC_ARG(a) ({char __buf[18]; print_mac(a, __buf); __buf;})
> 
> > I don't think this works.
> 
> $ cat test_fmt.c
> #include 
> #include 
> 
> #define MAC_FMT "%s"
> #define MAC_ARG(a) ({char __buf[18]; print_mac(a, __buf); __buf;})
> 
> int print_mac(const char* p, char* b)
> {
>   return sprintf(b, "%02x:%02x:%02x:%02x:%02x:%02x",
>p[0], p[1], p[2], p[3], p[4], p[5]);
> }
> 
> int main(int argc, char** argv)
> {
>   char m1[6] = {1,2,3,4,5,6};
>   char m2[6] = {6,5,4,3,2,1};
> 
>   printf("m1: " MAC_FMT " m2: " MAC_FMT "\n", MAC_ARG(m1), MAC_ARG(m2));
>   return 0;
> }
> 
> $ gcc test_fmt.c
> $ ./a.out
> m1: 01:02:03:04:05:06 m2: 06:05:04:03:02:01

As Dave said, you are passing out a variable which is no longer valid outside
of it's scope. GCC today may accidentally allow it or it might work, but it
is only because of a GCC bug. If I recall discussions about some of the
recent kernel space bloat, GCC doesn't reuse space for variables declared
in subblocks.

I.e:
int foo(int x) {
if (x) {
char block1[1024];
...
} else {
char block2[128];
}

}

Compiler should be able to use same stack space for block1/block2 and only grow
stack by 1K. But it probably isn't that smart.




-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [E1000-devel] [PATCH net-2.6.24] e100: fix driver init lockup on e100_up()

2007-08-27 Thread James Chapman

Kok, Auke wrote:

James Chapman wrote:

Recent NAPI changes require that napi_enable() is always matched with
a napi_disable(). This patch makes sure that this invariant holds for
e100. It also moves the netif_napi_add() call until after private
pointers have been intialized, though this might only be significant
for cases where netpoll is being used.

Signed-off-by: James Chapman <[EMAIL PROTECTED]>

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index e25f5ec..48996a4 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2575,11 +2575,12 @@ static int __devinit e100_probe(struct pci_dev 
*pdev,

 strncpy(netdev->name, pci_name(pdev), sizeof(netdev->name) - 1);
 
 nic = netdev_priv(netdev);

-netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
 nic->netdev = netdev;
 nic->pdev = pdev;
 nic->msg_enable = (1 << debug) - 1;
 pci_set_drvdata(pdev, netdev);
+netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
+napi_disable(&nic->napi);


Just wondering, could we even reverse this order? IOW disable NAPI 
first, then add it ?


I think the order shouldn't matter. DaveM?


Otherwise this sounds OK to me.


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread David Miller
From: James Chapman <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 16:51:29 +0100

> To implement this, there's no need for timers, hrtimers or generic NAPI 
> support that others have suggested. A driver's poll() would set an 
> internal flag and record the current jiffies value when finding 
> workdone=0 rather than doing an immediate napi_complete(). Early in 
> poll() it would test this flag and if set, do a low-cost test to see if 
> it had any work to do. If no work, it would check the saved jiffies 
> value and do the napi_complete() only if no work has been done for a 
> configurable number of jiffies. This keeps interrupts disabled longer at 
> the expense of many more calls to poll() where no work is done. So 
> critical to this scheme is modifying the driver's poll() to fastpath the 
> case of having no work to do while waiting for its local jiffy count to 
> expire.
> 
> Here's an untested patch for tg3 that illustrates the idea.

It's only going to work with hrtimers, these interfaces can
process at least 100,000 per jiffies tick.

And the hrtimer granularity is going to need to be significantly low,
and futhermore you're adding a guaranteed extra interrupt (for the
hrtimer firing) in these cases where we're exactly trying to avoid is
more interrupts.

If you can make it work, fine, but it's going to need to be at a
minimum disabled when the hrtimer granularity is not sufficient.

But there are huger fish to fry for you I think.  Talk to your
platform maintainers and ask for an interface for obtaining
a flat static distribution of interrupts to cpus in order to
support multiqueue NAPI better.

In your previous postings you made arguments saying that the
automatic placement of interrupts to cpus made everything
bunch of to a single cpu and you wanted to propagate the
NAPI work to other cpu's software interrupts from there.

That logic is bogus, because it merely proves that the hardware
interrupt distribution is broken.  If it's a bad cpu to run
software interrupts on, it's also a bad cpu to run hardware
interrupts on.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread Joe Perches
On Mon, 2007-08-27 at 13:41 -0700, David Miller wrote:
> From: Johannes Berg <[EMAIL PROTECTED]>
> Date: Mon, 27 Aug 2007 12:54:09 +0200
> > #define MAC_FMT "%s"
> > #define MAC_ARG(a) ({char __buf[18]; print_mac(a, __buf); __buf;})

> I don't think this works.

$ cat test_fmt.c
#include 
#include 

#define MAC_FMT "%s"
#define MAC_ARG(a) ({char __buf[18]; print_mac(a, __buf); __buf;})

int print_mac(const char* p, char* b)
{
  return sprintf(b, "%02x:%02x:%02x:%02x:%02x:%02x",
 p[0], p[1], p[2], p[3], p[4], p[5]);
}

int main(int argc, char** argv)
{
  char m1[6] = {1,2,3,4,5,6};
  char m2[6] = {6,5,4,3,2,1};

  printf("m1: " MAC_FMT " m2: " MAC_FMT "\n", MAC_ARG(m1), MAC_ARG(m2));
  return 0;
}

$ gcc test_fmt.c
$ ./a.out
m1: 01:02:03:04:05:06 m2: 06:05:04:03:02:01


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread David Miller
From: Joe Perches <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 08:44:17 -0700

> The compound statement to hide the automatic works well.

As I explained in another reply, it doesn't work well.

It's undefined to reference that in-expression local
variable after the expression is done being evaluated.
It's out of scope so the compiler can reclaim that
stack space and use it for other things.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.24 build broken (allyesconfig)

2007-08-27 Thread David Miller
From: Johannes Berg <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 14:52:59 +0200

> This fixes a typo in commit f98d4ca4986fec.

One of many.

At least 5 drivers didn't build because of this patch.  And this is
incredibly trivial stuff.  It's not like your changing the core
NAPI code for 20+ drivers, you're making a minor edit to a bunch
of printk() statements!

And in another thread we're arguing the merits of this approach.

So why don't you collect the build fixes and the original change
together, and we can use it if that's the final way to address this
issue.

Because I'm going to revert it for now, it's caused too much trouble.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net-2.6.24 build broken (allyesconfig)

2007-08-27 Thread David Miller
From: "Ilpo_Järvinen" <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 15:32:26 +0300 (EEST)

> Hmm, I would guess that "[NET]: Introduce MAC_FMT/MAC_ARG" broken it, 
> though didn't verify it.

This will be the 4th or so build regression I'll have to clean up from
that patch.

I'm just going to revert it.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread David Miller
From: Johannes Berg <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 12:54:09 +0200

> -- change macros to --
> #define MAC_FMT "%s"
> #define MAC_ARG(a) ({char __buf[18]; print_mac(a, buf); __buf})
> 
> I'm not sure we'd want that, but at the time you said it made the kernel
> significantly smaller and I doubt there's a performance problem with it
> (who prints mac addresses regularly?)

I don't think this works.

The scope of the __buf[18] array is inside of that MAC_ARG()
expression, which will be fully evaluated before constructing
the argument to printk().

Therefore printk() will be passed what is essentially a stale stack
pointer.

You'd need something like a "MAC_BUF buf;" all the callers need
to declare, and a new "buf" argument to MAC_ARG().

If this was the goal, there are better approches to this, how
about just calling:

print_mac(dev->dev_addr);

Sure, we'll have to split up printk() calls, but in the end it's
likely still smaller and better.  And I think it's much cleaner
than this macro stuff.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread David Miller
From: Jan-Bernd Themann <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 11:47:01 +0200

> So the question is simply: Do we want drivers that need (benefit
> from) a timer based polling support to implement their own timers
> each, or should there be a generic support?

I'm trying to figure out how an hrtimer implementation would
even work.

Would you start the timer from the chip interrupt handler?  If so,
that's taking two steps backwards as you've already taken all of the
overhead of running the interrupt handler.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-08-27 Thread David Acker

Kok, Auke wrote:

Milton Miller wrote:

On Jun 5, 2007, at 8:34 AM, David Acker wrote:


David, Milton,

This was the last communication on-topic for the proposed changes to fix 
e100 on ARM. We're holding our breath here waiting for more, and would 
love to hear that this issue and fixes hasn't died off.


Thanks,

Auke


I am sorry folks, this is my fault.  I got pulled onto a fire on one of 
our other products.  I have only recently come back to working on our 
product that uses the e100 on ARM.  Based on my current time available 
to finish cleaning up this patch, I should have a new version available 
by the end of this week.

-Ack







Milton Miller wrote:

On Jun 1, 2007, at 3:45 PM, David Acker wrote:
Ok, I took a stab at coding and testing these ideas.  Below is a 
patch against 2.6.22-rc3.

Let me know what you think.
I think you got most of the ideas.   As Auke noted, your coding 
style is showing again.   And your mailer again munged whitespace 
(fixed by s/^// s/^$//).
Sorry about the coding style.  I instinctively followed what was 
there instead of kernel coding convention.  I will look into how 
whitespace is getting screwed up.


I have to watch my coding style too (I like to indent the closing brace).

At least the white space damage seems to be reversable.  More than I 
can say for this mailer.



Find a buffer that is complete with rx->el not set and rx->s0 set.
It appears that hardware can read the rfd's el-bit, then 
software can clear the rfd el-bit and set the rfd size, and then 
hardware can come in and read the size.
Yes, since the size is after the EL flag in the descriptor, this can 
happen since the pci read is not atomic.
I am reading the status back, although I don't think that I have to 
in this instance.
Actually, you are reading it when the rfd still has EL set.  Since 
the cpu will never encounter that case, the if condition is never 
satisfied.
In my tests, every time I found a completed rfd with the el-bit set, 
the receiver was in the out of resources state.


Yes, if the EL was set, it would be a real hard race to find the 
completed packet with EL but not RNR.   I was trying to refer to where 
you find a completed packet and then check for EL in the RFD.  That is 
what I was claiming can not be observed by the cpu (unless the card 
writes the EL bit back, and not just the status u16).


If the unless ... above is true, then please put a comment that the 
device can write RFD->EL back to 1 if we raced.



How about creating a state unknown, for when we think we should 
check the device if its running.
If we are in this state and then encounter a received packet without 
s0 set, we can set it back

to running.   We set it when we rx a packet with s0 set.
We then move both io_status reads to the caller.

I can look into that as I clean this up.


I am testing a version of this code patched against 2.6.18.4 on my 
PXA 255 based system.  I will let you all know how it goes.
The testing I did so far did well.  I will try to get some more going 
tonight, hopefully on a cleaned up patch.


Good to hear our expectiations match reality.



I'm assuming this is why the cleanup of the receiver start to always 
start on rx_to_clean got dropped again. :-)

Yep.  I will get that in the next patch.


Ok.

Also, I would like a few sentences in the Driver Operation section 
IV Receive big comment.  Something like
In order to keep updates to the RFD link field from colliding with 
hardware writes to mark packets complete, we use the feature that 
hardware will not write to a size 0 descriptor and mark the previous 
packet as end-of-list (EL).   After updating the link, we remove EL 
and only then restore the size such that hardware may use the 
previous-to-end RFD.
at the end of the first paragraph, and insert software before "no 
locking is required" in the second.

Sounds good to me.

I will see if I can get into a cleaned up patch today and get it out 
by tomorrow.  Thanks for dealing with me...I have been around kernel 
code for awhile but posting official patches to linux is new to me.

-Ack


I've just learned by watching the lists over the last several years.  
Well, and actually writing the odd patch here and there.


It occurs to me that I have been focusing on the code and not the 
changelog.   I'll send a seperate reply on that thread shortly.


One more thing I'll state here ... as per the perfect patch 
guidelines, it is preferred that the meta-discussion about the patch 
and its history go after the change log, seperated from it by a line 
of "--- " so that the patch application scripts can just extract the 
email subject as the title and through the firsst line of --- as the 
commit log.  (This saves some manual editing).


[1] http://kernelnewbies.org/UpstreamMerge

milton

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsu

Re: Remove softirq scheduling from pktgen [PATCH]

2007-08-27 Thread Robert Olsson

Christoph Hellwig writes:

 > > Hello, It's not a job for pktgen.
 > 
 > Please also kill the do_softirq export while you're at it.


 Right seems like pktgen luckily was the only user.

 Cheers
--ro


Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>


diff --git a/kernel/softirq.c b/kernel/softirq.c
index 0f546dd..dbbdcd7 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -271,8 +271,6 @@ asmlinkage void do_softirq(void)
local_irq_restore(flags);
 }
 
-EXPORT_SYMBOL(do_softirq);
-
 #endif
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9 Rev3] Implement batching skb API and support in IPoIB

2007-08-27 Thread Rick Jones

David Miller wrote:

From: John Heffner <[EMAIL PROTECTED]>
Date: Sun, 26 Aug 2007 21:32:26 -0400


There are a few interesting things here.  For one, the bursts caused by 
TSO seem to be causing the receiver to do stretch acks.  This may have a 
negative impact on flow performance, but it's hard to say for sure how 
much.  Interestingly, it will even further reduce the CPU load on the 
sender, since it has to process fewer acks.


As I suspected, in the non-TSO case the receiver gets lots of packets 
directly queued to user.  This should result in somewhat lower CPU 
utilization on the receiver.  I don't know if it can account for all the 
difference you see.



I had completely forgotten these stretch ACK and ucopy issues.


ISTR that LRO will induce stretch ACKs as well.  Not that I dislike fewer ACKs 
mind you... :)


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

2007-08-27 Thread Kok, Auke

Milton Miller wrote:

On Jun 5, 2007, at 8:34 AM, David Acker wrote:


David, Milton,

This was the last communication on-topic for the proposed changes to fix e100 on 
ARM. We're holding our breath here waiting for more, and would love to hear that 
this issue and fixes hasn't died off.


Thanks,

Auke




Milton Miller wrote:

On Jun 1, 2007, at 3:45 PM, David Acker wrote:
Ok, I took a stab at coding and testing these ideas.  Below is a 
patch against 2.6.22-rc3.

Let me know what you think.
I think you got most of the ideas.   As Auke noted, your coding style 
is showing again.   And your mailer again munged whitespace (fixed by 
s/^// s/^$//).
Sorry about the coding style.  I instinctively followed what was there 
instead of kernel coding convention.  I will look into how whitespace 
is getting screwed up.


I have to watch my coding style too (I like to indent the closing 
brace).


At least the white space damage seems to be reversable.  More than I 
can say for this mailer.



Find a buffer that is complete with rx->el not set and rx->s0 set.
It appears that hardware can read the rfd's el-bit, then 
software can clear the rfd el-bit and set the rfd size, and then 
hardware can come in and read the size.
Yes, since the size is after the EL flag in the descriptor, this can 
happen since the pci read is not atomic.
I am reading the status back, although I don't think that I have to 
in this instance.
Actually, you are reading it when the rfd still has EL set.  Since 
the cpu will never encounter that case, the if condition is never 
satisfied.
In my tests, every time I found a completed rfd with the el-bit set, 
the receiver was in the out of resources state.


Yes, if the EL was set, it would be a real hard race to find the 
completed packet with EL but not RNR.   I was trying to refer to where 
you find a completed packet and then check for EL in the RFD.  That is 
what I was claiming can not be observed by the cpu (unless the card 
writes the EL bit back, and not just the status u16).


If the unless ... above is true, then please put a comment that the 
device can write RFD->EL back to 1 if we raced.



How about creating a state unknown, for when we think we should check 
the device if its running.
If we are in this state and then encounter a received packet without 
s0 set, we can set it back

to running.   We set it when we rx a packet with s0 set.
We then move both io_status reads to the caller.

I can look into that as I clean this up.


I am testing a version of this code patched against 2.6.18.4 on my 
PXA 255 based system.  I will let you all know how it goes.
The testing I did so far did well.  I will try to get some more going 
tonight, hopefully on a cleaned up patch.


Good to hear our expectiations match reality.



I'm assuming this is why the cleanup of the receiver start to always 
start on rx_to_clean got dropped again. :-)

Yep.  I will get that in the next patch.


Ok.

Also, I would like a few sentences in the Driver Operation section IV 
Receive big comment.  Something like
In order to keep updates to the RFD link field from colliding with 
hardware writes to mark packets complete, we use the feature that 
hardware will not write to a size 0 descriptor and mark the previous 
packet as end-of-list (EL).   After updating the link, we remove EL 
and only then restore the size such that hardware may use the 
previous-to-end RFD.
at the end of the first paragraph, and insert software before "no 
locking is required" in the second.

Sounds good to me.

I will see if I can get into a cleaned up patch today and get it out 
by tomorrow.  Thanks for dealing with me...I have been around kernel 
code for awhile but posting official patches to linux is new to me.

-Ack


I've just learned by watching the lists over the last several years.  
Well, and actually writing the odd patch here and there.


It occurs to me that I have been focusing on the code and not the 
changelog.   I'll send a seperate reply on that thread shortly.


One more thing I'll state here ... as per the perfect patch guidelines, 
it is preferred that the meta-discussion about the patch and its 
history go after the change log, seperated from it by a line of "--- " 
so that the patch application scripts can just extract the email 
subject as the title and through the firsst line of --- as the commit 
log.  (This saves some manual editing).


[1] http://kernelnewbies.org/UpstreamMerge

milton

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 7/7] qeth: Drop ARP packages on HiperSockets interface with NOARP attribute.

2007-08-27 Thread Ursula Braun
From: Klaus D. Wacker <[EMAIL PROTECTED]>

A network interface can get ARP packets even when the interface has
NOARP specified. In a HiperSockets environment this disturbs receiving
systems when packets are sent on the multicast queue. (E.g. TCP/IP on
z/VM issues messages reporting invalid data on the HiperSockets
interface.)
Qeth will no longer send ARP packets on HiperSockets interface when
interface has the NOARP attribute.

Signed-off-by: Klaus D. Wacker <[EMAIL PROTECTED]>
Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth_main.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -2505,7 +2505,7 @@ qeth_rebuild_skb_fake_ll_tr(struct qeth_
struct iphdr *ip_hdr;
 
QETH_DBF_TEXT(trace,5,"skbfktr");
-   skb_set_mac_header(skb, -QETH_FAKE_LL_LEN_TR);
+   skb_set_mac_header(skb, (int)-QETH_FAKE_LL_LEN_TR);
/* this is a fake ethernet header */
fake_hdr = tr_hdr(skb);
 
@@ -4710,9 +4710,15 @@ qeth_send_packet(struct qeth_card *card,
if (card->info.type != QETH_CARD_TYPE_IQD)
rc = qeth_do_send_packet(card, queue, new_skb, hdr,
 elements_needed, ctx);
-   else
+   else {
+   if ((skb->protocol == htons(ETH_P_ARP)) &&
+   (card->dev->flags & IFF_NOARP)) {
+   __qeth_free_new_skb(skb, new_skb);
+   return -EPERM;
+   }
rc = qeth_do_send_packet_fast(card, queue, new_skb, hdr,
  elements_needed, ctx);
+   }
if (!rc) {
card->stats.tx_packets++;
card->stats.tx_bytes += tx_bytes;

-- 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 4/7] qeth: Announce tx checksumming for qeth devices in TSO/EDDP mode

2007-08-27 Thread Ursula Braun
From: Frank Blaschka <[EMAIL PROTECTED]>

TSO requires tx checksumming. For non GSO frames in TSO/EDDP mode we
have to manually calculate the checksum. 

Signed-off-by: Frank Blaschka <[EMAIL PROTECTED]>
Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---


Subject: [patch 4/7] [PATCH] qeth: Announce tx checksumming for qeth devices in 
TSO/EDDP mode

From: Frank Blaschka <[EMAIL PROTECTED]>

TSO requires tx checksumming. For non GSO frames in TSO/EDDP mode we
have to manually calculate the checksum. 

Signed-off-by: Frank Blaschka <[EMAIL PROTECTED]>
Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth_main.c |   82 +++
 1 file changed, 68 insertions(+), 14 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -4555,6 +4555,53 @@ qeth_get_elements_no(struct qeth_card *c
 return elements_needed;
 }
 
+static void qeth_tx_csum(struct sk_buff *skb)
+{
+   int tlen;
+
+   if (skb->protocol == htons(ETH_P_IP)) {
+   tlen = ntohs(ip_hdr(skb)->tot_len) - (ip_hdr(skb)->ihl << 2);
+   switch (ip_hdr(skb)->protocol) {
+   case IPPROTO_TCP:
+   tcp_hdr(skb)->check = 0;
+   tcp_hdr(skb)->check = csum_tcpudp_magic(
+   ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
+   tlen, ip_hdr(skb)->protocol,
+   skb_checksum(skb, skb_transport_offset(skb),
+   tlen, 0));
+   break;
+   case IPPROTO_UDP:
+   udp_hdr(skb)->check = 0;
+   udp_hdr(skb)->check = csum_tcpudp_magic(
+   ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
+   tlen, ip_hdr(skb)->protocol,
+   skb_checksum(skb, skb_transport_offset(skb),
+   tlen, 0));
+   break;
+   }
+   } else if (skb->protocol == htons(ETH_P_IPV6)) {
+   switch (ipv6_hdr(skb)->nexthdr) {
+   case IPPROTO_TCP:
+   tcp_hdr(skb)->check = 0;
+   tcp_hdr(skb)->check = csum_ipv6_magic(
+   &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr,
+   ipv6_hdr(skb)->payload_len,
+   ipv6_hdr(skb)->nexthdr,
+   skb_checksum(skb, skb_transport_offset(skb),
+   ipv6_hdr(skb)->payload_len, 0));
+   break;
+   case IPPROTO_UDP:
+   udp_hdr(skb)->check = 0;
+   udp_hdr(skb)->check = csum_ipv6_magic(
+   &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr,
+   ipv6_hdr(skb)->payload_len,
+   ipv6_hdr(skb)->nexthdr,
+   skb_checksum(skb, skb_transport_offset(skb),
+   ipv6_hdr(skb)->payload_len, 0));
+   break;
+   }
+   }
+}
 
 static int
 qeth_send_packet(struct qeth_card *card, struct sk_buff *skb)
@@ -4640,6 +4687,10 @@ qeth_send_packet(struct qeth_card *card,
elements_needed += elems;
}
 
+   if ((large_send == QETH_LARGE_SEND_NO) &&
+   (skb->ip_summed == CHECKSUM_PARTIAL))
+   qeth_tx_csum(new_skb);
+
if (card->info.type != QETH_CARD_TYPE_IQD)
rc = qeth_do_send_packet(card, queue, new_skb, hdr,
 elements_needed, ctx);
@@ -6387,20 +6438,18 @@ qeth_deregister_addr_entry(struct qeth_c
 static u32
 qeth_ethtool_get_tx_csum(struct net_device *dev)
 {
-   /* We may need to say that we support tx csum offload if
-* we do EDDP or TSO. There are discussions going on to
-* enforce rules in the stack and in ethtool that make
-* SG and TSO depend on HW_CSUM. At the moment there are
-* no such rules
-* If we say yes here, we have to checksum outbound packets
-* any time. */
-   return 0;
+   return (dev->features & NETIF_F_HW_CSUM) != 0;
 }
 
 static int
 qeth_ethtool_set_tx_csum(struct net_device *dev, u32 data)
 {
-   return -EINVAL;
+   if (data)
+   dev->features |= NETIF_F_HW_CSUM;
+   else
+   dev->features &= ~NETIF_F_HW_CSUM;
+
+   return 0;
 }
 
 static u32
@@ -7414,7 +7463,8 @@ qeth_start_ipa_tso(struct qeth_card *car
}
if (rc && (card->options.large_send == QETH_LARGE_SEND_TSO)){
card->options.large_send = QETH_LARGE_SEND_NO;
-   card->dev->f

[patch 0/7] qeth patches for 2.6.23-rc3

2007-08-27 Thread Ursula Braun
-- 
qeth patches for 2.6.23-rc3:
- do not allow interruption of "ungroup"
- scatter gather mode: enforce rate limit
- don't return void function return values
- add tx checkumming for TSO/EDDP mode
- invoke qeth_clear_output_buffer only for allocated qdio queues.
- add specific message for exclusively used OSA-adapters
- drop ARP packets on HiperSockets

Regards,   Ursula Braun
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 5/7] qeth: crash during reboot after failing online setting

2007-08-27 Thread Ursula Braun
From: Ursula Braun <[EMAIL PROTECTED]>

Online setting of a qeth device may fail for instance because of:
- out-of-memory condition when allocating qdio queues
- IDX ACTIVATE problem
- ...
Such a device is still returned in a driver_for_each_device loop
processed in qeth_reboot_event(), which calls
qeth_clear_qdio_buffers(). Make sure qeth_clear_output_buffer() is
called only, if the qdio queues have been successfully allocated
during initialization of a qeth device.

Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth_main.c |   20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -3356,10 +3356,12 @@ out_freeoutq:
while (i > 0)
kfree(card->qdio.out_qs[--i]);
kfree(card->qdio.out_qs);
+   card->qdio.out_qs = NULL;
 out_freepool:
qeth_free_buffer_pool(card);
 out_freeinq:
kfree(card->qdio.in_q);
+   card->qdio.in_q = NULL;
 out_nomem:
atomic_set(&card->qdio.state, QETH_QDIO_UNINITIALIZED);
return -ENOMEM;
@@ -3375,16 +3377,20 @@ qeth_free_qdio_buffers(struct qeth_card 
QETH_QDIO_UNINITIALIZED)
return;
kfree(card->qdio.in_q);
+   card->qdio.in_q = NULL;
/* inbound buffer pool */
qeth_free_buffer_pool(card);
/* free outbound qdio_qs */
-   for (i = 0; i < card->qdio.no_out_queues; ++i){
-   for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j)
-   qeth_clear_output_buffer(card->qdio.out_qs[i],
-   &card->qdio.out_qs[i]->bufs[j]);
-   kfree(card->qdio.out_qs[i]);
+   if (card->qdio.out_qs) {
+   for (i = 0; i < card->qdio.no_out_queues; ++i) {
+   for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j)
+   qeth_clear_output_buffer(card->qdio.out_qs[i],
+   &card->qdio.out_qs[i]->bufs[j]);
+   kfree(card->qdio.out_qs[i]);
+   }
+   kfree(card->qdio.out_qs);
+   card->qdio.out_qs = NULL;
}
-   kfree(card->qdio.out_qs);
 }
 
 static void
@@ -3395,7 +3401,7 @@ qeth_clear_qdio_buffers(struct qeth_card
QETH_DBF_TEXT(trace, 2, "clearqdbf");
/* clear outbound buffers to free skbs */
for (i = 0; i < card->qdio.no_out_queues; ++i)
-   if (card->qdio.out_qs[i]){
+   if (card->qdio.out_qs && card->qdio.out_qs[i]) {
for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j)
qeth_clear_output_buffer(card->qdio.out_qs[i],
&card->qdio.out_qs[i]->bufs[j]);

-- 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 6/7] qeth: provide specific message for OSA-adapters exclusively used

2007-08-27 Thread Ursula Braun
From: Ursula Braun <[EMAIL PROTECTED]>

Exclusive usage of OSA-cards has been introduced. Even though Linux
does not make use of it, qeth should be prepared to receive a bad RC
for some initialization steps. A meaningful message is now given,
if an OSA-device is set online, even though the OSA-adapter is already
exclusively used by another host.

Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth_main.c |   28 +++-
 drivers/s390/net/qeth_mpc.h  |1 +
 2 files changed, 20 insertions(+), 9 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -1541,16 +1541,21 @@ qeth_idx_write_cb(struct qeth_channel *c
card = CARD_FROM_CDEV(channel->ccwdev);
 
if (!(QETH_IS_IDX_ACT_POS_REPLY(iob->data))) {
-   PRINT_ERR("IDX_ACTIVATE on write channel device %s: negative "
- "reply\n", CARD_WDEV_ID(card));
+   if (QETH_IDX_ACT_CAUSE_CODE(iob->data) == 0x19)
+   PRINT_ERR("IDX_ACTIVATE on write channel device %s: "
+   "adapter exclusively used by another host\n",
+   CARD_WDEV_ID(card));
+   else
+   PRINT_ERR("IDX_ACTIVATE on write channel device %s: "
+   "negative reply\n", CARD_WDEV_ID(card));
goto out;
}
memcpy(&temp, QETH_IDX_ACT_FUNC_LEVEL(iob->data), 2);
if ((temp & ~0x0100) != qeth_peer_func_level(card->info.func_level)) {
PRINT_WARN("IDX_ACTIVATE on write channel device %s: "
-  "function level mismatch "
-  "(sent: 0x%x, received: 0x%x)\n",
-  CARD_WDEV_ID(card), card->info.func_level, temp);
+   "function level mismatch "
+   "(sent: 0x%x, received: 0x%x)\n",
+   CARD_WDEV_ID(card), card->info.func_level, temp);
goto out;
}
channel->state = CH_STATE_UP;
@@ -1596,8 +1601,13 @@ qeth_idx_read_cb(struct qeth_channel *ch
goto out;
}
if (!(QETH_IS_IDX_ACT_POS_REPLY(iob->data))) {
-   PRINT_ERR("IDX_ACTIVATE on read channel device %s: negative "
- "reply\n", CARD_RDEV_ID(card));
+   if (QETH_IDX_ACT_CAUSE_CODE(iob->data) == 0x19)
+   PRINT_ERR("IDX_ACTIVATE on read channel device %s: "
+   "adapter exclusively used by another host\n",
+   CARD_RDEV_ID(card));
+   else
+   PRINT_ERR("IDX_ACTIVATE on read channel device %s: "
+   "negative reply\n", CARD_RDEV_ID(card));
goto out;
}
 
@@ -1612,8 +1622,8 @@ qeth_idx_read_cb(struct qeth_channel *ch
memcpy(&temp, QETH_IDX_ACT_FUNC_LEVEL(iob->data), 2);
if (temp != qeth_peer_func_level(card->info.func_level)) {
PRINT_WARN("IDX_ACTIVATE on read channel device %s: function "
-  "level mismatch (sent: 0x%x, received: 0x%x)\n",
-  CARD_RDEV_ID(card), card->info.func_level, temp);
+   "level mismatch (sent: 0x%x, received: 0x%x)\n",
+   CARD_RDEV_ID(card), card->info.func_level, temp);
goto out;
}
memcpy(&card->token.issuer_rm_r,
Index: linux-2.6-uschi/drivers/s390/net/qeth_mpc.h
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_mpc.h
+++ linux-2.6-uschi/drivers/s390/net/qeth_mpc.h
@@ -565,6 +565,7 @@ extern unsigned char IDX_ACTIVATE_WRITE[
 #define QETH_IDX_ACT_QDIO_DEV_REALADDR(buffer) (buffer+0x20)
 #define QETH_IS_IDX_ACT_POS_REPLY(buffer) (((buffer)[0x08]&3)==2)
 #define QETH_IDX_REPLY_LEVEL(buffer) (buffer+0x12)
+#define QETH_IDX_ACT_CAUSE_CODE(buffer) (buffer)[0x09]
 
 #define PDU_ENCAPSULATION(buffer) \
(buffer + *(buffer + (*(buffer+0x0b)) + \

-- 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 2/7] qeth: enforce a rate limit for inbound scatter gather messages

2007-08-27 Thread Ursula Braun
From: Frank Blaschka <[EMAIL PROTECTED]>

under memory pressure scatter gather mode switching messages must be
rate limited.

Signed-off-by: Frank Blaschka <[EMAIL PROTECTED]>
Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth_main.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -2803,13 +2803,16 @@ qeth_queue_input_buffer(struct qeth_card
if (newcount < count) {
/* we are in memory shortage so we switch back to
   traditional skb allocation and drop packages */
-   if (atomic_cmpxchg(&card->force_alloc_skb, 0, 1))
-   printk(KERN_WARNING
-   "qeth: switch to alloc skb\n");
+   if (!atomic_read(&card->force_alloc_skb) &&
+   net_ratelimit())
+   PRINT_WARN("Switch to alloc skb\n");
+   atomic_set(&card->force_alloc_skb, 3);
count = newcount;
} else {
-   if (atomic_cmpxchg(&card->force_alloc_skb, 1, 0))
-   printk(KERN_WARNING "qeth: switch to sg\n");
+   if ((atomic_read(&card->force_alloc_skb) == 1) &&
+   net_ratelimit())
+   PRINT_WARN("Switch to sg\n");
+   atomic_add_unless(&card->force_alloc_skb, -1, 0);
}
 
/*

-- 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/7] qeth: ungrouping a device must not be interruptible

2007-08-27 Thread Ursula Braun
From: Ursula Braun <[EMAIL PROTECTED]>

Problem:
A recovery thread must not be active when device is removed.
In qeth_remove_device() an interruptible wait operation is used
to wait until a qeth recovery thread is finished. If a user really
interrupts the ungroup operation of a qeth device while a recovery
is running, cio and qeth are out of sync (device already removed
from cio, but kept in qeth). A following module unload of qeth
results in a kernel OOPS here.

Solution:
Do not allow interruption of ungroup operation to guarantee
finishing of a potentially running qeth recovery thread.

Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth_main.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -561,7 +561,7 @@ qeth_set_offline(struct ccwgroup_device 
 }
 
 static int
-qeth_wait_for_threads(struct qeth_card *card, unsigned long threads);
+qeth_threads_running(struct qeth_card *card, unsigned long threads);
 
 
 static void
@@ -576,8 +576,7 @@ qeth_remove_device(struct ccwgroup_devic
if (!card)
return;
 
-   if (qeth_wait_for_threads(card, 0x))
-   return;
+   wait_event(card->wait_q, qeth_threads_running(card, 0x) == 0);
 
if (cgdev->state == CCWGROUP_ONLINE){
card->use_hard_stop = 1;

-- 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 3/7] qeth: dont return the return values of void functions.

2007-08-27 Thread Ursula Braun
From: Heiko Carstens <[EMAIL PROTECTED]>

Signed-off-by: Heiko Carstens <[EMAIL PROTECTED]>
Signed-off-by: Ursula Braun <[EMAIL PROTECTED]>
---

 drivers/s390/net/qeth.h |4 ++--
 drivers/s390/net/qeth_sys.c |8 
 2 files changed, 6 insertions(+), 6 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth.h
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth.h
+++ linux-2.6-uschi/drivers/s390/net/qeth.h
@@ -1178,9 +1178,9 @@ qeth_ipaddr_to_string(enum qeth_prot_ver
  char *buf)
 {
if (proto == QETH_PROT_IPV4)
-   return qeth_ipaddr4_to_string(addr, buf);
+   qeth_ipaddr4_to_string(addr, buf);
else if (proto == QETH_PROT_IPV6)
-   return qeth_ipaddr6_to_string(addr, buf);
+   qeth_ipaddr6_to_string(addr, buf);
 }
 
 static inline int
Index: linux-2.6-uschi/drivers/s390/net/qeth_sys.c
===
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_sys.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_sys.c
@@ -1760,10 +1760,10 @@ qeth_remove_device_attributes(struct dev
 {
struct qeth_card *card = dev->driver_data;
 
-   if (card->info.type == QETH_CARD_TYPE_OSN)
-   return sysfs_remove_group(&dev->kobj,
- &qeth_osn_device_attr_group);
-
+   if (card->info.type == QETH_CARD_TYPE_OSN) {
+   sysfs_remove_group(&dev->kobj, &qeth_osn_device_attr_group);
+   return;
+   }
sysfs_remove_group(&dev->kobj, &qeth_device_attr_group);
sysfs_remove_group(&dev->kobj, &qeth_device_ipato_group);
sysfs_remove_group(&dev->kobj, &qeth_device_vipa_group);

-- 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Remove softirq scheduling from pktgen [PATCH]

2007-08-27 Thread Christoph Hellwig
On Mon, Aug 27, 2007 at 06:57:19PM +0200, Robert Olsson wrote:
> 
> 
> Hello, It's not a job for pktgen.

Please also kill the do_softirq export while you're at it.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [E1000-devel] [PATCH net-2.6.24] e100: fix driver init lockup on e100_up()

2007-08-27 Thread Kok, Auke

James Chapman wrote:

Recent NAPI changes require that napi_enable() is always matched with
a napi_disable(). This patch makes sure that this invariant holds for
e100. It also moves the netif_napi_add() call until after private
pointers have been intialized, though this might only be significant
for cases where netpoll is being used.

Signed-off-by: James Chapman <[EMAIL PROTECTED]>

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index e25f5ec..48996a4 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2575,11 +2575,12 @@ static int __devinit e100_probe(struct pci_dev *pdev,
strncpy(netdev->name, pci_name(pdev), sizeof(netdev->name) - 1);
 
 	nic = netdev_priv(netdev);

-   netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
nic->netdev = netdev;
nic->pdev = pdev;
nic->msg_enable = (1 << debug) - 1;
pci_set_drvdata(pdev, netdev);
+   netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
+   napi_disable(&nic->napi);


Just wondering, could we even reverse this order? IOW disable NAPI first, then 
add it ?


Otherwise this sounds OK to me.

Auke
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-2.6.24] e100: fix driver init lockup on e100_up()

2007-08-27 Thread James Chapman
Recent NAPI changes require that napi_enable() is always matched with
a napi_disable(). This patch makes sure that this invariant holds for
e100. It also moves the netif_napi_add() call until after private
pointers have been intialized, though this might only be significant
for cases where netpoll is being used.

Signed-off-by: James Chapman <[EMAIL PROTECTED]>

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index e25f5ec..48996a4 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2575,11 +2575,12 @@ static int __devinit e100_probe(struct pci_dev *pdev,
strncpy(netdev->name, pci_name(pdev), sizeof(netdev->name) - 1);
 
nic = netdev_priv(netdev);
-   netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
nic->netdev = netdev;
nic->pdev = pdev;
nic->msg_enable = (1 << debug) - 1;
pci_set_drvdata(pdev, netdev);
+   netif_napi_add(netdev, &nic->napi, e100_poll, E100_NAPI_WEIGHT);
+   napi_disable(&nic->napi);
 
if((err = pci_enable_device(pdev))) {
DPRINTK(PROBE, ERR, "Cannot enable PCI device, aborting.\n");
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread James Chapman

Jan-Bernd Themann wrote:

On Monday 27 August 2007 17:51, James Chapman wrote:

In the second half of my previous reply (which seems to have been 
deleted), I suggest a way to avoid this problem without using hardware 
interrupt mitigation / coalescing. Original text is quoted below.


 >> I've seen the same and I'm suggesting that the NAPI driver keeps
 >> itself in polled mode for N polls or M jiffies after it sees
 >> workdone=0. This has always worked for me in packet forwarding
 >> scenarios to maximize packets/sec and minimize latency.

To implement this, there's no need for timers, hrtimers or generic NAPI 
support that others have suggested. A driver's poll() would set an 
internal flag and record the current jiffies value when finding 
workdone=0 rather than doing an immediate napi_complete(). Early in 
poll() it would test this flag and if set, do a low-cost test to see if 
it had any work to do. If no work, it would check the saved jiffies 
value and do the napi_complete() only if no work has been done for a 
configurable number of jiffies. This keeps interrupts disabled longer at 
the expense of many more calls to poll() where no work is done. So 
critical to this scheme is modifying the driver's poll() to fastpath the 
case of having no work to do while waiting for its local jiffy count to 
expire.




The problem I see with this approach is that the time that passes between
two jiffies might be too long for 10G ethernet adapters. 


Why would staying in polled mode for 2 jiffies be too long in the 10G 
case? I don't see why 10G makes any difference. Your poll() would be 
called as fast as your CPU allows during those 2 jiffies (it would 
actually be between 1 and 2 jiffies in practice). It is therefore 
critical that the driver's poll() implementation is as efficient as 
possible for the "no work" case to minimize the overhead of the extra 
poll() calls. Your poll might be called thousands of times in 1-2 
jiffies with nothing to do...



(I tried to implement
a timer based approach with usual timers and the result was a disaster).
HW interrupts / or HP timer avoid the jiffy problem as they activate softIRQs
as soon as you call netif_rx_schedule. 


My scheme doesn't use timers to do netif_rx_schedule() because the 
device stays in polled mode for 1-2 jiffies _after_ it detects it has no 
more work. So the device remains scheduled, processing packets as usual. 
The device deschedules itself and re-enables its interrupts only when it 
has a period of 1-2 jiffies of doing no work.


BTW, I chose 2 jiffies in the example patch just to keep the patch 
simple. It might be more for systems with large HZ or those that want to 
be even more aggressive at staying in polled mode. I envisage it being 
another parameter that can be tweaked using ethtool if people see a 
benefit of this scheme.


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Remove softirq scheduling from pktgen [PATCH]

2007-08-27 Thread Robert Olsson


Hello, It's not a job for pktgen.

Cheers
--ro


Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>


diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 18601af..975e887 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -164,7 +164,7 @@
 #include  /* do_div */
 #include 
 
-#define VERSION  "pktgen v2.68: Packet Generator for packet performance 
testing.\n"
+#define VERSION  "pktgen v2.69: Packet Generator for packet performance 
testing.\n"
 
 /* The buckets are exponential in 'width' */
 #define LAT_BUCKETS_MAX 32
@@ -381,7 +381,6 @@ struct pktgen_thread {
struct list_head th_list;
struct task_struct *tsk;
char result[512];
-   u32 max_before_softirq; /* We'll call do_softirq to prevent starvation. 
*/
 
/* Field for thread to receive "posted" events terminate, stop ifs etc. 
*/
 
@@ -1752,9 +1751,6 @@ static int pktgen_thread_show(struct seq_file *seq, void 
*v)
 
BUG_ON(!t);
 
-   seq_printf(seq, "Name: %s  max_before_softirq: %d\n",
-  t->tsk->comm, t->max_before_softirq);
-
seq_printf(seq, "Running: ");
 
if_lock(t);
@@ -1787,7 +1783,6 @@ static ssize_t pktgen_thread_write(struct file *file,
int i = 0, max, len, ret;
char name[40];
char *pg_result;
-   unsigned long value = 0;
 
if (count < 1) {
//  sprintf(pg_result, "Wrong command format");
@@ -1861,12 +1856,8 @@ static ssize_t pktgen_thread_write(struct file *file,
}
 
if (!strcmp(name, "max_before_softirq")) {
-   len = num_arg(&user_buffer[i], 10, &value);
-   mutex_lock(&pktgen_thread_lock);
-   t->max_before_softirq = value;
-   mutex_unlock(&pktgen_thread_lock);
+   sprintf(pg_result, "OK: Note! max_before_softirq is obsoleted 
-- Do not use");
ret = count;
-   sprintf(pg_result, "OK: max_before_softirq=%lu", value);
goto out;
}
 
@@ -2145,7 +2136,6 @@ static void spin(struct pktgen_dev *pkt_dev, __u64 
spin_until_us)
if (spin_until_us - now > jiffies_to_usecs(1) + 1)
schedule_timeout_interruptible(1);
else if (spin_until_us - now > 100) {
-   do_softirq();
if (!pkt_dev->running)
return;
if (need_resched())
@@ -3515,8 +3505,6 @@ static int pktgen_thread_worker(void *arg)
struct pktgen_thread *t = arg;
struct pktgen_dev *pkt_dev = NULL;
int cpu = t->cpu;
-   u32 max_before_softirq;
-   u32 tx_since_softirq = 0;
 
BUG_ON(smp_processor_id() != cpu);
 
@@ -3526,8 +3514,6 @@ static int pktgen_thread_worker(void *arg)
 
pr_debug("pktgen: starting pktgen/%d:  pid=%d\n", cpu, current->pid);
 
-   max_before_softirq = t->max_before_softirq;
-
set_current_state(TASK_INTERRUPTIBLE);
 
set_freezable();
@@ -3546,24 +3532,9 @@ static int pktgen_thread_worker(void *arg)
 
__set_current_state(TASK_RUNNING);
 
-   if (pkt_dev) {
-
+   if (pkt_dev) 
pktgen_xmit(pkt_dev);
 
-   /*
-* We like to stay RUNNING but must also give
-* others fair share.
-*/
-
-   tx_since_softirq += pkt_dev->last_ok;
-
-   if (tx_since_softirq > max_before_softirq) {
-   if (local_softirq_pending())
-   do_softirq();
-   tx_since_softirq = 0;
-   }
-   }
-
if (t->control & T_STOP) {
pktgen_stop(t);
t->control &= ~(T_STOP);
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net/ipv4/fib_trie.c - compile error (Re: 2.6.23-rc3-mm1)

2007-08-27 Thread Paul E. McKenney
On Mon, Aug 27, 2007 at 08:36:35AM +0200, Jarek Poplawski wrote:
> On 22-08-2007 19:03, Paul E. McKenney wrote:
> > On Wed, Aug 22, 2007 at 05:41:11PM +0200, Adrian Bunk wrote:
> >> On Wed, Aug 22, 2007 at 05:30:13PM +0200, Gabriel C wrote:
> >>> Got it with a randconfig ( 
> >>> http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-8 )
> >>>
> >>> ...
> >>>
> >>> net/ipv4/fib_trie.c: In function 'trie_rebalance':
> >>> net/ipv4/fib_trie.c:969: error: lvalue required as unary '&' operand
> >>> net/ipv4/fib_trie.c:971: error: lvalue required as unary '&' operand
> >>> net/ipv4/fib_trie.c:977: error: lvalue required as unary '&' operand
> >>> net/ipv4/fib_trie.c:980: error: lvalue required as unary '&' operand
> >>> ...
> >> Side effect of the git-net removal, temporarily removing 
> >> immunize-rcu_dereference-against-crazy-compiler-writers.patch should 
> >> work around it.
> > 
> > Alternatively, the following one-line patch to net/ipv4/fib_trie.c could
> > be used.
> > 
> > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
> > ---
> > 
> >  fib_trie.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff -urpNa -X dontdiff linux-2.6.23-rc3-mm1/net/ipv4/fib_trie.c 
> > linux-2.6.23-rc3-mm1.compile/net/ipv4/fib_trie.c
> > --- linux-2.6.23-rc3-mm1/net/ipv4/fib_trie.c2007-08-22 
> > 09:20:33.0 -0700
> > +++ linux-2.6.23-rc3-mm1.compile/net/ipv4/fib_trie.c2007-08-22 
> > 09:47:33.0 -0700
> > @@ -94,7 +94,7 @@ typedef unsigned int t_key;
> >  #define T_LEAF  1
> >  #define NODE_TYPE_MASK 0x1UL
> >  #define NODE_PARENT(node) \
> > -   ((struct tnode *)rcu_dereference(((node)->parent & ~NODE_TYPE_MASK)))
> > +   ((struct tnode *)(rcu_dereference((node)->parent) & ~NODE_TYPE_MASK))
> ...
> 
> After first reading of this thread I've had an impression it's about
> compiler's behavior, but now it seems to me this patch is not an
> alternative, but a 'must be' and only proper way of calling
> rcu_dereference (with a variable instead of an expression)? Am I
> right?

Yes, rcu_dereference() does indeed need to be invoked on a lvalue.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] isdn capi driver broken on 64 bit.

2007-08-27 Thread Stephen Hemminger
On Mon, 27 Aug 2007 13:02:26 +0200
Karsten Keil <[EMAIL PROTECTED]> wrote:

> On Fri, Aug 24, 2007 at 11:08:11AM -0700, Stephen Hemminger wrote:
> > The following driver API is broken on any architecture with 64 bit 
> > addresses.
> > because of cast that loses high bits.
> > 
> > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
> > 
> > 
> > --- a/drivers/isdn/capi/capidrv.c   2007-06-25 09:03:12.0 -0700
> > +++ b/drivers/isdn/capi/capidrv.c   2007-08-24 11:06:46.0 -0700
> > @@ -1855,6 +1855,9 @@ static int if_sendbuf(int id, int channe
> > return 0;
> > }
> > datahandle = nccip->datahandle;
> > +
> > +   /* This won't work on 64 bit! */
> > +   BUILD_BUG_ON(sizeof(skb->data) > sizeof(u32));
> > capi_fill_DATA_B3_REQ(&sendcmsg, global.ap.applid, card->msgid++,
> >   nccip->ncci,  /* adr */
> >   (u32) skb->data,  /* Data */
> 
> 
> NACK.
> 
> It is not a BUG.
> 
> This is OK, since this field must have a value and on 32 it has the correct
> one) On 64 bit this field is ignored (but also need a value, random data is
> bad as well).

If you are using it as a transaction ID, then you should generate one.
There is no guarantee that two skb's won't have the same 32 bit data value
on 64 bit.
-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread Jan-Bernd Themann
On Monday 27 August 2007 17:51, James Chapman wrote:

> In the second half of my previous reply (which seems to have been 
> deleted), I suggest a way to avoid this problem without using hardware 
> interrupt mitigation / coalescing. Original text is quoted below.
> 
>  >> I've seen the same and I'm suggesting that the NAPI driver keeps
>  >> itself in polled mode for N polls or M jiffies after it sees
>  >> workdone=0. This has always worked for me in packet forwarding
>  >> scenarios to maximize packets/sec and minimize latency.
> 
> To implement this, there's no need for timers, hrtimers or generic NAPI 
> support that others have suggested. A driver's poll() would set an 
> internal flag and record the current jiffies value when finding 
> workdone=0 rather than doing an immediate napi_complete(). Early in 
> poll() it would test this flag and if set, do a low-cost test to see if 
> it had any work to do. If no work, it would check the saved jiffies 
> value and do the napi_complete() only if no work has been done for a 
> configurable number of jiffies. This keeps interrupts disabled longer at 
> the expense of many more calls to poll() where no work is done. So 
> critical to this scheme is modifying the driver's poll() to fastpath the 
> case of having no work to do while waiting for its local jiffy count to 
> expire.
> 

The problem I see with this approach is that the time that passes between
two jiffies might be too long for 10G ethernet adapters. (I tried to implement
a timer based approach with usual timers and the result was a disaster).
HW interrupts / or HP timer avoid the jiffy problem as they activate softIRQs
as soon as you call netif_rx_schedule. 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread James Chapman

David Miller wrote:

From: James Chapman <[EMAIL PROTECTED]>
Date: Sun, 26 Aug 2007 20:36:20 +0100


David Miller wrote:

From: James Chapman <[EMAIL PROTECTED]>
Date: Fri, 24 Aug 2007 18:16:45 +0100


Does hardware interrupt mitigation really interact well with NAPI?

It interacts quite excellently.
If NAPI disables interrupts and keeps them disabled while there are more 
packets arriving or more transmits being completed, why do hardware 
interrupt mitigation / coalescing features of the network silicon help?


Because if your packet rate is low enough such that the cpu can
process the interrupt fast enough and thus only one packet gets
processed per NAPI poll, the cost of going into and out of NAPI mode
dominates the packet processing costs.


In the second half of my previous reply (which seems to have been 
deleted), I suggest a way to avoid this problem without using hardware 
interrupt mitigation / coalescing. Original text is quoted below.


>> I've seen the same and I'm suggesting that the NAPI driver keeps
>> itself in polled mode for N polls or M jiffies after it sees
>> workdone=0. This has always worked for me in packet forwarding
>> scenarios to maximize packets/sec and minimize latency.

To implement this, there's no need for timers, hrtimers or generic NAPI 
support that others have suggested. A driver's poll() would set an 
internal flag and record the current jiffies value when finding 
workdone=0 rather than doing an immediate napi_complete(). Early in 
poll() it would test this flag and if set, do a low-cost test to see if 
it had any work to do. If no work, it would check the saved jiffies 
value and do the napi_complete() only if no work has been done for a 
configurable number of jiffies. This keeps interrupts disabled longer at 
the expense of many more calls to poll() where no work is done. So 
critical to this scheme is modifying the driver's poll() to fastpath the 
case of having no work to do while waiting for its local jiffy count to 
expire.


Here's an untested patch for tg3 that illustrates the idea.

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 710dccc..59e151b 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -3473,6 +3473,24 @@ static int tg3_poll(struct napi_struct *napi,
struct tg3_hw_status *sblk = tp->hw_status;
int work_done = 0;

+   /* fastpath having no work while we're holding ourself in
+* polled mode
+*/
+   if ((tp->exit_poll_time) && (!tg3_has_work(tp))) {
+   if (time_after(jiffies, tp->exit_poll_time)) {
+   tp->exit_poll_time = 0;
+   /* tell net stack and NIC we're done */
+   netif_rx_complete(netdev, napi);
+   tg3_restart_ints(tp);
+   }
+   return 0;
+   }
+
+   /* if we get here, there might be work to do so disable the
+* poll hold fastpath above
+*/
+   tp->exit_poll_time = 0;
+
/* handle link change and other phy events */
if (!(tp->tg3_flags &
  (TG3_FLAG_USE_LINKCHG_REG |
@@ -3511,11 +3529,11 @@ static int tg3_poll(struct napi_struct *napi,
} else
sblk->status &= ~SD_STATUS_UPDATED;

-   /* if no more work, tell net stack and NIC we're done */
-   if (!tg3_has_work(tp)) {
-   netif_rx_complete(netdev, napi);
-   tg3_restart_ints(tp);
-   }
+   /* if no more work, set the time in jiffies when we should
+* exit polled mode
+*/
+   if (!tg3_has_work(tp))
+   tp->exit_poll_time = jiffies + 2;

return work_done;
 }
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index a6a23bb..a0d24d3 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2163,6 +2163,7 @@ struct tg3 {
u32 last_tag;

u32 msg_enable;
+   unsigned long   exit_poll_time;

/* begin "tx thread" cacheline section */
void(*write32_tx_mbox) (struct tg3 *, u32,


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with semantics?

2007-08-27 Thread Michael Kerrisk
Hi Andi,

Andi Kleen wrote:
> Shay Goikhman <[EMAIL PROTECTED]> writes:
> 
>> Dear Linux maintainers,
>>
>>  I'm doing :
>>
>>   setsockopt(s,  SO_RCVTIMEO, t1 );  // set time-out
>> t1 on socket while block receiving on it
>>   select(,,, &fd_set_including(s), .., &errs, t2);  // block till
>> receive or time-out  t 2 jointly on a set of sockets
>>
>> Apparently, I could no find reference on the coupled behavior of the two
>> above statements in Linux documentation.
>> As I understand the blocking semantics, I would expect  that  if t1> select should return after t1 with the descriptor 's' in 'errs' if 's' does
>> not become readable in the t1 interval.
>>
>> It is not so in life -- select ignores t1 altogether.
>>
>> Do you have some enlightening knowledge on the matter?
> 
> RCVTIMEO only applies to recvmsg et.al., similar to SNDTIMEO only
> apply to sendmsg etc. But select/poll only report events, they
> do not actually send or receive by themselves.
> 
> Michael, perhaps you can clarify that in the manpages

I added the following to sockets.7:

  Timeouts have
  effect   for  socket  I/O  calls  (read(2),  recv(2),
  recvfrom(2),recvmsg(2),write(2), send(2),
  sendto(2),  sendmsg(2));  timeouts have no effect for
  select(2), poll(2), epoll_wait(2), etc.

The change will be in man-pages-2.65.

Thanks for your note.

Cheers,

Michael

-- 
Michael Kerrisk
maintainer of Linux man pages Sections 2, 3, 4, 5, and 7

Want to help with man page maintenance?  Grab the latest tarball at
http://www.kernel.org/pub/linux/docs/manpages/
read the HOWTOHELP file and grep the source files for 'FIXME'.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread Joe Perches
On Mon, 2007-08-27 at 12:54 +0200, Johannes Berg wrote:
> Thanks for this patch though, I'd have done it otherwise.

I had it, it was just a s/EUI48/MAC/ and copy/paste thing.

There are also the arch, drivers/[^net], and net directories
that have a few of these.

The patch also added the missing ")" in drivers/net/82596.c

> I was rereading your original conversion and noticed that it is now
> trivial to make the kernel smaller like you originally wanted by doing
> something like this
> 
> -- define this function somewhere --
> print_mac(u8 *mac, char *buf)
> {
>   sprintf(buf, "%02x:...", mac[0], mac[1], ...);
> }
> EXPORT_SYMBOL(print_mac)
> 
> -- change macros to --
> #define MAC_FMT "%s"
> #define MAC_ARG(a) ({char __buf[18]; print_mac(a, buf); __buf})
> 
> I'm not sure we'd want that, but at the time you said it made the kernel
> significantly smaller and I doubt there's a performance problem with it
> (who prints mac addresses regularly?)

The reduction is ~.1% in an allyesconfig.
The compound statement to hide the automatic works well.
The function call is noise compared to the printk.

cheers, Joe

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] ehea: propagate physical port state

2007-08-27 Thread Jan-Bernd Themann
Introduces a module parameter to decide whether the physical
port link state is propagated to the network stack or not.
It makes sense not to take the physical port state into account
on machines with more logical partitions that communicate
with each other. This is always possible no matter what the physical
port state is. Thus eHEA can be considered as a switch there.

Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]>

---
 drivers/net/ehea/ehea.h  |5 -
 drivers/net/ehea/ehea_main.c |   14 +-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h
index d67f97b..8d58be5 100644
--- a/drivers/net/ehea/ehea.h
+++ b/drivers/net/ehea/ehea.h
@@ -39,7 +39,7 @@
 #include 
 
 #define DRV_NAME   "ehea"
-#define DRV_VERSION"EHEA_0073"
+#define DRV_VERSION"EHEA_0074"
 
 /* eHEA capability flags */
 #define DLPAR_PORT_ADD_REM 1
@@ -402,6 +402,8 @@ struct ehea_mc_list {
 
 #define EHEA_PORT_UP 1
 #define EHEA_PORT_DOWN 0
+#define EHEA_PHY_LINK_UP 1
+#define EHEA_PHY_LINK_DOWN 0
 #define EHEA_MAX_PORT_RES 16
 struct ehea_port {
struct ehea_adapter *adapter;/* adapter that owns this port */
@@ -427,6 +429,7 @@ struct ehea_port {
u32 msg_enable;
u32 sig_comp_iv;
u32 state;
+   u8 phy_link;
u8 full_duplex;
u8 autoneg;
u8 num_def_qps;
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index db57474..1e9fd6f 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -53,17 +53,21 @@ static int rq3_entries = EHEA_DEF_ENTRIES_RQ3;
 static int sq_entries = EHEA_DEF_ENTRIES_SQ;
 static int use_mcs = 0;
 static int num_tx_qps = EHEA_NUM_TX_QP;
+static int prop_carrier_state = 0;
 
 module_param(msg_level, int, 0);
 module_param(rq1_entries, int, 0);
 module_param(rq2_entries, int, 0);
 module_param(rq3_entries, int, 0);
 module_param(sq_entries, int, 0);
+module_param(prop_carrier_state, int, 0);
 module_param(use_mcs, int, 0);
 module_param(num_tx_qps, int, 0);
 
 MODULE_PARM_DESC(num_tx_qps, "Number of TX-QPS");
 MODULE_PARM_DESC(msg_level, "msg_level");
+MODULE_PARM_DESC(prop_carrier_state, "Propagate carrier state of physical "
+"port to stack. 1:yes, 0:no.  Default = 0 ");
 MODULE_PARM_DESC(rq3_entries, "Number of entries for Receive Queue 3 "
 "[2^x - 1], x = [6..14]. Default = "
 __MODULE_STRING(EHEA_DEF_ENTRIES_RQ3) ")");
@@ -814,7 +818,9 @@ int ehea_set_portspeed(struct ehea_port *port, u32 
port_speed)
ehea_error("Failed setting port speed");
}
}
-   netif_carrier_on(port->netdev);
+   if (!prop_carrier_state || (port->phy_link == EHEA_PHY_LINK_UP))
+   netif_carrier_on(port->netdev);
+
kfree(cb4);
 out:
return ret;
@@ -869,13 +875,19 @@ static void ehea_parse_eqe(struct ehea_adapter *adapter, 
u64 eqe)
}
 
if (EHEA_BMASK_GET(NEQE_EXTSWITCH_PORT_UP, eqe)) {
+   port->phy_link = EHEA_PHY_LINK_UP;
if (netif_msg_link(port))
ehea_info("%s: Physical port up",
  port->netdev->name);
+   if (prop_carrier_state)
+   netif_carrier_on(port->netdev);
} else {
+   port->phy_link = EHEA_PHY_LINK_DOWN;
if (netif_msg_link(port))
ehea_info("%s: Physical port down",
  port->netdev->name);
+   if (prop_carrier_state)
+   netif_carrier_off(port->netdev);
}
 
if (EHEA_BMASK_GET(NEQE_EXTSWITCH_PRIMARY, eqe))
-- 
1.5.2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] ehea: fix last_rx update

2007-08-27 Thread Jan-Bernd Themann
Update last_rx in registered device struct instead of
in the dummy device.

Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]>

---
 drivers/net/ehea/ehea_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 1e9fd6f..717b129 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -471,7 +471,7 @@ static struct ehea_cqe *ehea_proc_rwqes(struct net_device 
*dev,
else
netif_receive_skb(skb);
 
-   dev->last_rx = jiffies;
+   port->netdev->last_rx = jiffies;
} else {
pr->p_stats.poll_receive_errors++;
port_reset = ehea_treat_poll_error(pr, rq, cqe,
-- 
1.5.2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pktgen Multiqueue support [PATCH]

2007-08-27 Thread Robert Olsson


Hello,

Below some pktgen support to send into different TX queues.
This can of course be feed into input queues on other machines

Cheers
--ro



Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>


diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index a0db4d1..18601af 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -186,6 +186,7 @@
 #define F_SVID_RND(1<<10)  /* Random SVLAN ID */
 #define F_FLOW_SEQ(1<<11)  /* Sequential flows */
 #define F_IPSEC_ON(1<<12)  /* ipsec on for flows */
+#define F_QUEUE_MAP_RND (1<<13)/* queue map Random */
 
 /* Thread control flag bits */
 #define T_TERMINATE   (1<<0)
@@ -328,6 +329,7 @@ struct pktgen_dev {
__be32 cur_daddr;
__u16 cur_udp_dst;
__u16 cur_udp_src;
+   __u16 cur_queue_map;
__u32 cur_pkt_size;
 
__u8 hh[14];
@@ -355,6 +357,10 @@ struct pktgen_dev {
unsigned lflow; /* Flow length  (config) */
unsigned nflows;/* accumulated flows (stats) */
unsigned curfl; /* current sequenced flow (state)*/
+
+   u16 queue_map_min;
+   u16 queue_map_max;
+
 #ifdef CONFIG_XFRM
__u8ipsmode;/* IPSEC mode (config) */
__u8ipsproto;   /* IPSEC type (config) */
@@ -611,6 +617,11 @@ static int pktgen_if_show(struct seq_file *seq, void *v)
seq_printf(seq, " flows: %u flowlen: %u\n", pkt_dev->cflows,
   pkt_dev->lflow);
 
+   seq_printf(seq,
+  " queue_map_min: %u  queue_map_max: %u\n",
+  pkt_dev->queue_map_min,
+  pkt_dev->queue_map_max);
+
if (pkt_dev->flags & F_IPV6) {
char b1[128], b2[128], b3[128];
fmt_ip6(b1, pkt_dev->in6_saddr.s6_addr);
@@ -707,6 +718,9 @@ static int pktgen_if_show(struct seq_file *seq, void *v)
if (pkt_dev->flags & F_MPLS_RND)
seq_printf(seq,  "MPLS_RND  ");
 
+   if (pkt_dev->flags & F_QUEUE_MAP_RND)
+   seq_printf(seq,  "QUEUE_MAP_RND  ");
+
if (pkt_dev->cflows) {
if (pkt_dev->flags & F_FLOW_SEQ)
seq_printf(seq,  "FLOW_SEQ  "); /*in sequence flows*/
@@ -762,6 +776,8 @@ static int pktgen_if_show(struct seq_file *seq, void *v)
seq_printf(seq, " cur_udp_dst: %d  cur_udp_src: %d\n",
   pkt_dev->cur_udp_dst, pkt_dev->cur_udp_src);
 
+   seq_printf(seq, " cur_queue_map: %u\n", pkt_dev->cur_queue_map);
+
seq_printf(seq, " flows: %u\n", pkt_dev->nflows);
 
if (pkt_dev->result[0])
@@ -1213,6 +1229,11 @@ static ssize_t pktgen_if_write(struct file *file,
else if (strcmp(f, "FLOW_SEQ") == 0)
pkt_dev->flags |= F_FLOW_SEQ;
 
+   else if (strcmp(f, "QUEUE_MAP_RND") == 0)
+   pkt_dev->flags |= F_QUEUE_MAP_RND;
+
+   else if (strcmp(f, "!QUEUE_MAP_RND") == 0)
+   pkt_dev->flags &= ~F_QUEUE_MAP_RND;
 #ifdef CONFIG_XFRM
else if (strcmp(f, "IPSEC") == 0)
pkt_dev->flags |= F_IPSEC_ON;
@@ -1517,6 +1538,28 @@ static ssize_t pktgen_if_write(struct file *file,
return count;
}
 
+   if (!strcmp(name, "queue_map_min")) {
+   len = num_arg(&user_buffer[i], 5, &value);
+   if (len < 0) {
+   return len;
+   }
+   i += len;
+   pkt_dev->queue_map_min = value;
+   sprintf(pg_result, "OK: queue_map_min=%u", 
pkt_dev->queue_map_min);
+   return count;
+   }
+
+   if (!strcmp(name, "queue_map_max")) {
+   len = num_arg(&user_buffer[i], 5, &value);
+   if (len < 0) {
+   return len;
+   }
+   i += len;
+   pkt_dev->queue_map_max = value;
+   sprintf(pg_result, "OK: queue_map_max=%u", 
pkt_dev->queue_map_max);
+   return count;
+   }
+
if (!strcmp(name, "mpls")) {
unsigned n, offset;
len = get_labels(&user_buffer[i], pkt_dev);
@@ -2378,6 +2421,20 @@ static void mod_cur_headers(struct pktgen_dev *pkt_dev)
pkt_dev->cur_pkt_size = t;
}
 
+   if (pkt_dev->queue_map_min < pkt_dev->queue_map_max) {
+   __u16 t;
+   if (pkt_dev->flags & F_QUEUE_MAP_RND) {
+   t = random32() %
+   (pkt_dev->queue_map_max - 
pkt_dev->queue_map_min + 1)
+   + pkt_dev->queue_map_min;
+   } else {
+   t = pkt_dev->cur_queue_map + 1;
+   if (t > pkt_dev->queue_map_max)
+   t = pkt_dev->queue_map_min;
+   }
+   pkt_dev->cur_queue_map = t;
+   }
+
pkt_dev->flows[flow].count++;
 }

Re: net-2.6.24 build broken (allyesconfig)

2007-08-27 Thread Johannes Berg
On Mon, 2007-08-27 at 15:32 +0300, Ilpo Järvinen wrote:

> drivers/net/82596.c:1618:1: error: unterminated argument list invoking 
> macro "DEB"

> Hmm, I would guess that "[NET]: Introduce MAC_FMT/MAC_ARG" broken it, 
> though didn't verify it.
> 
> The fix is left as an exercise of the reader (i.e., the solution wasn't 
> too obvious for me :-) )...

Yup, my fault, sorry about that.

From: Johannes Berg <[EMAIL PROTECTED]>
Subject: fix MAC_FMT/MAC_ARG in 82596.c

This fixes a typo in commit f98d4ca4986fec.

Signed-off-by: Johannes Berg <[EMAIL PROTECTED]>

---
 drivers/net/82596.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- netdev-2.6.orig/drivers/net/82596.c 2007-08-27 14:48:19.674646075 +0200
+++ netdev-2.6/drivers/net/82596.c  2007-08-27 14:48:21.674646075 +0200
@@ -1562,7 +1562,7 @@ static void set_multicast_list(struct ne
memcpy(cp, dmi->dmi_addr, 6);
if (i596_debug > 1)
DEB(DEB_MULTI,printk(KERN_INFO "%s: Adding 
address " MAC_FMT "\n",
-   dev->name, MAC_ARG(cp));
+   dev->name, MAC_ARG(cp)));
}
i596_add_cmd(dev, &cmd->cmd);
}


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: 2.6.23-rc "NETDEV WATCHDOG: eth0: transmit timed out"

2007-08-27 Thread Jarek Poplawski
On 21-08-2007 12:56, Karl Meyer wrote:
> fyi:
> I do not know whether it is related to the problem, but since using
> the version you told me there are these entries is my log:
> frege Hangcheck: hangcheck value past margin!
...

BTW, I don't know wheter it's related too, but I think you should try
first to get rid of these errors:

> Freeing unused kernel memory: 220k freed
> usb_id[1320]: segfault at  eip b7e25db2 esp bfd1d734 error 4
> usb_id[1329]: segfault at  eip b7e1bdb2 esp bf9c9224 error 4
> usb_id[1322]: segfault at  eip b7df3db2 esp bfcb66c4 error 4
> usb_id[1321]: segfault at  eip b7e11db2 esp bf8f4b04 error 4

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[NET] 82596: Add missing parenthesis

2007-08-27 Thread Thomas Graf

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.24/drivers/net/82596.c
===
--- net-2.6.24.orig/drivers/net/82596.c 2007-08-27 14:43:16.0 +0200
+++ net-2.6.24/drivers/net/82596.c  2007-08-27 14:43:51.0 +0200
@@ -1562,7 +1562,7 @@ static void set_multicast_list(struct ne
memcpy(cp, dmi->dmi_addr, 6);
if (i596_debug > 1)
DEB(DEB_MULTI,printk(KERN_INFO "%s: Adding 
address " MAC_FMT "\n",
-   dev->name, MAC_ARG(cp));
+   dev->name, MAC_ARG(cp)));
}
i596_add_cmd(dev, &cmd->cmd);
}
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


net-2.6.24 build broken (allyesconfig)

2007-08-27 Thread Ilpo Järvinen
Hi,

$ git-reset --hard net-2.6.24-origin
HEAD is now at e2eb35e... [ATM]: Fix build errors after conversion to pr_debug()
$ make allyesconfig
...
$ make bzImage
...
  CC  drivers/net/82596.o
drivers/net/82596.c:1618:1: error: unterminated argument list invoking 
macro "DEB"
drivers/net/82596.c: In function 'set_multicast_list':
drivers/net/82596.c:1564: error: 'DEB' undeclared (first use in this 
function)
drivers/net/82596.c:1564: error: (Each undeclared identifier is reported 
only once
drivers/net/82596.c:1564: error: for each function it appears in.)
drivers/net/82596.c:1564: error: expected ';' at end of input
make[1]: *** [drivers/net/82596.o] Error 1
make: *** [drivers/net/82596.o] Error 2


Hmm, I would guess that "[NET]: Introduce MAC_FMT/MAC_ARG" broken it, 
though didn't verify it.

The fix is left as an exercise of the reader (i.e., the solution wasn't 
too obvious for me :-) )...


--
 i.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netdevice: kernel docbook addition

2007-08-27 Thread Jarek Poplawski
On 22-08-2007 21:33, Stephen Hemminger wrote:
> Add more kernel doc's for part of the network device API.
> This is only a start, and needs more work.
> 
> Applies against net-2.6.24
...
> +/**
> + *   napi_disable - prevent NAPI from scheduling
> + *   @n: napi context
> + *
> + * Resume NAPI from being scheduled on this context.
> + * Must be paired with napi_disable.
> + */
>  static inline void napi_enable(struct napi_struct *n)
>  {

It looks like small fix is needed here (I hope it would be faster
without my patch, thanks).

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Devel] [PATCH 1/1] Dynamically allocate the loopback device

2007-08-27 Thread Eric W. Biederman
Stephen Hemminger <[EMAIL PROTECTED]> writes:

> On Fri, 24 Aug 2007 19:55:47 +0400
> "Denis V. Lunev" <[EMAIL PROTECTED]> wrote:
>
>> [EMAIL PROTECTED] wrote:
>> > From: Daniel Lezcano <[EMAIL PROTECTED]>
>> > 
>> > Doing this makes loopback.c a better example of how to do a
>> > simple network device, and it removes the special case
>> > single static allocation of a struct net_device, hopefully
>> > making maintenance easier.
>> > 
>> > Applies against net-2.6.24
>> > 
>> > Tested on i386, x86_64
>> > Compiled on ia64, sparc
>> 
>> I think that a small note, that initialization order is changed will be
>> good to record. After this, loopback MUST be allocated before any other
>> networking subsystem initialization. And this is an important change.
>> 
>> Regards,
>> Den
>
> Yes, this code would break when other drivers are directly linked
> in. 

No. Other drivers don't care at all about the loopback device,
and it isn't a requirement that the loopback device be initialized
before other devices.

The requirement is that the loopback device is allocated before
we start using it.  Which means networking subsystems like
ipv4 and ipv6 care not other network drivers.  In practices this means
before we get very far into the ipv4 subsystem initialization as
ipv4 is always compiled in and is initialized early.

To get the initialization order correct I used fs_initcall instead of
module_init.

When I reflect on it.  I'm not really comfortable with the fact
that we currently start using the loopback device before we
finish initializing and register it.  Although it has worked
for over a decade so I guess early on we don't care about
much more then the address of the loopback device.

>From what I can tell the initialization order dependency seems much
less subtle and much more robust then separate rules for allocating
the loopback device.  We have had several patchs recently that
broke (including one merged upstream).  The only way I can see
to break an initialization order dependency is to go deliberately
messing around with initialization order.

Eric

p.s.  My apologies for the late reply some one dropped me off the cc.
And I have been under the weather all week.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] isdn capi driver broken on 64 bit.

2007-08-27 Thread Karsten Keil
On Fri, Aug 24, 2007 at 11:08:11AM -0700, Stephen Hemminger wrote:
> The following driver API is broken on any architecture with 64 bit addresses.
> because of cast that loses high bits.
> 
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
> 
> 
> --- a/drivers/isdn/capi/capidrv.c 2007-06-25 09:03:12.0 -0700
> +++ b/drivers/isdn/capi/capidrv.c 2007-08-24 11:06:46.0 -0700
> @@ -1855,6 +1855,9 @@ static int if_sendbuf(int id, int channe
>   return 0;
>   }
>   datahandle = nccip->datahandle;
> +
> + /* This won't work on 64 bit! */
> + BUILD_BUG_ON(sizeof(skb->data) > sizeof(u32));
>   capi_fill_DATA_B3_REQ(&sendcmsg, global.ap.applid, card->msgid++,
> nccip->ncci,  /* adr */
> (u32) skb->data,  /* Data */


NACK.

It is not a BUG.

This is OK, since this field must have a value and on 32 it has the correct
one) On 64 bit this field is ignored (but also need a value, random data is
bad as well).

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.24] introduce MAC_FMT/MAC_ARG

2007-08-27 Thread Johannes Berg
On Sat, 2007-08-25 at 17:09 -0700, Joe Perches wrote:
> 
> @@ -747,22 +741,15 @@ static int ariadne_rx(struct net_device *dev)
> skb_copy_to_linear_data(skb, (char *)priv->rx_buff[entry],
> pkt_len);
> skb->protocol=eth_type_trans(skb,dev);
>  #if 0
>  (int)skb->len);
> +{
> +   printk(KERN_DEBUG "RX pkt type 0x%04x"
> +  " from " MAC_FMT " to " MAC_FMT
> +  " data 0x%08x len %d\n",
> +  ((u_short *)skb->data)[6],
> +  MAC_ARG(((u_char *)skb->data)+6),
> +  MAC_ARG((u_char *)skb->data),
> +  (int)skb->data, (int)skb->len);
> +}
>  #endif 

You could drop the braces since there are no variables there any more.

Thanks for this patch though, I'd have done it otherwise.

I was rereading your original conversion and noticed that it is now
trivial to make the kernel smaller like you originally wanted by doing
something like this

-- define this function somewhere --
print_mac(u8 *mac, char *buf)
{
sprintf(buf, "%02x:...", mac[0], mac[1], ...);
}
EXPORT_SYMBOL(print_mac)

-- change macros to --
#define MAC_FMT "%s"
#define MAC_ARG(a) ({char __buf[18]; print_mac(a, buf); __buf})

I'm not sure we'd want that, but at the time you said it made the kernel
significantly smaller and I doubt there's a performance problem with it
(who prints mac addresses regularly?)

johannes


signature.asc
Description: This is a digitally signed message part


Re: RFC: issues concerning the next NAPI interface

2007-08-27 Thread Jan-Bernd Themann
On Monday 27 August 2007 03:58, David Miller wrote:
> From: James Chapman <[EMAIL PROTECTED]>
> Date: Sun, 26 Aug 2007 20:36:20 +0100
> 
> > David Miller wrote:
> > > From: James Chapman <[EMAIL PROTECTED]>
> > > Date: Fri, 24 Aug 2007 18:16:45 +0100
> > > 
> > >> Does hardware interrupt mitigation really interact well with NAPI?
> > > 
> > > It interacts quite excellently.
> > 
> > If NAPI disables interrupts and keeps them disabled while there are more 
> > packets arriving or more transmits being completed, why do hardware 
> > interrupt mitigation / coalescing features of the network silicon help?
> 
> Because if your packet rate is low enough such that the cpu can
> process the interrupt fast enough and thus only one packet gets
> processed per NAPI poll, the cost of going into and out of NAPI mode
> dominates the packet processing costs.

As far as I understand your argumentation, NAPI is supposed to work well only
for HW with coalescing features (concerning dropping the interrupt rate).
NAPI itself does not provide a reliable functionality to reduce the
number of interrupts, especially not for systems with only 1 NIC. 
NAPI will only wait for some time when the budget is exceeded
and the softIRQs don't call net_rx_action again. This seems to be the case
after 10 rounds. That means NAPI really waits after 300 x 10 packets 
have been processed in a row (worst case).

As a matter of fact there is HW that does not have this feature. There seems
to be HW which does not work well with plain NAPI.
This HW performs well if the operation system supports HP timers 
(and when they are used either in the device driver or polling engine).

So the question is simply: Do we want drivers that need (benefit from)
a timer based polling support to implement their own timers each, 
or should there be a generic support? 

Thanks,
Jan-Bernd

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pktgen multiqueue oops

2007-08-27 Thread Robert Olsson

Hello,

Initially pkt_dev can be NULL this causes netif_subqueue_stopped to 
oops. The patch below should cure it. But maybe the pktgen TX logic 
should be reworked to better support the new multiqueue support. 

Cheers
--ro

Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>


diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 7bae576..a0db4d1 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3331,8 +3331,9 @@ static __inline__ void pktgen_xmit(struct pktgen_dev 
*pkt_dev)
}
 
if ((netif_queue_stopped(odev) ||
-netif_subqueue_stopped(odev, pkt_dev->skb->queue_mapping)) ||
-need_resched()) {
+(pkt_dev->skb && 
+ netif_subqueue_stopped(odev, pkt_dev->skb->queue_mapping))) ||
+   need_resched()) {
idle_start = getCurUs();
 
if (!netif_running(odev)) {
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET] atm: Fix build errors after conversion to pr_debug()

2007-08-27 Thread David Miller
From: Thomas Graf <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 10:05:37 +0200

> Fixes ancient ATM debug code to at least compile again.
> 
> Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Patch applied, thanks Thomas!
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[NET] atm: Fix build errors after conversion to pr_debug()

2007-08-27 Thread Thomas Graf
Fixes ancient ATM debug code to at least compile again.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6.24/net/atm/signaling.c
===
--- net-2.6.24.orig/net/atm/signaling.c 2007-08-27 09:53:40.0 +0200
+++ net-2.6.24/net/atm/signaling.c  2007-08-27 09:55:16.0 +0200
@@ -89,9 +89,9 @@ static int sigd_send(struct atm_vcc *vcc
 
msg = (struct atmsvc_msg *) skb->data;
atomic_sub(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
-   pr_debug("sigd_send %d (0x%lx)\n",(int) msg->type,
- (unsigned long) msg->vcc);
vcc = *(struct atm_vcc **) &msg->vcc;
+   pr_debug("sigd_send %d (0x%lx)\n",(int) msg->type,
+ (unsigned long) vcc);
sk = sk_atm(vcc);
 
switch (msg->type) {
Index: net-2.6.24/net/atm/common.c
===
--- net-2.6.24.orig/net/atm/common.c2007-08-27 09:56:06.0 +0200
+++ net-2.6.24/net/atm/common.c 2007-08-27 09:56:16.0 +0200
@@ -497,7 +497,7 @@ int vcc_recvmsg(struct kiocb *iocb, stru
if (error)
return error;
sock_recv_timestamp(msg, sk, skb);
-   pr_debug("RcvM %d -= %d\n", atomic_read(&sk->rmem_alloc), 
skb->truesize);
+   pr_debug("RcvM %d -= %d\n", atomic_read(&sk->sk_rmem_alloc), 
skb->truesize);
atm_return(vcc, skb->truesize);
skb_free_datagram(sk, skb);
return copied;
Index: net-2.6.24/net/atm/raw.c
===
--- net-2.6.24.orig/net/atm/raw.c   2007-08-27 09:57:56.0 +0200
+++ net-2.6.24/net/atm/raw.c2007-08-27 09:58:09.0 +0200
@@ -32,8 +32,8 @@ static void atm_pop_raw(struct atm_vcc *
 {
struct sock *sk = sk_atm(vcc);
 
-   pr_debug("APopR (%d) %d -= %d\n", vcc->vci, sk->sk_wmem_alloc,
-   skb->truesize);
+   pr_debug("APopR (%d) %d -= %d\n", vcc->vci,
+   atomic_read(&sk->sk_wmem_alloc), skb->truesize);
atomic_sub(skb->truesize, &sk->sk_wmem_alloc);
dev_kfree_skb_any(skb);
sk->sk_write_space(sk);
Index: net-2.6.24/net/atm/pppoatm.c
===
--- net-2.6.24.orig/net/atm/pppoatm.c   2007-08-27 10:01:34.0 +0200
+++ net-2.6.24/net/atm/pppoatm.c2007-08-27 10:02:05.0 +0200
@@ -165,9 +165,8 @@ static void pppoatm_push(struct atm_vcc 
pvcc->chan.mtu += LLC_LEN;
break;
}
-   pr_debug("(unit %d): Couldn't autodetect yet "
+   pr_debug("Couldn't autodetect yet "
"(skb: %02X %02X %02X %02X %02X %02X)\n",
-   pvcc->chan.unit,
skb->data[0], skb->data[1], skb->data[2],
skb->data[3], skb->data[4], skb->data[5]);
goto error;
@@ -195,8 +194,7 @@ static int pppoatm_send(struct ppp_chann
 {
struct pppoatm_vcc *pvcc = chan_to_pvcc(chan);
ATM_SKB(skb)->vcc = pvcc->atmvcc;
-   pr_debug("(unit %d): pppoatm_send (skb=0x%p, vcc=0x%p)\n",
-   pvcc->chan.unit, skb, pvcc->atmvcc);
+   pr_debug("pppoatm_send (skb=0x%p, vcc=0x%p)\n", skb, pvcc->atmvcc);
if (skb->data[0] == '\0' && (pvcc->flags & SC_COMP_PROT))
(void) skb_pull(skb, 1);
switch (pvcc->encaps) { /* LLC encapsulation needed */
@@ -221,16 +219,14 @@ static int pppoatm_send(struct ppp_chann
goto nospace;
break;
case e_autodetect:
-   pr_debug("(unit %d): Trying to send without setting encaps!\n",
-   pvcc->chan.unit);
+   pr_debug("Trying to send without setting encaps!\n");
kfree_skb(skb);
return 1;
}
 
atomic_add(skb->truesize, &sk_atm(ATM_SKB(skb)->vcc)->sk_wmem_alloc);
ATM_SKB(skb)->atm_options = ATM_SKB(skb)->vcc->atm_options;
-   pr_debug("(unit %d): atm_skb(%p)->vcc(%p)->dev(%p)\n",
-   pvcc->chan.unit, skb, ATM_SKB(skb)->vcc,
+   pr_debug("atm_skb(%p)->vcc(%p)->dev(%p)\n", skb, ATM_SKB(skb)->vcc,
ATM_SKB(skb)->vcc->dev);
return ATM_SKB(skb)->vcc->send(ATM_SKB(skb)->vcc, skb)
? DROP_PACKET : 1;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] [DM9000] Added support for big-endian hosts

2007-08-27 Thread Laurent Pinchart
On Saturday 25 August 2007 06:31, Jeff Garzik wrote:
> Laurent Pinchart wrote:
> > This patch splits the receive status in 8bit wide fields and convert the
> > packet length from little endian to CPU byte order.
> >
> > Signed-off-by: Laurent Pinchart <[EMAIL PROTECTED]>
> > ---
> >  drivers/net/dm9000.c |   13 +++--
> >  1 files changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/dm9000.c b/drivers/net/dm9000.c
> > index c3de81b..a424810 100644
> > --- a/drivers/net/dm9000.c
> > +++ b/drivers/net/dm9000.c
> > @@ -894,7 +894,8 @@ dm9000_timer(unsigned long data)
> >  }
> >
> >  struct dm9000_rxhdr {
> > -   u16 RxStatus;
> > +   u8  RxPktReady;
> > +   u8  RxStatus;
> > u16 RxLen;
> >  } __attribute__((__packed__));
>
> why does this not need endian conversions as well?
>
>   Jeff

The rx header is a 4-byte structure layed out as above (packet ready, status 
and length). The first two fields are 8-bit wide so don't need endian 
conversion. The length field is a 16-bit big endian value which is converted 
to CPU order in dm9000_rx().

Before this patch, the driver accessed the status and packet ready fields as a 
16-bit value, which was obviously endianess-dependant.

Best regards,

-- 
Laurent Pinchart
CSE Semaphore Belgium

Chaussée de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] iproute2-2.6.23-rc3

2007-08-27 Thread Jarek Poplawski
On Fri, Aug 24, 2007 at 12:26:28PM -0700, Stephen Hemminger wrote:
> On Fri, 24 Aug 2007 12:10:44 +0200
> Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> 
> > On 22-08-2007 20:08, Stephen Hemminger wrote:
> > > There have been a lot of changes for 2.6.23, so here is a test release
> > > of iproute2 that should capture all the submitted patches
> > > 
> > > 
> > > http://developer.osdl.org/shemminger/iproute2/download/iproute2-2.6.23-rc3.tar.gz
> > 
> > But... isn't it forged, btw?!
> 
> No, I just didn't sign a temporary testing version.  A final version
> will be out after 2.6.23

So, I'm calmer now... On the other hand with kernel testing versions
there seems to be more afraid of?

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html