Re: [PATCH RFX]: napi_struct V3

2007-07-24 Thread Rusty Russell
On Mon, 2007-07-23 at 22:47 -0700, David Miller wrote:
 I don't think it's wise to implement this over and over again in each
 driver, since we already know at least a handfull of drivers will use
 this.

Yep.  Alternative is a napi_struct_with_restart, but I don't think it's
worth the few-byte savings per queue.

 Any objections?

On the contrary, this looks good.

Thanks!
Rusty.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-24 Thread Oliver Hartkopp
Varun Chandramohan wrote:
 Patrick McHardy wrote:
   
 Varun Chandramohan wrote:
   
 
  /**
 + * timeval_to_sec - Convert timeval to seconds
 + * @tv: pointer to the timeval variable to be converted
 + *
 + * Returns the seconds representation of timeval parameter.
 + */
 +static inline time_t timeval_to_sec(const struct timeval *tv)
 +{
 +   return (tv-tv_sec + (tv-tv_usec + 50)/100);
 +}
 
   
 I don't think you should round down timeout values.
   
 
 Can you elaborate on that? As per the RFC of MIB ,we need only seconds
 granularity. Taking that as the case i dont understand why round down
 should not be done?
   

When you like to create any timeout based on your calculated value, you
might run into the problem that your calculated value is set to _zero_
even if there was some time before the conversion. This might probably
not what you indented to get.

So what about rounding up with

return (tv-tv_sec + (tv-tv_usec + 99)/100);

???

Btw. isn't here already any solution based on ktime conversions?

Regards,
Oliver


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-24 Thread Varun Chandramohan
Oliver Hartkopp wrote:
 Varun Chandramohan wrote:
   
 Patrick McHardy wrote:
   
 
 Varun Chandramohan wrote:
   
 
   
  /**
 + * timeval_to_sec - Convert timeval to seconds
 + * @tv: pointer to the timeval variable to be converted
 + *
 + * Returns the seconds representation of timeval parameter.
 + */
 +static inline time_t timeval_to_sec(const struct timeval *tv)
 +{
 +  return (tv-tv_sec + (tv-tv_usec + 50)/100);
 +}
 
   
 
 I don't think you should round down timeout values.
   
 
   
 Can you elaborate on that? As per the RFC of MIB ,we need only seconds
 granularity. Taking that as the case i dont understand why round down
 should not be done?
   
 

 When you like to create any timeout based on your calculated value, you
 might run into the problem that your calculated value is set to _zero_
 even if there was some time before the conversion. This might probably
 not what you indented to get.

 So what about rounding up with

 return (tv-tv_sec + (tv-tv_usec + 99)/100);

 ???

   
This can done.  Is this what you were ref to me, Patrick?
 Btw. isn't here already any solution based on ktime conversions?

   
AFAIK there isint any conversion function to secs. Correct me if iam wrong.
But we can have a function or macro to do this conversion.
 Regards,
 Oliver


   
Regards,
Varun
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
   

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: specifying scopid's for link-local IPv6 addrs

2007-07-24 Thread Bill Fink
On Mon, 23 Jul 2007, Rick Jones wrote:

 Folks -
 
 People running netperf have reported that they have trouble with IPv6 under 
 Linux.  Specifically, wereas the use of link-local IPv6 addresses just 
 works 
 in netperf under a number of other OSes they do not under Linux.  I'm 
 ass-u-me-ing 2.6 here, but not sure exactly which ones - I've seen it on a 
 2.6.18-based RHEL5.
 
 Some poking about and conversation has suggested that one has to set a 
 sin6_scope_id in the sockaddr_in6.  This needs to be an index of one of the 
 interfaces in the system, which I presume means walking some additional 
 structures.
 
 Is this a requirement which might be expected to remain in the future, or is 
 it 
 something which might just go away?  That will have an effect on netperf 
 future 
 development.
 
 thanks,
 
 rick jones

Rick,

I don't see any way around this.  For example, on one of my test
systems, I have the following link local routes:

chance% netstat -A inet6 -rn | grep fe80::/64
fe80::/64   ::  
U 25600 eth0
fe80::/64   ::  
U 25600 eth2
fe80::/64   ::  
U 25600 eth3
fe80::/64   ::  
U 25600 eth4
fe80::/64   ::  
U 25600 eth5
fe80::/64   ::  
U 25600 eth6

So if I want to run a link local test to fe80::202:b3ff:fed4:cd1,
the system has no way to choose which is the correct interface to
use for the test, and will give an error if the interface isn't
specified.  Here's an example of this with nuttcp:

chance% nuttcp -P5100 fe80::202:b3ff:fed4:cd1
nuttcp-t: Info: attempting to switch to deprecated classic mode
nuttcp-t: Info: will use less reliable transmitter side statistics
nuttcp-t: v5.5.5: Error: connect: Invalid argument
errno=22

You must explicitly specify the desired interface.  For example,
on my test system, the correct interface is eth6 which is interface 8
(lo eth0 eth1 eth2 ... eth5 eth6).  Here is an example nuttcp test
specifying interface 8:

chance% nuttcp -P5100 fe80::202:b3ff:fed4:cd1%8
 1178.5809 MB /  10.02 sec =  986.2728 Mbps 12 %TX 15 %RX

nuttcp uses getaddrinfo() which parses the %ifindex field,
and then copies the sin6_scope_id from the res structure to the
server's sockaddr_in6 structure before initiating the connect().

-Bill
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20-2.6.21 - networking dies after random time

2007-07-24 Thread Jarek Poplawski
On Mon, Jul 23, 2007 at 07:44:58AM +0200, Marcin Ślusarz wrote:
 Ok, I've bisected this problem and found that this patch broke my NIC:
 
 76d2160147f43f982dfe881404cfde9fd0a9da21 is first bad commit
 commit 76d2160147f43f982dfe881404cfde9fd0a9da21
 Author: Ingo Molnar [EMAIL PROTECTED]
 Date:   Fri Feb 16 01:28:24 2007 -0800
 
[PATCH] genirq: do not mask interrupts by default
 
Never mask interrupts immediately upon request.  Disabling interrupts in
high-performance codepaths is rare, and on the other hand this change 
could
recover lost edges (or even other types of lost interrupts) by
 conservatively
only masking interrupts after they happen.  (NOTE: with this change the
highlevel irq-disable code still soft-disables this IRQ line - and
 if such an
interrupt happens then the IRQ flow handler keeps the IRQ masked.)
 
Mark i8529A controllers as 'never loses an edge'.
 
Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
Cc: Thomas Gleixner [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Signed-off-by: Linus Torvalds [EMAIL PROTECTED]

So, it seems nobody (except the users) cares...

BTW, maybe there should be created something like Network Cards
Producers Made Rich on Unnecessary Changed Cards Linux Foundation?:

On Fri, Jun 29, 2007 at 10:50:20AM +0200, Jean-Baptiste Vignaud wrote:
...
 2) changed the 3com cards
 i replaced by two cards,
 01:06.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 42)
 01:07.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS)
 
 reinstalled and stressed the network (small download from a laptop) and :
 
 Jun 29 09:34:10 loki kernel: NETDEV WATCHDOG: eth0: transmit timed out
 Jun 29 09:34:51 loki last message repeated 14 times
 Jun 29 09:35:18 loki last message repeated 8 times

...Of course, no response of any serious developer for this as well.

BTW #2: I wonder how true is this (after above-mentioned patch):

From include/linux/irq.h:
 /**
  * struct irq_chip - hardware interrupt chip descriptor
...
  * @disable:  disable the interrupt (defaults to chip-mask if NULL)

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFX]: napi_struct V3

2007-07-24 Thread Francois Romieu
David Miller [EMAIL PROTECTED] :
[...]
 I also didn't play with turning off NAPI in kconfig where drivers
 allow that, can we just get rid of that crap already? :-/

Would it be accepted for 2.6.23 or must it be considered post-2.6.23 ?

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Initialize and fill IPv6 route age

2007-07-24 Thread Stephen Hemminger
On Tue, 24 Jul 2007 09:50:57 +0530
Varun Chandramohan [EMAIL PROTECTED] wrote:

 Stephen Hemminger wrote:
  On Mon, 23 Jul 2007 10:13:18 +0530
  Varun Chandramohan [EMAIL PROTECTED] wrote:
 

  The age field of the ipv6 route structures are initilized with the current 
  timeval at the time of route   creation. When the route dump is called 
  the route age value stored in the structure is subtracted from the 
  present timeval and the difference is passed on as the route age.
 
  Signed-off-by: Varun Chandramohan [EMAIL PROTECTED]
  ---
   include/net/ip6_fib.h   |1 +
   include/net/ip6_route.h |3 +++
   net/ipv6/addrconf.c |5 +
   net/ipv6/route.c|   23 +++
   4 files changed, 28 insertions(+), 4 deletions(-)
 
  diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
  index c48ea87..e30a1cf 100644
  --- a/include/net/ip6_fib.h
  +++ b/include/net/ip6_fib.h
  @@ -98,6 +98,7 @@ struct rt6_info
 
 u32 rt6i_flags;
 u32 rt6i_metric;
  +  time_t  rt6i_age;
 atomic_trt6i_ref;
 struct fib6_table   *rt6i_table;
   
  diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
  index 5456fdd..fc9716c 100644
  --- a/include/net/ip6_route.h
  +++ b/include/net/ip6_route.h
  @@ -36,6 +36,9 @@ struct route_info {
   #define RT6_LOOKUP_F_REACHABLE0x2
   #define RT6_LOOKUP_F_HAS_SADDR0x4
   
  +#define RT6_SET_ROUTE_INFO 0x0
  +#define RT6_GET_ROUTE_INFO 0x1
  +
   extern struct rt6_infoip6_null_entry;
   
   #ifdef CONFIG_IPV6_MULTIPLE_TABLES
  diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
  index 5a5f8bd..715c766 100644
  --- a/net/ipv6/addrconf.c
  +++ b/net/ipv6/addrconf.c
  @@ -4187,6 +4187,7 @@ EXPORT_SYMBOL(unregister_inet6addr_notif
   
   int __init addrconf_init(void)
   {
  +  struct timeval tv;
 int err = 0;
   
 /* The addrconf netdev notifier requires that loopback_dev
  @@ -4214,10 +4215,14 @@ int __init addrconf_init(void)
 if (err)
 return err;
   
  +  do_gettimeofday(tv);
  
 
  Better to use ktime_t or timespec in new code.

 You are saying not to use timeval as its going to be removed sometime in
 future? If not, may i know why should we use timespec or ktime?
 I need only seconds granularity so i was wondering if that matters.
 

Timeval isn't going away, but ktime is easier for basic maths.
If all you need is seconds resolution,  just use jiffies. Gettimeofday
(and ktime_get) both require expensive hardware accesses on
some platforms.

For the specific case IPV6 route ageing, this isn't a big deal it is just
that people tend to copy code. 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: tg3 issues

2007-07-24 Thread patric

Michael Chan wrote:


Last update on this.

udelay( 100 + net_random % 300 ) seems to work much better and i have 
not had a single problem getting the link up within 10 seconds of a cold 
or warm-boot, and most often the link comes up directly without any sort 
of delay instead like before when it could hang for 30 seconds before 
getting a link, if you even got a link.





We'll have to do some testing to see if we can find a better solution.
Adding up to 400 usec of busy wait is not ideal.  Are you connecting two
5701 fiber cards directly to each other in your setup?


  


Hi,

Yea, it's and ugly hack, but it's a workaround that at least works for me.
Have done some additional tests and to me it seems that the driver just 
needs to wait a bit longer to detect the link + some random time to get 
around the issues it might have when chasing the other systems state.
Don't have the numbers in front of me, but i did some tests to get the 
'optimum' delay to wait and it seems like the larger the wait time (even 
up to around 40-50ms!) causes the autoneg to go much smoother and 
faster. And remember that the link can be reported up, but no traffic 
can be passed via the link before the autoneg is complete and both cards 
reports back that flowcontrol is enabled so you need to verify the link 
by trying to send some data and not just look at the link-status.


Hope my conclusions might help, or atleast point you in the right direction.

I have 2
01:08.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 
Gigabit Ethernet (rev 15)   ( 14e4:1645 )

connected via a single FC cable.
And the two systems are one single-core and one dual-core AMD system.

Regards,
Patric

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFX]: napi_struct V3

2007-07-24 Thread Jeff Garzik

Francois Romieu wrote:

David Miller [EMAIL PROTECTED] :
[...]

I also didn't play with turning off NAPI in kconfig where drivers
allow that, can we just get rid of that crap already? :-/


Would it be accepted for 2.6.23 or must it be considered post-2.6.23 ?


The merge window is closed, so not 2.6.23.

The cases where Kconfig allows selection of NAPI-or-not is generally 
either (a) the author considers one path new and relatively unproven or 
(b) the author has not bothered to figure out which is the best choice, 
and instead just offers both rather than doing the legwork to find out.


Either way, individual cases must be examined and it is not necessarily 
just a cleanup.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFX]: napi_struct V3

2007-07-24 Thread David Miller
From: Francois Romieu [EMAIL PROTECTED]
Date: Tue, 24 Jul 2007 09:12:01 +0200

 David Miller [EMAIL PROTECTED] :
 [...]
  I also didn't play with turning off NAPI in kconfig where drivers
  allow that, can we just get rid of that crap already? :-/
 
 Would it be accepted for 2.6.23 or must it be considered post-2.6.23 ?

The 2.6.23 merge window closed already, so this is of course
2.6.24 material.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20-2.6.21 - networking dies after random time

2007-07-24 Thread Ingo Molnar

* Marcin Ślusarz [EMAIL PROTECTED] wrote:

 Ok, I've bisected this problem and found that this patch broke my NIC:
 
 76d2160147f43f982dfe881404cfde9fd0a9da21 is first bad commit
 commit 76d2160147f43f982dfe881404cfde9fd0a9da21
 Author: Ingo Molnar [EMAIL PROTECTED]
 Date:   Fri Feb 16 01:28:24 2007 -0800
 
[PATCH] genirq: do not mask interrupts by default

thanks for tracking it down! Could you try the patch below (ontop an 
otherwise unmodified kernel)? This tests the theory whether the problem 
is related to the disable_irq_nosync() call in the ne2k driver's xmit 
path. Does this solve the hangs too?

Ingo

Index: linux/kernel/irq/manage.c
===
--- linux.orig/kernel/irq/manage.c
+++ linux/kernel/irq/manage.c
@@ -102,7 +102,19 @@ void disable_irq_nosync(unsigned int irq
spin_lock_irqsave(desc-lock, flags);
if (!desc-depth++) {
desc-status |= IRQ_DISABLED;
-   desc-chip-disable(irq);
+   /*
+* the _nosync variant of irq-disable suggests that the
+* caller is not worried about concurrency but about the
+* ordering of the irq flow itself. (such as hardware
+* getting confused about certain, normally valid irq
+* handling sequences.) So if the default disable handler
+* is in place then try the more conservative masking
+* instead:
+*/
+   if (desc-chip-disable == default_disable  desc-chip-mask)
+   desc-chip-mask(irq);
+   else
+   desc-chip-disable(irq);
}
spin_unlock_irqrestore(desc-lock, flags);
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Jens Axboe
On Mon, Jul 23 2007, Andrew Morton wrote:
 I worked out that the crash I saw was in
 
 BUG_ON(!pte_none(*(kmap_pte-idx)));
 
 in the read of kmap_pte[idx].  Which would be weird as the caller is using 
 a literal KM_USER0.
 
 So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am
 unable to reproduce it now).
 
 If that BUG_ON _is_ triggering then it might indicate that someone is doing
 a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0.

Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the
address. I've seen both of those end up triggering that BUG_ON() in a
later kmap.

Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in
ocfs2 at least. But you are probably not using that, so I'll keep
looking...

---

[PATCH] ocfs2: bad kunmap_atomic()

kunmap_atomic() takes the virtual address, not the mapped page as
argument.

Signed-off-by: Jens Axboe [EMAIL PROTECTED]

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 5727cd1..c4034f6 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2153,7 +2153,7 @@ static int ocfs2_splice_write_actor(struct 
pipe_inode_info *pipe,
src = buf-ops-map(pipe, buf, 1);
dst = kmap_atomic(page, KM_USER1);
memcpy(dst + offset, src + buf-offset, count);
-   kunmap_atomic(page, KM_USER1);
+   kunmap_atomic(dst, KM_USER1);
buf-ops-unmap(pipe, buf, src);
 
copied = ocfs2_write_end(file, file-f_mapping, sd-pos, count, count,

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] possible deadlock in tulip driver

2007-07-24 Thread Denis V. Lunev
Calling flush_scheduled_work() may deadlock if called under rtnl_lock
(from dev-stop) as linkwatch_event() may be on the workqueue and it will try
to get the rtnl_lock

Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---

 tulip_core.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- ./drivers/net/tulip/tulip_core.c.tulip  2007-07-16 12:54:29.0 
+0400
+++ ./drivers/net/tulip/tulip_core.c2007-07-23 19:06:24.0 +0400
@@ -726,8 +726,6 @@ static void tulip_down (struct net_devic
void __iomem *ioaddr = tp-base_addr;
unsigned long flags;
 
-   flush_scheduled_work();
-
del_timer_sync (tp-timer);
 #ifdef CONFIG_TULIP_NAPI
del_timer_sync (tp-oom_timer);
@@ -1788,6 +1786,8 @@ static void __devexit tulip_remove_one (
if (!dev)
return;
 
+   flush_scheduled_work();
+
tp = netdev_priv(dev);
unregister_netdev(dev);
pci_free_consistent (pdev,
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Jens Axboe
On Tue, Jul 24 2007, Jens Axboe wrote:
 On Mon, Jul 23 2007, Andrew Morton wrote:
  I worked out that the crash I saw was in
  
  BUG_ON(!pte_none(*(kmap_pte-idx)));
  
  in the read of kmap_pte[idx].  Which would be weird as the caller is using 
  a literal KM_USER0.
  
  So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am
  unable to reproduce it now).
  
  If that BUG_ON _is_ triggering then it might indicate that someone is doing
  a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0.
 
 Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the
 address. I've seen both of those end up triggering that BUG_ON() in a
 later kmap.
 
 Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in
 ocfs2 at least. But you are probably not using that, so I'll keep
 looking...

What about the new async crypto stuff? I've been looking, but is it
guarenteed that async_memcpy() runs in process context with interrupts
enabled always? If not, there's a km type bug there.

In general, I think the highmem stuff could do with more safety checks:

- People ALWAYS get the atomic unmaps wrong, passing in the page instead
  of the address. I've seen tons of these. And since kunmap_atomic()
  takes a void pointer, nobody notices until it goes boom.
- People easily get the km type wrong - they use KM_USERx in interrupt
  context, or one of the irq variants without disabling interrupts.

If we could just catch these two types of bugs, we've got a lot of these
problems covered.


-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Andrew Morton
On Tue, 24 Jul 2007 10:22:07 +0200 Jens Axboe [EMAIL PROTECTED] wrote:

 On Tue, Jul 24 2007, Jens Axboe wrote:
  On Mon, Jul 23 2007, Andrew Morton wrote:
   I worked out that the crash I saw was in
   
   BUG_ON(!pte_none(*(kmap_pte-idx)));
   
   in the read of kmap_pte[idx].  Which would be weird as the caller is 
   using 
   a literal KM_USER0.
   
   So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I 
   am
   unable to reproduce it now).
   
   If that BUG_ON _is_ triggering then it might indicate that someone is 
   doing
   a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0.
  
  Or doing double kunmaps, or doing a kunmap_atomic() on the page, not the
  address. I've seen both of those end up triggering that BUG_ON() in a
  later kmap.
  
  Looking over the 2.6.22..2.6.23-rc1 diff, I found one such error in
  ocfs2 at least. But you are probably not using that, so I'll keep
  looking...
 
 What about the new async crypto stuff? I've been looking, but is it
 guarenteed that async_memcpy() runs in process context with interrupts
 enabled always? If not, there's a km type bug there.

I think Shannon maintains that now.

 In general, I think the highmem stuff could do with more safety checks:
 
 - People ALWAYS get the atomic unmaps wrong, passing in the page instead
   of the address. I've seen tons of these. And since kunmap_atomic()
   takes a void pointer, nobody notices until it goes boom.

yeah, it's a real trap.  For a while I had a patch which converted
kmap_atomic() to return a char*, and kunmap_atomic() to take a char*, so
misuse got compile warnings.  But it was a pig to maintain so I tossed it. 
It'd be somewhat easier to do now we've converted a lot of callers to
clear_user_highpage() and similar.

 - People easily get the km type wrong - they use KM_USERx in interrupt
   context, or one of the irq variants without disabling interrupts.
 
 If we could just catch these two types of bugs, we've got a lot of these
 problems covered.

Here's the -mm debug patch:


diff -puN arch/i386/mm/highmem.c~kmap_atomic-debugging arch/i386/mm/highmem.c
--- a/arch/i386/mm/highmem.c~kmap_atomic-debugging
+++ a/arch/i386/mm/highmem.c
@@ -30,7 +30,44 @@ void *kmap_atomic(struct page *page, enu
 {
enum fixed_addresses idx;
unsigned long vaddr;
+   static unsigned warn_count = 10;
 
+   if (unlikely(warn_count == 0))
+   goto skip;
+
+   if (unlikely(in_interrupt())) {
+   if (in_irq()) {
+   if (type != KM_IRQ0  type != KM_IRQ1 
+   type != KM_BIO_SRC_IRQ  type != KM_BIO_DST_IRQ 
+   type != KM_BOUNCE_READ) {
+   WARN_ON(1);
+   warn_count--;
+   }
+   } else if (!irqs_disabled()) {  /* softirq */
+   if (type != KM_IRQ0  type != KM_IRQ1 
+   type != KM_SOFTIRQ0  type != KM_SOFTIRQ1 
+   type != KM_SKB_SUNRPC_DATA 
+   type != KM_SKB_DATA_SOFTIRQ 
+   type != KM_BOUNCE_READ) {
+   WARN_ON(1);
+   warn_count--;
+   }
+   }
+   }
+
+   if (type == KM_IRQ0 || type == KM_IRQ1 || type == KM_BOUNCE_READ ||
+   type == KM_BIO_SRC_IRQ || type == KM_BIO_DST_IRQ) {
+   if (!irqs_disabled()) {
+   WARN_ON(1);
+   warn_count--;
+   }
+   } else if (type == KM_SOFTIRQ0 || type == KM_SOFTIRQ1) {
+   if (irq_count() == 0  !irqs_disabled()) {
+   WARN_ON(1);
+   warn_count--;
+   }
+   }
+skip:
/* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */
pagefault_disable();
 
_

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1 sky2 boot crash in sky2_mac_intr

2007-07-24 Thread Stephen Hemminger
On Tue, 24 Jul 2007 10:22:05 +0200
Florian Lohoff [EMAIL PROTECTED] wrote:

 
 Hi,
 i am seeing irregular crashes on boot in the sky2_mac_intr. This is an
 Fujitsu Siemens Lifebook E8110 with a Core Duo. Currently i suspect some
 strange BIOS issues as the issues i see with the sky2 aka parity errors
 etc i also see sometimes with the integrated ipw3945 which complains
 about firmware errors. It seems the BIOS randomly fails to initialize
 all the hardware. To reproduce this crash and catch it on the serial
 console it took me around 12 boots. The machine is stable once correctly
 booted and i work most of the day on it.
 
 [   46.479939] ACPI: PCI Interrupt :02:00.0[A] - GSI 16 (level, low) - 
 IRQ 19
 [   46.568569] sky2 :02:00.0: v1.16 addr 0xf000 irq 19 Yukon-EC Ultra 
 (0xb4) rev 2
 [   46.664555] sky2 eth0: addr 00:17:42:13:45:8c
 
 [   61.958741] sky2 eth1: enabling interface
 [   62.010834] sky2 eth1: phy write timeout

The problem is related to power management. The PHY has a number of PCI 
configuration
registers for power control, and the function of these changes based on the 
version and
revision of the chip. The driver does work on older versions of the EC-U, in
Fujitsu laptop's, it is just the new rev that is broken.

The driver should probably fail smarter (by not loading) if the PHY isn't 
powered
up correctly, but that doesn't help your problem.

The vendor has provided me with documentation on many versions
of the chip, but I don't have doc's on the lastest revision differences of the 
EC Ultra, 
so a proper solution is not easily available.  The best method for resolving 
this would
be to first try the vendor driver version of sk98lin and see if that fixes it. 
If so,
then it is easy to change sky2, to match the phy setup in the vendor driver.
Another possibility is to look for places in sky2 driver where there are places
that compare version/revision.

The most likely bits that need to change are in PCI registers: 0x80, 0x84 and 
0x88
You could also load the windows driver and dump PCI config space (with lspci 
from
cygwin), and see what the settings are there.

I am away from my office for a month, and therefore away from any sky2
hardware for testing. 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stalled connection (need help to debug)

2007-07-24 Thread Jarek Poplawski
On 18-07-2007 20:50, Oleg Verych wrote:
 Hallo.

 I have a very strange problem.
 []
 Any advise on how to debug this will be very appreciated. Thanks.
 
 Thanks...

You're quite welcome...

Cheers,
Jarek P.

PS: Oleg, no offence, but are you sure this was the right problem
and the right list?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Initialize and fill IPv6 route age

2007-07-24 Thread Varun Chandramohan
Stephen Hemminger wrote:
 On Tue, 24 Jul 2007 09:50:57 +0530
 Varun Chandramohan [EMAIL PROTECTED] wrote:

   
 Stephen Hemminger wrote:
 
 On Mon, 23 Jul 2007 10:13:18 +0530
 Varun Chandramohan [EMAIL PROTECTED] wrote:

   
   
 The age field of the ipv6 route structures are initilized with the current 
 timeval at the time of route   creation. When the route dump is called 
 the route age value stored in the structure is subtracted from the 
 present timeval and the difference is passed on as the route age.

 Signed-off-by: Varun Chandramohan [EMAIL PROTECTED]
 ---
  include/net/ip6_fib.h   |1 +
  include/net/ip6_route.h |3 +++
  net/ipv6/addrconf.c |5 +
  net/ipv6/route.c|   23 +++
  4 files changed, 28 insertions(+), 4 deletions(-)

 diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
 index c48ea87..e30a1cf 100644
 --- a/include/net/ip6_fib.h
 +++ b/include/net/ip6_fib.h
 @@ -98,6 +98,7 @@ struct rt6_info

u32 rt6i_flags;
u32 rt6i_metric;
 +  time_t  rt6i_age;
atomic_trt6i_ref;
struct fib6_table   *rt6i_table;
  
 diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
 index 5456fdd..fc9716c 100644
 --- a/include/net/ip6_route.h
 +++ b/include/net/ip6_route.h
 @@ -36,6 +36,9 @@ struct route_info {
  #define RT6_LOOKUP_F_REACHABLE0x2
  #define RT6_LOOKUP_F_HAS_SADDR0x4
  
 +#define RT6_SET_ROUTE_INFO 0x0
 +#define RT6_GET_ROUTE_INFO 0x1
 +
  extern struct rt6_infoip6_null_entry;
  
  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
 diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
 index 5a5f8bd..715c766 100644
 --- a/net/ipv6/addrconf.c
 +++ b/net/ipv6/addrconf.c
 @@ -4187,6 +4187,7 @@ EXPORT_SYMBOL(unregister_inet6addr_notif
  
  int __init addrconf_init(void)
  {
 +  struct timeval tv;
int err = 0;
  
/* The addrconf netdev notifier requires that loopback_dev
 @@ -4214,10 +4215,14 @@ int __init addrconf_init(void)
if (err)
return err;
  
 +  do_gettimeofday(tv);
 
 
 Better to use ktime_t or timespec in new code.
   
   
 You are saying not to use timeval as its going to be removed sometime in
 future? If not, may i know why should we use timespec or ktime?
 I need only seconds granularity so i was wondering if that matters.

 

 Timeval isn't going away, but ktime is easier for basic maths.
 If all you need is seconds resolution,  just use jiffies. Gettimeofday
   
Thanks stephen for the review. As far as using jiffies (which is my
original implementation) i got a  comment saying that on 32bit platform
jiffies wrap every 49 days.  That's the reason for the do_gettimeofday()
implementation. However i understand that such calls require expensive
h/w accesses. Considering the fact that jiffies wrap around rather
quickly, can you suggest any other way of doing it?
 (and ktime_get) both require expensive hardware accesses on
 some platforms.

 For the specific case IPV6 route ageing, this isn't a big deal it is just
 that people tend to copy code. 
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
   

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Question: global lock (genl_mutex) possible refinement?

2007-07-24 Thread Thomas Graf
* Richard MUSIL [EMAIL PROTECTED] 2007-07-23 18:45
 I have been giving it a second thought and came up with something more
 complex. The idea is to have locking granularity at the level of
 individual families.

I agree in general, it would make up a better solution.

However, your initial patch allows operations and families to be
unregistered while message of the same family are being processed
which must not be allowed.

Please provide a new overall patch which is not based on your
initial patch so I can review your idea properly.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20-2.6.21 - networking dies after random time

2007-07-24 Thread Ingo Molnar

* Ingo Molnar [EMAIL PROTECTED] wrote:

 thanks for tracking it down! Could you try the patch below (ontop an 
 otherwise unmodified kernel)? This tests the theory whether the 
 problem is related to the disable_irq_nosync() call in the ne2k 
 driver's xmit path. Does this solve the hangs too?

please try the patch below instead.

Ingo

Index: linux/kernel/irq/chip.c
===
--- linux.orig/kernel/irq/chip.c
+++ linux/kernel/irq/chip.c
@@ -231,7 +231,7 @@ static void default_enable(unsigned int 
 /*
  * default disable function
  */
-static void default_disable(unsigned int irq)
+void default_disable(unsigned int irq)
 {
 }
 
Index: linux/kernel/irq/internals.h
===
--- linux.orig/kernel/irq/internals.h
+++ linux/kernel/irq/internals.h
@@ -10,6 +10,8 @@ extern void irq_chip_set_defaults(struct
 /* Set default handler: */
 extern void compat_irq_chip_set_default_handler(struct irq_desc *desc);
 
+extern void default_disable(unsigned int irq);
+
 #ifdef CONFIG_PROC_FS
 extern void register_irq_proc(unsigned int irq);
 extern void register_handler_proc(unsigned int irq, struct irqaction *action);
Index: linux/kernel/irq/manage.c
===
--- linux.orig/kernel/irq/manage.c
+++ linux/kernel/irq/manage.c
@@ -102,7 +102,19 @@ void disable_irq_nosync(unsigned int irq
spin_lock_irqsave(desc-lock, flags);
if (!desc-depth++) {
desc-status |= IRQ_DISABLED;
-   desc-chip-disable(irq);
+   /*
+* the _nosync variant of irq-disable suggests that the
+* caller is not worried about concurrency but about the
+* ordering of the irq flow itself. (such as hardware
+* getting confused about certain, normally valid irq
+* handling sequences.) So if the default disable handler
+* is in place then try the more conservative masking
+* instead:
+*/
+   if (desc-chip-disable == default_disable  desc-chip-mask)
+   desc-chip-mask(irq);
+   else
+   desc-chip-disable(irq);
}
spin_unlock_irqrestore(desc-lock, flags);
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GENETLINK]: Correctly report errors while registering a multicast group

2007-07-24 Thread Thomas Graf

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6/net/netlink/genetlink.c
===
--- net-2.6.orig/net/netlink/genetlink.c2007-07-23 21:54:35.0 
+0200
+++ net-2.6/net/netlink/genetlink.c 2007-07-23 21:54:54.0 +0200
@@ -196,7 +196,7 @@ int genl_register_mc_group(struct genl_f
genl_ctrl_event(CTRL_CMD_NEWMCAST_GRP, grp);
  out:
genl_unlock();
-   return 0;
+   return err;
 }
 EXPORT_SYMBOL(genl_register_mc_group);
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GENETLINK]: Fix adjustment of number of multicast groups

2007-07-24 Thread Thomas Graf
The current calculation of the maximum number of genetlink
multicast groups seems odd, fix it.

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6/net/netlink/genetlink.c
===
--- net-2.6.orig/net/netlink/genetlink.c2007-07-23 22:03:02.0 
+0200
+++ net-2.6/net/netlink/genetlink.c 2007-07-23 22:05:12.0 +0200
@@ -184,7 +184,7 @@ int genl_register_mc_group(struct genl_f
}
 
err = netlink_change_ngroups(genl_sock,
-sizeof(unsigned long) * NETLINK_GENERIC);
+mc_groups_longs * BITS_PER_LONG);
if (err)
goto out;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GENETLINK]: Fix race in genl_unregister_mc_groups()

2007-07-24 Thread Thomas Graf
family-mcast_groups is protected by genl_lock so it must
be held while accessing the list in genl_unregister_mc_groups().
Requires adding a non-locking variant of genl_unregister_mc_group().

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6/net/netlink/genetlink.c
===
--- net-2.6.orig/net/netlink/genetlink.c2007-07-23 22:08:04.0 
+0200
+++ net-2.6/net/netlink/genetlink.c 2007-07-23 22:09:08.0 +0200
@@ -200,6 +200,18 @@ int genl_register_mc_group(struct genl_f
 }
 EXPORT_SYMBOL(genl_register_mc_group);
 
+static void __genl_unregister_mc_group(struct genl_family *family,
+  struct genl_multicast_group *grp)
+{
+   BUG_ON(grp-family != family);
+   netlink_clear_multicast_users(genl_sock, grp-id);
+   clear_bit(grp-id, mc_groups);
+   list_del(grp-list);
+   genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
+   grp-id = 0;
+   grp-family = NULL;
+}
+
 /**
  * genl_unregister_mc_group - unregister a multicast group
  *
@@ -217,14 +229,8 @@ EXPORT_SYMBOL(genl_register_mc_group);
 void genl_unregister_mc_group(struct genl_family *family,
  struct genl_multicast_group *grp)
 {
-   BUG_ON(grp-family != family);
genl_lock();
-   netlink_clear_multicast_users(genl_sock, grp-id);
-   clear_bit(grp-id, mc_groups);
-   list_del(grp-list);
-   genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
-   grp-id = 0;
-   grp-family = NULL;
+   genl_unregister_mc_group(family, grp);
genl_unlock();
 }
 
@@ -232,8 +238,10 @@ static void genl_unregister_mc_groups(st
 {
struct genl_multicast_group *grp, *tmp;
 
+   genl_lock();
list_for_each_entry_safe(grp, tmp, family-mcast_groups, list)
-   genl_unregister_mc_group(family, grp);
+   __genl_unregister_mc_group(family, grp);
+   genl_unlock();
 }
 
 /**
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Au1000 eth : fix ioctl handling

2007-07-24 Thread Florian Fainelli
Hi all,

This patch fixes the handling of unsupported ioctls with the au1000_eth 
driver.

Signed-off-by: Florian Fainelli [EMAIL PROTECTED]
-- 
diff --git a/drivers/net/au1000_eth.c b/drivers/net/au1000_eth.c
index c27cfce..99a1c61 100644
--- a/drivers/net/au1000_eth.c
+++ b/drivers/net/au1000_eth.c
@@ -1316,12 +1316,20 @@ static void set_rx_mode(struct net_device *dev)
 static int au1000_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
struct au1000_private *aup = (struct au1000_private *)dev-priv;
+   struct mii_ioctl_data *data = if_mii(rq);
+   int rc = -EOPNOTSUPP;
 
if (!netif_running(dev)) return -EINVAL;
 
if (!aup-phy_dev) return -EINVAL; // PHY not controllable
 
-   return phy_mii_ioctl(aup-phy_dev, if_mii(rq), cmd);
+   switch (cmd) {
+   default:
+   rc = phy_mii_ioctl(aup-phy_dev, data, cmd);
+   break;
+   }
+
+   return rc;
 }
 
 static struct net_device_stats *au1000_get_stats(struct net_device *dev)


signature.asc
Description: This is a digitally signed message part.


Re: 2.6.23-rc1 sky2 boot crash in sky2_mac_intr

2007-07-24 Thread Florian Lohoff
On Tue, Jul 24, 2007 at 09:50:08AM +0100, Stephen Hemminger wrote:
 The problem is related to power management. The PHY has a number of PCI 
 configuration
 registers for power control, and the function of these changes based on the 
 version and
 revision of the chip. The driver does work on older versions of the EC-U, in
 Fujitsu laptop's, it is just the new rev that is broken.
 
 The driver should probably fail smarter (by not loading) if the PHY isn't 
 powered
 up correctly, but that doesn't help your problem.
 
 The vendor has provided me with documentation on many versions
 of the chip, but I don't have doc's on the lastest revision differences of 
 the EC Ultra, 
 so a proper solution is not easily available.  The best method for resolving 
 this would
 be to first try the vendor driver version of sk98lin and see if that fixes 
 it. If so,
 then it is easy to change sky2, to match the phy setup in the vendor driver.
 Another possibility is to look for places in sky2 driver where there are 
 places
 that compare version/revision.
 
 The most likely bits that need to change are in PCI registers: 0x80, 0x84 and 
 0x88
 You could also load the windows driver and dump PCI config space (with lspci 
 from
 cygwin), and see what the settings are there.
 
 I am away from my office for a month, and therefore away from any sky2
 hardware for testing. 

I'll try the above and keep you posted. The crash itself seems to be a
2.6.23-rc1 regression though. I never experienced this with 2.6.22-rc5
which i was running before.

Flo
-- 
Florian Lohoff  [EMAIL PROTECTED] +49-171-2280134
Those who would give up a little freedom to get a little 
  security shall soon have neither - Benjamin Franklin


signature.asc
Description: Digital signature


Re: Races in net_rx_action vs netpoll?

2007-07-24 Thread Jarek Poplawski
On 22-07-2007 09:05, David Miller wrote:
 From: Stephen Hemminger [EMAIL PROTECTED]
 Date: Thu, 19 Jul 2007 17:27:47 +0100
 
 Please revisit the requirements that netconsole needs and redesign
 it from scratch.  The existing code is causing too much breakage.

 Can it be done without breaking the semantics of network devices, or
 should we rewrite the driver interface to take have a different
 interface like netdev_sync_send_skb() that is slow, synchronous and
 non-interrupt (ie polls for completion).  Of course, then people
 will complain that netconsole traffic slows the machine down.  for
 completion.
 
 I couldn't agree more.
 
 Since netpoll runs outside of all of the normal netdevice locking
 rules, only the people using netpoll hit all the bugs.  That means
 most of us do not test out these code path, which is bad.
 
 So, if anything, a more integrated implementation is essential.
 

But, IMHO, until this will be done some simpler measures could
do be done to make poll_napi less dangerous (as a matter of fact
I wonder why oopses observed  diagnosed by Olaf are so rare).

Current locking with netpoll_poll_lock is mainly misleading.
It seems somebody who planned napi didn't even think such helpers
as poll_napi are possible on other cpus and somebody doing netpoll
didn't want to show this all per cpu data needs full locking anyway.
But since it's like this there is no reason to invent a wheel again
and normal locking should be done: so global (not per device)
spin_lock (#ifdef CONFIG_NETPOLL only) held during all net_rx_action
and spin_trylock similarly in poll_napi (with STATE_RX_SCHED
re-checking) should be minimum needed here.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Mike Galbraith
On Mon, 2007-07-23 at 13:24 -0700, Andrew Morton wrote:

 You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out.

My box bugged during boot the first time I booted 23-rc1, but nothing
made it to the console, and I didn't have a serial console running.  I
didn't have DEBUG_PAGEALLOC or friends set.

 I haven't worked out where that kmap_atomic() call is coming from yet. 
 Both traces point up into the page allocator, but I _think_ that's stack
 gunk.

I just enabled all debug options, and was just rewarded with the below.

[  119.079531] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
[  119.558867] [ cut here ]
[  119.572197] kernel BUG at arch/i386/mm/highmem.c:38!
[  119.585804] invalid opcode:  [#1]
[  119.598013] PREEMPT SMP DEBUG_PAGEALLOC
[  119.610103] Modules linked in: edd button battery ac ip6t_REJECT xt_tcpudp 
ipt_REJECT xt_state iptable_mangle iptable_nat nf_nat iptable_filter 
ip6table_mangle nf_conntrack_ipv4 nf_conntrack nfnetlink ip_tables 
ip6table_filter ip6_tables x_tables nls_iso8859_1 nls_cp437 nls_utf8 
snd_intel8x0 snd_ac97_codec ac97_bus snd_mpu401 snd_pcm prism54 snd_timer 
snd_mpu401_uart snd_rawmidi snd_seq_device snd intel_agp agpgart soundcore 
snd_page_alloc i2c_i801 fan thermal processor
[  119.698063] CPU:1
[  119.698065] EIP:0060:[c011cd2d]Not tainted VLI
[  119.698067] EFLAGS: 00010006   (2.6.23-rc1-smp #75)
[  119.736358] EIP is at kmap_atomic_prot+0xa7/0xab
[  119.749647] eax: 3d07f163   ebx: c166db80   ecx: c0750e60   edx: 0007
[  119.765417] esi: 0022   edi: 0163   ebp: c069dcd4   esp: c069dcc8
[  119.781273] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[  119.796378] Process udevd (pid: 4775, ti=c069d000 task=f31aea60 
task.ti=f477d000)
[  119.804068] Stack: c166db80  c166db80 c069dcdc c011cd3f c069dd40 
c015b6e0 0001 
[  119.822272]0044 0163  0001 c165f4e0 0001 
c165f4e0 0001 
[  119.840762] 00028020 c061e71c c166db80 0046 0080 
0001 c011e4de 
[  119.859389] Call Trace:
[  119.881302]  [c0105144] show_trace_log_lvl+0x1a/0x30
[  119.896319]  [c01051ff] show_stack_log_lvl+0xa5/0xca
[  119.911171]  [c0105420] show_registers+0x1fc/0x343
[  119.925756]  [c0105689] die+0x122/0x249
[  119.939241]  [c0105834] do_trap+0x84/0xad
[  119.952897]  [c0105b1c] do_invalid_op+0x88/0x92
[  119.967118]  [c04cf3c2] error_code+0x72/0x78
[  119.980948]  [c011cd3f] kmap_atomic+0xe/0x10
[  119.994642]  [c015b6e0] get_page_from_freelist+0x39e/0x45e
[  120.009485]  [c015b7fb] __alloc_pages+0x5b/0x2db
[  120.023342]  [c0172872] cache_alloc_refill+0x380/0x6f2
[  120.037623]  [c0172e7a] kmem_cache_alloc+0xa1/0xa5
[  120.051426]  [c03fb397] neigh_create+0x5f/0x506
[  120.064894]  [c046e25d] ndisc_dst_alloc+0x122/0x151
[  120.078769]  [c0471b0b] __ndisc_send+0x8d/0x4fa
[  120.092340]  [c0472915] ndisc_send_ns+0x5f/0x7d
[  120.105848]  [c0469ff5] addrconf_dad_timer+0xdb/0xe0
[  120.119758]  [c012f8a0] run_timer_softirq+0x130/0x191
[  120.133717]  [c012c06d] __do_softirq+0x76/0xe4
[  120.147475]  [c0106b48] do_softirq+0x63/0xac
[  120.147488]  [c012bff5] 

(gdb) list *neigh_create+0x5f
0xc03fb397 is in neigh_create (include/linux/slab.h:259).
254 /*
255  * Shortcuts
256  */
257 static inline void *kmem_cache_zalloc(struct kmem_cache *k, gfp_t flags)
258 {
259 return kmem_cache_alloc(k, flags | __GFP_ZERO);
260 }
261
262 /**
263  * kzalloc - allocate memory. The memory is set to zero.
(gdb) list *kmem_cache_alloc+0xa1
0xc0172e7a is in kmem_cache_alloc (mm/slab.c:3176).
3171STATS_INC_ALLOCHIT(cachep);
3172ac-touched = 1;
3173objp = ac-entry[--ac-avail];
3174} else {
3175STATS_INC_ALLOCMISS(cachep);
3176objp = cache_alloc_refill(cachep, flags);
3177}
3178return objp;
3179}
3180
(gdb) list *cache_alloc_refill+0x380
0xc0172872 is in cache_alloc_refill (include/linux/gfp.h:154).
149
150 /* Unknown node is current node */
151 if (nid  0)
152 nid = numa_node_id();
153
154 return __alloc_pages(gfp_mask, order,
155 NODE_DATA(nid)-node_zonelists + gfp_zone(gfp_mask));
156 }
157
158 #ifdef CONFIG_NUMA
(gdb) list *__alloc_pages+0x5b
0xc015b7fb is in __alloc_pages (mm/page_alloc.c:1248).
1243if (unlikely(*z == NULL)) {
1244/* Should this ever happen?? */
1245return NULL;
1246}
1247
1248page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
1249zonelist, ALLOC_WMARK_LOW|ALLOC_CPUSET);
1250if (page)
1251goto got_pg;
1252
(gdb) list *get_page_from_freelist+0x39e
0xc015b6e0 is in get_page_from_freelist (include/linux/highmem.h:122).
117 

Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Mike Galbraith
On Tue, 2007-07-24 at 12:01 +0200, Mike Galbraith wrote:
 On Mon, 2007-07-23 at 13:24 -0700, Andrew Morton wrote:
 
  You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that 
  out.
 
 My box bugged during boot the first time I booted 23-rc1, but nothing
 made it to the console, and I didn't have a serial console running.  I
 didn't have DEBUG_PAGEALLOC or friends set.
 
  I haven't worked out where that kmap_atomic() call is coming from yet. 
  Both traces point up into the page allocator, but I _think_ that's stack
  gunk.
 
 I just enabled all debug options, and was just rewarded with the below.

Hm.  I just also experienced filesystem corruption when I tried to send
from that kernel, and it bugged in the process.  My mount table ended up
in /etc/resolv.conf along with some binary goop, making nscd rather
unhappy after reboot.  fsck time.

.Mike

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Question: global lock (genl_mutex) possible refinement?

2007-07-24 Thread Richard MUSIL
Thomas Graf wrote:
 Please provide a new overall patch which is not based on your
 initial patch so I can review your idea properly.

Here it goes (merging two previous patches). I have diffed
against v2.6.22, which I am using currently as my base:

 include/net/genetlink.h |1 +
 net/netlink/genetlink.c |  106 +++
 2 files changed, 80 insertions(+), 27 deletions(-)

diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index b6eaca1..681ad13 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -25,6 +25,7 @@ struct genl_family
struct nlattr **attrbuf;/* private */
struct list_headops_list;   /* private */
struct list_headfamily_list;/* private */
+   struct mutexlock;   /* private */
 };
 
 /**
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index b9ab62f..0104267 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -38,6 +38,32 @@ static void genl_unlock(void)
genl_sock-sk_data_ready(genl_sock, 0);
 }
 
+static DEFINE_MUTEX(genl_fam_mutex);   /* serialization for family list 
management */
+
+static inline void genl_fam_lock(struct genl_family *family)
+{
+   mutex_lock(genl_fam_mutex);
+   if (family)
+   mutex_lock(family-lock);
+}
+
+static inline void genl_fam_unlock(struct genl_family *family)
+{
+   if (family)
+   mutex_unlock(family-lock);
+   mutex_unlock(genl_fam_mutex);
+}
+
+static inline void genl_onefam_lock(struct genl_family *family)
+{
+   mutex_lock(family-lock);
+}
+
+static inline void genl_onefam_unlock(struct genl_family *family)
+{
+   mutex_unlock(family-lock);
+}
+
 #define GENL_FAM_TAB_SIZE  16
 #define GENL_FAM_TAB_MASK  (GENL_FAM_TAB_SIZE - 1)
 
@@ -150,9 +176,9 @@ int genl_register_ops(struct genl_family *family, struct 
genl_ops *ops)
if (ops-policy)
ops-flags |= GENL_CMD_CAP_HASPOL;
 
-   genl_lock();
+   genl_fam_lock(family);
list_add_tail(ops-ops_list, family-ops_list);
-   genl_unlock();
+   genl_fam_unlock(family);
 
genl_ctrl_event(CTRL_CMD_NEWOPS, ops);
err = 0;
@@ -180,16 +206,16 @@ int genl_unregister_ops(struct genl_family *family, 
struct genl_ops *ops)
 {
struct genl_ops *rc;
 
-   genl_lock();
+   genl_fam_lock(family);
list_for_each_entry(rc, family-ops_list, ops_list) {
if (rc == ops) {
list_del(ops-ops_list);
-   genl_unlock();
+   genl_fam_unlock(family);
genl_ctrl_event(CTRL_CMD_DELOPS, ops);
return 0;
}
}
-   genl_unlock();
+   genl_fam_unlock(family);
 
return -ENOENT;
 }
@@ -216,8 +242,9 @@ int genl_register_family(struct genl_family *family)
goto errout;
 
INIT_LIST_HEAD(family-ops_list);
+   mutex_init(family-lock);
 
-   genl_lock();
+   genl_fam_lock(family);
 
if (genl_family_find_byname(family-name)) {
err = -EEXIST;
@@ -251,14 +278,14 @@ int genl_register_family(struct genl_family *family)
family-attrbuf = NULL;
 
list_add_tail(family-family_list, genl_family_chain(family-id));
-   genl_unlock();
+   genl_fam_unlock(family);
 
genl_ctrl_event(CTRL_CMD_NEWFAMILY, family);
 
return 0;
 
 errout_locked:
-   genl_unlock();
+   genl_fam_unlock(family);
 errout:
return err;
 }
@@ -275,7 +302,7 @@ int genl_unregister_family(struct genl_family *family)
 {
struct genl_family *rc;
 
-   genl_lock();
+   genl_fam_lock(family);
 
list_for_each_entry(rc, genl_family_chain(family-id), family_list) {
if (family-id != rc-id || strcmp(rc-name, family-name))
@@ -283,14 +310,16 @@ int genl_unregister_family(struct genl_family *family)
 
list_del(rc-family_list);
INIT_LIST_HEAD(family-ops_list);
-   genl_unlock();
+
+   genl_fam_unlock(family);
+   mutex_destroy(family-lock);
 
kfree(family-attrbuf);
genl_ctrl_event(CTRL_CMD_DELFAMILY, family);
return 0;
}
 
-   genl_unlock();
+   genl_fam_unlock(family);
 
return -ENOENT;
 }
@@ -303,38 +332,57 @@ static int genl_rcv_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh)
struct genlmsghdr *hdr = nlmsg_data(nlh);
int hdrlen, err;
 
+   genl_fam_lock(NULL);
family = genl_family_find_byid(nlh-nlmsg_type);
-   if (family == NULL)
+   if (family == NULL) {
+   genl_fam_unlock(NULL);
return -ENOENT;
+   }
+
+   /* get particular family lock, but release global family lock
+* so registering operations for other families are possible */
+ 

[PATCH] virtio_net.c gso feature support

2007-07-24 Thread Rusty Russell
Feedback welcome, as always!

(There's been talk of a virtualization git tree, in which case there'll
be a decent home for these patches soon).

Cheers,
Rusty.
==
Add feature and GSO support to virtio net driver.

If you don't do GSO, you can simply ignore the first sg element of
every outgoing packet, and tack a dummy one on every incoming.

Signed-off-by: Rusty Russell [EMAIL PROTECTED]
---
 drivers/net/virtio_net.c   |  130 
 include/linux/virtio_net.h |   25 +++-
 2 files changed, 143 insertions(+), 12 deletions(-)

===
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -16,12 +16,13 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
-//#define DEBUG
+#define DEBUG
 #include linux/netdevice.h
 #include linux/etherdevice.h
 #include linux/module.h
 #include linux/virtio.h
 #include linux/scatterlist.h
+#include linux/virtio_net.h
 
 /* FIXME: Make dynamic */
 #define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN)
@@ -40,6 +41,18 @@ struct virtnet_info
struct sk_buff_head send;
 };
 
+static inline struct virtio_net_hdr *skb_vnet_hdr(struct sk_buff *skb)
+{
+   return (struct virtio_net_hdr *)skb-cb;
+}
+
+static void vnet_hdr_to_sg(struct scatterlist *sg, struct sk_buff *skb)
+{
+   sg-page = virt_to_page(skb_vnet_hdr(skb));
+   sg-offset = offset_in_page(skb_vnet_hdr(skb));
+   sg-length = sizeof(struct virtio_net_hdr);
+}
+
 static bool skb_xmit_done(struct virtqueue *vq)
 {
struct virtnet_info *vi = vq-priv;
@@ -52,12 +65,14 @@ static void receive_skb(struct net_devic
 static void receive_skb(struct net_device *dev, struct sk_buff *skb,
unsigned len)
 {
-   if (unlikely(len  ETH_HLEN)) {
+   struct virtio_net_hdr *hdr = skb_vnet_hdr(skb);
+
+   if (unlikely(len  sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
pr_debug(%s: short packet %i\n, dev-name, len);
dev-stats.rx_length_errors++;
-   dev_kfree_skb(skb);
-   return;
-   }
+   goto drop;
+   }
+   len -= sizeof(struct virtio_net_hdr);
BUG_ON(len  MAX_PACKET_LEN);
 
skb_trim(skb, len);
@@ -66,13 +81,70 @@ static void receive_skb(struct net_devic
 ntohs(skb-protocol), skb-len, skb-pkt_type);
dev-stats.rx_bytes += skb-len;
dev-stats.rx_packets++;
+
+   if (hdr-flags  VIRTIO_NET_F_NEEDS_CSUM) {
+   pr_debug(Needs csum!\n);
+   skb-ip_summed = CHECKSUM_PARTIAL;
+   skb-csum_start = hdr-csum_start;
+   skb-csum_offset = hdr-csum_offset;
+   if (skb-csum_start  skb-len - 2
+   || skb-csum_offset  skb-len - 2) {
+   if (net_ratelimit())
+   printk(KERN_WARNING %s: csum=%u/%u len=%u\n,
+  dev-name, skb-csum_start,
+  skb-csum_offset, skb-len);
+   goto frame_err;
+   }
+   }
+
+   if (hdr-gso_type != VIRTIO_NET_GSO_NONE) {
+   pr_debug(GSO!\n);
+   switch (hdr-gso_type) {
+   case VIRTIO_NET_GSO_TCP:
+   skb_shinfo(skb)-gso_type = SKB_GSO_TCPV4;
+   break;
+   case VIRTIO_NET_GSO_TCP_ECN:
+   skb_shinfo(skb)-gso_type = SKB_GSO_TCP_ECN;
+   break;
+   case VIRTIO_NET_GSO_UDP:
+   skb_shinfo(skb)-gso_type = SKB_GSO_UDP;
+   break;
+   case VIRTIO_NET_GSO_TCPV6:
+   skb_shinfo(skb)-gso_type = SKB_GSO_TCPV6;
+   break;
+   default:
+   if (net_ratelimit())
+   printk(KERN_WARNING %s: bad gso type %u.\n,
+  dev-name, hdr-gso_type);
+   goto frame_err;
+   }
+
+   skb_shinfo(skb)-gso_size = hdr-gso_size;
+   if (skb_shinfo(skb)-gso_size == 0) {
+   if (net_ratelimit())
+   printk(KERN_WARNING %s: zero gso size.\n,
+  dev-name);
+   goto frame_err;
+   }
+
+   /* Header must be checked, and gso_segs computed. */
+   skb_shinfo(skb)-gso_type |= SKB_GSO_DODGY;
+   skb_shinfo(skb)-gso_segs = 0;
+   }
+
netif_receive_skb(skb);
+   return;
+
+frame_err:
+   dev-stats.rx_frame_errors++;
+drop:
+   dev_kfree_skb(skb);
 }
 
 static void try_fill_recv(struct virtnet_info *vi)
 {
struct sk_buff *skb;
-   struct scatterlist sg[MAX_SKB_FRAGS];
+   struct scatterlist 

Re: [PATCH] [2.6.22] Fix a potential NULL pointer dereference in write_bulk_callback() in drivers/net/usb/pegasus.c

2007-07-24 Thread Petko Manolov

ACK :-)


cheers,
Petko


On Mon, 23 Jul 2007, Micah Gruber wrote:

This patch fixes a potential null dereference bug where we dereference 
pegasus before a null check. This patch simply moves the dereferencing after 
the null check.


Signed-off-by: Micah Gruber [EMAIL PROTECTED]

---

--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -768,11 +768,13 @@
static void write_bulk_callback(struct urb *urb)
{
  pegasus_t *pegasus = urb-context;
-   struct net_device *net = pegasus-net;
+   struct net_device *net;

  if (!pegasus)
  return;

+   net = pegasus-net;
+
  if (!netif_device_present(net) || !netif_running(net))
  return;


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)

2007-07-24 Thread OBATA Noboru
From: Rick Jones [EMAIL PROTECTED]
Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
Date: Thu, 12 Jul 2007 15:27:30 -0700

  So the problem is that RTO can grows to be twice the failover detection
  time.  So back to the original mail, the scenario has a switch with failover
  detection of 20seconds.  Worst case TCP RTO could grow to 40 seconds.
  
  Going back in archive to original mail:
  
  
 Background
 ==
 
 When designing a TCP/IP based network system on failover-capable
 network devices, people want to set timeouts hierarchically in
 three layers, network device layer, TCP layer, and application
 layer (bottom-up order), such that:
 
 1. Network device layer detects a failure first and switch to a
backup device (say, in 20sec).
 
 2. TCP layer timeout  retransmission comes next, _hopefully_
before the application layer timeout.
 
 3. Application layer detects a network failure last (by, say,
30sec timeout) and may trigger a system-level failover.
  
  
  Sounds like the solution is to make the switch failover detection faster.
  If you get switch failover down to 5sec then TCP RTO shouldn't be bigger
  than 10sec, and application will survive.
 
 That may indeed be the best solution, we'll have to wait to hear if 
 there is any freedom there.  When this sort of thing has crossed my path 
 in other contexts, the general answer is that the device failover time 
 is fixed, and the application layer time is similarly constrained by 
 end-user expectation/requirement.  Often as not, layer 8 and 9 issues 
 tend to dominate and expect to trump (in this case layer 4 issues).

I agree that application will survive if a user makes the
application timeout twice the failover timeout.  But I'm afraid
there is no such freedom.

Basically, to minimize downtime, shorter timeouts are preferred
as long as the probability of mis-detection is kept low at a
certain level.

In practice, failover timeouts for bonding, switches, or routers
are determined by heuristics.  Users know what timeout values and
retry counts of probe packets are suitable for detecting failure
of a certain combination of network equipments. (e.g., 5sec
timeout, retries 4 times)  Shorter is better.

And application timeout (or system timeout) is given as an
end-user requirement.  There is little change of negotiation,
really.  And again shorter (than requirement, if possible) is
better.

Regards,

-- 
OBATA Noboru ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)

2007-07-24 Thread OBATA Noboru
From: Rick Jones [EMAIL PROTECTED]
Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
Date: Thu, 12 Jul 2007 13:51:44 -0700

  TCP's timeouts are perfectly fine, and the only thing you
  might be showing above is that the application timeouts
  are too short or that TCP needs notifications.
 
 The application timeouts are probably being driven by external desires 
 for a given recovery time.

Agreed.

 TCP notifications don't solve anything unless the links in question are 
 local to the machine on which the TCP endpoint resides.

Agreed.  Thank you for a good explanation.

My original discussion using Dom-0 and Dom-U might be
misleading, but I was trying to say:

* Network failure and recovery(failover) are not necessarily
  visible locally.

  ** Dom-0 vs. Dom-U discussion is just an example of the case
 where a network failure is not visible locally.

  ** For another example, network switches or routers sitting
 somewhere in the middle of route are often duplicated with
 active-standby setting today.

* Quick response (retransmission) of TCP upon a recovery of such
  invisible devices as well is desired.

* If the failure and recovery are not visible locally, TCP
  notifications do not help.

Regards,

-- 
OBATA Noboru ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-24 Thread Patrick McHardy
Varun Chandramohan wrote:
 Oliver Hartkopp wrote:
 
I don't think you should round down timeout values.
  

  

Can you elaborate on that? As per the RFC of MIB ,we need only seconds
granularity. Taking that as the case i dont understand why round down
should not be done?
  


When you like to create any timeout based on your calculated value, you
might run into the problem that your calculated value is set to _zero_
even if there was some time before the conversion. This might probably
not what you indented to get.

So what about rounding up with

return (tv-tv_sec + (tv-tv_usec + 99)/100);

???

  
 
 This can done.  Is this what you were ref to me, Patrick?


Yes, timeouts should usually be at least as long as specified.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-24 Thread Varun Chandramohan
Patrick McHardy wrote:
 Varun Chandramohan wrote:
   
 Oliver Hartkopp wrote:

 
 I don't think you should round down timeout values.
  

  
   
 Can you elaborate on that? As per the RFC of MIB ,we need only seconds
 granularity. Taking that as the case i dont understand why round down
 should not be done?
  

 
 When you like to create any timeout based on your calculated value, you
 might run into the problem that your calculated value is set to _zero_
 even if there was some time before the conversion. This might probably
 not what you indented to get.

 So what about rounding up with

 return (tv-tv_sec + (tv-tv_usec + 99)/100);

 ???

  
   
 This can done.  Is this what you were ref to me, Patrick?
 


 Yes, timeouts should usually be at least as long as specified.
   
Thanks Patrick and Oliver, ill round it up. :-)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Dan Williams

On 7/24/07, Jens Axboe [EMAIL PROTECTED] wrote:

What about the new async crypto stuff? I've been looking, but is it
guarenteed that async_memcpy() runs in process context with interrupts
enabled always? If not, there's a km type bug there.



Currently the only user is the MD raid456 driver, and yes, it only
performs copies from the handle_stripe routine which is always run in
process context with interrupts enabled.  However this is not
documented.  Would it be advisable to add a WARN_ON for this
condition?


In general, I think the highmem stuff could do with more safety checks:

- People ALWAYS get the atomic unmaps wrong, passing in the page instead
  of the address. I've seen tons of these. And since kunmap_atomic()
  takes a void pointer, nobody notices until it goes boom.
- People easily get the km type wrong - they use KM_USERx in interrupt
  context, or one of the irq variants without disabling interrupts.

If we could just catch these two types of bugs, we've got a lot of these
problems covered.


--
Jens Axboe


--
Dan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Dan Williams

On 7/24/07, Andrew Morton [EMAIL PROTECTED] wrote:
[...]

 What about the new async crypto stuff? I've been looking, but is it
 guarenteed that async_memcpy() runs in process context with interrupts
 enabled always? If not, there's a km type bug there.

I think Shannon maintains that now.



I am looking after the async_tx API, I will send a patch to update
MAINTAINERS shortly.

--
Dan
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-2.6.22-rc7] xfrm state selection update to use inner addresses

2007-07-24 Thread Patrick McHardy
Joakim Koskela wrote:
 This patch modifies the xfrm state selection logic to use the inner
 addresses where the outer have been (incorrectly) used. This is
 required for beet mode in general and interfamily setups in both
 tunnel and beet mode.


Looks good.

Acked-by: Patrick McHardy [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] e1000: repost work around 82571 completion timout on pseries HW

2007-07-24 Thread Andy Gospodarek

There seems to have been some discussion about a patch like this in the
past but I still haven't noticed any platforms fixes or noticed that
this got included, so I'd like to propose this modified version.

Thoughts?

Signed-off-by: Andy Gospodarek [EMAIL PROTECTED]
---

 e1000_hw.c |7 +++
 e1000_hw.h |1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/net/e1000/e1000_hw.c b/drivers/net/e1000/e1000_hw.c
index 9be4469..7c75b8b 100644
--- a/drivers/net/e1000/e1000_hw.c
+++ b/drivers/net/e1000/e1000_hw.c
@@ -1030,6 +1030,13 @@ e1000_init_hw(struct e1000_hw *hw)
 break;
 }
 
+#if defined(CONFIG_PPC_PSERIES)
+if (hw-mac_type == e1000_82571) {
+uint32_t gcr = E1000_READ_REG(hw, GCR);
+gcr |= E1000_GCR_DISABLE_TIMEOUT_MECHANISM;
+E1000_WRITE_REG(hw, GCR, gcr);
+}
+#endif 
 
 if (hw-mac_type == e1000_82573) {
 uint32_t gcr = E1000_READ_REG(hw, GCR);
diff --git a/drivers/net/e1000/e1000_hw.h b/drivers/net/e1000/e1000_hw.h
index bd000b8..71f4e2d 100644
--- a/drivers/net/e1000/e1000_hw.h
+++ b/drivers/net/e1000/e1000_hw.h
@@ -2211,6 +2211,7 @@ struct e1000_host_command_info {
 #define PCI_EX_82566_SNOOP_ALL PCI_EX_NO_SNOOP_ALL
 
 #define E1000_GCR_L1_ACT_WITHOUT_L0S_RX 0x0800
+#define E1000_GCR_DISABLE_TIMEOUT_MECHANISM 0x8000
 /* Function Active and Power State to MNG */
 #define E1000_FACTPS_FUNC0_POWER_STATE_MASK 0x0003
 #define E1000_FACTPS_LAN0_VALID 0x0004
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] e1000: repost work around 82571 completion timout on pseries HW

2007-07-24 Thread Kok, Auke

Andy Gospodarek wrote:

There seems to have been some discussion about a patch like this in the
past but I still haven't noticed any platforms fixes or noticed that
this got included, so I'd like to propose this modified version.

Thoughts?


I've written a completely new version for IBM based on the response that 
implements this as a pci quirk (to tag the chipset, not the device).


However IBM has spotted an issue due to firmware hiding the id for linux of the 
bridge involved (on pseries), and they're working on an alternative to fix that.


So, the patch will look different, and I'll post it as soon as we have a version 
for both pseries and xseries that works.


Auke




Signed-off-by: Andy Gospodarek [EMAIL PROTECTED]
---

 e1000_hw.c |7 +++
 e1000_hw.h |1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/net/e1000/e1000_hw.c b/drivers/net/e1000/e1000_hw.c
index 9be4469..7c75b8b 100644
--- a/drivers/net/e1000/e1000_hw.c
+++ b/drivers/net/e1000/e1000_hw.c
@@ -1030,6 +1030,13 @@ e1000_init_hw(struct e1000_hw *hw)
 break;
 }
 
+#if defined(CONFIG_PPC_PSERIES)

+if (hw-mac_type == e1000_82571) {
+uint32_t gcr = E1000_READ_REG(hw, GCR);
+gcr |= E1000_GCR_DISABLE_TIMEOUT_MECHANISM;
+E1000_WRITE_REG(hw, GCR, gcr);
+}
+#endif 
 
 if (hw-mac_type == e1000_82573) {

 uint32_t gcr = E1000_READ_REG(hw, GCR);
diff --git a/drivers/net/e1000/e1000_hw.h b/drivers/net/e1000/e1000_hw.h
index bd000b8..71f4e2d 100644
--- a/drivers/net/e1000/e1000_hw.h
+++ b/drivers/net/e1000/e1000_hw.h
@@ -2211,6 +2211,7 @@ struct e1000_host_command_info {
 #define PCI_EX_82566_SNOOP_ALL PCI_EX_NO_SNOOP_ALL
 
 #define E1000_GCR_L1_ACT_WITHOUT_L0S_RX 0x0800

+#define E1000_GCR_DISABLE_TIMEOUT_MECHANISM 0x8000
 /* Function Active and Power State to MNG */
 #define E1000_FACTPS_FUNC0_POWER_STATE_MASK 0x0003
 #define E1000_FACTPS_LAN0_VALID 0x0004

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][30/37] Clean up duplicate includes in net/netfilter/

2007-07-24 Thread Patrick McHardy
Jesper Juhl wrote:
 This patch cleans up duplicate includes in
   net/netfilter/


I've queued this one and the bridge-netfilter patch (27/37), thanks
Jesper.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Oops in xfrm_bundle_ok

2007-07-24 Thread Ken-ichirou MATSUZAWA
 Hello,

From: Ken-ichirou MATSUZAWA [EMAIL PROTECTED]
Subject: Oops in xfrm_bundle_ok
Date: Thu, 05 Jul 2007 02:47:10 +0900 (JST)

 I got Oops like below. I glanced xfrm_bundle_ok() in
 xfrm_policy.c and __xfrm4.bundle_create() in xfrm4_policy.c.

It seems this was fixed by below, thanks a lot.

 commit bd0bf0765ea1fba80d7085e1f0375ec045631dc1
 Author: Patrick McHardy [EMAIL PROTECTED]
 Date:   Wed Jul 18 01:55:52 2007 -0700

 [XFRM]: Fix crash introduced by struct dst_entry reordering


@@ -2141,7 +2141,7 @@ int xfrm_bundle_ok(struct xfrm_policy *pol, struct 
xfrm_dst *first,
if (last == first)
break;
 
-   last = last-u.next;
+   last = (struct xfrm_dst *)last-u.dst.next;
last-child_mtu_cached = mtu;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: revised make xfrm_audit_log more generic patch

2007-07-24 Thread Joy Latten
On Tue, 2007-07-24 at 11:04 -0400, Steve Grubb wrote:

  +   audit_log_format(audit_buf, %s: auid=%u, buf, auid);
   
  if (sid != 0 
  security_secid_to_secctx(sid, secctx, secctx_len) == 0)
 
 The operation in buf will not be parsed by the user space tools. Let's 
 use op=%s  where you have %s:  above. Audit record fields are name=value 
 and fields separated by spaces. op is what we are using in other places to 
 mean operation. 
 
 I know its a change from the records above, but we previously had some detail 
 about what operation was being performed by the record type and this did not 
 matter so much. Now that we only have one event type, the meaning of the 
 event being recorded needs to be parsable and in a field. 
 
 It also wouldn't hurt to change the text being sent to this function to have 
 a 
 hyphen instead of a space, so SPD delete becomes SPD-delete. This keeps 
 the parser happy.
 
 This patch otherwise looks good.

Sounds good. I will make the changes and resend. 
Thanks!!

Joy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Andrew Morton
On Tue, 24 Jul 2007 12:01:09 +0200 Mike Galbraith [EMAIL PROTECTED] wrote:

 On Mon, 2007-07-23 at 13:24 -0700, Andrew Morton wrote:
 
  You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that 
  out.
 
 My box bugged during boot the first time I booted 23-rc1, but nothing
 made it to the console, and I didn't have a serial console running.  I
 didn't have DEBUG_PAGEALLOC or friends set.
 
  I haven't worked out where that kmap_atomic() call is coming from yet. 
  Both traces point up into the page allocator, but I _think_ that's stack
  gunk.
 
 I just enabled all debug options, and was just rewarded with the below.

doh.  It's a slab bug.

 [  119.079531] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
 [  119.558867] [ cut here ]
 [  119.572197] kernel BUG at arch/i386/mm/highmem.c:38!
 [  119.585804] invalid opcode:  [#1]
 [  119.598013] PREEMPT SMP DEBUG_PAGEALLOC
 [  119.610103] Modules linked in: edd button battery ac ip6t_REJECT xt_tcpudp 
 ipt_REJECT xt_state iptable_mangle iptable_nat nf_nat iptable_filter 
 ip6table_mangle nf_conntrack_ipv4 nf_conntrack nfnetlink ip_tables 
 ip6table_filter ip6_tables x_tables nls_iso8859_1 nls_cp437 nls_utf8 
 snd_intel8x0 snd_ac97_codec ac97_bus snd_mpu401 snd_pcm prism54 snd_timer 
 snd_mpu401_uart snd_rawmidi snd_seq_device snd intel_agp agpgart soundcore 
 snd_page_alloc i2c_i801 fan thermal processor
 [  119.698063] CPU:1
 [  119.698065] EIP:0060:[c011cd2d]Not tainted VLI
 [  119.698067] EFLAGS: 00010006   (2.6.23-rc1-smp #75)
 [  119.736358] EIP is at kmap_atomic_prot+0xa7/0xab
 [  119.749647] eax: 3d07f163   ebx: c166db80   ecx: c0750e60   edx: 0007
 [  119.765417] esi: 0022   edi: 0163   ebp: c069dcd4   esp: c069dcc8
 [  119.781273] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
 [  119.796378] Process udevd (pid: 4775, ti=c069d000 task=f31aea60 
 task.ti=f477d000)
 [  119.804068] Stack: c166db80  c166db80 c069dcdc c011cd3f c069dd40 
 c015b6e0 0001 
 [  119.822272]0044 0163  0001 c165f4e0 0001 
 c165f4e0 0001 
 [  119.840762] 00028020 c061e71c c166db80 0046 0080 
 0001 c011e4de 
 [  119.859389] Call Trace:
 [  119.881302]  [c0105144] show_trace_log_lvl+0x1a/0x30
 [  119.896319]  [c01051ff] show_stack_log_lvl+0xa5/0xca
 [  119.911171]  [c0105420] show_registers+0x1fc/0x343
 [  119.925756]  [c0105689] die+0x122/0x249
 [  119.939241]  [c0105834] do_trap+0x84/0xad
 [  119.952897]  [c0105b1c] do_invalid_op+0x88/0x92
 [  119.967118]  [c04cf3c2] error_code+0x72/0x78
 [  119.980948]  [c011cd3f] kmap_atomic+0xe/0x10
 [  119.994642]  [c015b6e0] get_page_from_freelist+0x39e/0x45e
 [  120.009485]  [c015b7fb] __alloc_pages+0x5b/0x2db
 [  120.023342]  [c0172872] cache_alloc_refill+0x380/0x6f2
 [  120.037623]  [c0172e7a] kmem_cache_alloc+0xa1/0xa5
 [  120.051426]  [c03fb397] neigh_create+0x5f/0x506
 [  120.064894]  [c046e25d] ndisc_dst_alloc+0x122/0x151
 [  120.078769]  [c0471b0b] __ndisc_send+0x8d/0x4fa
 [  120.092340]  [c0472915] ndisc_send_ns+0x5f/0x7d
 [  120.105848]  [c0469ff5] addrconf_dad_timer+0xdb/0xe0
 [  120.119758]  [c012f8a0] run_timer_softirq+0x130/0x191
 [  120.133717]  [c012c06d] __do_softirq+0x76/0xe4
 [  120.147475]  [c0106b48] do_softirq+0x63/0xac
 [  120.147488]  [c012bff5] 


 (gdb) list *neigh_create+0x5f
 0xc03fb397 is in neigh_create (include/linux/slab.h:259).
 254 /*
 255  * Shortcuts
 256  */
 257 static inline void *kmem_cache_zalloc(struct kmem_cache *k, gfp_t 
 flags)
 258 {
 259 return kmem_cache_alloc(k, flags | __GFP_ZERO);
 260 }

See, networking's kmem_cache_alloc(..., __GFP_ZERO) ended up calling into
the page allocator with __GFP_ZERO.  This is the bug - slab isn't supposed
to do that: the __GFP_ZERO is supposed to be removed.

Now, it's not a highmem page, so prep_zero_page() won't actually establish
a kmap, but it will check that the kmap slot is presently unused on this
CPU.

But networking calls in here from softirq context (illegal for KM_USER0)
and sometimes that KM_USER0 slot *will* be in use, so kmap_atomic_prot()
will go BUG.

I must say it's really really scary that such a low-level function as
prep_zero_page() is using KM_USER0.  I don't think it has enough debugging
checks in there to prevent Bad Stuff from going undetected.

I guess this was the bug:

--- a/mm/slab.c~a
+++ a/mm/slab.c
@@ -2776,7 +2776,7 @@ static int cache_grow(struct kmem_cache 
 * 'nodeid'.
 */
if (!objp)
-   objp = kmem_getpages(cachep, flags, nodeid);
+   objp = kmem_getpages(cachep, local_flags, nodeid);
if (!objp)
goto failed;
 
_


I don't see why you later got fs corruption - afacit we won't actually
modify the KM_USER0 slot in this scenario.


 262 /**
 263  * kzalloc - allocate memory. The memory is set to zero.
 (gdb) list *kmem_cache_alloc+0xa1
 0xc0172e7a is in kmem_cache_alloc 

Re: [PATCH] e1000: repost work around 82571 completion timout on pseries HW

2007-07-24 Thread Andy Gospodarek
On Tue, Jul 24, 2007 at 08:31:08AM -0700, Kok, Auke wrote:
 Andy Gospodarek wrote:
 There seems to have been some discussion about a patch like this in the
 past but I still haven't noticed any platforms fixes or noticed that
 this got included, so I'd like to propose this modified version.
 
 Thoughts?
 
 I've written a completely new version for IBM based on the response that 
 implements this as a pci quirk (to tag the chipset, not the device).
 
 However IBM has spotted an issue due to firmware hiding the id for linux of 
 the bridge involved (on pseries), and they're working on an alternative to 
 fix that.
 
 So, the patch will look different, and I'll post it as soon as we have a 
 version for both pseries and xseries that works.
 
 Auke
 

Sounds great.  I just wanted to make sure this didn't slip through the
cracks.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Fix race in genl_unregister_mc_groups()

2007-07-24 Thread Brian Haley

Thomas Graf wrote:

@@ -217,14 +229,8 @@ EXPORT_SYMBOL(genl_register_mc_group);
 void genl_unregister_mc_group(struct genl_family *family,
  struct genl_multicast_group *grp)
 {
-   BUG_ON(grp-family != family);
genl_lock();
-   netlink_clear_multicast_users(genl_sock, grp-id);
-   clear_bit(grp-id, mc_groups);
-   list_del(grp-list);
-   genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
-   grp-id = 0;
-   grp-family = NULL;
+   genl_unregister_mc_group(family, grp);
genl_unlock();
 }


Shouldn't this be __genl_unregister_mc_group(family, grp) ?

-Brian
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: revised make xfrm_audit_log more generic patch

2007-07-24 Thread Joy Latten
On Tue, 2007-07-24 at 11:04 -0400, Steve Grubb wrote:

 It also wouldn't hurt to change the text being sent to this function to have 
 a 
 hyphen instead of a space, so SPD delete becomes SPD-delete. This keeps 
 the parser happy.
 
Steve, more for my education, should all entries have this sort of
syntax, that is, a hyphen in it? I imagine some entries might be a 
bit more wordy and so I was wondering ahead of time how to do it.

Thanks!!

Joy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] netxen: bug fixes for multiport adapters

2007-07-24 Thread Dhananjay Phadke

Jeff,

Any chance of these patches getting committed soon?

Thanks,
-Dhananjay Phadke

On 7/21/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

These patches include fix for problem with 2nd port of multiport adapters
on IBM blades. Also improves interrupt handling for multiport adapters
avoiding interrupt flood after interrupt is down.

Generated against upstream-fixes.

 drivers/net/netxen/netxen_nic.h  |3 +-
 drivers/net/netxen/netxen_nic_main.c |   85 +++---
 2 files changed, 43 insertions(+), 45 deletions(-)

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-24 Thread Patrick McHardy
Sam Ravnborg wrote:
 On Tue, Jul 24, 2007 at 08:36:33AM +0300, Al Boldi wrote:
 
Replaces NF_CONNTRACK_ENABLED with NF_CONNTRACK and selects it for 
NF_CONNTRACK_IPV4 and NF_CONNTRACK_IPV6

This exposes IPv4/6 connection tracking options for easier Kconfig setup.

Signed-off-by: Al Boldi [EMAIL PROTECTED]
Cc: David Miller [EMAIL PROTECTED]
Cc: Sam Ravnborg [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
---
--- a/net/netfilter/Kconfig   2007-07-09 06:38:52.0 +0300
+++ b/net/netfilter/Kconfig   2007-07-24 08:28:06.0 +0300
@@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
and is also scheduled to replace the old syslog-based ipt_LOG
and ip6t_LOG modules.
 
-# Rename this to NF_CONNTRACK in a 2.6.25
-config NF_CONNTRACK_ENABLED
+config NF_CONNTRACK


We kept this mainly for an easier upgrade. As the comment states, it
should go in 2.6.25, at which time all people having reconfigured
their kernel at least once since ip_conntrack was removed will have
the NF_CONNTRACK option set to the same value as NF_CONNTRACK_ENABLED.

--- a/net/ipv4/netfilter/Kconfig  2007-07-09 06:38:50.0 +0300
+++ b/net/ipv4/netfilter/Kconfig  2007-07-24 08:27:39.0 +0300
@@ -7,7 +7,7 @@ menu IP: Netfilter Configuration
 
 config NF_CONNTRACK_IPV4
  tristate IPv4 connection tracking support (required for NAT)
- depends on NF_CONNTRACK
+ select NF_CONNTRACK
  ---help---
Connection tracking keeps a record of what packets have passed
through your machine, in order to figure out how they are related
--- a/net/ipv6/netfilter/Kconfig  2007-07-09 06:38:51.0 +0300
+++ b/net/ipv6/netfilter/Kconfig  2007-07-24 08:27:54.0 +0300
@@ -7,7 +7,8 @@ menu IPv6: Netfilter Configuration (EXP
 
 config NF_CONNTRACK_IPV6
  tristate IPv6 connection tracking support (EXPERIMENTAL)
- depends on INET  IPV6  EXPERIMENTAL  NF_CONNTRACK
+ depends on INET  IPV6  EXPERIMENTAL
+ select NF_CONNTRACK
  ---help---
Connection tracking keeps a record of what packets have passed
through your machine, in order to figure out how they are related

 
 This change looks wrong.
 Due to the reverse nature of select kconfig cannot fulfill the dependencies
 of selected symbols. So as a rule of thumb select should only select
 symbols with no menu and no dependencies to avoid some of the
 problems that have popped up during the last months.


In this case it looks OK since the dependencies of IPv4 connection
tracking are (besides NF_CONNTRACK) are superset of those of
nf_conntrack.

But I vaguely recall having tried this myself and it broke somewhere,
maybe it was because of the NF_CONNTRACK_ENABLED option, I can't
recall anymore. Al, if this also works without removal of
NF_CONNTRACK_ENABLED, please resend without that part.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][30/37] Clean up duplicate includes in net/netfilter/

2007-07-24 Thread Jesper Juhl

On 24/07/07, Patrick McHardy [EMAIL PROTECTED] wrote:

Jesper Juhl wrote:
 This patch cleans up duplicate includes in
   net/netfilter/


I've queued this one and the bridge-netfilter patch (27/37), thanks
Jesper.


Thanks for the feedback Patrick.

--
Jesper Juhl [EMAIL PROTECTED]
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: revised make xfrm_audit_log more generic patch

2007-07-24 Thread Steve Grubb
On Tuesday 24 July 2007 12:33:26 pm Joy Latten wrote:
  It also wouldn't hurt to change the text being sent to this function to
  have a hyphen instead of a space, so SPD delete becomes SPD-delete.
  This keeps the parser happy.

 Steve, more for my education, should all entries have this sort of
 syntax, that is, a hyphen in it?

Only if its something that is important to have associated in reports. More 
that 1 or 2 hyphens is probably not good.

 I imagine some entries might be a bit more wordy and so I was wondering
 ahead of time how to do it.

The audit logs should be short as possible but contain everything necessary. 
You can have language in the record that makes it more understandable for 
people reading the raw record, but it won't necessarily be picked up by 
report parsers for searching or presentation.

If you want me to help review the choices, let me know offline and we can work 
through it.

-Steve
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: specifying scopid's for link-local IPv6 addrs

2007-07-24 Thread Rick Jones

Rick,

I don't see any way around this.  For example, on one of my test
systems, I have the following link local routes:

chance% netstat -A inet6 -rn | grep fe80::/64
fe80::/64   ::  
U 25600 eth0
fe80::/64   ::  
U 25600 eth2
fe80::/64   ::  
U 25600 eth3
fe80::/64   ::  
U 25600 eth4
fe80::/64   ::  
U 25600 eth5
fe80::/64   ::  
U 25600 eth6

So if I want to run a link local test to fe80::202:b3ff:fed4:cd1,
the system has no way to choose which is the correct interface to
use for the test, and will give an error if the interface isn't
specified. 


Yeah, I was wondering about that.  I'm not sure if the attempts on those other 
OSes happened to involve multiple interfaces or not.  Even so, it feels 
unpleasant for an application to deal with and I wonder if there is a way for a 
stack to deal with it on the application's behalf.  I guess that might involve 
some sort of layer violation between neightbor discovery and routing (typing 
while I think about things I know little about...)


Is there RFC chapter and verse I might read about routing with multiple 
link-local's on a system?



You must explicitly specify the desired interface.  For example,
on my test system, the correct interface is eth6 which is interface 8
(lo eth0 eth1 eth2 ... eth5 eth6).  Here is an example nuttcp test
specifying interface 8:

chance% nuttcp -P5100 fe80::202:b3ff:fed4:cd1%8
 1178.5809 MB /  10.02 sec =  986.2728 Mbps 12 %TX 15 %RX

nuttcp uses getaddrinfo() which parses the %ifindex field,
and then copies the sin6_scope_id from the res structure to the
server's sockaddr_in6 structure before initiating the connect().


OK, I'll give that a quick try with netperf:

[EMAIL PROTECTED] ~]# netperf -H 192.168.2.107 -c -C -i 30,3 -- -s 1M -S 1M -m 64K 
-H fe80::207:43ff:fe05:9d%2
TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to fe80::207:43ff:fe05:9d%2 
(fe80::207:43ff:fe05:9d) port 0 AF_INET6 : +/-2.5% @ 99% conf.


Cool - it establishes the data connection just fine.


To further demonstrate my ignorance :)  is that %n suffix something one might 
expect in most/all getaddrinfo()'s or is that unique to the one in Linux?


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NET_DMA: where do we ever call dma_skb_copy_datagram_iovec() with NULL pinned_list?

2007-07-24 Thread Andrew Grover

On 7/20/07, Al Viro [EMAIL PROTECTED] wrote:

AFAICS, all callers of dma_skb_copy_datagram_iovec()
are either
* recursive for fragments, pass pinned_list unchanged or
* called from tcp, with pinned_list coming from
tp-ucopy.pinned_list and only when tp-ucopy.dma_chan is non-NULL.

Now, all non-NULL assignments to -dma_chan have the same form:
if (!tp-ucopy.dma_chan  tp-ucopy.pinned_list)
tp-ucopy.dma_chan = get_softnet_dma();
IOW, if -ucopy.pinned_list stays NULL, -ucopy.dma_chan will do the same.

Moreover, any place that resets -ucopy.pinned_list will also reset
-ucopy.dma_chan.

IOW, we can't ever get non-NULL tp-ucopy.dma_chan while tp-ucopy.pinned_list
is NULL.  So how can we ever get to the dma_memcpy_to_kernel_iovec()?


Shannon what do you think? Is this a bug, or does it no longer support
non-userspace iovecs?

Thanks -- Andy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: specifying scopid's for link-local IPv6 addrs

2007-07-24 Thread Sridhar Samudrala
On Tue, 2007-07-24 at 10:13 -0700, Rick Jones wrote:
  Rick,
  
  I don't see any way around this.  For example, on one of my test
  systems, I have the following link local routes:
  
  chance% netstat -A inet6 -rn | grep fe80::/64
  fe80::/64   ::  
  U 25600 eth0
  fe80::/64   ::  
  U 25600 eth2
  fe80::/64   ::  
  U 25600 eth3
  fe80::/64   ::  
  U 25600 eth4
  fe80::/64   ::  
  U 25600 eth5
  fe80::/64   ::  
  U 25600 eth6
  
  So if I want to run a link local test to fe80::202:b3ff:fed4:cd1,
  the system has no way to choose which is the correct interface to
  use for the test, and will give an error if the interface isn't
  specified. 
 
 Yeah, I was wondering about that.  I'm not sure if the attempts on those 
 other 
 OSes happened to involve multiple interfaces or not.  Even so, it feels 
 unpleasant for an application to deal with and I wonder if there is a way for 
 a 
 stack to deal with it on the application's behalf.  I guess that might 
 involve 
 some sort of layer violation between neightbor discovery and routing (typing 
 while I think about things I know little about...)
 
 Is there RFC chapter and verse I might read about routing with multiple 
 link-local's on a system?
 
  You must explicitly specify the desired interface.  For example,
  on my test system, the correct interface is eth6 which is interface 8
  (lo eth0 eth1 eth2 ... eth5 eth6).  Here is an example nuttcp test
  specifying interface 8:
  
  chance% nuttcp -P5100 fe80::202:b3ff:fed4:cd1%8
   1178.5809 MB /  10.02 sec =  986.2728 Mbps 12 %TX 15 %RX
  
  nuttcp uses getaddrinfo() which parses the %ifindex field,
  and then copies the sin6_scope_id from the res structure to the
  server's sockaddr_in6 structure before initiating the connect().
 
 OK, I'll give that a quick try with netperf:
 
 [EMAIL PROTECTED] ~]# netperf -H 192.168.2.107 -c -C -i 30,3 -- -s 1M -S 1M 
 -m 64K 
 -H fe80::207:43ff:fe05:9d%2

We can even specify the interface name instead of the interface index
   link-local%ethX

getaddrinfo() uses if_nametoindex() internally to get the index.

Thanks
Sridhar
 TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to fe80::207:43ff:fe05:9d%2 
 (fe80::207:43ff:fe05:9d) port 0 AF_INET6 : +/-2.5% @ 99% conf.
 
 Cool - it establishes the data connection just fine.
 
 
 To further demonstrate my ignorance :)  is that %n suffix something one might 
 expect in most/all getaddrinfo()'s or is that unique to the one in Linux?
 
 rick jones
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: specifying scopid's for link-local IPv6 addrs

2007-07-24 Thread Rick Jones



You must explicitly specify the desired interface.  For example,
on my test system, the correct interface is eth6 which is interface 8
(lo eth0 eth1 eth2 ... eth5 eth6).  Here is an example nuttcp test
specifying interface 8:

chance% nuttcp -P5100 fe80::202:b3ff:fed4:cd1%8
 1178.5809 MB /  10.02 sec =  986.2728 Mbps 12 %TX 15 %RX

nuttcp uses getaddrinfo() which parses the %ifindex field,
and then copies the sin6_scope_id from the res structure to the
server's sockaddr_in6 structure before initiating the connect().



OK, I'll give that a quick try with netperf:

[EMAIL PROTECTED] ~]# netperf -H 192.168.2.107 -c -C -i 30,3 -- -s 1M -S 1M 
-m 64K -H fe80::207:43ff:fe05:9d%2
TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 
fe80::207:43ff:fe05:9d%2 (fe80::207:43ff:fe05:9d) port 0 AF_INET6 : 
+/-2.5% @ 99% conf.


Cool - it establishes the data connection just fine.


Well, I spoke too soon - while it got me past my EINVAL, the connection 
establishement timed-out.  Either I picked the wrong value for n, or I may yet 
need to make some tweaks to netperf.


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-24 Thread Al Boldi
Patrick McHardy wrote:
 Sam Ravnborg wrote:
  On Tue, Jul 24, 2007 at 08:36:33AM +0300, Al Boldi wrote:
 Replaces NF_CONNTRACK_ENABLED with NF_CONNTRACK and selects it for
 NF_CONNTRACK_IPV4 and NF_CONNTRACK_IPV6
 
 This exposes IPv4/6 connection tracking options for easier Kconfig
  setup.
 
 Signed-off-by: Al Boldi [EMAIL PROTECTED]
 Cc: David Miller [EMAIL PROTECTED]
 Cc: Sam Ravnborg [EMAIL PROTECTED]
 Cc: Andrew Morton [EMAIL PROTECTED]
 ---
 --- a/net/netfilter/Kconfig 2007-07-09 06:38:52.0 +0300
 +++ b/net/netfilter/Kconfig 2007-07-24 08:28:06.0 +0300
 @@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
   and is also scheduled to replace the old syslog-based ipt_LOG
   and ip6t_LOG modules.
 
 -# Rename this to NF_CONNTRACK in a 2.6.25
 -config NF_CONNTRACK_ENABLED
 +config NF_CONNTRACK

 We kept this mainly for an easier upgrade. As the comment states, it
 should go in 2.6.25, at which time all people having reconfigured
 their kernel at least once since ip_conntrack was removed will have
 the NF_CONNTRACK option set to the same value as NF_CONNTRACK_ENABLED.

 --- a/net/ipv4/netfilter/Kconfig2007-07-09 06:38:50.0 +0300
 +++ b/net/ipv4/netfilter/Kconfig2007-07-24 08:27:39.0 +0300
 @@ -7,7 +7,7 @@ menu IP: Netfilter Configuration
 
  config NF_CONNTRACK_IPV4
 tristate IPv4 connection tracking support (required for NAT)
 -   depends on NF_CONNTRACK
 +   select NF_CONNTRACK
 ---help---
   Connection tracking keeps a record of what packets have passed
   through your machine, in order to figure out how they are related
 --- a/net/ipv6/netfilter/Kconfig2007-07-09 06:38:51.0 +0300
 +++ b/net/ipv6/netfilter/Kconfig2007-07-24 08:27:54.0 +0300
 @@ -7,7 +7,8 @@ menu IPv6: Netfilter Configuration (EXP
 
  config NF_CONNTRACK_IPV6
 tristate IPv6 connection tracking support (EXPERIMENTAL)
 -   depends on INET  IPV6  EXPERIMENTAL  NF_CONNTRACK
 +   depends on INET  IPV6  EXPERIMENTAL
 +   select NF_CONNTRACK
 ---help---
   Connection tracking keeps a record of what packets have passed
   through your machine, in order to figure out how they are related
 
  This change looks wrong.
  Due to the reverse nature of select kconfig cannot fulfill the
  dependencies of selected symbols. So as a rule of thumb select should
  only select symbols with no menu and no dependencies to avoid some of
  the
  problems that have popped up during the last months.

 In this case it looks OK since the dependencies of IPv4 connection
 tracking are (besides NF_CONNTRACK) are superset of those of
 nf_conntrack.

 But I vaguely recall having tried this myself and it broke somewhere,
 maybe it was because of the NF_CONNTRACK_ENABLED option, I can't
 recall anymore. Al, if this also works without removal of
 NF_CONNTRACK_ENABLED, please resend without that part.

It doesn't.  But how about this, if you really can't live without 
NF_CONNTRACK_ENBLED:

==
--- Kconfig.old 2007-07-09 06:38:52.0 +0300
+++ Kconfig 2007-07-24 20:24:27.0 +0300
@@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
  and is also scheduled to replace the old syslog-based ipt_LOG
  and ip6t_LOG modules.
 
-# Rename this to NF_CONNTRACK in a 2.6.25
-config NF_CONNTRACK_ENABLED
+config NF_CONNTRACK
tristate Netfilter connection tracking support
help
  Connection tracking keeps a record of what packets have passed
@@ -40,9 +39,9 @@ config NF_CONNTRACK_ENABLED
 
  To compile it as a module, choose M here.  If unsure, say N.
 
-config NF_CONNTRACK
+config NF_CONNTRACK_ENABLED
tristate
-   default NF_CONNTRACK_ENABLED
+   default NF_CONNTRACK
 
 config NF_CT_ACCT
bool Connection tracking flow accounting
==


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-24 Thread Patrick McHardy
Al Boldi wrote:
 Patrick McHardy wrote:
 
But I vaguely recall having tried this myself and it broke somewhere,
maybe it was because of the NF_CONNTRACK_ENABLED option, I can't
recall anymore. Al, if this also works without removal of
NF_CONNTRACK_ENABLED, please resend without that part.
 
 
 It doesn't.  But how about this, if you really can't live without 
 NF_CONNTRACK_ENBLED:
 
 ==
 --- Kconfig.old   2007-07-09 06:38:52.0 +0300
 +++ Kconfig   2007-07-24 20:24:27.0 +0300
 @@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
 and is also scheduled to replace the old syslog-based ipt_LOG
 and ip6t_LOG modules.
  
 -# Rename this to NF_CONNTRACK in a 2.6.25
 -config NF_CONNTRACK_ENABLED
 +config NF_CONNTRACK
   tristate Netfilter connection tracking support
   help
 Connection tracking keeps a record of what packets have passed
 @@ -40,9 +39,9 @@ config NF_CONNTRACK_ENABLED
  
 To compile it as a module, choose M here.  If unsure, say N.
  
 -config NF_CONNTRACK
 +config NF_CONNTRACK_ENABLED
   tristate
 - default NF_CONNTRACK_ENABLED
 + default NF_CONNTRACK
  
  config NF_CT_ACCT
   bool Connection tracking flow accounting


That defeats the only purpose why we kept it. How about we change this
once we remove it, in 2.6.25?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Adrian Bunk
On Mon, Jul 23, 2007 at 02:28:11PM -0700, Linus Torvalds wrote:
 
 
 On Mon, 23 Jul 2007, Andrew Morton wrote:
  
  It'd be nice to get a clean trace.  Are you able to obtain the full
  trace with CONFIG_FRAME_POINTER=y?
 
 If you are talking about
 
   http://userweb.kernel.org/~akpm/dsc03659.jpg
 
 then I think that _is_ a full trace. It's certainly not very messy, and it 
 seems accurate. It's just that inlining makes it much harder to see the 
 call-graphs, but that's what inlining does..
 
 For example, missing from the call graph is
 
   get_page_from_freelist -
 buffered_rmqueue -   [ missing - inlined ]
   prep_new_page -[ missing - inlined ]
 prep_zero_page - [ missing - inlined ]
   clear_highpage -   [ missing - inlined ]
 kmap_atomic -[ missing - tailcall ]
   kmap_atomic_prot
 
 but I'm pretty sure the call trace is good (and I'm also pretty sure gcc 
 is overly aggressive at inlining, and that it causes us pain for 
 debugging, but whatever)
...

For prep_zero_page() and clear_highpage() we can't blame gcc since we 
force gcc to always inline them.

buffered_rmqueue() and prep_new_page() are static functions with only 
one caller each, and for the normal non-debug case it's a really nice 
optimization to have them inlined automatically. But it might make sense 
to add -fno-inline-functions-called-once to the CFLAGS depending on some 
debug option?

   Linus

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Linus Torvalds


On Tue, 24 Jul 2007, Adrian Bunk wrote:
 
 buffered_rmqueue() and prep_new_page() are static functions with only 
 one caller each, and for the normal non-debug case it's a really nice 
 optimization to have them inlined automatically.

I'm not at all sure I agree.

Inlining big functions doesn't actually tend to generally generate any 
better code, so if gcc's logic is single callsite - always inline, then 
that logic is likely not right.

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: NET_DMA: where do we ever call dma_skb_copy_datagram_iovec() with NULL pinned_list?

2007-07-24 Thread Nelson, Shannon
Al Viro:

   AFAICS, all callers of dma_skb_copy_datagram_iovec()
are either
   * recursive for fragments, pass pinned_list unchanged or
   * called from tcp, with pinned_list coming from
tp-ucopy.pinned_list and only when tp-ucopy.dma_chan is non-NULL.

Now, all non-NULL assignments to -dma_chan have the same form:
   if (!tp-ucopy.dma_chan  tp-ucopy.pinned_list)
   tp-ucopy.dma_chan = get_softnet_dma();
IOW, if -ucopy.pinned_list stays NULL, -ucopy.dma_chan will 
do the same.

Moreover, any place that resets -ucopy.pinned_list will also reset
-ucopy.dma_chan.

IOW, we can't ever get non-NULL tp-ucopy.dma_chan while 
tp-ucopy.pinned_list
is NULL.  So how can we ever get to the dma_memcpy_to_kernel_iovec()?

It looks like this is extra code.  The history seems to be that it was
thought to be useful for internal copying, perhaps for smbfs or iSCSI,
as the comment suggests.  However, since no one is using it, it can
probably come out.  If there is no argument, I'll post a patch to remove
it.

sln
--
==
Mr. Shannon Nelson LAN Access Division, Intel Corp.
[EMAIL PROTECTED]I don't speak for Intel
(503) 712-7659Parents can't afford to be squeamish.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Linus Torvalds


On Tue, 24 Jul 2007, Andrew Morton wrote:
 
 I guess this was the bug:

Looks very likely to me. Mike, Alexey, does this fix things for you?

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Andrew Morton
On Tue, 24 Jul 2007 11:14:21 -0700 (PDT)
Linus Torvalds [EMAIL PROTECTED] wrote:

 
 
 On Tue, 24 Jul 2007, Adrian Bunk wrote:
  
  buffered_rmqueue() and prep_new_page() are static functions with only 
  one caller each, and for the normal non-debug case it's a really nice 
  optimization to have them inlined automatically.
 
 I'm not at all sure I agree.
 
 Inlining big functions doesn't actually tend to generally generate any 
 better code, so if gcc's logic is single callsite - always inline, then 
 that logic is likely not right.
 

fwiw, -fno-inline-functions-called-once (who knew?) takes i386 allnoconfig
vmlinux .text from 928360 up to 955362 bytes (27k larger).

A surprisingly large increase - I wonder if it did something dumb.  It
appears to still correctly inline those things which we've manually marked
inline.  hm.

It would be nice to defeat the autoinlining for debug purposes though.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-24 Thread Al Boldi
Patrick McHardy wrote:
 Al Boldi wrote:
  Patrick McHardy wrote:
 But I vaguely recall having tried this myself and it broke somewhere,
 maybe it was because of the NF_CONNTRACK_ENABLED option, I can't
 recall anymore. Al, if this also works without removal of
 NF_CONNTRACK_ENABLED, please resend without that part.
 
  It doesn't.  But how about this, if you really can't live without
  NF_CONNTRACK_ENBLED:
 
  ==
  --- Kconfig.old 2007-07-09 06:38:52.0 +0300
  +++ Kconfig 2007-07-24 20:24:27.0 +0300
  @@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
and is also scheduled to replace the old syslog-based ipt_LOG
and ip6t_LOG modules.
 
  -# Rename this to NF_CONNTRACK in a 2.6.25
  -config NF_CONNTRACK_ENABLED
  +config NF_CONNTRACK
  tristate Netfilter connection tracking support
  help
Connection tracking keeps a record of what packets have passed
  @@ -40,9 +39,9 @@ config NF_CONNTRACK_ENABLED
 
To compile it as a module, choose M here.  If unsure, say N.
 
  -config NF_CONNTRACK
  +config NF_CONNTRACK_ENABLED
  tristate
  -   default NF_CONNTRACK_ENABLED
  +   default NF_CONNTRACK
 
   config NF_CT_ACCT
  bool Connection tracking flow accounting

 That defeats the only purpose why we kept it.

I'm not sure how this would defeat the only purpose.  Isn't the purpose of 
this to alias NF_CONNTRACK_ENABLED to NF_CONNTRACK?  And as such would yield 
the same result.

Also, we could leave this as is, and select NF_CONNTRACK_ENABLED instead of 
NF_CONNTRACK.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Linus Torvalds


On Tue, 24 Jul 2007, Andrew Morton wrote:
 
 fwiw, -fno-inline-functions-called-once (who knew?) takes i386 allnoconfig
 vmlinux .text from 928360 up to 955362 bytes (27k larger).
 
 A surprisingly large increase - I wonder if it did something dumb.  It
 appears to still correctly inline those things which we've manually marked
 inline.  hm.

I think inlining small enough functions is worth it, and the thing is, the 
kernel is actually pretty damn good at having lots of small functions. 
It's one of the few things I really care about from a coding style 
standpoint.

So I'm not surprised that -fno-inline-functions-called-once makes things 
larger, because I think it's generally a good idea to inline things that 
are just called once. But it does make things harder to debug, and the 
performance advantages become increasingly small for bigger functions.

And that's a balancing act. Do we care about performance? Yes. But do we 
care so much that it's worth inlining something like buffered_rmqueue()? 

So I would not be surprised if -fno-inline-functions-called-once will 
disable *all* the inlining heuristics, and say oh, it's not an inline 
function, and it's only called once, so we won't inline it at all.

So called once should probably make the inlining weight bigger (ie 
inline *larger* functions than you would otherwise), it just shouldn't 
make it infinite. It's not worth it.

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-24 Thread Patrick McHardy
Al Boldi wrote:
 Patrick McHardy wrote:
 
Al Boldi wrote:

Patrick McHardy wrote:

But I vaguely recall having tried this myself and it broke somewhere,
maybe it was because of the NF_CONNTRACK_ENABLED option, I can't
recall anymore. Al, if this also works without removal of
NF_CONNTRACK_ENABLED, please resend without that part.

It doesn't.  But how about this, if you really can't live without
NF_CONNTRACK_ENBLED:

==
--- Kconfig.old  2007-07-09 06:38:52.0 +0300
+++ Kconfig  2007-07-24 20:24:27.0 +0300
@@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
   and is also scheduled to replace the old syslog-based ipt_LOG
   and ip6t_LOG modules.

-# Rename this to NF_CONNTRACK in a 2.6.25
-config NF_CONNTRACK_ENABLED
+config NF_CONNTRACK
 tristate Netfilter connection tracking support
 help
   Connection tracking keeps a record of what packets have passed
@@ -40,9 +39,9 @@ config NF_CONNTRACK_ENABLED

   To compile it as a module, choose M here.  If unsure, say N.

-config NF_CONNTRACK
+config NF_CONNTRACK_ENABLED
 tristate
-default NF_CONNTRACK_ENABLED
+default NF_CONNTRACK

 config NF_CT_ACCT
 bool Connection tracking flow accounting

That defeats the only purpose why we kept it.
 
 
 I'm not sure how this would defeat the only purpose.  Isn't the purpose of 
 this to alias NF_CONNTRACK_ENABLED to NF_CONNTRACK?  And as such would yield 
 the same result.


The purpose is to avoid forcing people a second time to reconfigure
the conntrack options since we've completed nf_conntrack and removed
ip_conntrack. Previously NF_CONNTRACK was a bool (selecting the new
implementation) and NF_CONNTRACK_ENABLED specified whether to build
either nf_conntrack or ip_conntrack modular/static/not at all. So
old configs only have the information whether to build modular in
NF_CONNTRACK_ENABLED, but NF_CONNTRACK is what actually controls it.
With your change, old configs will still build nf_conntrack properly,
but they will always choose static linking.

 Also, we could leave this as is, and select NF_CONNTRACK_ENABLED instead of 
 NF_CONNTRACK.

I guess so, and that would have to select NF_CONNTRACK.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10] Implement batching skb API

2007-07-24 Thread jamal
KK,

On Tue, 2007-24-07 at 09:14 +0530, Krishna Kumar2 wrote:

 
 J Hadi Salim [EMAIL PROTECTED] wrote on 07/23/2007 06:02:01 PM:


 Actually you have not sent netperf results with prep and without prep.

My results were based on pktgen (which i explained as testing the
driver). I think depending on netperf without further analysis is
simplistic. It was like me doing forwarding tests on these patches.

  So _which_ non-LLTX driver doesnt do that? ;-
 
 I have no idea since I haven't looked at all drivers. Can you tell which
 all non-LLTX drivers does that ? I stated this as the sole criterea.

The few i have peeked at all do it. I also think the e1000 should be
converted to be non-LLTX. The rest of netdev is screaming to kill LLTX. 

  tun driver doesnt use it either - but i doubt that makes it bloat
 
 Adding extra code that is currently not usable (esp from a submission
 point) is bloat.

So far i have converted 3 drivers, 1 of them doesnt use it. Two more
driver conversions are on the way, they will both use it. How is this
bloat again? 
A few emails back you said if only IPOIB can use batching then thats
good enough justification. 

  You waltz in, have the luxury of looking at my code, presentations, many
  discussions with me etc ...
 
 luxury ? 
 I had implemented the entire thing even before knowing that you
 are working on something similar! and I had sent the first proposal to
 netdev,

I saw your patch at the end of may (or at least 2 weeks after you said
it existed). That patch has very little resemblance to what you just
posted conceptwise or codewise. I could post it if you would give me
permission.

 *after* which you told that you have your own code and presentations (which
 I had never seen earlier - I joined netdev a few months back, earlier I was
 working on RDMA, Infiniband as you know).

I am gonna assume you didnt know of my work - which i have been making
public for about 3 years. Infact i talked about this topic when i
visited your office in 2006 on a day you were not present, so it is
plausible you didnt hear of it.

  And it didn't give me any great
 ideas either, remember I had posted results for E1000 at the time of
 sending the proposals. 

In mid-June you sent me a series of patches which included anything from
changing variable names to combining qdisc_restart and about everything
i referred to as being cosmetic differences in your posted patches. I
took two of those and incorporated them in. One was an XXX in my code
already to allocate the dev-blist 
(Commit: bb4464c5f67e2a69ffb233fcf07aede8657e4f63). 
The other one was a mechanical removal of the blist being passed
(Commit: 0e9959e5ee6f6d46747c97ca8edc91b3eefa0757). 
Some of the others i asked you to defer. For example, the reason i gave
you for not merging any qdisc_restart_combine changes is because i was
waiting for Dave to swallow the qdisc_restart changes i made; otherwise
maintainance becomes extremely painful for me. 
Sridhar actually provided a lot more valuable comments and fixes but has
not planted a flag on behalf of the queen of spain like you did. 

 However I do give credit in my proposal to you for what
 ideas that your provided (without actual code), and the same I did for other
 people who did the same, like Dave, Sridhar. BTW, you too had discussions 
 with me,
 and I sent some patches to improve your code too, 

I incorporated two of your patches and asked for deferal of others.
These patches have now shown up in what you claim as the difference. I
just call them cosmetic difference not to downplay the importance of
having an ethtool interface but because they do not make batching
perform any better. The real differences are those two items. I am
suprised you havent cannibalized those changes as well. I thought you
renamed them to something else; according to your posting:
This patch will work with drivers updated by Jamal, Matt  Michael Chan
with minor modifications - rename xmit_win to xmit_slots  rename batch
handler. Or maybe thats a future plan you have in mind?

 so it looks like a two
 way street to me (and that is how open source works and should).

Open source is a lot more transparent than that.

You posted a question, which was part of your research. I responded and
told you i have patches; you asked me for them and i promptly ported
them from pre-2.6.18 to the latest kernel at the time. 

The nature of this batching work is one of performance. So numbers are
important. If you had some strong disagreements on something in the
architecture, then it would be of great value to explain it in a
technical detail - and more importantly to provide some numbers to say
why it is a bad idea. You get numbers by running some tests. 
You did none of the above. Your effort has been to produce your patch
for whatever reasons. This would not have been problematic to me if it
actually was based within reasons of optimization because the end goal
would have been achieved.

I have deleted the rest of the email 

Re: 2.6.20-2.6.21 - networking dies after random time

2007-07-24 Thread Linus Torvalds


On Tue, 24 Jul 2007, Ingo Molnar wrote:
 
 please try the patch below instead.

I'm hoping this is just a let's see if the behavior changes patch, not 
something that you think should be applied if it fixes something?

This patch looks like it is trying to paper over (rather than fix) some 
possible bug in the -disable logic. Makes sense as a let's see if it's 
that kind of thing, but not as a let's fix it.

Or am I missing something?

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Andi Kleen
Linus Torvalds [EMAIL PROTECTED] writes:
 
 So called once should probably make the inlining weight bigger (ie 
 inline *larger* functions than you would otherwise), it just shouldn't 
 make it infinite. It's not worth it.

There's probably a --param where it can be tweaked exactly. The
problem is that --params tend to be very gcc version specific
and might do something completely different on a newer or 
older version. So it's better not to use them.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Adrian Bunk
On Tue, Jul 24, 2007 at 12:15:48PM -0700, Linus Torvalds wrote:
 
 
 On Tue, 24 Jul 2007, Andrew Morton wrote:
  
  fwiw, -fno-inline-functions-called-once (who knew?) takes i386 allnoconfig
  vmlinux .text from 928360 up to 955362 bytes (27k larger).
  
  A surprisingly large increase - I wonder if it did something dumb.  It
  appears to still correctly inline those things which we've manually marked
  inline.  hm.
 
 I think inlining small enough functions is worth it, and the thing is, the 
 kernel is actually pretty damn good at having lots of small functions. 
 It's one of the few things I really care about from a coding style 
 standpoint.
 
 So I'm not surprised that -fno-inline-functions-called-once makes things 
 larger, because I think it's generally a good idea to inline things that 
 are just called once. But it does make things harder to debug, and the 
 performance advantages become increasingly small for bigger functions.
 
 And that's a balancing act. Do we care about performance? Yes.

When using CONFIG_CC_OPTIMIZE_FOR_SIZE=y we even actively tell gcc that 
we only care about size and do not care about performance...

 But do we 
 care so much that it's worth inlining something like buffered_rmqueue()? 
...

Where is the problem with having buffered_rmqueue() inlined?

   Linus

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Linus Torvalds


On Tue, 24 Jul 2007, Andi Kleen wrote:
 
 There's probably a --param where it can be tweaked exactly. The
 problem is that --params tend to be very gcc version specific
 and might do something completely different on a newer or 
 older version. So it's better not to use them.

I agree wholeheartedly with that sentiment. We've tried at times (because 
some gcc snapshots made some *truly* insane choices for a while), and 
maybe we still have some around. Not worth the pain.

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Linus Torvalds


On Tue, 24 Jul 2007, Adrian Bunk wrote:
 
  But do we 
  care so much that it's worth inlining something like buffered_rmqueue()? 
 ...
 
 Where is the problem with having buffered_rmqueue() inlined?

In this case, it was a pain to just even try to find the call chain, or 
read the asm.

I would encourage lots of kernel hackers to read the assembler code gcc 
generates. I suspect people being aware of code generation issues (and 
writing their code with that in mind) is a *much* bigger performance 
impact than gcc inlining random functions.

So maybe I'm old-fashioned and crazy, but readability of the asm result 
actually is a worthwhile goal. Not because we care directly, but because 
I'd like to encourage people to do it, due to the *indirect* benefits.

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-24 Thread David Stevens
Oliver Hartkopp [EMAIL PROTECTED] wrote on 07/23/2007 11:22:39 PM:

 When you like to create any timeout based on your calculated value, you
 might run into the problem that your calculated value is set to _zero_
 even if there was some time before the conversion. This might probably
 not what you indented to get.

BTW, these are stats (for human consumption), not timers.

+-DLS

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] amso1100: QP init bug in amso driver

2007-07-24 Thread Tom Tucker
Roland:

The guys at UNH found this and fixed it. I'm surprised no
one has hit this before. I guess it only breaks when the 
refcount on the QP is non-zero.

Initialize the wait_queue_head_t in the c2_qp structure.

Signed-off-by: Ethan Burns [EMAIL PROTECTED]
Acked-by: Tom Tucker [EMAIL PROTECTED]

---
 drivers/infiniband/hw/amso1100/c2_qp.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/amso1100/c2_qp.c 
b/drivers/infiniband/hw/amso1100/c2_qp.c
index 420c138..01d0786 100644
--- a/drivers/infiniband/hw/amso1100/c2_qp.c
+++ b/drivers/infiniband/hw/amso1100/c2_qp.c
@@ -506,6 +506,7 @@ int c2_alloc_qp(struct c2_dev *c2dev,
qp-send_sgl_depth = qp_attrs-cap.max_send_sge;
qp-rdma_write_sgl_depth = qp_attrs-cap.max_send_sge;
qp-recv_sgl_depth = qp_attrs-cap.max_recv_sge;
+   init_waitqueue_head(qp-wait);
 
/* Initialize the SQ MQ */
q_size = be32_to_cpu(reply-sq_depth);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-24 Thread Al Boldi
Patrick McHardy wrote:
 Al Boldi wrote:
  Also, we could leave this as is, and select NF_CONNTRACK_ENABLED instead
  of NF_CONNTRACK.

 I guess so, and that would have to select NF_CONNTRACK.

Should I resend, or can you take care of it?


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20-2.6.21 - networking dies after random time

2007-07-24 Thread Ingo Molnar

* Linus Torvalds [EMAIL PROTECTED] wrote:

 On Tue, 24 Jul 2007, Ingo Molnar wrote:
  
  please try the patch below instead.
 
 I'm hoping this is just a let's see if the behavior changes patch, 
 not something that you think should be applied if it fixes something?
 
 This patch looks like it is trying to paper over (rather than fix) 
 some possible bug in the -disable logic. Makes sense as a let's 
 see if it's that kind of thing, but not as a let's fix it.
 
 Or am I missing something?

yeah - it's a totaly bad and unacceptable hack (i realized how bad it 
was when i wrote up that comment section ...), i just wanted to see 
which portion of ne2k/lib8390.c is sensitive to the fact whether an irq 
line is masked or not. The patch has no SOB line either.

the current best fix forward is to undo my original change, unless we 
find a better fix for this problem. (Note that the other patches posted 
in this thread are broken too: they only mask the irq but dont reliably 
unmask it.)

here's the current method of handling irqs for Marcin's card:

17: 12   IO-APIC-fasteoi   eth1, eth0

and fasteoi is a really simple sequence: no masking/unmasking by the 
flow handler itself but a NOP at entry and an APIC-EOI at the end. The 
disable/enable irq thing should thus have minimal effect if done within 
an irq handler.

now ne2k does something uncommon: for xmit (which is normally done 
outside of irq handlers) it will disable_irq_nosync()/enable_irq() the 
interrupt. It does it to exclude the handler from _that_ CPU, but due to 
the _nosync it does not exclude it from any other CPUs. So that's a bit 
weird already.

just in case, i've just re-checked all the genirq bits that change 
IRQ_DISABLED (that bit accidentally clear would be the only way to truly 
allow an IRQ handler to interrupt the disable_irq_nosync() critical 
section on that CPU) - but i can see no way for that to happen: we 
unconditionally detect and report unbalanced and underflowing 
irq_desc-depth, and the only other place (besides enable_irq()) that 
clears IRQ_DISABLED is __set_irq_handler(), and on x86 that is only used 
during bootup.

Marcin, could you try the patch below too? [without having any other 
patch applied.] It basically turns the critical section into an irqs-off 
critical section and thus checks whether your problem is related to that 
particular area of code.

Ingo

Index: linux/drivers/net/lib8390.c
===
--- linux.orig/drivers/net/lib8390.c
+++ linux/drivers/net/lib8390.c
@@ -297,9 +297,7 @@ static int ei_start_xmit(struct sk_buff 
 *  Slow phase with lock held.
 */
 
-   disable_irq_nosync_lockdep_irqsave(dev-irq, flags);
-
-   spin_lock(ei_local-page_lock);
+   spin_lock_irqsave(ei_local-page_lock, flags);
 
ei_local-irqlock = 1;
 
@@ -376,8 +374,7 @@ static int ei_start_xmit(struct sk_buff 
ei_local-irqlock = 0;
ei_outb_p(ENISR_ALL, e8390_base + EN0_IMR);
 
-   spin_unlock(ei_local-page_lock);
-   enable_irq_lockdep_irqrestore(dev-irq, flags);
+   spin_unlock_irqrestore(ei_local-page_lock, flags);
 
dev_kfree_skb (skb);
ei_local-stat.tx_bytes += send_length;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-24 Thread Alexey Dobriyan
On Tue, Jul 24, 2007 at 11:25:22AM -0700, Linus Torvalds wrote:
 On Tue, 24 Jul 2007, Andrew Morton wrote:
  I guess this was the bug:

 Looks very likely to me. Mike, Alexey, does this fix things for you?

Yeah, box is running for more than hour, survived LTP, gdb testsuite,
portage sync and so on.

What new to me is that someone decided TSC is unstable but that's
probably OK on full debug kernel:

Clocksource tsc unstable (delta = 91724418 ns)
Time: pit clocksource has been installed.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [2.6.22] Fix a potential NULL pointer dereference in mace_interrupt() in drivers/net/pcmcia/nmclan_cs.c

2007-07-24 Thread Jeff Garzik

Micah Gruber wrote:
This patch fixes a potential null dereference bug where we dereference 
DEV before a null check. This patch simply moves the dereferencing after 
the null check.


Signed-off-by: Micah Gruber [EMAIL PROTECTED]

---

--- a/drivers/net/pcmcia/nmclan_cs.c+++ 
b/drivers/net/pcmcia/nmclan_cs.c@@ -996,7 +996,7 @@


PLEASE PLEASE PLEASE fix your mailer.

None of your patches are apply-able via script or patch(1), which is how 
all Linux maintainers apply patches.


If you want to contribute to the kernel, you -must- figure out how to 
send patches via email.


Linux kernel development is done almost exclusively through email, so it 
is very important to get the details right.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] ucc_geth: add ethtool support

2007-07-24 Thread Jeff Garzik

Li Yang wrote:

The patch enables statistics in ucc_geth and adds ethtool support to
ucc_geth driver.

Signed-off-by: Li Yang [EMAIL PROTECTED]
---
 drivers/net/Makefile   |2 +-
 drivers/net/ucc_geth.c |   19 +-
 drivers/net/ucc_geth.h |6 +
 drivers/net/ucc_geth_ethtool.c |  388 
 drivers/net/ucc_geth_mii.c |6 +-
 5 files changed, 407 insertions(+), 14 deletions(-)
 create mode 100644 drivers/net/ucc_geth_ethtool.c


applied 1-2


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] netxen: Load firmware during probe, dma watchdog fix.

2007-07-24 Thread Jeff Garzik

[EMAIL PROTECTED] wrote:

The firmware should be loaded after resetting hardware during PCI probe,
besides module unload. This fixes issue with 2nd port of multiport adapter
on powerpc blades. This patch also fixes a bug that PCI resources are not
freed if dma watchdog shutdown failed. The dma watchdog poll messages
during module unload are also suppressed.

Signed-off-by: Dhananjay Phadke [EMAIL PROTECTED]
Signed-off-by: Milan Bag [EMAIL PROTECTED]
Signed-off-by: Wen Xiong [EMAIL PROTECTED]


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/10] ps3: fix wrong calculation of rx descriptor address

2007-07-24 Thread Jeff Garzik

Masakazu Mokuno wrote:

Fixed the bug that calculation of the address of rx descriptor was
wrong.

Signed-off-by: Masakazu Mokuno [EMAIL PROTECTED]
---
 drivers/net/ps3_gelic_net.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


applied 1-10


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] atl1: change tpd_avail function name

2007-07-24 Thread Jeff Garzik

Jay Cliburn wrote:

Change tpd_avail() to atl1_tpd_avail().

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
---
 drivers/net/atl1/atl1_main.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)


applied 1-5


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] drivers/net/acenic.c: fix check-after-use

2007-07-24 Thread Jeff Garzik

Adrian Bunk wrote:
The Coverity checker noted that we've already dereferenced dev when we 
check whether it's NULL.


Since it's impossible that dev is NULL at this place this patch 
removes the NULL check.


Signed-off-by: Adrian Bunk [EMAIL PROTECTED]


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] eHEA: net_poll support

2007-07-24 Thread Jeff Garzik

Jan-Bernd Themann wrote:

net_poll support for eHEA added

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED]
---


 drivers/net/ehea/ehea.h  |2 +-
 drivers/net/ehea/ehea_main.c |   22 +-
 2 files changed, 22 insertions(+), 2 deletions(-)


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] defxx: Use __maybe_unused rather than a local hack

2007-07-24 Thread Jeff Garzik

Maciej W. Rozycki wrote:
 This is a change to remove a local hack in favour to __maybe_unused that 
has been recently added.


Signed-off-by: Maciej W. Rozycki [EMAIL PROTECTED]
---
Hi,

 It should be obvious.  The code builds, therefore it works.

 Please apply,

  Maciej



applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] forcedeth: new device ids in pci_ids.h

2007-07-24 Thread Jeff Garzik

Ayaz Abdulla wrote:

This patch contains new device ids for MCP73 chipset.

Signed-Off-By: Ayaz Abdulla [EMAIL PROTECTED]


applied 1-2


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/3] netdev: i82596 Ethernet needs asm/cacheflush.h

2007-07-24 Thread Jeff Garzik

Geert Uytterhoeven wrote:

netdev: i82596 Ethernet needs asm/cacheflush.h on m68k

drivers/net/82596.c: In function 'init_rx_bufs':
drivers/net/82596.c:552: error: implicit declaration of function 'cache_clear'
drivers/net/82596.c: In function 'i596_start_xmit':
drivers/net/82596.c:1104: error: implicit declaration of function 'cache_push'

The driver still compiles on ia32 (CONFIG_APRICOT=y)

Signed-off-by: Geert Uytterhoeven [EMAIL PROTECTED]


applied


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NET_DMA: remove unused dma_memcpy_to_kernel_iovec

2007-07-24 Thread Shannon Nelson
Al Viro pointed out that dma_memcpy_to_kernel_iovec() really was
unreachable and thus unused.  The code originally was there to support
in-kernel dma needs, but since it remains unused, we'll pull it out.

Signed-off-by: Shannon Nelson [EMAIL PROTECTED]
---

 drivers/dma/iovlock.c |   27 ---
 1 files changed, 0 insertions(+), 27 deletions(-)

diff --git a/drivers/dma/iovlock.c b/drivers/dma/iovlock.c
index d637555..e763d72 100644
--- a/drivers/dma/iovlock.c
+++ b/drivers/dma/iovlock.c
@@ -143,29 +143,6 @@ void dma_unpin_iovec_pages(struct dma_pinned_list 
*pinned_list)
kfree(pinned_list);
 }
 
-static dma_cookie_t dma_memcpy_to_kernel_iovec(struct dma_chan *chan, struct
-   iovec *iov, unsigned char *kdata, size_t len)
-{
-   dma_cookie_t dma_cookie = 0;
-
-   while (len  0) {
-   if (iov-iov_len) {
-   int copy = min_t(unsigned int, iov-iov_len, len);
-   dma_cookie = dma_async_memcpy_buf_to_buf(
-   chan,
-   iov-iov_base,
-   kdata,
-   copy);
-   kdata += copy;
-   len -= copy;
-   iov-iov_len -= copy;
-   iov-iov_base += copy;
-   }
-   iov++;
-   }
-
-   return dma_cookie;
-}
 
 /*
  * We have already pinned down the pages we will be using in the iovecs.
@@ -187,10 +164,6 @@ dma_cookie_t dma_memcpy_to_iovec(struct dma_chan *chan, 
struct iovec *iov,
if (!chan)
return memcpy_toiovec(iov, kdata, len);
 
-   /* - kernel copies (e.g. smbfs) */
-   if (!pinned_list)
-   return dma_memcpy_to_kernel_iovec(chan, iov, kdata, len);
-
iovec_idx = 0;
while (iovec_idx  pinned_list-nr_iovecs) {
struct dma_page_list *page_list;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Fix race in genl_unregister_mc_groups()

2007-07-24 Thread Thomas Graf
* Brian Haley [EMAIL PROTECTED] 2007-07-24 12:14
 Thomas Graf wrote:
 @@ -217,14 +229,8 @@ EXPORT_SYMBOL(genl_register_mc_group);
  void genl_unregister_mc_group(struct genl_family *family,
struct genl_multicast_group *grp)
  {
 -BUG_ON(grp-family != family);
  genl_lock();
 -netlink_clear_multicast_users(genl_sock, grp-id);
 -clear_bit(grp-id, mc_groups);
 -list_del(grp-list);
 -genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
 -grp-id = 0;
 -grp-family = NULL;
 +genl_unregister_mc_group(family, grp);
  genl_unlock();
  }
 
 Shouldn't this be __genl_unregister_mc_group(family, grp) ?

Yes, thank for you noticing.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[REPOST][GENETLINK]: Fix race in genl_unregister_mc_groups()

2007-07-24 Thread Thomas Graf
family-mcast_groups is protected by genl_lock so it must
be held while accessing the list in genl_unregister_mc_groups().
Requires adding a non-locking variant of genl_unregister_mc_group().

Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Index: net-2.6/net/netlink/genetlink.c
===
--- net-2.6.orig/net/netlink/genetlink.c2007-07-23 22:08:04.0 
+0200
+++ net-2.6/net/netlink/genetlink.c 2007-07-24 23:51:11.0 +0200
@@ -200,6 +200,18 @@ int genl_register_mc_group(struct genl_f
 }
 EXPORT_SYMBOL(genl_register_mc_group);
 
+static void __genl_unregister_mc_group(struct genl_family *family,
+  struct genl_multicast_group *grp)
+{
+   BUG_ON(grp-family != family);
+   netlink_clear_multicast_users(genl_sock, grp-id);
+   clear_bit(grp-id, mc_groups);
+   list_del(grp-list);
+   genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
+   grp-id = 0;
+   grp-family = NULL;
+}
+
 /**
  * genl_unregister_mc_group - unregister a multicast group
  *
@@ -217,14 +229,8 @@ EXPORT_SYMBOL(genl_register_mc_group);
 void genl_unregister_mc_group(struct genl_family *family,
  struct genl_multicast_group *grp)
 {
-   BUG_ON(grp-family != family);
genl_lock();
-   netlink_clear_multicast_users(genl_sock, grp-id);
-   clear_bit(grp-id, mc_groups);
-   list_del(grp-list);
-   genl_ctrl_event(CTRL_CMD_DELMCAST_GRP, grp);
-   grp-id = 0;
-   grp-family = NULL;
+   __genl_unregister_mc_group(family, grp);
genl_unlock();
 }
 
@@ -232,8 +238,10 @@ static void genl_unregister_mc_groups(st
 {
struct genl_multicast_group *grp, *tmp;
 
+   genl_lock();
list_for_each_entry_safe(grp, tmp, family-mcast_groups, list)
-   genl_unregister_mc_group(family, grp);
+   __genl_unregister_mc_group(family, grp);
+   genl_unlock();
 }
 
 /**
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] TIPC: fix tipc_link_create error handling

2007-07-24 Thread Florian Westphal
if printbuf allocation or tipc_node_attach_link() fails, invalid
references to the link are left in the associated node and bearer
structures.
Fix by allocating printbuf early and moving timer initialization
and the addition of the new link to the b_ptr-links list after
tipc_node_attach_link() succeeded.

Signed-off-by: Florian Westphal [EMAIL PROTECTED]
---
 link.c |   28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

also move k_init_timer(), as suggested by Allan.

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 5adfdfd..1d674e0 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -423,6 +423,17 @@ struct link *tipc_link_create(struct bearer *b_ptr, const 
u32 peer,
return NULL;
}
 
+   if (LINK_LOG_BUF_SIZE) {
+   char *pb = kmalloc(LINK_LOG_BUF_SIZE, GFP_ATOMIC);
+
+   if (!pb) {
+   kfree(l_ptr);
+   warn(Link creation failed, no memory for print 
buffer\n);
+   return NULL;
+   }
+   tipc_printbuf_init(l_ptr-print_buf, pb, LINK_LOG_BUF_SIZE);
+   }
+
l_ptr-addr = peer;
if_name = strchr(b_ptr-publ.name, ':') + 1;
sprintf(l_ptr-name, %u.%u.%u:%s-%u.%u.%u:,
@@ -432,8 +443,6 @@ struct link *tipc_link_create(struct bearer *b_ptr, const 
u32 peer,
tipc_zone(peer), tipc_cluster(peer), tipc_node(peer));
/* note: peer i/f is appended to link name by reset/activate */
memcpy(l_ptr-media_addr, media_addr, sizeof(*media_addr));
-   k_init_timer(l_ptr-timer, (Handler)link_timeout, (unsigned 
long)l_ptr);
-   list_add_tail(l_ptr-link_list, b_ptr-links);
l_ptr-checkpoint = 1;
l_ptr-b_ptr = b_ptr;
link_set_supervision_props(l_ptr, b_ptr-media-tolerance);
@@ -459,21 +468,14 @@ struct link *tipc_link_create(struct bearer *b_ptr, const 
u32 peer,
 
l_ptr-owner = tipc_node_attach_link(l_ptr);
if (!l_ptr-owner) {
+   if (LINK_LOG_BUF_SIZE)
+   kfree(l_ptr-print_buf.buf);
kfree(l_ptr);
return NULL;
}
 
-   if (LINK_LOG_BUF_SIZE) {
-   char *pb = kmalloc(LINK_LOG_BUF_SIZE, GFP_ATOMIC);
-
-   if (!pb) {
-   kfree(l_ptr);
-   warn(Link creation failed, no memory for print 
buffer\n);
-   return NULL;
-   }
-   tipc_printbuf_init(l_ptr-print_buf, pb, LINK_LOG_BUF_SIZE);
-   }
-
+   k_init_timer(l_ptr-timer, (Handler)link_timeout, (unsigned 
long)l_ptr);
+   list_add_tail(l_ptr-link_list, b_ptr-links);
tipc_k_signal((Handler)tipc_link_start, (unsigned long)l_ptr);
 
dbg(tipc_link_create(): tolerance = %u,cont intv = %u, abort_limit = 
%u\n,
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REPOST][GENETLINK]: Fix race in genl_unregister_mc_groups()

2007-07-24 Thread David Miller
From: Thomas Graf [EMAIL PROTECTED]
Date: Tue, 24 Jul 2007 23:52:57 +0200

 family-mcast_groups is protected by genl_lock so it must
 be held while accessing the list in genl_unregister_mc_groups().
 Requires adding a non-locking variant of genl_unregister_mc_group().
 
 Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Applied, thanks Thomas.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] NET: Fix sch_api to properly set sch-parent on the root qdisc

2007-07-24 Thread PJ Waskiewicz
From: Patrick McHardy [EMAIL PROTECTED]

Fix sch_api to correctly set sch-parent for both ingress and egress
qdiscs in qdisc_create().

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]
Signed-off-by: Peter P Waskiewicz Jr [EMAIL PROTECTED]
---

 net/sched/sch_api.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 13c09bc..dee0d5f 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -380,6 +380,10 @@ void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned 
int n)
return;
while ((parentid = sch-parent)) {
sch = qdisc_lookup(sch-dev, TC_H_MAJ(parentid));
+   if (sch == NULL) {
+   WARN_ON(parentid != TC_H_ROOT);
+   return;
+   }
cops = sch-ops-cl_ops;
if (cops-qlen_notify) {
cl = cops-get(sch, parentid);
@@ -420,8 +424,6 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc 
*parent,
unsigned long cl = cops-get(parent, classid);
if (cl) {
err = cops-graft(parent, cl, new, old);
-   if (new)
-   new-parent = classid;
cops-put(parent, cl);
}
}
@@ -436,7 +438,8 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc 
*parent,
  */
 
 static struct Qdisc *
-qdisc_create(struct net_device *dev, u32 handle, struct rtattr **tca, int 
*errp)
+qdisc_create(struct net_device *dev, u32 parent, u32 handle,
+  struct rtattr **tca, int *errp)
 {
int err;
struct rtattr *kind = tca[TCA_KIND-1];
@@ -482,6 +485,8 @@ qdisc_create(struct net_device *dev, u32 handle, struct 
rtattr **tca, int *errp)
goto err_out2;
}
 
+   sch-parent = parent;
+
if (handle == TC_H_INGRESS) {
sch-flags |= TCQ_F_INGRESS;
sch-stats_lock = dev-ingress_lock;
@@ -758,9 +763,11 @@ create_n_graft:
if (!(n-nlmsg_flagsNLM_F_CREATE))
return -ENOENT;
if (clid == TC_H_INGRESS)
-   q = qdisc_create(dev, tcm-tcm_parent, tca, err);
+   q = qdisc_create(dev, tcm-tcm_parent, tcm-tcm_parent,
+tca, err);
else
-   q = qdisc_create(dev, tcm-tcm_handle, tca, err);
+   q = qdisc_create(dev, tcm-tcm_parent, tcm-tcm_handle,
+tca, err);
if (q == NULL) {
if (err == -EAGAIN)
goto replay;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] NET: Fix sch_prio to properly detect the root qdisc on multiqueue

2007-07-24 Thread PJ Waskiewicz
Fix the check in prio_tune() to see if sch-parent is TC_H_ROOT instead of
sch-handle to load or reject the qdisc for multiqueue devices.

Signed-off-by: Peter P Waskiewicz Jr [EMAIL PROTECTED]
---

 net/sched/sch_prio.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 2d8c084..06441db 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -239,11 +239,13 @@ static int prio_tune(struct Qdisc *sch, struct rtattr 
*opt)
/* If we're multiqueue, make sure the number of incoming bands
 * matches the number of queues on the device we're associating with.
 * If the number of bands requested is zero, then set q-bands to
-* dev-egress_subqueue_count.
+* dev-egress_subqueue_count.  Also, the root qdisc must be the
+* only one that is enabled for multiqueue, since it's the only one
+* that interacts with the underlying device.
 */
q-mq = RTA_GET_FLAG(tb[TCA_PRIO_MQ - 1]);
if (q-mq) {
-   if (sch-handle != TC_H_ROOT)
+   if (sch-parent != TC_H_ROOT)
return -EINVAL;
if (netif_is_multiqueue(sch-dev)) {
if (q-bands == 0)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Fix adjustment of number of multicast groups

2007-07-24 Thread David Miller
From: Thomas Graf [EMAIL PROTECTED]
Date: Tue, 24 Jul 2007 11:45:14 +0200

 The current calculation of the maximum number of genetlink
 multicast groups seems odd, fix it.
 
 Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[FIX] NET: Fix sch_api and sch_prio to properly set and detect the root qdisc

2007-07-24 Thread PJ Waskiewicz
This is a patch from Patrick McHardy to fix the sch_api code, which I
went ahead and tested and made a slight fix to.  This also includes
the fix to sch_prio based on Patrick's patch.

The sch-parent handle should contain the parent qdisc ID.  When the
qdisc is the root qdisc (TC_H_ROOT), the parent handle should be the
value TC_H_ROOT.  This fixes sch_api to set this correctly on
qdisc_create() for both ingress and egress qdiscs.

Change this check in prio_tune() so that only the root qdisc can be
multiqueue-enabled; use sch-parent instead of sch-handle.

-- 
PJ Waskiewicz [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Correctly report errors while registering a multicast group

2007-07-24 Thread David Miller
From: Thomas Graf [EMAIL PROTECTED]
Date: Tue, 24 Jul 2007 11:44:27 +0200

 Signed-off-by: Thomas Graf [EMAIL PROTECTED]

Also applied, thanks Thomas.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2][RFC] Update network drivers to use devres

2007-07-24 Thread bphilips
These patches update the network driver core and the e100 driver to use devres.
Devres is a simple resource manager for device drivers, see
Documentation/driver-model/devres.txt for more information.

If the RFC goes well I will create patches to update all applicable network
drivers.

Applies against 2.6.22

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2][RFC] Update net core to use devres.

2007-07-24 Thread bphilips
* netdev_pci_remove_one() can replace simple pci device remove
  functions

* devm_alloc_netdev() is like alloc_netdev but allocates memory using devres.
  alloc_netdev() can be removed once all drivers use devres.

Applies against 2.6.22

Signed-off-by: Brandon Philips [EMAIL PROTECTED]

---
 include/linux/etherdevice.h |2 +
 include/linux/netdevice.h   |5 
 net/core/dev.c  |   48 
 net/ethernet/eth.c  |6 +
 4 files changed, 57 insertions(+), 4 deletions(-)

Index: linux-2.6/include/linux/netdevice.h
===
--- linux-2.6.orig/include/linux/netdevice.h
+++ linux-2.6/include/linux/netdevice.h
@@ -622,6 +622,7 @@ extern int  dev_queue_xmit(struct sk_buf
 extern int register_netdevice(struct net_device *dev);
 extern voidunregister_netdevice(struct net_device *dev);
 extern voidfree_netdev(struct net_device *dev);
+extern voidnetdev_pci_remove_one(struct pci_dev *pdev);
 extern voidsynchronize_net(void);
 extern int register_netdevice_notifier(struct notifier_block *nb);
 extern int unregister_netdevice_notifier(struct notifier_block 
*nb);
@@ -992,6 +993,10 @@ static inline void netif_tx_disable(stru
 extern voidether_setup(struct net_device *dev);
 
 /* Support for loadable net-drivers */
+
+extern struct net_device *devm_alloc_netdev(struct device *dev,
+   int sizeof_priv, const char *name,
+   void (*setup)(struct net_device *));
 extern struct net_device *alloc_netdev(int sizeof_priv, const char *name,
   void (*setup)(struct net_device *));
 extern int register_netdev(struct net_device *dev);
Index: linux-2.6/net/core/dev.c
===
--- linux-2.6.orig/net/core/dev.c
+++ linux-2.6/net/core/dev.c
@@ -89,6 +89,7 @@
 #include linux/interrupt.h
 #include linux/if_ether.h
 #include linux/netdevice.h
+#include linux/pci.h
 #include linux/etherdevice.h
 #include linux/notifier.h
 #include linux/skbuff.h
@@ -3343,7 +3344,8 @@ static struct net_device_stats *internal
 }
 
 /**
- * alloc_netdev - allocate network device
+ * __alloc_netdev - allocate network device
+ * @dev:   managed device responsible for mem; NULL if not managed
  * @sizeof_priv:   size of private data to allocate space for
  * @name:  device name format string
  * @setup: callback to initialize device
@@ -3351,8 +3353,8 @@ static struct net_device_stats *internal
  * Allocates a struct net_device with private data area for driver use
  * and performs basic initialization.
  */
-struct net_device *alloc_netdev(int sizeof_priv, const char *name,
-   void (*setup)(struct net_device *))
+struct net_device *__alloc_netdev(struct device *gendev, int sizeof_priv,
+   const char *name, void (*setup)(struct net_device *))
 {
void *p;
struct net_device *dev;
@@ -3364,7 +3366,11 @@ struct net_device *alloc_netdev(int size
alloc_size = (sizeof(*dev) + NETDEV_ALIGN_CONST)  ~NETDEV_ALIGN_CONST;
alloc_size += sizeof_priv + NETDEV_ALIGN_CONST;
 
-   p = kzalloc(alloc_size, GFP_KERNEL);
+   if (dev == NULL)
+   p = kzalloc(alloc_size, GFP_KERNEL);
+   else
+   p = devm_kzalloc(gendev, alloc_size, GFP_KERNEL);
+
if (!p) {
printk(KERN_ERR alloc_netdev: Unable to allocate device.\n);
return NULL;
@@ -3382,8 +3388,23 @@ struct net_device *alloc_netdev(int size
strcpy(dev-name, name);
return dev;
 }
+
+struct net_device *devm_alloc_netdev(struct device *dev, int sizeof_priv, const
+   char *name, void (*setup)(struct net_device *))
+{
+   return __alloc_netdev(dev, sizeof_priv, name, setup);
+}
+EXPORT_SYMBOL(devm_alloc_netdev);
+
+struct net_device *alloc_netdev(int sizeof_priv, const char *name,
+   void (*setup)(struct net_device *))
+{
+   return __alloc_netdev(NULL, sizeof_priv, name, setup);
+}
 EXPORT_SYMBOL(alloc_netdev);
 
+
+
 /**
  * free_netdev - free network device
  * @dev: device
@@ -3411,6 +3432,25 @@ void free_netdev(struct net_device *dev)
 #endif
 }
 
+/**
+ * free_netdev - free network device
+ * @dev: device
+ *
+ * This function does the last stage of destroying an allocated device
+ * interface. The reference to the device object is released.
+ * If this is the last reference then it will be freed.
+ */
+void netdev_pci_remove_one(struct pci_dev *pdev)
+{
+   struct net_device *netdev = pci_get_drvdata(pdev);
+   if (netdev) {
+   unregister_netdev(netdev);
+   pci_set_drvdata(pdev, NULL);
+  

[PATCH 2/2][RFC] Update e100 driver to use devres.

2007-07-24 Thread bphilips
devres manages device resources and is currently used by all libata low level
drivers.   It can greatly reduce the complexity of the error handling on probe
and the device removal functions.

For example the e100_free() function and all of the gotos in e100_probe have
been removed.  Also, e100_remove() has been deleted and replaced with a much
simpler netdev_pci_remove_one() function that should be able to handle any PCI
net device that doesn't require any teardown besides resource deallocation.

Applies against 2.6.22.

Signed-off-by: Brandon Philips [EMAIL PROTECTED]

---
 drivers/net/e100.c |   73 +
 1 file changed, 19 insertions(+), 54 deletions(-)

Index: linux-2.6/drivers/net/e100.c
===
--- linux-2.6.orig/drivers/net/e100.c
+++ linux-2.6/drivers/net/e100.c
@@ -2514,18 +2514,11 @@ static int e100_do_ioctl(struct net_devi
 
 static int e100_alloc(struct nic *nic)
 {
-   nic-mem = pci_alloc_consistent(nic-pdev, sizeof(struct mem),
-   nic-dma_addr);
-   return nic-mem ? 0 : -ENOMEM;
-}
+   struct device *dev = nic-pdev-dev;
 
-static void e100_free(struct nic *nic)
-{
-   if(nic-mem) {
-   pci_free_consistent(nic-pdev, sizeof(struct mem),
-   nic-mem, nic-dma_addr);
-   nic-mem = NULL;
-   }
+   nic-mem = dmam_alloc_coherent(dev, sizeof(struct mem),
+   nic-dma_addr, GFP_ATOMIC);
+   return nic-mem ? 0 : -ENOMEM;
 }
 
 static int e100_open(struct net_device *netdev)
@@ -2552,7 +2545,7 @@ static int __devinit e100_probe(struct p
struct nic *nic;
int err;
 
-   if(!(netdev = alloc_etherdev(sizeof(struct nic {
+   if (!(netdev = devm_alloc_etherdev(pdev-dev, sizeof(struct nic {
if(((1  debug) - 1)  NETIF_MSG_PROBE)
printk(KERN_ERR PFX Etherdev alloc failed, abort.\n);
return -ENOMEM;
@@ -2582,26 +2575,26 @@ static int __devinit e100_probe(struct p
nic-msg_enable = (1  debug) - 1;
pci_set_drvdata(pdev, netdev);
 
-   if((err = pci_enable_device(pdev))) {
+   if ((err = pcim_enable_device(pdev))) {
DPRINTK(PROBE, ERR, Cannot enable PCI device, aborting.\n);
-   goto err_out_free_dev;
+   return err;
}
 
if(!(pci_resource_flags(pdev, 0)  IORESOURCE_MEM)) {
DPRINTK(PROBE, ERR, Cannot find proper PCI device 
base address, aborting.\n);
err = -ENODEV;
-   goto err_out_disable_pdev;
+   return err;
}
 
if((err = pci_request_regions(pdev, DRV_NAME))) {
DPRINTK(PROBE, ERR, Cannot obtain PCI resources, aborting.\n);
-   goto err_out_disable_pdev;
+   return err;
}
 
if((err = pci_set_dma_mask(pdev, DMA_32BIT_MASK))) {
DPRINTK(PROBE, ERR, No usable DMA configuration, aborting.\n);
-   goto err_out_free_res;
+   return err;
}
 
SET_MODULE_OWNER(netdev);
@@ -2610,11 +2603,11 @@ static int __devinit e100_probe(struct p
if (use_io)
DPRINTK(PROBE, INFO, using i/o access mode\n);
 
-   nic-csr = pci_iomap(pdev, (use_io ? 1 : 0), sizeof(struct csr));
+   nic-csr = pcim_iomap(pdev, (use_io ? 1 : 0), sizeof(struct csr));
if(!nic-csr) {
DPRINTK(PROBE, ERR, Cannot map device registers, aborting.\n);
err = -ENOMEM;
-   goto err_out_free_res;
+   return err;
}
 
if(ent-driver_data)
@@ -2647,11 +2640,11 @@ static int __devinit e100_probe(struct p
 
if((err = e100_alloc(nic))) {
DPRINTK(PROBE, ERR, Cannot alloc driver memory, aborting.\n);
-   goto err_out_iounmap;
+   return err;
}
 
if((err = e100_eeprom_load(nic)))
-   goto err_out_free;
+   return err;
 
e100_phy_init(nic);
 
@@ -2661,8 +2654,8 @@ static int __devinit e100_probe(struct p
if (!eeprom_bad_csum_allow) {
DPRINTK(PROBE, ERR, Invalid MAC address from 
EEPROM, aborting.\n);
-   err = -EAGAIN;
-   goto err_out_free;
+
+   return -EAGAIN;
} else {
DPRINTK(PROBE, ERR, Invalid MAC address from EEPROM, 
you MUST configure one.\n);
@@ -2682,7 +2675,7 @@ static int __devinit e100_probe(struct p
strcpy(netdev-name, eth%d);
if((err = register_netdev(netdev))) {
DPRINTK(PROBE, ERR, Cannot register net device, aborting.\n);
-   goto err_out_free;
+   return err;
}
 
DPRINTK(PROBE, INFO, addr 0x%llx, irq %d, 
@@ -2692,36 

Re: [PATCH 1/2][RFC] Update net core to use devres.

2007-07-24 Thread Al Viro
On Tue, Jul 24, 2007 at 03:39:52PM -0700, [EMAIL PROTECTED] wrote:
 * netdev_pci_remove_one() can replace simple pci device remove
   functions
 
 * devm_alloc_netdev() is like alloc_netdev but allocates memory using devres.
   alloc_netdev() can be removed once all drivers use devres.
 
E...  What the hell for?  To make sure that we have struct device
for everything, whether we need it or not?  Have you actually read
through drivers/net?  I mean, _really_ read through it, looking for
ugly cases...

I've done just that several times and I'm sorry, but I would classify
that project as hopeless.  It's way, _way_ more diverse than SATA...
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >