Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-23 Thread Varun Chandramohan
Oliver Hartkopp wrote:
> Varun Chandramohan wrote:
>   
>> Patrick McHardy wrote:
>>   
>> 
>>> Varun Chandramohan wrote:
>>>   
>>> 
>>>   
  /**
 + * timeval_to_sec - Convert timeval to seconds
 + * @tv: pointer to the timeval variable to be converted
 + *
 + * Returns the seconds representation of timeval parameter.
 + */
 +static inline time_t timeval_to_sec(const struct timeval *tv)
 +{
 +  return (tv->tv_sec + (tv->tv_usec + 50)/100);
 +}
 
   
 
>>> I don't think you should round down timeout values.
>>>   
>>> 
>>>   
>> Can you elaborate on that? As per the RFC of MIB ,we need only seconds
>> granularity. Taking that as the case i dont understand why round down
>> should not be done?
>>   
>> 
>
> When you like to create any timeout based on your calculated value, you
> might run into the problem that your calculated value is set to _zero_
> even if there was "some time" before the conversion. This might probably
> not what you indented to get.
>
> So what about rounding up with
>
> return (tv->tv_sec + (tv->tv_usec + 99)/100);
>
> ???
>
>   
This can done.  Is this what you were ref to me, Patrick?
> Btw. isn't here already any solution based on ktime conversions?
>
>   
AFAIK there isint any conversion function to secs. Correct me if iam wrong.
But we can have a function or macro to do this conversion.
> Regards,
> Oliver
>
>
>   
Regards,
Varun
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-23 Thread Oliver Hartkopp
Varun Chandramohan wrote:
> Patrick McHardy wrote:
>   
>> Varun Chandramohan wrote:
>>   
>> 
>>>  /**
>>> + * timeval_to_sec - Convert timeval to seconds
>>> + * @tv: pointer to the timeval variable to be converted
>>> + *
>>> + * Returns the seconds representation of timeval parameter.
>>> + */
>>> +static inline time_t timeval_to_sec(const struct timeval *tv)
>>> +{
>>> +   return (tv->tv_sec + (tv->tv_usec + 50)/100);
>>> +}
>>> 
>>>   
>> I don't think you should round down timeout values.
>>   
>> 
> Can you elaborate on that? As per the RFC of MIB ,we need only seconds
> granularity. Taking that as the case i dont understand why round down
> should not be done?
>   

When you like to create any timeout based on your calculated value, you
might run into the problem that your calculated value is set to _zero_
even if there was "some time" before the conversion. This might probably
not what you indented to get.

So what about rounding up with

return (tv->tv_sec + (tv->tv_usec + 99)/100);

???

Btw. isn't here already any solution based on ktime conversions?

Regards,
Oliver


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFX]: napi_struct V3

2007-07-23 Thread Rusty Russell
On Mon, 2007-07-23 at 22:47 -0700, David Miller wrote:
> I don't think it's wise to implement this over and over again in each
> driver, since we already know at least a handfull of drivers will use
> this.

Yep.  Alternative is a napi_struct_with_restart, but I don't think it's
worth the few-byte savings per queue.

> Any objections?

On the contrary, this looks good.

Thanks!
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-23 Thread Sam Ravnborg
On Tue, Jul 24, 2007 at 08:36:33AM +0300, Al Boldi wrote:
> 
> Replaces NF_CONNTRACK_ENABLED with NF_CONNTRACK and selects it for 
> NF_CONNTRACK_IPV4 and NF_CONNTRACK_IPV6
> 
> This exposes IPv4/6 connection tracking options for easier Kconfig setup.
> 
> Signed-off-by: Al Boldi <[EMAIL PROTECTED]>
> Cc: David Miller <[EMAIL PROTECTED]>
> Cc: Sam Ravnborg <[EMAIL PROTECTED]>
> Cc: Andrew Morton <[EMAIL PROTECTED]>
> ---
> --- a/net/netfilter/Kconfig   2007-07-09 06:38:52.0 +0300
> +++ b/net/netfilter/Kconfig   2007-07-24 08:28:06.0 +0300
> @@ -25,8 +25,7 @@ config NETFILTER_NETLINK_LOG
> and is also scheduled to replace the old syslog-based ipt_LOG
> and ip6t_LOG modules.
>  
> -# Rename this to NF_CONNTRACK in a 2.6.25
> -config NF_CONNTRACK_ENABLED
> +config NF_CONNTRACK
>   tristate "Netfilter connection tracking support"
>   help
> Connection tracking keeps a record of what packets have passed
> @@ -40,10 +39,6 @@ config NF_CONNTRACK_ENABLED
>  
> To compile it as a module, choose M here.  If unsure, say N.
>  
> -config NF_CONNTRACK
> - tristate
> - default NF_CONNTRACK_ENABLED
> -
>  config NF_CT_ACCT
>   bool "Connection tracking flow accounting"
>   depends on NF_CONNTRACK
> --- a/net/ipv4/netfilter/Kconfig  2007-07-09 06:38:50.0 +0300
> +++ b/net/ipv4/netfilter/Kconfig  2007-07-24 08:27:39.0 +0300
> @@ -7,7 +7,7 @@ menu "IP: Netfilter Configuration"
>  
>  config NF_CONNTRACK_IPV4
>   tristate "IPv4 connection tracking support (required for NAT)"
> - depends on NF_CONNTRACK
> + select NF_CONNTRACK
>   ---help---
> Connection tracking keeps a record of what packets have passed
> through your machine, in order to figure out how they are related
> --- a/net/ipv6/netfilter/Kconfig  2007-07-09 06:38:51.0 +0300
> +++ b/net/ipv6/netfilter/Kconfig  2007-07-24 08:27:54.0 +0300
> @@ -7,7 +7,8 @@ menu "IPv6: Netfilter Configuration (EXP
>  
>  config NF_CONNTRACK_IPV6
>   tristate "IPv6 connection tracking support (EXPERIMENTAL)"
> - depends on INET && IPV6 && EXPERIMENTAL && NF_CONNTRACK
> + depends on INET && IPV6 && EXPERIMENTAL
> + select NF_CONNTRACK
>   ---help---
> Connection tracking keeps a record of what packets have passed
> through your machine, in order to figure out how they are related
> 
This change looks wrong.
Due to the reverse nature of "select" kconfig cannot fulfill the dependencies
of selected symbols. So as a rule of thumb select should only select
symbols with no menu and no dependencies to avoid some of the
problems that have popped up during the last months.

Sam
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Netfilter Kconfig: Expose IPv4/6 connection tracking options by selecting NF_CONNTRACK

2007-07-23 Thread David Miller

Any reason you're not sending this to the netfilter developer list
mentioned in MAINTAINERS, or it's chief maintainer Patrick McHardy?

NETFILTER/IPTABLES/IPCHAINS
P:  Rusty Russell
P:  Marc Boucher
P:  James Morris
P:  Harald Welte
P:  Jozsef Kadlecsik
P:  Patrick McHardy
M:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED]
L:  [EMAIL PROTECTED] (subscribers-only)
L:  [EMAIL PROTECTED]
W:  http://www.netfilter.org/
W:  http://www.iptables.org/
S:  Supported

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFX]: napi_struct V3

2007-07-23 Thread David Miller
From: Rusty Russell <[EMAIL PROTECTED]>
Date: Tue, 24 Jul 2007 14:47:19 +1000

> On Mon, 2007-07-23 at 21:07 -0700, David Miller wrote:
> > Another area of consternation are drivers that were using
> > netif_rx_reschedule(), as that interface was removed because it
> > doesn't fit well with the caller managing the dev->quota et al.  I
> > left race conditions in the drivers that were using that interface,
> > but they should still basically work nonetheless.
> 
> Hmm, virtio does this, if the implementation returns false from
> ->restart.  But it's basically a bandaid for things like lguest which
> don't check irq status on irq enable, hence is subject to the race.
> 
> But AFAICT netif_rx_reschedule() is implementable in a driver anyway.
> What am I missing?

That looks like it would work, yes.

This is the least understood area, semantically, of NAPI.  It is
commonly believed that level-triggered interrupts obviate the need to
handle this race condition, and nothing could be further from the
truth.

I don't think it's wise to implement this over and over again in each
driver, since we already know at least a handfull of drivers will use
this.

A netdev_napi_init() suggests itself already.  We could put the work
struct into the napi_struct and make the init routine something like:

static inline void netif_napi_init(struct napi_struct *napi,
   int (*poll)(struct napi_struct *, int),
   void (*resched)(struct work *),
   int weight)
{
napi->poll = poll;
napi->weight = weight;
INIT_WORK(&napi->work, resched);
}

and then we have napi_resched():

static inline void napi_resched(struct napi_struct *napi)
{
schedule_work(&napi->work);
}

If this can be handled in a better way inside the driver, as is the
case for tg3 et al., "resched" can be passed in as NULL.

Any objections?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Alexey Dobriyan
On Mon, Jul 23, 2007 at 03:27:12PM -0700, Andrew Morton wrote:
> On Tue, 24 Jul 2007 02:04:46 +0400
> Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, Jul 23, 2007 at 02:11:37PM -0700, Andrew Morton wrote:
> > > On Tue, 24 Jul 2007 01:01:53 +0400
> > > Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote:
> > > > > > I had more complete info: 
> > > > > > http://article.gmane.org/gmane.linux.network/66966
> > > > > > 
> > > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule 
> > > > > > that out.
> > > > > > 
> > > > > > I haven't worked out where that kmap_atomic() call is coming from 
> > > > > > yet. 
> > > > > > Both traces point up into the page allocator, but I _think_ that's 
> > > > > > stack
> > > > > > gunk.
> > > > > 
> > > > > Ahh, you suspect networking.
> > > > > 
> > > > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus 
> > > > > network
> > > > > receiving ~20 junk packets per second, one gathering netconsole output
> > > > > and ssh to it, no conntracks and fancy stuff.
> > > > > 
> > > > > [reboots with cables physically unplugged]
> > > > 
> > > > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of
> > > > small files) with both cables unplugged. It all went fine for ~5 minutes
> > > > after that it crashed exactly same way after 10 secs after plugging one
> > > > of them.
> > > 
> > > It'd be nice to get a clean trace.  Are you able to obtain the full
> > > trace with CONFIG_FRAME_POINTER=y?
> > 
> > Sorry, no camera shot, finding camera requires wakening up M. :)
> > 
> > It took longer that usual, but here it is
> > 
> > kmap_atomic
> > get_page_from_freelist
> > __alloc_pages
> > cache_alloc_refill
> > __alloc_pages
> > cache_alloc_refill
> > kmem_cache_alloc
> > dst_alloc
> > ip_route_input
> > ip_rcv
> > netif_receive_skb
> > rtl8139_poll
> > net_rx_action
> > __do_softirq
> > do_softirq
> > irq_exit
> > do_IRQ
> > common_interrupt
> > handle_mm_fault
> > do_page_fault
> > error_core
> > 
> > much more loaded x86_64 box near also running 2.6.23-rc1 with debugging
> > turned on, using atl1 driver doesn't experience any crashes.
> > 
> > And I found 2.6.22-b91cba52e9b7b3f1c0037908a192d93a869ca9e5-x entry on
> > top of grub config which means b91cba52e9b7b3f1c0037908a192d93a869ca9e5
> > _without_ any debugging was OK.
> 
> I worked out that the crash I saw was in
> 
> BUG_ON(!pte_none(*(kmap_pte-idx)));
> 
> in the read of kmap_pte[idx].  Which would be weird as the caller is using 
> a literal KM_USER0.
> 
> So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am
> unable to reproduce it now).
> 
> If that BUG_ON _is_ triggering then it might indicate that someone is doing
> a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0.
> 
> If they're holding an atomic kmap then they'll be running in_atomic so it
> is unlikely that they accidentally added __GFP_WAIT because lots of people
> would be getting lots of might_sleep() warnings.
> 
> Hence that first VM_BUG_ON in prep_zero_page() _should_ be triggering.
> 
> Do you have CONFIG_DEBUG_VM enabled?

Yes.

> Also, it might be useful to apply -mm's kmap_atomic-debugging.patch.  it
> will detect lots of abuse.

I hit it only once with this patch applied, but there were no additional
warnings.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFX]: napi_struct V3

2007-07-23 Thread Rusty Russell
On Mon, 2007-07-23 at 21:07 -0700, David Miller wrote:
> Another area of consternation are drivers that were using
> netif_rx_reschedule(), as that interface was removed because it
> doesn't fit well with the caller managing the dev->quota et al.  I
> left race conditions in the drivers that were using that interface,
> but they should still basically work nonetheless.

Hmm, virtio does this, if the implementation returns false from
->restart.  But it's basically a bandaid for things like lguest which
don't check irq status on irq enable, hence is subject to the race.

But AFAICT netif_rx_reschedule() is implementable in a driver anyway.
What am I missing?

static void resched_me(struct work_struct *work)
{
struct foo *foo = container_of(work, struct foo, work);

irq_disable();
netif_rx_schedule(foo->dev);
irq_enable();
}

struct foo {
...;
struct napi_struct napi;
struct work_struct rescheduler;
};

int foo_poll(struct napi_struct *napi, int budget)
{
...
enable_rx_interrupts();
netif_rx_complete(napi);

/* Broken interrupts, so we have race after enabling. */
if (unlikely(ring_has_new_packet(dev)))
schedule_work(&foo->rescheduler);
return used;
}

Cheers,
Rusty.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kernel-doc fixes for net/

2007-07-23 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Fix kernel-doc omissions in net/:

Warning(linux-2.6.23-rc1//net/core/dev.c:2728): No description found for 
parameter 'addr'
Warning(linux-2.6.23-rc1//net/core/dev.c:2752): No description found for 
parameter 'addr'
Warning(linux-2.6.23-rc1//net/core/dev.c:3839): No description found for 
parameter 'net_dma'
Warning(linux-2.6.23-rc1//net/core/dev.c:3877): No description found for 
parameter 'state'

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 net/core/dev.c |   16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

--- linux-2.6.23-rc1.orig/net/core/dev.c
+++ linux-2.6.23-rc1/net/core/dev.c
@@ -2718,9 +2718,11 @@ int __dev_addr_add(struct dev_addr_list 
 /**
  * dev_unicast_delete  - Release secondary unicast address.
  * @dev: device
+ * @addr: address to delete
+ * @alen: length of @addr
  *
  * Release reference to a secondary unicast address and remove it
- * from the device if the reference count drop to zero.
+ * from the device if the reference count drops to zero.
  *
  * The caller must hold the rtnl_mutex.
  */
@@ -2742,6 +2744,8 @@ EXPORT_SYMBOL(dev_unicast_delete);
 /**
  * dev_unicast_add - add a secondary unicast address
  * @dev: device
+ * @addr: address to delete
+ * @alen: length of @addr
  *
  * Add a secondary unicast address to the device or increase
  * the reference count if it already exists.
@@ -3830,9 +3834,11 @@ static int dev_cpu_callback(struct notif
 
 #ifdef CONFIG_NET_DMA
 /**
- * net_dma_rebalance -
- * This is called when the number of channels allocated to the net_dma_client
- * changes.  The net_dma_client tries to have one DMA channel per CPU.
+ * net_dma_rebalance - try to maintain one DMA channel per CPU
+ * @net_dma: DMA client and associated data (lock, channels, channel_mask)
+ *
+ * This is called when the number of channels allocated to the net_dma client
+ * changes.  The net_dma client tries to have one DMA channel per CPU.
  */
 
 static void net_dma_rebalance(struct net_dma *net_dma)
@@ -3869,7 +3875,7 @@ static void net_dma_rebalance(struct net
  * netdev_dma_event - event callback for the net_dma_client
  * @client: should always be net_dma_client
  * @chan: DMA channel for the event
- * @event: event type
+ * @state: DMA state to be handled
  */
 static enum dma_state_client
 netdev_dma_event(struct dma_client *client, struct dma_chan *chan,
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Initialize and fill IPv6 route age

2007-07-23 Thread Varun Chandramohan
Stephen Hemminger wrote:
> On Mon, 23 Jul 2007 10:13:18 +0530
> Varun Chandramohan <[EMAIL PROTECTED]> wrote:
>
>   
>> The age field of the ipv6 route structures are initilized with the current 
>> timeval at the time of route   creation. When the route dump is called 
>> the route age value stored in the structure is subtracted from the 
>> present timeval and the difference is passed on as the route age.
>>
>> Signed-off-by: Varun Chandramohan <[EMAIL PROTECTED]>
>> ---
>>  include/net/ip6_fib.h   |1 +
>>  include/net/ip6_route.h |3 +++
>>  net/ipv6/addrconf.c |5 +
>>  net/ipv6/route.c|   23 +++
>>  4 files changed, 28 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
>> index c48ea87..e30a1cf 100644
>> --- a/include/net/ip6_fib.h
>> +++ b/include/net/ip6_fib.h
>> @@ -98,6 +98,7 @@ struct rt6_info
>>  
>>  u32 rt6i_flags;
>>  u32 rt6i_metric;
>> +time_t  rt6i_age;
>>  atomic_trt6i_ref;
>>  struct fib6_table   *rt6i_table;
>>  
>> diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
>> index 5456fdd..fc9716c 100644
>> --- a/include/net/ip6_route.h
>> +++ b/include/net/ip6_route.h
>> @@ -36,6 +36,9 @@ struct route_info {
>>  #define RT6_LOOKUP_F_REACHABLE  0x2
>>  #define RT6_LOOKUP_F_HAS_SADDR  0x4
>>  
>> +#define RT6_SET_ROUTE_INFO 0x0
>> +#define RT6_GET_ROUTE_INFO 0x1
>> +
>>  extern struct rt6_info  ip6_null_entry;
>>  
>>  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index 5a5f8bd..715c766 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -4187,6 +4187,7 @@ EXPORT_SYMBOL(unregister_inet6addr_notif
>>  
>>  int __init addrconf_init(void)
>>  {
>> +struct timeval tv;
>>  int err = 0;
>>  
>>  /* The addrconf netdev notifier requires that loopback_dev
>> @@ -4214,10 +4215,14 @@ int __init addrconf_init(void)
>>  if (err)
>>  return err;
>>  
>> +do_gettimeofday(&tv);
>> 
>
> Better to use ktime_t or timespec in new code.
>   
You are saying not to use timeval as its going to be removed sometime in
future? If not, may i know why should we use timespec or ktime?
I need only seconds granularity so i was wondering if that matters.

Regards,
Varun
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-23 Thread Varun Chandramohan
Patrick McHardy wrote:
> Varun Chandramohan wrote:
>   
>>  /**
>> + * timeval_to_sec - Convert timeval to seconds
>> + * @tv: pointer to the timeval variable to be converted
>> + *
>> + * Returns the seconds representation of timeval parameter.
>> + */
>> +static inline time_t timeval_to_sec(const struct timeval *tv)
>> +{
>> +return (tv->tv_sec + (tv->tv_usec + 50)/100);
>> +}
>> 
>
>
> I don't think you should round down timeout values.
>   
Can you elaborate on that? As per the RFC of MIB ,we need only seconds
granularity. Taking that as the case i dont understand why round down
should not be done?

Regards,
Varun
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10] Implement batching skb API

2007-07-23 Thread Krishna Kumar2
Hi Jamal,

J Hadi Salim <[EMAIL PROTECTED]> wrote on 07/23/2007 06:02:01 PM:

> Yes, and these results were sent to you as well a while back.
> When i get the time when i get back i will look em up in my test machine
> and resend.

Actually you have not sent netperf results with prep and without prep.

> > No. I see value only in non-LLTX drivers which also gets the same TX
lock
> > in the RX path.
>
> So _which_ non-LLTX driver doesnt do that? ;->

I have no idea since I haven't looked at all drivers. Can you tell which
all non-LLTX drivers does that ? I stated this as the sole criterea.

> tun driver doesnt use it either - but i doubt that makes it "bloat"

Adding extra code that is currently not usable (esp from a submission
point)
is bloat.

> You waltz in, have the luxury of looking at my code, presentations, many
> discussions with me etc ...

"luxury" ? I had implemented the entire thing even before knowing that you
are working on something similar! and I had sent the first proposal to
netdev,
*after* which you told that you have your own code and presentations (which
I had never seen earlier - I joined netdev a few months back, earlier I was
working on RDMA, Infiniband as you know). And it didn't give me any great
ideas either, remember I had posted results for E1000 at the time of
sending
the proposals. However I do give credit in my proposal to you for what
ideas
that your provided (without actual code), and the same I did for other
people
who did the same, like Dave, Sridhar. BTW, you too had discussions with me,
and I sent some patches to improve your code too, so it looks like a two
way
street to me (and that is how open source works and should).

> When i ask for differences to code you produced, they now seem to sum up
> to the two below. You dont think theres some honest issue with this
> picture?

Two changes ? That's it ? I gave a big list of changes between our
implementations but you twist my words to conclude there is just two (by
conveniently labelling everything else "cosmetic", or "potentially
useful"!)! Even my restart routine used a single API from the first day,
I would never imagine using multiple API's. Our codes probably doesn't
have even one line that look remotely similar!

To clarify : I suggested that you could send patches for the two *missing*
items if you can show they add value (and not the rest, as I consider
those will not improve the code/logic/algo).

> > ("lacking in frankness, candor, or sincerity; falsely or hypocritically
> > ingenuous; insincere")  Sorry, no response to personal comments and
> > have a flame-war :)
>
> Give me a better description.

Sorry, no personal comments. Infact I will avoid responding to baits and
innuendoes from now on.

Thanks,

- KK

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [2.6.22] Fix a potential NULL pointer dereference in mace_interrupt() in drivers/net/pcmcia/nmclan_cs.c

2007-07-23 Thread Micah Gruber
This patch fixes a potential null dereference bug where we dereference 
DEV before a null check. This patch simply moves the dereferencing after 
the null check.


Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>

---

--- a/drivers/net/pcmcia/nmclan_cs.c
+++ b/drivers/net/pcmcia/nmclan_cs.c
@@ -996,7 +996,7 @@

{
  struct net_device *dev = (struct net_device *) dev_id;
  mace_private *lp = netdev_priv(dev);
-  kio_addr_t ioaddr = dev->base_addr;
+  kio_addr_t ioaddr;
  int status;
  int IntrCnt = MACE_MAX_IR_ITERATIONS;

@@ -1006,6 +1006,8 @@
return IRQ_NONE;
  }

+  ioaddr = dev->base_addr;
+
  if (lp->tx_irq_disabled) {
printk(
  (lp->tx_irq_disabled?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [DRIVER SUBMISSION] DRBD wants to go mainline

2007-07-23 Thread Kyle Moffett
For the guys on netdev, would you please look at the tcp_recvmsg- 
threading and TCP_NAGLE_CORK issues below and give opinions on the  
best way to proceed?


One thing to remember, you don't necessarily have to merge every  
feature right away.  As long as the new code is configured "off" by  
default with an "(EXPERIMENTAL)" warning, you can start getting the  
core parts and the cleanups upstream before you have to resolve all  
the issues with low-latency, dynamic-tracing-frameworks, etc.


On Jul 23, 2007, at 09:32:02, Lars Ellenberg wrote:

On Sun, Jul 22, 2007 at 09:32:02PM -0400, Kyle Moffett wrote:

+/* I don't remember why XCPU ...
+ * This is used to wake the asender,
+ * and to interrupt sending the sending task
+ * on disconnect.
+ */
+#define DRBD_SIG SIGXCPU


Don't use signals between kernel threads, use proper primitives  
like notifiers and waitqueues, which means you should also  
probably switch away from kernel_thread() to the kthread_*()  
APIs.  Also you should fix this FIXME or remove it if it no longer  
applies:-D.


right.
but how to I tell a network thread in tcp_recvmsg to stop early,  
without using signals?


I'm not really a kernel-networking guy, so I can't answer this  
definitively, but I'm pretty sure the problem has been solved in many  
network filesystems and such, so I've added a netdev CC.  The way I'd  
do it in userspace is with nonblocking IO and epoll(), that way I  
don't actually have to "stop" or "signal" the thread, I can just add  
a socket to epoll fd when I want to pay attention to it, and remove  
it from my epoll fd when I'm done with it.  I'd assume there's some  
equivalent way in kernelspace based around the "struct kiocb *iocb"  
and "int nonblock" parameters to the tcp_recvmsg() kernel function.



+/* see kernel/printk.c:printk_ratelimit
+ * macro, so it is easy do have independend rate limits at  
different locations

+ * "initializer element not constant ..." with kernel 2.4 :(
+ * so I initialize toks to something large
+ */
+#define DRBD_ratelimit(ratelimit_jiffies, ratelimit_burst) \

Any particular reason you can't just use printk_ratelimit for this?


I want to be able to do a rate-limit per specific message/code  
fragment, without affecting other messages or execution paths.


Ok, so could you change your patch to modify __printk_ratelimit() to  
also accept a "struct printk_rate" datastructure and make  
printk_ratelimit() call "__printk_ratelimit(&global_printk_rate);"??


Typically if $KERNEL_FEATURE is insufficient for your needs you  
should fix $KERNEL_FEATURE instead of duplicating a replacement in  
your driver.  This applies to basically all of the things I'm talking  
about, kernel-threads, workqueues (BTW: I believe you can make your  
own custom workqueue thread(s) instead of using the default "events/ 
*" ones), debugging macros, fault-insertion, integer math, lock- 
checking, dynamic tracing, etc.  If you find some reason that some  
generic code won't work for you, please try to fix it first so we can  
all benefit from it.


Umm, how about fixing this to actually use proper workqueues or  
something instead of this open-coded mess?


unlikely to happen "right now".  but it is on our todo list...


Unfortunately problems like these need to be fixed before a mainline  
merge.  Merging duplicated code is a big no-no, and historically  
there have been problems with people who merge code and never  
properly maintain it once it's in tree.  As a result the rule is your  
code has to be easily maintainable before anybody will even  
*consider* merging it.



+/* I want the packet to fit within one page
+ * THINK maybe use a special bitmap header,
+ * including offset and compression scheme and whatnot
+ * Do not use PAGE_SIZE here! Use a architecture agnostic constant!
+ */
+#define BM_PACKET_WORDS ((4096-sizeof(struct Drbd_Header))/sizeof 
(long))


Yuck.  Definitely use PAGE_SIZE here, so at least if it's broken  
on an arch with multiple page sizes, somebody can grep for  
PAGE_SIZE to fix it.  It also means that on archs/configs with 8k  
or 64k pages you won't waste a bunch of memory.


No. This is not to allocate anything, but defines the chunk size  
with which we transmit the bitmap, when we have to.  We need to be  
able to talk from one arch (say i586) to some other (say s390, or  
sparc, or whatever).  The receiving side has a one-page buffer,  
from which it may or may not to endian-conversion.  The hardcoded  
4096 is the minimal common denominator here.


Ahhh.  Please replace the constant "4096" with:
/* This is the maximum amount of bitmap we will send per packet */
# define MAX_BITMAP_CHUNK_SIZE 4096
# define BM_PACKET_WORDS \
((MAX_BITMAP_CHUNK_SIZE - sizeof(struct Drbd_Header))/sizeof(long))

It's more text but dramatically improves the readability by  
eliminating more magic numbers.  This is a much milder case than I've  
seen in the past, so it's not that big of a deal.




+/* Dynamic tracing f

Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Andrew Morton
On Tue, 24 Jul 2007 02:04:46 +0400
Alexey Dobriyan <[EMAIL PROTECTED]> wrote:

> On Mon, Jul 23, 2007 at 02:11:37PM -0700, Andrew Morton wrote:
> > On Tue, 24 Jul 2007 01:01:53 +0400
> > Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> > 
> > > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote:
> > > > > I had more complete info: 
> > > > > http://article.gmane.org/gmane.linux.network/66966
> > > > > 
> > > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule 
> > > > > that out.
> > > > > 
> > > > > I haven't worked out where that kmap_atomic() call is coming from 
> > > > > yet. 
> > > > > Both traces point up into the page allocator, but I _think_ that's 
> > > > > stack
> > > > > gunk.
> > > > 
> > > > Ahh, you suspect networking.
> > > > 
> > > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus 
> > > > network
> > > > receiving ~20 junk packets per second, one gathering netconsole output
> > > > and ssh to it, no conntracks and fancy stuff.
> > > > 
> > > > [reboots with cables physically unplugged]
> > > 
> > > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of
> > > small files) with both cables unplugged. It all went fine for ~5 minutes
> > > after that it crashed exactly same way after 10 secs after plugging one
> > > of them.
> > 
> > It'd be nice to get a clean trace.  Are you able to obtain the full
> > trace with CONFIG_FRAME_POINTER=y?
> 
> Sorry, no camera shot, finding camera requires wakening up M. :)
> 
> It took longer that usual, but here it is
> 
>   kmap_atomic
>   get_page_from_freelist
>   __alloc_pages
>   cache_alloc_refill
>   __alloc_pages
>   cache_alloc_refill
>   kmem_cache_alloc
>   dst_alloc
>   ip_route_input
>   ip_rcv
>   netif_receive_skb
>   rtl8139_poll
>   net_rx_action
>   __do_softirq
>   do_softirq
>   irq_exit
>   do_IRQ
>   common_interrupt
>   handle_mm_fault
>   do_page_fault
>   error_core
> 
> much more loaded x86_64 box near also running 2.6.23-rc1 with debugging
> turned on, using atl1 driver doesn't experience any crashes.
> 
> And I found 2.6.22-b91cba52e9b7b3f1c0037908a192d93a869ca9e5-x entry on
> top of grub config which means b91cba52e9b7b3f1c0037908a192d93a869ca9e5
> _without_ any debugging was OK.

I worked out that the crash I saw was in

BUG_ON(!pte_none(*(kmap_pte-idx)));

in the read of kmap_pte[idx].  Which would be weird as the caller is using 
a literal KM_USER0.

So maybe I goofed, and that BUG_ON is triggering (it scrolled off, and I am
unable to reproduce it now).

If that BUG_ON _is_ triggering then it might indicate that someone is doing
a __GFP_HIGHMEM|__GFP_ZERO allocation while holding KM_USER0.

If they're holding an atomic kmap then they'll be running in_atomic so it
is unlikely that they accidentally added __GFP_WAIT because lots of people
would be getting lots of might_sleep() warnings.

Hence that first VM_BUG_ON in prep_zero_page() _should_ be triggering.

Do you have CONFIG_DEBUG_VM enabled?



Also, it might be useful to apply -mm's kmap_atomic-debugging.patch.  it
will detect lots of abuse.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: revised make xfrm_audit_log more generic patch

2007-07-23 Thread James Morris
On Mon, 23 Jul 2007, Joy Latten wrote:

> Revised patch that modifies xfrm_audit_log() such that it
> can accomodate auditing other ipsec events
> besides add/delete of an SA or SPD entry.
> 
> This patch differs from original in that it does
> not remove existing ipsec audit defines so as
> to not break existing audit apps. 
> 
> This is a small change to accomodate updating
> ipsec protocol to RFCs 4301, 4302 and 4303 which
> require auditing some ipsec events if auditing
> is available. Please let me know if ok.
> 
> Regards,
> Joy
> 
> Signed-off-by: Joy Latten <[EMAIL PROTECTED]>

Acked-by: James Morris <[EMAIL PROTECTED]>




-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Alexey Dobriyan
On Mon, Jul 23, 2007 at 02:11:37PM -0700, Andrew Morton wrote:
> On Tue, 24 Jul 2007 01:01:53 +0400
> Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote:
> > > > I had more complete info: 
> > > > http://article.gmane.org/gmane.linux.network/66966
> > > > 
> > > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule 
> > > > that out.
> > > > 
> > > > I haven't worked out where that kmap_atomic() call is coming from yet. 
> > > > Both traces point up into the page allocator, but I _think_ that's stack
> > > > gunk.
> > > 
> > > Ahh, you suspect networking.
> > > 
> > > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network
> > > receiving ~20 junk packets per second, one gathering netconsole output
> > > and ssh to it, no conntracks and fancy stuff.
> > > 
> > > [reboots with cables physically unplugged]
> > 
> > OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of
> > small files) with both cables unplugged. It all went fine for ~5 minutes
> > after that it crashed exactly same way after 10 secs after plugging one
> > of them.
> 
> It'd be nice to get a clean trace.  Are you able to obtain the full
> trace with CONFIG_FRAME_POINTER=y?

Sorry, no camera shot, finding camera requires wakening up M. :)

It took longer that usual, but here it is

kmap_atomic
get_page_from_freelist
__alloc_pages
cache_alloc_refill
__alloc_pages
cache_alloc_refill
kmem_cache_alloc
dst_alloc
ip_route_input
ip_rcv
netif_receive_skb
rtl8139_poll
net_rx_action
__do_softirq
do_softirq
irq_exit
do_IRQ
common_interrupt
handle_mm_fault
do_page_fault
error_core

much more loaded x86_64 box near also running 2.6.23-rc1 with debugging
turned on, using atl1 driver doesn't experience any crashes.

And I found 2.6.22-b91cba52e9b7b3f1c0037908a192d93a869ca9e5-x entry on
top of grub config which means b91cba52e9b7b3f1c0037908a192d93a869ca9e5
_without_ any debugging was OK.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]: revised make xfrm_audit_log more generic patch

2007-07-23 Thread Joy Latten
Revised patch that modifies xfrm_audit_log() such that it
can accomodate auditing other ipsec events
besides add/delete of an SA or SPD entry.

This patch differs from original in that it does
not remove existing ipsec audit defines so as
to not break existing audit apps. 

This is a small change to accomodate updating
ipsec protocol to RFCs 4301, 4302 and 4303 which
require auditing some ipsec events if auditing
is available. Please let me know if ok.

Regards,
Joy

Signed-off-by: Joy Latten <[EMAIL PROTECTED]>


diff -urpN linux-2.6.22/include/linux/audit.h 
linux-2.6.22.patch/include/linux/audit.h
--- linux-2.6.22/include/linux/audit.h  2007-07-23 14:35:28.0 -0500
+++ linux-2.6.22.patch/include/linux/audit.h2007-07-23 14:38:51.0 
-0500
@@ -112,6 +112,7 @@
 #define AUDIT_MAC_IPSEC_DELSA  1412/* Delete a XFRM state */
 #define AUDIT_MAC_IPSEC_ADDSPD 1413/* Add a XFRM policy */
 #define AUDIT_MAC_IPSEC_DELSPD 1414/* Delete a XFRM policy */
+#define AUDIT_MAC_IPSEC_EVENT  1415/* Audit IPSec events */
 
 #define AUDIT_FIRST_KERN_ANOM_MSG   1700
 #define AUDIT_LAST_KERN_ANOM_MSG1799
diff -urpN linux-2.6.22/include/net/xfrm.h linux-2.6.22.patch/include/net/xfrm.h
--- linux-2.6.22/include/net/xfrm.h 2007-07-23 14:35:28.0 -0500
+++ linux-2.6.22.patch/include/net/xfrm.h   2007-07-23 14:38:51.0 
-0500
@@ -427,9 +427,11 @@ struct xfrm_audit
 
 #ifdef CONFIG_AUDITSYSCALL
 extern void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
-   struct xfrm_policy *xp, struct xfrm_state *x);
+  u16 family, xfrm_address_t saddr, 
+  xfrm_address_t daddr, __be32 spi, __be32 flowid, 
+  struct xfrm_sec_ctx *sctx, char *buf);
 #else
-#define xfrm_audit_log(a,s,t,r,p,x) do { ; } while (0)
+#define xfrm_audit_log(a,i,t,r,f,s,d,p,l,c,b) do { ; } while (0)
 #endif /* CONFIG_AUDITSYSCALL */
 
 static inline void xfrm_pol_hold(struct xfrm_policy *policy)
diff -urpN linux-2.6.22/net/key/af_key.c linux-2.6.22.patch/net/key/af_key.c
--- linux-2.6.22/net/key/af_key.c   2007-07-08 18:32:17.0 -0500
+++ linux-2.6.22.patch/net/key/af_key.c 2007-07-23 14:38:51.0 -0500
@@ -1459,7 +1459,9 @@ static int pfkey_add(struct sock *sk, st
err = xfrm_state_update(x);
 
xfrm_audit_log(audit_get_loginuid(current->audit_context), 0,
-  AUDIT_MAC_IPSEC_ADDSA, err ? 0 : 1, NULL, x);
+  AUDIT_MAC_IPSEC_EVENT, err ? 0 : 1, 
+  x->props.family, x->props.saddr, x->id.daddr, 
+  x->id.spi, 0, x->security, "SAD add");
 
if (err < 0) {
x->km.state = XFRM_STATE_DEAD;
@@ -1513,7 +1515,10 @@ static int pfkey_delete(struct sock *sk,
km_state_notify(x, &c);
 out:
xfrm_audit_log(audit_get_loginuid(current->audit_context), 0,
-  AUDIT_MAC_IPSEC_DELSA, err ? 0 : 1, NULL, x);
+  AUDIT_MAC_IPSEC_EVENT, err ? 0 : 1, x->props.family,
+  x->props.saddr, x->id.daddr, x->id.spi, 0,
+  x->security, "SAD delete");
+
xfrm_state_put(x);
 
return err;
@@ -2266,7 +2271,9 @@ static int pfkey_spdadd(struct sock *sk,
 hdr->sadb_msg_type != SADB_X_SPDUPDATE);
 
xfrm_audit_log(audit_get_loginuid(current->audit_context), 0,
-  AUDIT_MAC_IPSEC_ADDSPD, err ? 0 : 1, xp, NULL);
+  AUDIT_MAC_IPSEC_EVENT, err ? 0 : 1, 
+  xp->selector.family, xp->selector.saddr,
+  xp->selector.daddr, 0, 0, xp->security, "SPD add");
 
if (err)
goto out;
@@ -2350,7 +2357,9 @@ static int pfkey_spddelete(struct sock *
return -ENOENT;
 
xfrm_audit_log(audit_get_loginuid(current->audit_context), 0,
-  AUDIT_MAC_IPSEC_DELSPD, err ? 0 : 1, xp, NULL);
+  AUDIT_MAC_IPSEC_EVENT, err ? 0 : 1,
+  xp->selector.family, xp->selector.saddr,
+  xp->selector.daddr, 0, 0, xp->security, "SPD delete");
 
if (err)
goto out;
@@ -2611,7 +2620,10 @@ static int pfkey_spdget(struct sock *sk,
 
if (delete) {
xfrm_audit_log(audit_get_loginuid(current->audit_context), 0,
-  AUDIT_MAC_IPSEC_DELSPD, err ? 0 : 1, xp, NULL);
+  AUDIT_MAC_IPSEC_EVENT, err ? 0 : 1, 
+  xp->selector.family, xp->selector.saddr,
+  xp->selector.daddr, 0, 0, xp->security,
+  "SPD delete");
 
if (err)
goto out;
diff -urpN linux-2.6.22/net/xfrm/xfrm_policy.c 
linux-2.6.22.patch/net/xfrm/xfrm_policy.c
--- linux-2.6.22/net/xfrm/xfrm_policy.c 2007-07-23 14:35:29.0 -0500
+++ linux-2.6.22

Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Sam Ravnborg
> 
> For example, missing from the call graph is
> 
>   get_page_from_freelist ->
> buffered_rmqueue ->   [ missing - inlined ]
>   prep_new_page ->[ missing - inlined ]
> prep_zero_page -> [ missing - inlined ]
>   clear_highpage ->   [ missing - inlined ]
> kmap_atomic ->[ missing - tailcall ]
>   kmap_atomic_prot
> 
> (and I'm also pretty sure gcc 
> is overly aggressive at inlining, and that it causes us pain for 
> debugging, but whatever)

mm/page_alloc.c:static inline void prep_zero_page(struct page *page, int order, 
gfp_t gfp_flags)
include/linux/highmem.h:static inline void clear_highpage(struct page *page)

So at least two was explicit marked inline.
Now if that made I change i dunno.

Sam
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Linus Torvalds


On Mon, 23 Jul 2007, Andrew Morton wrote:
> 
> It'd be nice to get a clean trace.  Are you able to obtain the full
> trace with CONFIG_FRAME_POINTER=y?

If you are talking about

http://userweb.kernel.org/~akpm/dsc03659.jpg

then I think that _is_ a full trace. It's certainly not very messy, and it 
seems accurate. It's just that inlining makes it much harder to see the 
call-graphs, but that's what inlining does..

For example, missing from the call graph is

get_page_from_freelist ->
  buffered_rmqueue ->   [ missing - inlined ]
prep_new_page ->[ missing - inlined ]
  prep_zero_page -> [ missing - inlined ]
clear_highpage ->   [ missing - inlined ]
  kmap_atomic ->[ missing - tailcall ]
kmap_atomic_prot

but I'm pretty sure the call trace is good (and I'm also pretty sure gcc 
is overly aggressive at inlining, and that it causes us pain for 
debugging, but whatever)

The earlier part of the trace looks fine too.

The only odd part I see is the existence of "dput()" there, so maybe it's 
not *quite* clean and enabling frame pointers might get rid of a few bogus 
entries, but it looks pretty close to clean.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] forcedeth: new device ids in pci_ids.h

2007-07-23 Thread Stephen Hemminger
On Sun, 22 Jul 2007 20:43:15 -0400
Ayaz Abdulla <[EMAIL PROTECTED]> wrote:

> This patch contains new device ids for MCP73 chipset.
> 
> Signed-Off-By: Ayaz Abdulla <[EMAIL PROTECTED]>
> 

no current practice is to stop cluttering pci_ids.h with specific sub-device 
id's
and just have the vendor values.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: Resurrect napi_poll patch.

2007-07-23 Thread Stephen Hemminger
On Mon, 23 Jul 2007 19:06:42 +0200
Andi Kleen <[EMAIL PROTECTED]> wrote:

> On Mon, Jul 23, 2007 at 10:58:22AM +0100, Stephen Hemminger wrote:
> > On 21 Jul 2007 15:26:00 +0200
> > Andi Kleen <[EMAIL PROTECTED]> wrote:
> > 
> > > David Miller <[EMAIL PROTECTED]> writes:
> > > > 
> > > > Good candidates for taking advantage of multi-napi are:
> > > > 
> > > > 1) e1000
> > > > 2) ucc_geth
> > > > 3) ehea
> > > > 4) sunvnet
> > > 
> > > s2io.c
> > 
> > sky2.c could use it because of issues with dual-port that share
> > one napi for status.
> 
> Sorry, I didn't parse the sentence. Did you mean "couldn't use it" ...? 
> Also can you elaborate why it shouldn't work?

Sky2 would make a good case for decoupling because it can have:

device0 device1
\/
 napi poll

Right now both device's share the NAPI instance on device 0,
but it take some work to make sure that the core code doesn't
upset device1 when device0 is down.

Actually, dual port boards are rare, I have one but it is a PCI-E x4
board so a pain to find a system with a slot.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Andrew Morton
On Tue, 24 Jul 2007 01:01:53 +0400
Alexey Dobriyan <[EMAIL PROTECTED]> wrote:

> On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote:
> > > I had more complete info: 
> > > http://article.gmane.org/gmane.linux.network/66966
> > > 
> > > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that 
> > > out.
> > > 
> > > I haven't worked out where that kmap_atomic() call is coming from yet. 
> > > Both traces point up into the page allocator, but I _think_ that's stack
> > > gunk.
> > 
> > Ahh, you suspect networking.
> > 
> > Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network
> > receiving ~20 junk packets per second, one gathering netconsole output
> > and ssh to it, no conntracks and fancy stuff.
> > 
> > [reboots with cables physically unplugged]
> 
> OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of
> small files) with both cables unplugged. It all went fine for ~5 minutes
> after that it crashed exactly same way after 10 secs after plugging one
> of them.

It'd be nice to get a clean trace.  Are you able to obtain the full
trace with CONFIG_FRAME_POINTER=y?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Odd behaviour of proxy_arp

2007-07-23 Thread Lennart Sorensen
I have been seeing some occasional strange behavior when using
proxy_arp.  I have a router running with an ADSL PPPoE link to the
Internet, and an Ethernet link to a local network.  It has proxy_arp
enabled on the internal Ethernet port since I sometimes have ipsec
tunnels running where I use proxy_arp to proxy for the IP assigned to
the other end of the tunnel so that local machines can find and reach
it.  I run two independent subnets on the local network (one with fixed
IPs for my machines here, and another with DHCP addresses for guest
machines that visit occasionally just to give them Internet access).

I run 10.0.0.0/8 and 192.168.254.0/24 on the local network with the
router having an IP in each subnet.

The strangeness that occurs is that once in a while there is a 10
second period where the system will answer all arp requests for all IPs
on the local network, with it's own MAC address, which is clearly wrong
since it doesn't have any of those IP addresses.  It seems to happen
every couple of days or so on average, although not at any specific
time.  One day it happened at 11:32:30 to 11:32:39, and a few days later
it happened at 12:08:38 to 12:08:48.  If I disable proxy_arp, it never
happens at all, but then I loose the ability to do what I have proxy_arp
enabled for in the first place.

Related to that problem, there is also the annoyance that any IP that
isn't part of either of the two subnets the router belongs to, have arp
requests answered by the router all the time, which it also should not
be answering, since it doesn't actually have a clue what those IP
addresses belong to and certainly has no idea where it should forward to
to reach them.  I occasionally have other random subnets in use on the
network for running local test networks separate from everything else.
It would be great if the kernel would keep its nose out of those subnets
too.

So far I have seen this behavior with 2.6.8, 2.6.16, and 2.6.18 (being
the kernels I have run on this router).

So have I misunderstood something about what proxy_arp is supposed to
do, or is proxy_arp in the kernel simply broken, or is it perhaps
mis-designed?  Are there some tuning parameters that could perhaps make
it actually do what one would expect it to be doing?

--
Len Sorensen
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Alexey Dobriyan
On Tue, Jul 24, 2007 at 12:40:45AM +0400, Alexey Dobriyan wrote:
> > I had more complete info: http://article.gmane.org/gmane.linux.network/66966
> > 
> > You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that 
> > out.
> > 
> > I haven't worked out where that kmap_atomic() call is coming from yet. 
> > Both traces point up into the page allocator, but I _think_ that's stack
> > gunk.
> 
> Ahh, you suspect networking.
> 
> Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network
> receiving ~20 junk packets per second, one gathering netconsole output
> and ssh to it, no conntracks and fancy stuff.
> 
> [reboots with cables physically unplugged]

OK, I run gdb recompile, cat(1) every file in /usr/portage (shitload of
small files) with both cables unplugged. It all went fine for ~5 minutes
after that it crashed exactly same way after 10 secs after plugging one
of them.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] forcedeth: mcp73 device addition

2007-07-23 Thread Ayaz Abdulla

This patch contains new device settings for MCP73 chipset.

Signed-Off-By: Ayaz Abdulla <[EMAIL PROTECTED]>

--- old/drivers/net/forcedeth.c 2007-07-22 19:02:41.0 -0400
+++ new/drivers/net/forcedeth.c 2007-07-22 19:31:56.0 -0400
@@ -5550,6 +5550,22 @@
PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, 
PCI_DEVICE_ID_NVIDIA_NVENET_27),
.driver_data = 
DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
},
+   {   /* MCP73 Ethernet Controller */
+   PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, 
PCI_DEVICE_ID_NVIDIA_NVENET_28),
+   .driver_data = 
DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+   },
+   {   /* MCP73 Ethernet Controller */
+   PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, 
PCI_DEVICE_ID_NVIDIA_NVENET_29),
+   .driver_data = 
DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+   },
+   {   /* MCP73 Ethernet Controller */
+   PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, 
PCI_DEVICE_ID_NVIDIA_NVENET_30),
+   .driver_data = 
DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+   },
+   {   /* MCP73 Ethernet Controller */
+   PCI_DEVICE(PCI_VENDOR_ID_NVIDIA, 
PCI_DEVICE_ID_NVIDIA_NVENET_31),
+   .driver_data = 
DEV_NEED_TIMERIRQ|DEV_NEED_LINKTIMER|DEV_HAS_HIGH_DMA|DEV_HAS_POWER_CNTRL|DEV_HAS_MSI|DEV_HAS_PAUSEFRAME_TX|DEV_HAS_STATISTICS_V2|DEV_HAS_TEST_EXTENDED|DEV_HAS_MGMT_UNIT,
+   },
{0,},
 };
 


[PATCH 1/2] forcedeth: new device ids in pci_ids.h

2007-07-23 Thread Ayaz Abdulla

This patch contains new device ids for MCP73 chipset.

Signed-Off-By: Ayaz Abdulla <[EMAIL PROTECTED]>

--- old/include/linux/pci_ids.h 2007-07-22 18:57:26.0 -0400
+++ new/include/linux/pci_ids.h 2007-07-22 18:57:11.0 -0400
@@ -1223,6 +1223,10 @@
 #define PCI_DEVICE_ID_NVIDIA_NVENET_25  0x054D
 #define PCI_DEVICE_ID_NVIDIA_NVENET_26  0x054E
 #define PCI_DEVICE_ID_NVIDIA_NVENET_27  0x054F
+#define PCI_DEVICE_ID_NVIDIA_NVENET_28  0x07DC
+#define PCI_DEVICE_ID_NVIDIA_NVENET_29  0x07DD
+#define PCI_DEVICE_ID_NVIDIA_NVENET_30  0x07DE
+#define PCI_DEVICE_ID_NVIDIA_NVENET_31  0x07DF
 #define PCI_DEVICE_ID_NVIDIA_NFORCE_MCP67_IDE   0x0560
 #define PCI_DEVICE_ID_NVIDIA_NFORCE_MCP73_IDE   0x056C
 #define PCI_DEVICE_ID_NVIDIA_NFORCE_MCP77_IDE   0x0759


Re: tg3 issues

2007-07-23 Thread Michael Chan
On Sun, 2007-07-22 at 13:43 +0200, patric wrote:
> patric wrote:
> 
> > Hi,
> >
> > Think i got something working for me at least, and the fix is quite 
> > minimal and only downside that i could see from it was that you might 
> > get a small delay when bringing up the interface, but that's probably 
> > better than getting a non-functional interface that reports that it's up.
> >
> > The fix seems to be quite simple with just a random sleep at the end 
> > of  "tg3_setup_fiber_by_hand():"
> >
> >tw32_f(MAC_MODE, tp->mac_mode);
> >udelay(40);
> >}
> >
> > out:
> >udelay( net_random() % 400 );
> >return current_link_up;
> > }
> >
> > Not sure that this is a good fix or if it might break on other 
> > systems, but maybe you could have a quick look at that?
> >
> > Regards,
> > Patric
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> Last update on this.
> 
> "udelay( 100 + net_random % 300 )" seems to work much better and i have 
> not had a single problem getting the link up within 10 seconds of a cold 
> or warm-boot, and most often the link comes up directly without any sort 
> of delay instead like before when it could hang for 30 seconds before 
> getting a link, if you even got a link.
> 

We'll have to do some testing to see if we can find a better solution.
Adding up to 400 usec of busy wait is not ideal.  Are you connecting two
5701 fiber cards directly to each other in your setup?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Alexey Dobriyan
On Mon, Jul 23, 2007 at 01:24:31PM -0700, Andrew Morton wrote:
> On Mon, 23 Jul 2007 23:01:52 +0400
> Alexey Dobriyan <[EMAIL PROTECTED]> wrote:
> 
> > On Mon, Jul 23, 2007 at 10:38:39PM +0400, Alexey Dobriyan wrote:
> > > Managed to hit BUG_ON() in kmap_atomic_prot() three times while doing
> > > nothing unusual for this box (two times it was under X, so I can't
> > > guarantee, one time while trying to reproduce via ./configure in gdb
> > > tarball)
> 
> Yeah, I hit this several times a few days ago.  Same story: it just
> randomly went splat in response to no obvious stimulus.  Reported it to
> netdev, was greeted with stunned silence.
> 
> 
> > > Box has 2.5G of RAM. 2.6.22 was OK.
> > > 
> > > [dives into framebuffer console setup for complete oops]
> > 
> > kernel BUG at arch/i386/mm/highmem.c:38
> > PREEMPT DEBUG_PAGEALLOC SLAB
> > EIP at kmap_atomic_prot+0x32/0x93
> > get_page_from_freelist
> > __alloc_pages
> > cache_alloc_refill
> > cache_alloc_refill
> > kmem_cache_alloc
> > dst_alloc
> > dst_alloc
> > __ip_route_output_key
> > [some junk I don't trust]
> > 
> > eax: 000c
> > ebx: 0003
> > ecx: c065efe0
> > edx: 0003
> > edi: 0163
> > 
> > 
> > c010cc9b :
> > c010cc9b:   57  push   %edi
> > c010cc9c:   56  push   %esi
> > c010cc9d:   53  push   %ebx
> > c010cc9e:   89 c6   mov%eax,%esi
> > c010cca0:   89 d3   mov%edx,%ebx
> > c010cca2:   89 cf   mov%ecx,%edi
> > c010cca4:   b8 01 00 00 00  mov$0x1,%eax
> > c010cca9:   e8 dd 1b 00 00  call   c010e88b 
> > c010ccae:   e8 b1 ac 0e 00  call   c01f7964 
> > c010ccb3:   6b c0 0dimul   $0xd,%eax,%eax
> > c010ccb6:   8d 14 03lea(%ebx,%eax,1),%edx
> > c010ccb9:   8d 04 95 00 00 00 00lea0x0(,%edx,4),%eax
> > c010ccc0:   8b 0d 30 a1 3e c0   mov0xc03ea130,%ecx
> > c010ccc6:   29 c1   sub%eax,%ecx
> > c010ccc8:   83 39 00cmpl   $0x0,(%ecx)
> > c010cccb:   74 04   je c010ccd1 
> > c010cccd:   0f 0b   ud2a   
> 
> I had more complete info: http://article.gmane.org/gmane.linux.network/66966
> 
> You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out.
> 
> I haven't worked out where that kmap_atomic() call is coming from yet. 
> Both traces point up into the page allocator, but I _think_ that's stack
> gunk.

Ahh, you suspect networking.

Here, setup is 2 cheap-ass 100Mb realtek 8139 NICs, one to campus network
receiving ~20 junk packets per second, one gathering netconsole output
and ssh to it, no conntracks and fancy stuff.

[reboots with cables physically unplugged]

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8778] New: Ocotea board: kernel reports access of bad area during boot with DEBUG_SLAB=y

2007-07-23 Thread Christoph Lameter
On Wed, 18 Jul 2007 09:55:37 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> hm.  It should be the case that providing SLAB_HWCACHE_ALIGN at
> kmem_cache_create() time will override slab-debugging's offsetting
> of the returned addresses.


That is true for SLUB but not in SLAB. SLAB has always ignored
SLAB_HWCACHE_ALIGN when debugging is on because of the issues involved
in placing the redzone values etc.  Could be fun to fix.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23-rc1: BUG_ON in kmap_atomic_prot()

2007-07-23 Thread Andrew Morton
On Mon, 23 Jul 2007 23:01:52 +0400
Alexey Dobriyan <[EMAIL PROTECTED]> wrote:

> On Mon, Jul 23, 2007 at 10:38:39PM +0400, Alexey Dobriyan wrote:
> > Managed to hit BUG_ON() in kmap_atomic_prot() three times while doing
> > nothing unusual for this box (two times it was under X, so I can't
> > guarantee, one time while trying to reproduce via ./configure in gdb
> > tarball)

Yeah, I hit this several times a few days ago.  Same story: it just
randomly went splat in response to no obvious stimulus.  Reported it to
netdev, was greeted with stunned silence.


> > Box has 2.5G of RAM. 2.6.22 was OK.
> > 
> > [dives into framebuffer console setup for complete oops]
> 
> kernel BUG at arch/i386/mm/highmem.c:38
> PREEMPT DEBUG_PAGEALLOC SLAB
> EIP at kmap_atomic_prot+0x32/0x93
>   get_page_from_freelist
>   __alloc_pages
>   cache_alloc_refill
>   cache_alloc_refill
>   kmem_cache_alloc
>   dst_alloc
>   dst_alloc
>   __ip_route_output_key
>   [some junk I don't trust]
> 
> eax: 000c
> ebx: 0003
> ecx: c065efe0
> edx: 0003
> edi: 0163
> 
> 
> c010cc9b :
> c010cc9b: 57  push   %edi
> c010cc9c: 56  push   %esi
> c010cc9d: 53  push   %ebx
> c010cc9e: 89 c6   mov%eax,%esi
> c010cca0: 89 d3   mov%edx,%ebx
> c010cca2: 89 cf   mov%ecx,%edi
> c010cca4: b8 01 00 00 00  mov$0x1,%eax
> c010cca9: e8 dd 1b 00 00  call   c010e88b 
> c010ccae: e8 b1 ac 0e 00  call   c01f7964 
> c010ccb3: 6b c0 0dimul   $0xd,%eax,%eax
> c010ccb6: 8d 14 03lea(%ebx,%eax,1),%edx
> c010ccb9: 8d 04 95 00 00 00 00lea0x0(,%edx,4),%eax
> c010ccc0: 8b 0d 30 a1 3e c0   mov0xc03ea130,%ecx
> c010ccc6: 29 c1   sub%eax,%ecx
> c010ccc8: 83 39 00cmpl   $0x0,(%ecx)
> c010cccb: 74 04   je c010ccd1 
> c010cccd: 0f 0b   ud2a   

I had more complete info: http://article.gmane.org/gmane.linux.network/66966

You're using DEBUG_PAGEALLOC, but I was not, so I think we can rule that out.

I haven't worked out where that kmap_atomic() call is coming from yet. 
Both traces point up into the page allocator, but I _think_ that's stack
gunk.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8797] New: WARNING from skb_gso_segment

2007-07-23 Thread Andrew Morton
On Mon, 23 Jul 2007 09:36:08 -0700 (PDT)
[EMAIL PROTECTED] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8797
> 
>Summary: WARNING from skb_gso_segment
>Product: Networking
>Version: 2.5
>  KernelVersion: 2.6.22.1
>   Platform: All
> OS/Version: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: IPV4
> AssignedTo: [EMAIL PROTECTED]
> ReportedBy: [EMAIL PROTECTED]
> CC: [EMAIL PROTECTED]
> 
> 
> Most recent kernel where this bug did not occur: 2.6.20.4
> 
> Distribution: kernel.org
> 
> Hardware Environment: x86, dual-core Xeon, tg3, e1000
> 
> Software Environment: ip_vs, bonding
> 
> Problem Description: After upgrading kernels from 2.6.20.4 to 2.6.22.1, I
> started getting tons (27,000+ a day) of the warning messages below resulting
> from this change:
> 
> http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.22.y.git;a=commitdiff;h=f9d106a6d53b57b78eae5544f9582c643343a764
> 
> This is on a system that uses ip_vs to load balance DNS traffic. Other than
> these warnings, everything appears to be normal with the load balanced DNS
> traffic.
> 
> Jul 17 09:29:40 hlb1 kernel: WARNING: at net/core/dev.c:1286 skb_gso_segment()
> Jul 17 09:29:40 hlb1 kernel:  [] skb_gso_segment+0x92/0x184
> Jul 17 09:29:40 hlb1 kernel:  [] dev_gso_segment+0x10/0x34
> Jul 17 09:29:40 hlb1 kernel:  [] dev_hard_start_xmit+0x4b/0xb1
> Jul 17 09:29:40 hlb1 kernel:  [] dev_queue_xmit+0x172/0x1fd
> Jul 17 09:29:40 hlb1 kernel:  [] ip_output+0x274/0x2ad
> Jul 17 09:29:40 hlb1 kernel:  [] ip_finish_output+0x0/0x20d
> Jul 17 09:29:40 hlb1 kernel:  [] ip_queue_xmit+0x35e/0x3a6
> Jul 17 09:29:40 hlb1 kernel:  [] do_ip_vs_get_ctl+0x6f5/0x704 
> [ip_vs]
> Jul 17 09:29:40 hlb1 kernel:  [] ip_output+0x274/0x2ad
> Jul 17 09:29:40 hlb1 kernel:  [] ip_vs_dr_xmit+0x345/0x37b [ip_vs]
> Jul 17 09:29:40 hlb1 kernel:  [] _spin_lock_bh+0x8/0x10
> Jul 17 09:29:40 hlb1 kernel:  [] tcp_cwnd_restart+0x17/0xad
> Jul 17 09:29:40 hlb1 kernel:  [] tcp_transmit_skb+0x3e0/0x402
> Jul 17 09:29:40 hlb1 kernel:  [] tso_fragment+0x15a/0x194
> Jul 17 09:29:40 hlb1 kernel:  [] tcp_write_xmit+0x1cc/0x22e
> Jul 17 09:29:40 hlb1 kernel:  [] __tcp_push_pending_frames+0x11/0x60
> Jul 17 09:29:40 hlb1 kernel:  [] tcp_sendmsg+0x9df/0xa95
> Jul 17 09:29:40 hlb1 kernel:  [] inet_sendmsg+0x39/0x43
> Jul 17 09:29:40 hlb1 kernel:  [] do_sock_write+0xab/0xb2
> Jul 17 09:29:40 hlb1 kernel:  [] sock_aio_write+0x50/0x5c
> Jul 17 09:29:40 hlb1 kernel:  [] do_sync_write+0xbf/0xfc
> Jul 17 09:29:40 hlb1 kernel:  [] mutex_lock+0x13/0x22
> Jul 17 09:29:40 hlb1 kernel:  [] autoremove_wake_function+0x0/0x33
> Jul 17 09:29:40 hlb1 kernel:  [] sock_common_getsockopt+0x1c/0x21
> Jul 17 09:29:40 hlb1 kernel:  [] sys_getsockopt+0x7d/0x9c
> Jul 17 09:29:40 hlb1 kernel:  [] vfs_write+0xc9/0x133
> Jul 17 09:29:40 hlb1 kernel:  [] sys_write+0x41/0x67
> Jul 17 09:29:40 hlb1 kernel:  [] sysenter_past_esp+0x5f/0x85
> Jul 17 09:29:40 hlb1 kernel:  ===
> 
> 
> Steps to reproduce: Unsure
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> --- You are receiving this mail because: ---
> You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] TIPC: fix tipc_link_create error handling

2007-07-23 Thread Stephens, Allan
Hi Florian:

Changes look pretty good to me.

I'd also recommend deferring the call to k_init_timer() to the same
point as your list_add_tail() call.  (If you don't, then there should
really be a k_term_timer() call in the clean up code that handles a
failure of tipc_node_attach_link().)

Regards,
Al 

-Original Message-
From: Florian Westphal [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 23, 2007 2:34 PM
To: netdev@vger.kernel.org
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; Stephens, Allan;
[EMAIL PROTECTED]
Subject: [PATCH] TIPC: fix tipc_link_create error handling

if printbuf allocation or tipc_node_attach_link() fails, invalid
references to the link are left in the associated node and bearer
structures.
Fix by doing printbuf allocation early and adding the new link to
b_ptr->links after tipc_node_attach_link() succeeded.

Signed-off-by: Florian Westphal <[EMAIL PROTECTED]>
---
 net/tipc/link.c |   26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

Allan/Jon/Per: I'd appreciate if you could check wether I missed
something.

diff --git a/net/tipc/link.c b/net/tipc/link.c index 5adfdfd..9917c64
100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -423,6 +423,17 @@ struct link *tipc_link_create(struct bearer *b_ptr,
const u32 peer,
return NULL;
}
 
+   if (LINK_LOG_BUF_SIZE) {
+   char *pb = kmalloc(LINK_LOG_BUF_SIZE, GFP_ATOMIC);
+
+   if (!pb) {
+   kfree(l_ptr);
+   warn("Link creation failed, no memory for print
buffer\n");
+   return NULL;
+   }
+   tipc_printbuf_init(&l_ptr->print_buf, pb,
LINK_LOG_BUF_SIZE);
+   }
+
l_ptr->addr = peer;
if_name = strchr(b_ptr->publ.name, ':') + 1;
sprintf(l_ptr->name, "%u.%u.%u:%s-%u.%u.%u:", @@ -433,7 +444,6
@@ struct link *tipc_link_create(struct bearer *b_ptr, const u32 peer,
/* note: peer i/f is appended to link name by
reset/activate */
memcpy(&l_ptr->media_addr, media_addr, sizeof(*media_addr));
k_init_timer(&l_ptr->timer, (Handler)link_timeout, (unsigned
long)l_ptr);
-   list_add_tail(&l_ptr->link_list, &b_ptr->links);
l_ptr->checkpoint = 1;
l_ptr->b_ptr = b_ptr;
link_set_supervision_props(l_ptr, b_ptr->media->tolerance); @@
-459,21 +469,13 @@ struct link *tipc_link_create(struct bearer *b_ptr,
const u32 peer,
 
l_ptr->owner = tipc_node_attach_link(l_ptr);
if (!l_ptr->owner) {
+   if (LINK_LOG_BUF_SIZE)
+   kfree(l_ptr->print_buf.buf);
kfree(l_ptr);
return NULL;
}
 
-   if (LINK_LOG_BUF_SIZE) {
-   char *pb = kmalloc(LINK_LOG_BUF_SIZE, GFP_ATOMIC);
-
-   if (!pb) {
-   kfree(l_ptr);
-   warn("Link creation failed, no memory for print
buffer\n");
-   return NULL;
-   }
-   tipc_printbuf_init(&l_ptr->print_buf, pb,
LINK_LOG_BUF_SIZE);
-   }
-
+   list_add_tail(&l_ptr->link_list, &b_ptr->links);
tipc_k_signal((Handler)tipc_link_start, (unsigned long)l_ptr);
 
dbg("tipc_link_create(): tolerance = %u,cont intv = %u,
abort_limit = %u\n",
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Lockup on boot when trying to bring up r8169 NIC

2007-07-23 Thread Andy Gospodarek
On Thu, Jul 19, 2007 at 04:11:35PM +0200, Thomas MĂĽller wrote:
> Hi,
> 
> I already sent this two days ago, but I have the feeling it was
> overlooked or filtered because of a large attachment.
> 
> 
> If I try to boot 2.6.21.6, 2.6.22.1 or 2.6.22-git8 the system completely
> hangs when init tries to bring up my r8169-based NIC. Not even the
> keyboard lights are working anymore.
> 
> If I unplug the network cable, boot continues just fine and everything
> works as it should.
> If I boot with the cable unplugged, the system also hangs and continues
> after I plug in the cable.
> 
> 
> Everything works fine with 2.6.20.15.
> 
> 
> Configuration:
>   http://www.mathtm.de/config_2.6.20.15_fc6based
>   http://www.mathtm.de/config_2.6.21.6_f7based
> 
> 
> Using a Fedora kernel (based on 2.6.21.5) I get the following kernel
> message:
> r8169: eth0: link down
> BUG: soft lockup detected on CPU#0!
>  [] softlockup_tick+0xa5/0xb4
>  [] update_process_times+0x3b/0x5e
>  [] tick_sched_timer+0x57/0x9a
>  [] hrtimer_interrupt+0x12b/0x1b6
>  [] tick_sched_timer+0x0/0x9a
>  [] timer_interrupt+0x2c/0x32
>  [] handle_IRQ_event+0x1a/0x3f
>  [] handle_level_irq+0x81/0xc7
>  [] do_IRQ+0xb8/0xd1
>  [] common_interrupt+0x23/0x28
>  [] handle_IRQ_event+0x11/0x3f
>  [] handle_level_irq+0x81/0xc7
>  [] handle_level_irq+0x0/0xc7
>  [] do_IRQ+0xac/0xd1
>  [] common_interrupt+0x23/0x28
>  [] __do_softirq+0x54/0xba
>  [] do_softirq+0x59/0xb1
>  [] handle_level_irq+0x0/0xc7
>  [] irq_exit+0x38/0x6b
>  [] do_IRQ+0xbd/0xd1
>  [] common_interrupt+0x23/0x28
>  [] find_busiest_group+0x264/0x4c5
>  [] _spin_unlock_irqrestore+0x8/0x9
>  [] __mod_timer+0xa1/0xab
>  [] rtl8169_open+0x12e/0x194 [r8169]
>  [] dev_open+0x2b/0x62
>  [] dev_change_flags+0x47/0xe4
>  [] devinet_ioctl+0x250/0x56a
>  [] copy_to_user+0x3c/0x50
>  [] sock_ioctl+0x19f/0x1be
>  [] sock_ioctl+0x0/0x1be
>  [] do_ioctl+0x1f/0x62
>  [] vfs_ioctl+0x244/0x256
>  [] sys_ioctl+0x4c/0x64
>  [] syscall_call+0x7/0xb
>  ===
> r8169: eth0: link up
> 
> 
> 
> There already is a bugzilla entry at
> http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242572
> I know, not everyone is a fan of bugzilla, but maybe someone wants to
> take a look at what was discussed there.
> 
> 
> Please CC me as I'm not subscribed to the list and don't hesitate to
> tell me that I forgot to include some crucial information ;)
> 
> 
> Regards,
> Thomas
> 


As you know already this seems to be caused by an undesireable
interaction between the r8169 driver and a kernel with
CONFIG_DEBUG_SHIRQ=y.  

It seems rtl8169_interrupt will spin (or hang) while booting until there
is an interrupt to service (like a link-change event).

-andy

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] dma: use dev_to_node to get node for device in dma_alloc_pages

2007-07-23 Thread Christoph Lameter
On Tue, 10 Jul 2007 16:53:09 -0700
Yinghai Lu <[EMAIL PROTECTED]> wrote:

> [PATCH 5/5] dma: use dev_to_node to get node for device in
> dma_alloc_pages

Acked-by: Christoph Lameter <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH [Bug 8756]] IPv6: Don't update ADVMSS on routes where the MTU is not also updated

2007-07-23 Thread Simon Arlott
The ADVMSS value was incorrectly updated for ALL routes when the MTU 
is updated because it's outside the effect of the if statement's 
condition.

Signed-off-by: Simon Arlott <[EMAIL PROTECTED]>
---
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=8756

 net/ipv6/route.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 919de68..55ea80f 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1983,9 +1983,10 @@ static int rt6_mtu_change_route(struct rt6_info *rt, 
void *p_arg)
!dst_metric_locked(&rt->u.dst, RTAX_MTU) &&
(dst_mtu(&rt->u.dst) > arg->mtu ||
 (dst_mtu(&rt->u.dst) < arg->mtu &&
- dst_mtu(&rt->u.dst) == idev->cnf.mtu6)))
+ dst_mtu(&rt->u.dst) == idev->cnf.mtu6))) {
rt->u.dst.metrics[RTAX_MTU-1] = arg->mtu;
-   rt->u.dst.metrics[RTAX_ADVMSS-1] = ipv6_advmss(arg->mtu);
+   rt->u.dst.metrics[RTAX_ADVMSS-1] = ipv6_advmss(arg->mtu);
+   }
return 0;
 }
 
-- 
1.5.0.1

-- 
Simon Arlott


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)

2007-07-23 Thread Rick Jones

David Miller wrote:

From: Rick Jones <[EMAIL PROTECTED]>
Date: Fri, 13 Jul 2007 09:55:10 -0700



Fine, but so?  I suspect the point of the patch is to provide a
lower cap on the accumulated backoff so data starts flowing over the
connection within that lower cap once the link is
restored/failed-over.



The backoff is there for a reason.


I'm not disputing the general value of the backoff, nor about the value of an 
initial value of 60 seconds.  In terms of avoiding congestive collapse one does 
indeed want the exponential backoff.  I'm just in agreement with the person from 
Hitachi that allowing someone to tweak the backoff has a certain value.


60 seconds is already a trade-off between a pure (non capped) exponential 
backoff and capping the value.


rick
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


specifying scopid's for link-local IPv6 addrs

2007-07-23 Thread Rick Jones

Folks -

People running netperf have reported that they have trouble with IPv6 under 
Linux.  Specifically, wereas the use of link-local IPv6 addresses "just works" 
in netperf under a number of "other OSes" they do not under Linux.  I'm 
ass-u-me-ing 2.6 here, but not sure exactly which ones - I've seen it on a 
2.6.18-based RHEL5.


Some poking about and conversation has suggested that one has to set a 
sin6_scope_id in the sockaddr_in6.  This needs to be an index of one of the 
interfaces in the system, which I presume means walking some additional structures.


Is this a requirement which might be expected to remain in the future, or is it 
something which might just go away?  That will have an effect on netperf future 
development.


thanks,

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] ixgbe: Support for Intel(R) 10GbE PCI Express adapters - Take #2

2007-07-23 Thread Rick Jones



Bidirectional test.
87380  65536  6553660.01   7809.57   28.6630.022.405   2.519
TX
87380  65536  6553660.01   7592.90   28.6630.022.474   2.591
RX
--
87380  65536  6553660.01  7629.73   28.3229.642.433   2.546
RX
87380  65536  6553660.01  7926.99   28.3229.642.342   2.450
TX

Signle netperf stream between 2 quad-core Xeon based boxes. Tested on
2.6.20 and 2.6.22 kernels. Driver uses NAPI and LRO.


The bidirectional looks like a two concurrent stream (TCP_STREAM + TCP_MAERTS) 
test right?


If you want a single-stream bidirectional test, then with the top of trunk 
netperf you can use:


./configure --enable-burst
make install # yadda yadda
netperf -t TCP_RR -H  -f m -v 2 -l 60 -c -C -- -r 64K -b 12

which will cause netperf to have 13, 64K transactions in flight at one time on 
the connection, which for a 64K request size has been sufficient, thusfar 
anyway, to saturate things.  As there is no select/poll/whatever call in netperf 
TCP_RR it might be necessary to include test-specific -s and -S options to make 
sure the socket buffer (SO_SNDBUF) is large enough that none of those send() 
calls ever block, lest both ends end-up blocked in a send() call.


The -f m will switch the output from transactions/s to megabits per second and 
is the part requiring the top of trunk netperf.  The -v 2 stuff causes extra 
stuff to give bitrates in each direction and transaction/s rate as well as 
computed average latency.  That is also in top of trunk, otherwise, for 2.4.3 
you can skip that and do the math to conver to megabits/s yourself and not get 
all the other derived values.


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] TIPC: fix tipc_link_create error handling

2007-07-23 Thread Florian Westphal
if printbuf allocation or tipc_node_attach_link() fails, invalid
references to the link are left in the associated node and bearer
structures.
Fix by doing printbuf allocation early and adding the new link
to b_ptr->links after tipc_node_attach_link() succeeded.

Signed-off-by: Florian Westphal <[EMAIL PROTECTED]>
---
 net/tipc/link.c |   26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

Allan/Jon/Per: I'd appreciate if you could check wether I missed something.

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 5adfdfd..9917c64 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -423,6 +423,17 @@ struct link *tipc_link_create(struct bearer *b_ptr, const 
u32 peer,
return NULL;
}
 
+   if (LINK_LOG_BUF_SIZE) {
+   char *pb = kmalloc(LINK_LOG_BUF_SIZE, GFP_ATOMIC);
+
+   if (!pb) {
+   kfree(l_ptr);
+   warn("Link creation failed, no memory for print 
buffer\n");
+   return NULL;
+   }
+   tipc_printbuf_init(&l_ptr->print_buf, pb, LINK_LOG_BUF_SIZE);
+   }
+
l_ptr->addr = peer;
if_name = strchr(b_ptr->publ.name, ':') + 1;
sprintf(l_ptr->name, "%u.%u.%u:%s-%u.%u.%u:",
@@ -433,7 +444,6 @@ struct link *tipc_link_create(struct bearer *b_ptr, const 
u32 peer,
/* note: peer i/f is appended to link name by reset/activate */
memcpy(&l_ptr->media_addr, media_addr, sizeof(*media_addr));
k_init_timer(&l_ptr->timer, (Handler)link_timeout, (unsigned 
long)l_ptr);
-   list_add_tail(&l_ptr->link_list, &b_ptr->links);
l_ptr->checkpoint = 1;
l_ptr->b_ptr = b_ptr;
link_set_supervision_props(l_ptr, b_ptr->media->tolerance);
@@ -459,21 +469,13 @@ struct link *tipc_link_create(struct bearer *b_ptr, const 
u32 peer,
 
l_ptr->owner = tipc_node_attach_link(l_ptr);
if (!l_ptr->owner) {
+   if (LINK_LOG_BUF_SIZE)
+   kfree(l_ptr->print_buf.buf);
kfree(l_ptr);
return NULL;
}
 
-   if (LINK_LOG_BUF_SIZE) {
-   char *pb = kmalloc(LINK_LOG_BUF_SIZE, GFP_ATOMIC);
-
-   if (!pb) {
-   kfree(l_ptr);
-   warn("Link creation failed, no memory for print 
buffer\n");
-   return NULL;
-   }
-   tipc_printbuf_init(&l_ptr->print_buf, pb, LINK_LOG_BUF_SIZE);
-   }
-
+   list_add_tail(&l_ptr->link_list, &b_ptr->links);
tipc_k_signal((Handler)tipc_link_start, (unsigned long)l_ptr);
 
dbg("tipc_link_create(): tolerance = %u,cont intv = %u, abort_limit = 
%u\n",
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6 patch] remove Documentation/networking/net-modules.txt

2007-07-23 Thread Adrian Bunk
kAccording to git, the only one who touched this file during the last
5 years was me when removing drivers...

modinfo offers a less ancient version of this information.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 Documentation/networking/00-INDEX|2 
 Documentation/networking/net-modules.txt |  315 ---
 2 files changed, 317 deletions(-)

--- linux-2.6.22-rc6-mm1/Documentation/networking/00-INDEX.old  2007-07-23 
20:28:51.0 +0200
+++ linux-2.6.22-rc6-mm1/Documentation/networking/00-INDEX  2007-07-23 
20:28:59.0 +0200
@@ -80,8 +80,6 @@
- Behaviour of cards under Multicast
 ncsa-telnet
- notes on how NCSA telnet (DOS) breaks with MTU discovery enabled.
-net-modules.txt
-   - info and "insmod" parameters for all network driver modules.
 netdevices.txt
- info on network device driver functions exported to the kernel.
 olympic.txt
--- linux-2.6.22-rc6-mm1/Documentation/networking/net-modules.txt   
2007-07-12 20:43:37.0 +0200
+++ /dev/null   2006-09-19 00:45:31.0 +0200
@@ -1,315 +0,0 @@
-Wed 2-Aug-95  <[EMAIL PROTECTED]>
-
-   Linux network driver modules
-
-   Do not mistake this for "README.modules" at the top-level
-   directory!  That document tells about modules in general, while
-   this one tells only about network device driver modules.
-
-   This is a potpourri of INSMOD-time(*) configuration options
-   (if such exists) and their default values of various modules
-   in the Linux network drivers collection.
-
-   Some modules have also hidden (= non-documented) tunable values.
-   The choice of not documenting them is based on general belief, that
-   the less the user needs to know, the better.  (There are things that
-   driver developers can use, others should not confuse themselves.)
-
-   In many cases it is highly preferred that insmod:ing is done
-   ONLY with defining an explicit address for the card, AND BY
-   NOT USING AUTO-PROBING!
-
-   Now most cards have some explicitly defined base address that they
-   are compiled with (to avoid auto-probing, among other things).
-   If that compiled value does not match your actual configuration,
-   do use the "io=0xXXX" -parameter for the insmod, and give there
-   a value matching your environment.
-
-   If you are adventurous, you can ask the driver to autoprobe
-   by using the "io=0" parameter, however it is a potentially dangerous
-   thing to do in a live system.  (If you don't know where the
-   card is located, you can try autoprobing, and after possible
-   crash recovery, insmod with proper IO-address..)
-
-   --
-   (*) "INSMOD-time" means when you load module with
-   /sbin/insmod  you can feed it optional parameters.
-   See "man insmod".
-   --
-
-
-   8390 based Network Modules  (Paul Gortmaker, Nov 12, 1995)
-   --
-
-(Includes: smc-ultra, ne, wd, 3c503, hp, hp-plus, e2100 and ac3200)
-
-The 8390 series of network drivers now support multiple card systems without 
-reloading the same module multiple times (memory efficient!) This is done by 
-specifying multiple comma separated values, such as:
-
-   insmod 3c503.o io=0x280,0x300,0x330,0x350  xcvr=0,1,0,1
-
-The above would have the one module controlling four 3c503 cards, with card 2
-and 4 using external transceivers. The "insmod" manual describes the usage
-of comma separated value lists.
-
-It is *STRONGLY RECOMMENDED* that you supply "io=" instead of autoprobing.
-If an "io=" argument is not supplied, then the ISA drivers will complain
-about autoprobing being not recommended, and begrudgingly autoprobe for
-a *SINGLE CARD ONLY* -- if you want to use multiple cards you *have* to 
-supply an "io=0xNNN,0xQQQ,..." argument.
-
-The ne module is an exception to the above. A NE2000 is essentially an
-8390 chip, some bus glue and some RAM. Because of this, the ne probe is
-more invasive than the rest, and so at boot we make sure the ne probe is 
-done last of all the 8390 cards (so that it won't trip over other 8390 based
-cards) With modules we can't ensure that all other non-ne 8390 cards have
-already been found. Because of this, the ne module REQUIRES an "io=0xNNN" 
-argument passed in via insmod. It will refuse to autoprobe.
-
-It is also worth noting that auto-IRQ probably isn't as reliable during 
-the flurry of interrupt activity on a running machine. Cards such as the 
-ne2000 that can't get the IRQ setting from an EEPROM or configuration
-register are probably best supplied with an "irq=M" argument as well.
-
-
---
-Card/Module List - Configurable Parameters and Default Values
---
-
-3c501.c:
-   

Re: [Bugme-new] [Bug 8789] New: Error inserting ipt_LOG (mod_path): Device or resource busy

2007-07-23 Thread Patrick McHardy
Andrew Morton wrote:
> On Sat, 21 Jul 2007 00:44:08 -0700 (PDT) [EMAIL PROTECTED] wrote:
> 
> 
>>http://bugzilla.kernel.org/show_bug.cgi?id=8789
>>
>>   Summary: Error inserting ipt_LOG (mod_path): Device or resource
>>busy
> 
> 
> A 2.6.12 -> 2.6.22 regression.


Crap, that was my fault.

T., can you please test whether this patch fixes it?

diff --git a/net/bridge/netfilter/ebt_log.c b/net/bridge/netfilter/ebt_log.c
index 031bfa4..984e9c6 100644
--- a/net/bridge/netfilter/ebt_log.c
+++ b/net/bridge/netfilter/ebt_log.c
@@ -196,10 +196,8 @@ static int __init ebt_log_init(void)
ret = ebt_register_watcher(&log);
if (ret < 0)
return ret;
-   ret = nf_log_register(PF_BRIDGE, &ebt_log_logger);
-   if (ret < 0 && ret != -EEXIST)
-   ebt_unregister_watcher(&log);
-   return ret;
+   nf_log_register(PF_BRIDGE, &ebt_log_logger);
+   return 0;
 }
 
 static void __exit ebt_log_fini(void)
diff --git a/net/bridge/netfilter/ebt_ulog.c b/net/bridge/netfilter/ebt_ulog.c
index 9411db6..6fec352 100644
--- a/net/bridge/netfilter/ebt_ulog.c
+++ b/net/bridge/netfilter/ebt_ulog.c
@@ -308,12 +308,8 @@ static int __init ebt_ulog_init(void)
else if ((ret = ebt_register_watcher(&ulog)))
sock_release(ebtulognl->sk_socket);
 
-   if (nf_log_register(PF_BRIDGE, &ebt_ulog_logger) < 0) {
-   printk(KERN_WARNING "ebt_ulog: not logging via ulog "
-  "since somebody else already registered for 
PF_BRIDGE\n");
-   /* we cannot make module load fail here, since otherwise
-* ebtables userspace would abort */
-   }
+   if (ret == 0)
+   nf_log_register(PF_BRIDGE, &ebt_ulog_logger);
 
return ret;
 }
diff --git a/net/ipv4/netfilter/ipt_LOG.c b/net/ipv4/netfilter/ipt_LOG.c
index 5937ad1..127a5e8 100644
--- a/net/ipv4/netfilter/ipt_LOG.c
+++ b/net/ipv4/netfilter/ipt_LOG.c
@@ -479,10 +479,8 @@ static int __init ipt_log_init(void)
ret = xt_register_target(&ipt_log_reg);
if (ret < 0)
return ret;
-   ret = nf_log_register(PF_INET, &ipt_log_logger);
-   if (ret < 0 && ret != -EEXIST)
-   xt_unregister_target(&ipt_log_reg);
-   return ret;
+   nf_log_register(PF_INET, &ipt_log_logger);
+   return 0;
 }
 
 static void __exit ipt_log_fini(void)
diff --git a/net/ipv6/netfilter/ip6t_LOG.c b/net/ipv6/netfilter/ip6t_LOG.c
index b05327e..6ab9900 100644
--- a/net/ipv6/netfilter/ip6t_LOG.c
+++ b/net/ipv6/netfilter/ip6t_LOG.c
@@ -493,10 +493,8 @@ static int __init ip6t_log_init(void)
ret = xt_register_target(&ip6t_log_reg);
if (ret < 0)
return ret;
-   ret = nf_log_register(PF_INET6, &ip6t_logger);
-   if (ret < 0 && ret != -EEXIST)
-   xt_unregister_target(&ip6t_log_reg);
-   return ret;
+   nf_log_register(PF_INET6, &ip6t_logger);
+   return 0;
 }
 
 static void __exit ip6t_log_fini(void)


Re: [PATCH] make xfrm_audit_log more generic

2007-07-23 Thread Steve Grubb
On Monday 23 July 2007 13:49:17 Joy Latten wrote:
> > Will this cause existing applications to break?
>
> Perhaps someone in audit list could help answer this.

Probably. Its better to take a new number and let the old ones sit idle.

-Steve
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make xfrm_audit_log more generic

2007-07-23 Thread Joy Latten
On Thu, 2007-07-19 at 21:45 -0400, James Morris wrote:
> On Thu, 19 Jul 2007, Joy Latten wrote:
> 
> > --- linux-2.6.22/include/linux/audit.h  2007-07-19 13:17:22.0 
> > -0500
> > +++ linux-2.6.22.patch/include/linux/audit.h2007-07-19 
> > 13:21:29.0 -0500
> > @@ -108,10 +108,7 @@
> >  #define AUDIT_MAC_CIPSOV4_DEL  1408/* NetLabel: del CIPSOv4 DOI 
> > entry */
> >  #define AUDIT_MAC_MAP_ADD  1409/* NetLabel: add LSM domain mapping */
> >  #define AUDIT_MAC_MAP_DEL  1410/* NetLabel: del LSM domain mapping */
> > -#define AUDIT_MAC_IPSEC_ADDSA  1411/* Add a XFRM state */
> > -#define AUDIT_MAC_IPSEC_DELSA  1412/* Delete a XFRM state */
> > -#define AUDIT_MAC_IPSEC_ADDSPD 1413/* Add a XFRM policy */
> > -#define AUDIT_MAC_IPSEC_DELSPD 1414/* Delete a XFRM policy */
> > +#define AUDIT_MAC_IPSEC_EVENT  1411/* Audit IPSec events */
> 
> Will this cause existing applications to break?
> 

Perhaps someone in audit list could help answer this. 

During testing, because I changed the above defines, all
IPSec events are listed as "MAC_IPSEC_ADDSA" for now without
userspace change. Is this ok? Or is there a better way to 
migrate this change in? Perhaps leave previous IPsec defines 
and just add in a new one and use it? If that is better
approach, let me know and I will change code to accomodate.

Regards,
Joy
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: Resurrect napi_poll patch.

2007-07-23 Thread Andi Kleen
On Mon, Jul 23, 2007 at 10:58:22AM +0100, Stephen Hemminger wrote:
> On 21 Jul 2007 15:26:00 +0200
> Andi Kleen <[EMAIL PROTECTED]> wrote:
> 
> > David Miller <[EMAIL PROTECTED]> writes:
> > > 
> > > Good candidates for taking advantage of multi-napi are:
> > > 
> > > 1) e1000
> > > 2) ucc_geth
> > > 3) ehea
> > > 4) sunvnet
> > 
> > s2io.c
> 
> sky2.c could use it because of issues with dual-port that share
> one napi for status.

Sorry, I didn't parse the sentence. Did you mean "couldn't use it" ...? 
Also can you elaborate why it shouldn't work?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Question: global lock (genl_mutex) possible refinement?

2007-07-23 Thread Richard MUSIL
Thomas Graf wrote:
> Actually there is no reason to not use separate locks for the
> message serialization and the protection of the list of registered
> families. There is only one lock simply for the reason that I've
> never thought of anybody could think of registering a new genetlink
> family while processing a message.

Thomas,
I have been giving it a second thought and came up with something more
complex. The idea is to have locking granularity at the level of
individual families.

Message processing will lock only the family it currently processes
*and* global messaging lock, while family management routines will take
global family lock, and particular lock for family.
This should allow running registration/dereg. from message handler as
long as the family involved is not the same family in which context the
message is being processed.

On the other hand, using lock per family, ensures that message handlers
are not (accidentally) called with invalid references. This should not
be so much problem for dumpit handler, but doit uses for example family
attrs. So I added another patch which implements above described
functionality below.

> Alternatively you could also postpone the registration of the new
> genetlink family to a workqueue.

To be honest, I cannot judge whether the additional complexity I propose
outweighs the gains. In my book, it definitely does, since I like the
easiness of doit, dumpit handlers and implementation using those is
pretty straightforward.
In the long term I believe that refining the locking could also help
in situations where there is heavy traffic over genetlink, then
all family manipulations will not be blocked (which is currently the case).

Let me know, if you accept it as a patch, or should I eventually go for
plan B ;).

--
Richard


>From 63b3ee722402533aed6e137347e41ab1a1fa1127 Mon Sep 17 00:00:00 2001
From: Richard Musil <[EMAIL PROTECTED]>
Date: Mon, 23 Jul 2007 15:12:09 +0200
Subject: [PATCH] Added private mutex for each genetlink family (struct 
genl_family).
This mutex is used to synchronize access to particular family between message
processing handlers and management routines for families
(registering/unregistering families/ops).

This should ensure that another family can be registered or unregistered from
inside genetlink message handler. Trying to register or unregister family
from its own handler will still cause deadlock.
---
 include/net/genetlink.h |1 +
 net/netlink/genetlink.c |   98 +--
 2 files changed, 70 insertions(+), 29 deletions(-)

diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index b6eaca1..681ad13 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -25,6 +25,7 @@ struct genl_family
struct nlattr **attrbuf;/* private */
struct list_headops_list;   /* private */
struct list_headfamily_list;/* private */
+   struct mutexlock;   /* private */
 };
 
 /**
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 5ee18eb..0104267 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -40,16 +40,30 @@ static void genl_unlock(void)
 
 static DEFINE_MUTEX(genl_fam_mutex);   /* serialization for family list 
management */
 
-static inline void genl_fam_lock(void)
+static inline void genl_fam_lock(struct genl_family *family)
 {
mutex_lock(&genl_fam_mutex);
+   if (family)
+   mutex_lock(&family->lock);
 }
 
-static inline genl_fam_unlock(void)
+static inline void genl_fam_unlock(struct genl_family *family)
 {
+   if (family)
+   mutex_unlock(&family->lock);
mutex_unlock(&genl_fam_mutex);
 }
 
+static inline void genl_onefam_lock(struct genl_family *family)
+{
+   mutex_lock(&family->lock);
+}
+
+static inline void genl_onefam_unlock(struct genl_family *family)
+{
+   mutex_unlock(&family->lock);
+}
+
 #define GENL_FAM_TAB_SIZE  16
 #define GENL_FAM_TAB_MASK  (GENL_FAM_TAB_SIZE - 1)
 
@@ -162,9 +176,9 @@ int genl_register_ops(struct genl_family *family, struct 
genl_ops *ops)
if (ops->policy)
ops->flags |= GENL_CMD_CAP_HASPOL;
 
-   genl_fam_lock();
+   genl_fam_lock(family);
list_add_tail(&ops->ops_list, &family->ops_list);
-   genl_fam_unlock();
+   genl_fam_unlock(family);
 
genl_ctrl_event(CTRL_CMD_NEWOPS, ops);
err = 0;
@@ -192,16 +206,16 @@ int genl_unregister_ops(struct genl_family *family, 
struct genl_ops *ops)
 {
struct genl_ops *rc;
 
-   genl_fam_lock();
+   genl_fam_lock(family);
list_for_each_entry(rc, &family->ops_list, ops_list) {
if (rc == ops) {
list_del(&ops->ops_list);
-   genl_fam_unlock();
+   genl_fam_unlock(family);
genl_ctrl_event(CTRL_CMD_DELOPS, ops);
return 

[PATCH net-2.6.22-rc7] xfrm state selection update to use inner addresses

2007-07-23 Thread Joakim Koskela
This patch modifies the xfrm state selection logic to use the inner
addresses where the outer have been (incorrectly) used. This is
required for beet mode in general and interfamily setups in both
tunnel and beet mode.

Signed-off-by: Joakim Koskela <[EMAIL PROTECTED]>
Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>
Signed-off-by: Diego Beltrami <[EMAIL PROTECTED]>
Signed-off-by: Miika Komu <[EMAIL PROTECTED]>
---

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 157bfbd..75fdb7d 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1299,7 +1299,8 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, struct 
flowi *fl,
xfrm_address_t *local  = saddr;
struct xfrm_tmpl *tmpl = &policy->xfrm_vec[i];
 
-   if (tmpl->mode == XFRM_MODE_TUNNEL) {
+   if (tmpl->mode == XFRM_MODE_TUNNEL ||
+   tmpl->mode == XFRM_MODE_BEET) {
remote = &tmpl->id.daddr;
local = &tmpl->saddr;
family = tmpl->encap_family;
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index e070c3f..f5d30c4 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -611,7 +611,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t 
*saddr,
  selector.
 */
if (x->km.state == XFRM_STATE_VALID) {
-   if (!xfrm_selector_match(&x->sel, fl, family) ||
+   if (!xfrm_selector_match(&x->sel, fl, 
x->sel.family) ||
!security_xfrm_state_pol_flow_match(x, pol, 
fl))
continue;
if (!best ||
@@ -623,7 +623,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t 
*saddr,
acquire_in_progress = 1;
} else if (x->km.state == XFRM_STATE_ERROR ||
   x->km.state == XFRM_STATE_EXPIRED) {
-   if (xfrm_selector_match(&x->sel, fl, family) &&
+   if (xfrm_selector_match(&x->sel, fl, 
x->sel.family) &&
security_xfrm_state_pol_flow_match(x, pol, 
fl))
error = -ESRCH;
}
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] eepro100 resume patch

2007-07-23 Thread Pavel Machek
On Fri 2007-07-13 21:11:28, Kok, Auke wrote:
> [adding netdev]
> 
> David Fries wrote:
> >When I did a software suspend to disk then resumed the Intel network
> >card using eepro100 driver would be unable to transmit packets.  I
> >tracked this down and found a register write after the print message
> >"DP83840 specific setup" which wasn't being executed when the system
> >was restored.  This fix moves that write and another write which
> >forces the link speed and duplex.
> >
> >After doing this work and preparing the patch I checked out the
> >mailing list only to find a patch that removes the eepro100.  I then
> >updated Kconfig, though I wonder why it didn't have a similar message
> >in it long time ago.
> >
> >I too had tried the e100 driver some time ago and it didn't work,
> 
> That argument is pretty useless right now. Please *test* e100 and *report 
> issues*. I recently did some very intensive suspend/resume testing (and 
> fixes) on e100 and I have yet to hear of any problems with it since... that 
> was 2.6.18 or so even.
> 
> >eepro100 did and I've been using it so long that I've almost forgotten
> >about that.  I just gave the e100 driver a try and I've been running
> >for about an hour now without any problems and it does resume after a
> >suspend to disk operation.
> >
> >Signed-off-by: David Fries <[EMAIL PROTECTED]>
> 
> I don't think I need to NAK this. I doubt that Jeff Garzik will apply this 
> in the first place. eepro100 is on it's way out, so let's focus on what 
> matters.

Well, I believe we should either remove eepro100 _now_ or fix issues
with it. "Keep it in the tree, and broken at the same time" is unnice.


> >@@ -743,20 +746,22 @@
> >phys[(eeprom[7]>>8)&7]);
> > if (((eeprom[6]>>8) & 0x3f) == DP83840
> > ||  ((eeprom[6]>>8) & 0x3f) == DP83840A) {
> >-int mdi_reg23 = mdio_read(dev, eeprom[6] & 0x1f, 23) 
> >| 0x0422;
> >+int mdi_reg23_orig = mdio_read(dev, eeprom[6] & 
> >0x1f, 23);
> >+int mdi_reg23 = mdi_reg23_orig | 0x0422;
> > if (congenb)
> >   mdi_reg23 |= 0x0100;
> >-printk(KERN_INFO"  DP83840 specific setup, setting 
> >register 23 to %4.4x.\n",
> >-   mdi_reg23);
> >-mdio_write(dev, eeprom[6] & 0x1f, 23, mdi_reg23);
> >+/* Print the message here, write in speedo_resume, 
> >which
> >+ * is called both on module load and from
> >+ * eepro100_resume.
> >+ */
> >+printk(KERN_INFO"  DP83840 specific setup, setting 
> >register 23 to %4.4x, was %4.4x.\n",
> >+   mdi_reg23, mdi_reg23_orig);
> > }
> > if ((option >= 0) && (option & 0x70)) {
> >+/* Print here, write in speedo_resume. */
> > printk(KERN_INFO "  Forcing %dMbs %s-duplex 
> > operation.\n",
> >(option & 0x20 ? 100 : 10),
> >(option & 0x10 ? "full" : "half"));
> >-mdio_write(dev, eeprom[6] & 0x1f, MII_BMCR,
> >-   ((option & 0x20) ? 0x2000 : 0) |  
> >/* 100mbps? */
> >-   ((option & 0x10) ? 0x0100 : 0)); 
> >/* Full duplex? */
> > }
> > 
> > /* Perform a system self-test. */
> >@@ -1050,6 +1055,24 @@
> > struct speedo_private *sp = netdev_priv(dev);
> > void __iomem *ioaddr = sp->regs;
> > 
> >+/* DP83840 specific setup, moved here from from speedo_found1 because
> >+ * it needs to called after resume, ie suspend to disk.
> >+ * 07-11-2007 David Fries <[EMAIL PROTECTED]>
> >+ */
> >+if (((sp->phy[0]>>8) & 0x3f) == DP83840
> >+||  ((sp->phy[0]>>8) & 0x3f) == DP83840A) {
> >+int mdi_reg23 = mdio_read(dev, sp->phy[0] & 0x1f, 23) | 
> >0x0422;
> >+if (congenb)
> >+mdi_reg23 |= 0x0100;
> >+mdio_write(dev, sp->phy[0] & 0x1f, 23, mdi_reg23);
> >+}
> >+if ((sp->option >= 0) && (sp->option & 0x70)) {
> >+mdio_write(dev, sp->phy[0] & 0x1f, MII_BMCR,
> >+((sp->option & 0x20) ? 0x2000 : 0) |/* 100mbps? 
> >*/
> >+((sp->option & 0x10) ? 0x0100 : 0)); /* Full duplex? 
> >*/
> >+}
> >+
> >+
> > /* Start with a Tx threshold of 256 (0x..20 8 byte units). */
> > sp->tx_threshold = 0x01208000;
> > 
> >
> >
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff

[PATCH] eHEA: net_poll support

2007-07-23 Thread Jan-Bernd Themann
net_poll support for eHEA added

Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]>
---


 drivers/net/ehea/ehea.h  |2 +-
 drivers/net/ehea/ehea_main.c |   22 +-
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h
index 489c8b2..8ee2c2c 100644
--- a/drivers/net/ehea/ehea.h
+++ b/drivers/net/ehea/ehea.h
@@ -39,7 +39,7 @@
 #include 
 
 #define DRV_NAME   "ehea"
-#define DRV_VERSION"EHEA_0071"
+#define DRV_VERSION"EHEA_0072"
 
 /* eHEA capability flags */
 #define DLPAR_PORT_ADD_REM 1
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 4c70a93..58702f5 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -589,6 +589,23 @@ static int ehea_poll(struct net_device *dev, int *budget)
return 1;
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void ehea_netpoll(struct net_device *dev)
+{
+   struct ehea_port *port = netdev_priv(dev);
+
+   netif_rx_schedule(port->port_res[0].d_netdev);
+}
+#endif
+
+static int ehea_poll_firstqueue(struct net_device *dev, int *budget)
+{
+   struct ehea_port *port = netdev_priv(dev);
+   struct net_device *d_dev = port->port_res[0].d_netdev;
+
+   return ehea_poll(d_dev, budget);
+}
+
 static irqreturn_t ehea_recv_irq_handler(int irq, void *param)
 {
struct ehea_port_res *pr = param;
@@ -2626,7 +2643,10 @@ struct ehea_port *ehea_setup_single_port(struct 
ehea_adapter *adapter,
memcpy(dev->dev_addr, &port->mac_addr, ETH_ALEN);
 
dev->open = ehea_open;
-   dev->poll = ehea_poll;
+   dev->poll = ehea_poll_firstqueue;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+   dev->poll_controller = ehea_netpoll;
+#endif
dev->weight = 64;
dev->stop = ehea_stop;
dev->hard_start_xmit = ehea_start_xmit;
-- 
1.5.2

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10] Implement batching skb API

2007-07-23 Thread jamal
KK,

On Mon, 2007-23-07 at 10:19 +0530, Krishna Kumar2 wrote:

> Hmmm ? Evgeniy has not even tested my code to find some regression :) And
> you may possibly not find much improvement in E1000 when you run iperf
> (which is what I do) compared to pktgen. 

Pktgen is the correct test (or the closest to correct) because it tests
the driver tx path. iperf/netperf test the effect of batching on
tcp/udp. Infact i would start with udp first. What you need to do if
testing end-2-end is see where the effects occur. For example, it is
feasible that batching is a little too aggressive and the receiver cant
keep up (netstat -s before and after will be helpful).
Maybe by such insight we can improve things.

> > My experiments show it is useful (in a very visible way using pktgen)
> > for e1000 to have the prep() interface.
> 
> I meant : have you compared results of batching with prep on vs prep off,
> and
> what is the difference in BW ?

Yes, and these results were sent to you as well a while back.
When i get the time when i get back i will look em up in my test machine
and resend.

> No. I see value only in non-LLTX drivers which also gets the same TX lock
> in the RX path.

So _which_ non-LLTX driver doesnt do that? ;->

> > The value is also there in LLTX drivers even if in just formating a skb
> > ready for transmit. If this is not clear i could do a much longer
> > writeup on my thought evolution towards adding prep().
> 
> In LLTX drivers, the driver does the 'prep' without holding the tx_lock in
> any case, so there should be no improvement. Could you send the write-up

I will - please give me sometime; i am overloaded at the moment.

> There is *nothing* IPoIB specific or focus in my code. 
> I said adding prep
> doesn't
> work for IPoIB and so it is pointless to add bloat to the code until some
> code can

tun driver doesnt use it either - but i doubt that makes it "bloat"

>  What I meant to say
> is that there isn't much point in saying that your code is not ready or
> you are using old code base, or has multiple restart functions, or is not
> tested enough, etc, and then say let's re-do/rethink the whole
> implementation when my code is already working and giving good results.

The suggestive hand gesturing is the kind of thing that bothers me. What
do you think: Would i be submitting patches in baed on 2.6.22-rc4? Would
it make sense to include parallel qdisc paths? For heavens sake, i have
told you i would be fine with accepting such changes when the qdisc
restart changes went in first.
You waltz in, have the luxury of looking at my code, presentations, many
discussions with me etc ...
When i ask for differences to code you produced, they now seem to sum up
to the two below. You dont think theres some honest issue with this
picture?

> OTOH, if you find some cases that are better handled with :
>   1. prep handler
>   2. xmit_win (which I don't have now),
> then please send me patches and I will also test out and incorporate.
> 

And then of course you will end up adding those because they are both
useful, just calling them some other name. And then you will end up
incorporating all the drivers i invested many hours (as a gratitous
volunteer) to change and test - maybe you will change varibale names or
rearrange some function. 
I am a very compromising person; i have no problem coauthoring these
patches if you actually invest useful time like fixing things up and
doing proper tests. But you are not doing that - instead you are being
extremely aggressive and hijacking the whole thing. It is courteous if
you find somebody else has a patch you point out whats wrong preferably
with some proof. 

> > It sounds disingenuous but i may have misread you.
> 
> ("lacking in frankness, candor, or sincerity; falsely or hypocritically
> ingenuous; insincere")  Sorry, no response to personal comments and
> have a flame-war :)

Give me a better description. 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Initialize and fill IPv6 route age

2007-07-23 Thread Stephen Hemminger
On Mon, 23 Jul 2007 10:13:18 +0530
Varun Chandramohan <[EMAIL PROTECTED]> wrote:

> The age field of the ipv6 route structures are initilized with the current 
> timeval at the time of route   creation. When the route dump is called 
> the route age value stored in the structure is subtracted from the 
> present timeval and the difference is passed on as the route age.
> 
> Signed-off-by: Varun Chandramohan <[EMAIL PROTECTED]>
> ---
>  include/net/ip6_fib.h   |1 +
>  include/net/ip6_route.h |3 +++
>  net/ipv6/addrconf.c |5 +
>  net/ipv6/route.c|   23 +++
>  4 files changed, 28 insertions(+), 4 deletions(-)
> 
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index c48ea87..e30a1cf 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -98,6 +98,7 @@ struct rt6_info
>   
>   u32 rt6i_flags;
>   u32 rt6i_metric;
> + time_t  rt6i_age;
>   atomic_trt6i_ref;
>   struct fib6_table   *rt6i_table;
>  
> diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
> index 5456fdd..fc9716c 100644
> --- a/include/net/ip6_route.h
> +++ b/include/net/ip6_route.h
> @@ -36,6 +36,9 @@ struct route_info {
>  #define RT6_LOOKUP_F_REACHABLE   0x2
>  #define RT6_LOOKUP_F_HAS_SADDR   0x4
>  
> +#define RT6_SET_ROUTE_INFO 0x0
> +#define RT6_GET_ROUTE_INFO 0x1
> +
>  extern struct rt6_info   ip6_null_entry;
>  
>  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 5a5f8bd..715c766 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -4187,6 +4187,7 @@ EXPORT_SYMBOL(unregister_inet6addr_notif
>  
>  int __init addrconf_init(void)
>  {
> + struct timeval tv;
>   int err = 0;
>  
>   /* The addrconf netdev notifier requires that loopback_dev
> @@ -4214,10 +4215,14 @@ int __init addrconf_init(void)
>   if (err)
>   return err;
>  
> + do_gettimeofday(&tv);

Better to use ktime_t or timespec in new code.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] Add new timeval_to_sec function

2007-07-23 Thread Patrick McHardy
Varun Chandramohan wrote:
>  /**
> + * timeval_to_sec - Convert timeval to seconds
> + * @tv: pointer to the timeval variable to be converted
> + *
> + * Returns the seconds representation of timeval parameter.
> + */
> +static inline time_t timeval_to_sec(const struct timeval *tv)
> +{
> + return (tv->tv_sec + (tv->tv_usec + 50)/100);
> +}


I don't think you should round down timeout values.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/12 -Rev2] dev.c changes.

2007-07-23 Thread Krishna Kumar2
Hi Evgeniy,

Evgeniy Polyakov <[EMAIL PROTECTED]> wrote on 07/23/2007 04:14:28 PM:
> > +/*
> > + * dev_change_tx_batching - Enable or disable batching for a driver
that
> > + * supports batching.

> > +   /* Check if new value is same as the current */
> > +   if (!!(dev->features & NETIF_F_BATCH_ON) == !!new_batch_skb)
> > +  goto out;
>
> o_O
>
> Scratched head for too long before understood what it means :)

Is there a easy way to do this ?

> > +   spin_lock(&dev->queue_lock);
> > +   if (new_batch_skb) {
> > +  dev->features |= NETIF_F_BATCH_ON;
> > +  dev->tx_queue_len >>= 1;
> > +   } else {
> > +  if (!skb_queue_empty(&dev->skb_blist))
> > + skb_queue_purge(&dev->skb_blist);
> > +  dev->features &= ~NETIF_F_BATCH_ON;
> > +  dev->tx_queue_len <<= 1;
> > +   }
> > +   spin_unlock(&dev->queue_lock);
>
> Hmm, should this also stop interrupts?

That is a good question, and I am not sure. I thought it
is not required, though adding it doesn't affect code
either. Can someone tell if disabling bh is required and
why (couldn't figure out the intention of bh for
dev_queue_xmit either, is this to disable preemption) ?

Thanks,

- KK

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/12 -Rev2] IPoIB xmit API addition

2007-07-23 Thread Krishna Kumar2
Hi Evgeniy,

Evgeniy Polyakov <[EMAIL PROTECTED]> wrote on 07/23/2007 04:18:26 PM:

> >  static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc
*wc)
> >  {
> > struct ipoib_dev_priv *priv = netdev_priv(dev);
> > +   int i = 0, num_completions;
> > +   int tx_ring_index = priv->tx_tail & (ipoib_sendq_size - 1);
> > unsigned int wr_id = wc->wr_id;
> > -   struct ipoib_tx_buf *tx_req;
> > unsigned long flags;
> >
> > ipoib_dbg_data(priv, "send completion: id %d, status: %d\n",
> > @@ -255,23 +256,57 @@ static void ipoib_ib_handle_tx_wc(struct
> >return;
> > }
> >
> > -   tx_req = &priv->tx_ring[wr_id];
> > +   num_completions = wr_id - tx_ring_index + 1;
> > +   if (num_completions <= 0)
> > +  num_completions += ipoib_sendq_size;
>
> Can this still be less than zero?

Should never happen, otherwise the TX code wrote on bad/unallocated
memory and would have crashed first.

Thanks,

- KK

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/12 -Rev2] dev.c changes.

2007-07-23 Thread Evgeniy Polyakov
Hi Krishna.

On Sun, Jul 22, 2007 at 02:35:25PM +0530, Krishna Kumar ([EMAIL PROTECTED]) 
wrote:
> diff -ruNp org/net/core/dev.c rev2/net/core/dev.c
> --- org/net/core/dev.c2007-07-20 07:49:28.0 +0530
> +++ rev2/net/core/dev.c   2007-07-21 23:08:33.0 +0530
> @@ -875,6 +875,48 @@ void netdev_state_change(struct net_devi
>   }
>  }
>  
> +/*
> + * dev_change_tx_batching - Enable or disable batching for a driver that
> + * supports batching.
> + */
> +int dev_change_tx_batching(struct net_device *dev, unsigned long 
> new_batch_skb)
> +{
> + int ret;
> +
> + if (!dev->hard_start_xmit_batch) {
> + /* Driver doesn't support skb batching */
> + ret = -ENOTSUPP;
> + goto out;
> + }
> +
> + /* Handle invalid argument */
> + if (new_batch_skb < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + ret = 0;
> +
> + /* Check if new value is same as the current */
> + if (!!(dev->features & NETIF_F_BATCH_ON) == !!new_batch_skb)
> + goto out;

o_O

Scratched head for too long before understood what it means :)

> + spin_lock(&dev->queue_lock);
> + if (new_batch_skb) {
> + dev->features |= NETIF_F_BATCH_ON;
> + dev->tx_queue_len >>= 1;
> + } else {
> + if (!skb_queue_empty(&dev->skb_blist))
> + skb_queue_purge(&dev->skb_blist);
> + dev->features &= ~NETIF_F_BATCH_ON;
> + dev->tx_queue_len <<= 1;
> + }
> + spin_unlock(&dev->queue_lock);

Hmm, should this also stop interrupts?

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/12 -Rev2] IPoIB xmit API addition

2007-07-23 Thread Evgeniy Polyakov
On Sun, Jul 22, 2007 at 02:36:49PM +0530, Krishna Kumar ([EMAIL PROTECTED]) 
wrote:
> diff -ruNp org/drivers/infiniband/ulp/ipoib/ipoib_ib.c 
> rev2/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> --- org/drivers/infiniband/ulp/ipoib/ipoib_ib.c   2007-07-20 
> 07:49:28.0 +0530
> +++ rev2/drivers/infiniband/ulp/ipoib/ipoib_ib.c  2007-07-22 
> 00:08:37.0 +0530
> @@ -242,8 +242,9 @@ repost:
>  static void ipoib_ib_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
>  {
>   struct ipoib_dev_priv *priv = netdev_priv(dev);
> + int i = 0, num_completions;
> + int tx_ring_index = priv->tx_tail & (ipoib_sendq_size - 1);
>   unsigned int wr_id = wc->wr_id;
> - struct ipoib_tx_buf *tx_req;
>   unsigned long flags;
>  
>   ipoib_dbg_data(priv, "send completion: id %d, status: %d\n",
> @@ -255,23 +256,57 @@ static void ipoib_ib_handle_tx_wc(struct
>   return;
>   }
>  
> - tx_req = &priv->tx_ring[wr_id];
> + num_completions = wr_id - tx_ring_index + 1;
> + if (num_completions <= 0)
> + num_completions += ipoib_sendq_size;

Can this still be less than zero?

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/1] lro: Generic Large Receive Offload for TCP traffic

2007-07-23 Thread Evgeniy Polyakov
Hi Jan-Bernd.

On Fri, Jul 20, 2007 at 05:41:48PM +0200, Jan-Bernd Themann ([EMAIL PROTECTED]) 
wrote:
> Generic LRO patch
> 
> Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]>

Besides couple trivial codyng/formatting nits I did not found any
problematic places after review. Details below.

Thanks.

> +#define TCP_PAYLOAD_LENGTH(iph, tcph) \
> +(ntohs(iph->tot_len) - IP_HDR_LEN(iph) - TCP_HDR_LEN(tcph))

A tab?

> +static void lro_add_common(struct net_lro_desc *lro_desc, struct iphdr *iph,
> +struct tcphdr *tcph, int tcp_data_len)
> +{
> + struct sk_buff *parent = lro_desc->parent;
> + u32 *topt;
> +
> + lro_desc->pkt_aggr_cnt++;
> + lro_desc->ip_tot_len += tcp_data_len;
> + lro_desc->tcp_next_seq += tcp_data_len;
> + lro_desc->tcp_window = tcph->window;
> + lro_desc->tcp_ack = tcph->ack_seq;
> +
> + /* don't update tcp_rcv_tsval, would not work with PAWS */
> + if (lro_desc->tcp_saw_tstamp) {
> + topt = (u32 *) (tcph + 1);
> + lro_desc->tcp_rcv_tsecr = *(topt + 2);
> + }
> +
> + parent->len += tcp_data_len;
> + parent->data_len += tcp_data_len;
> +
> + lro_desc->data_csum = csum_add(lro_desc->data_csum,
> +lro_tcp_data_csum(iph, tcph,
> +  tcp_data_len));
> + return;
> +}

return from void? And in other places too.

> +int __lro_proc_skb(struct net_lro_mgr *lro_mgr, struct sk_buff *skb,
> +struct vlan_group *vgrp, u16 vlan_tag, void *priv)
> +{
> + struct net_lro_desc *lro_desc;
> +struct iphdr *iph;
> +struct tcphdr *tcph;
> + u64 flags;

Broken tab and spaces.

> +struct sk_buff *lro_gen_skb(struct net_lro_mgr *lro_mgr,
> + struct skb_frag_struct *frags,
> + int len, int true_size,
> + void *mac_hdr,
> + int hlen)
> +{
> + struct sk_buff *skb;
> +struct skb_frag_struct *skb_frags;
> + int data_len = len;

The same.

> + skb = netdev_alloc_skb(lro_mgr->dev, hlen);
> + if (!skb)
> + return NULL;
> +
> +skb->len = len;
> + skb->data_len = len - hlen;

Here too.
There is number of such places, ommitted others.


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GENETLINK]: Question: global lock (genl_mutex) possible refinement?

2007-07-23 Thread Thomas Graf
* Richard MUSIL <[EMAIL PROTECTED]> 2007-07-20 18:15
> Patrick McHardy wrote:
> > Export the lock/unlock/.. functions. You'll also need a new version 
> > similar to __rtnl_unlock.
> 
> Patrick, you might feel, I am not reading your lines, but in fact I do.
> The problem is that I do not feel competent to follow/propose such
> changes. So what I propose here (in included patch) is the least change
> scenario, which I can think of and on which I feel safe.
> 
> If there are some other changes required, as you suggested for example
> exporting lock from genetlink module, I hope authors of genetlink will
> comment on that. Currently, I do not see any reason for that, but this
> could be due to my limited knowledge.

Actually there is no reason to not use separate locks for the
message serialization and the protection of the list of registered
families. There is only one lock simply for the reason that I've
never thought of anybody could think of registering a new genetlink
family while processing a message.

Alternatively you could also postpone the registration of the new
genetlink family to a workqueue.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [1/2] 2.6.23-rc1: known regressions

2007-07-23 Thread Al Viro
On Mon, Jul 23, 2007 at 11:47:30AM +0200, Michal Piotrowski wrote:
> Subject : drivers/misc/asus-laptop.c:*: error: 'struct led_classdev' 
> has no member named 'class_dev'
> References  : http://lkml.org/lkml/2007/7/22/299
> Last known good : ?
> Submitter   : Gabriel C <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?
> Status  : unknown

>From 2a7e1148a9d3ee860dc2650c9a45288b120e250f Mon Sep 17 00:00:00 2001
From: Al Viro <[EMAIL PROTECTED]>
Date: Mon, 23 Jul 2007 06:20:22 -0400
Subject: [PATCH] Fix failure exits in asus-laptop

Fallout from f8a7c6fe14f556ca8eeddce258cb21392d0c3a2f.  However, looking
at it shows that checks done in ASUS_LED_UNREGISTER() can't trigger
at all (we never get to asus_led_exit() if registration fails) and
if that registration fails, we actually leak stuff.  IOW, it's worse
than just replacing class_dev with dev in there - the tests themselves
had been papering over the lousy cleanup logics.

Signed-off-by: Al Viro <[EMAIL PROTECTED]>
---
 drivers/misc/asus-laptop.c |   32 +---
 1 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/misc/asus-laptop.c b/drivers/misc/asus-laptop.c
index f753060..6b89854 100644
--- a/drivers/misc/asus-laptop.c
+++ b/drivers/misc/asus-laptop.c
@@ -1067,19 +1067,16 @@ static void asus_backlight_exit(void)
 }
 
 #define  ASUS_LED_UNREGISTER(object)   \
-   if(object##_led.class_dev   \
-  && !IS_ERR(object##_led.class_dev))  \
-   led_classdev_unregister(&object##_led)
+   led_classdev_unregister(&object##_led)
 
 static void asus_led_exit(void)
 {
+   destroy_workqueue(led_workqueue);
ASUS_LED_UNREGISTER(mled);
ASUS_LED_UNREGISTER(tled);
ASUS_LED_UNREGISTER(pled);
ASUS_LED_UNREGISTER(rled);
ASUS_LED_UNREGISTER(gled);
-
-   destroy_workqueue(led_workqueue);
 }
 
 static void __exit asus_laptop_exit(void)
@@ -1135,29 +1132,42 @@ static int asus_led_init(struct device *dev)
 
rv = ASUS_LED_REGISTER(mled, dev);
if (rv)
-   return rv;
+   goto out;
 
rv = ASUS_LED_REGISTER(tled, dev);
if (rv)
-   return rv;
+   goto out1;
 
rv = ASUS_LED_REGISTER(rled, dev);
if (rv)
-   return rv;
+   goto out2;
 
rv = ASUS_LED_REGISTER(pled, dev);
if (rv)
-   return rv;
+   goto out3;
 
rv = ASUS_LED_REGISTER(gled, dev);
if (rv)
-   return rv;
+   goto out4;
 
led_workqueue = create_singlethread_workqueue("led_workqueue");
if (!led_workqueue)
-   return -ENOMEM;
+   goto out5;
 
return 0;
+out5:
+   rv = -ENOMEM;
+   ASUS_LED_UNREGISTER(gled);
+out4:
+   ASUS_LED_UNREGISTER(pled);
+out3:
+   ASUS_LED_UNREGISTER(rled);
+out2:
+   ASUS_LED_UNREGISTER(tled);
+out1:
+   ASUS_LED_UNREGISTER(mled);
+out:
+   return rv;
 }
 
 static int __init asus_laptop_init(void)
-- 
1.5.3.GIT

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-usb-devel] [PATCH] [IrDA] KS959 USB IrDA dongle support

2007-07-23 Thread Alan Cox
> >Too late, Linus just closed the merge window, you had two
> >weeks to submit this :-)
> Too bad...
> I submitted it as soon as it was ready. Will we have to wait until the
> 2.6.24 merge window, or can it be applied earlier as it's a fully
> standalone driver ?

You can always send Linus a copy and ask but if its just finished then
far better to send it to Andrew so that it can get testing and picked at
in -mm then merged next time around when polished.

Alan
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/10] net-sysfs.c changes.

2007-07-23 Thread Stephen Hemminger
On Sat, 21 Jul 2007 12:16:30 +0530
Krishna Kumar2 <[EMAIL PROTECTED]> wrote:

> Stephen Hemminger <[EMAIL PROTECTED]> wrote on 07/20/2007
> 09:52:03 PM:
> > Patrick McHardy <[EMAIL PROTECTED]> wrote:
> >
> > > Krishna Kumar2 wrote:
> > > > Patrick McHardy <[EMAIL PROTECTED]> wrote on 07/20/2007 03:37:20 PM:
> > > >
> > > >
> > > >
> > > >> rtnetlink support seems more important than sysfs to me.
> > > >>
> > > >
> > > > Thanks, I will add that as a patch. The reason to add to sysfs is
> that
> > > > it is easier to change for a user (and similar to tx_queue_len).
> > > >
> > >
> >
> > But since batching is so similar to TSO, i really should be part of the
> > flags and controlled by ethtool like other offload flags.
> 
> So should I add all three interfaces (or which ones) :
> 
>   1. /sys (like for tx_queue_len)
>   2. netlink
>   3. ethtool.
> 
> Or only 2 & 3 are enough ?
> 

Yes, please do #3 and maybe #2.
Sysfs api's are a long term ABI problem.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]: Resurrect napi_poll patch.

2007-07-23 Thread Stephen Hemminger
On 21 Jul 2007 15:26:00 +0200
Andi Kleen <[EMAIL PROTECTED]> wrote:

> David Miller <[EMAIL PROTECTED]> writes:
> > 
> > Good candidates for taking advantage of multi-napi are:
> > 
> > 1) e1000
> > 2) ucc_geth
> > 3) ehea
> > 4) sunvnet
> 
> s2io.c

sky2.c could use it because of issues with dual-port that share
one napi for status.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/12 -Rev2] Implement batching skb API

2007-07-23 Thread Krishna Kumar2
> I have started a 10 run test for various buffer sizes and processes, and
> will post the results on Monday.

The 10 iteration run results for Rev2 are (average) :

--
Test Case   Org New%Change
--

  TCP 1 Process
Size:32   2703306313.31
Size:128 12948  12217   -5.64
Size:512 48108  55384   15.12
Size:4096   129089132586  2.70
Average:192848 203250  5.39

  TCP 4 Processes
Size:32  10389  107683.64
Size:12839694   42265   6.47
Size:512159563 156373 -1.99
Size:4096   268094 256008-4.50
Average:477740 465414 -2.58


  TCP No Delay 1 Process
Size:32   2606   295013.20
Size:128 8115   11864   46.19
Size:512 39113  42608   8.93
Size:4096   103966105333  1.31
Average:153800 162755  5.82


  TCP No Delay 4 Processes
Size:32   42138727107.14
Size:128 17579   35143   99.91
Size:512 70803   123936 75.04
Size:4096   203541  225259 10.67
Average: 296136 393065  32.73

--
Average:11205241224484 9.28%

There are three cases that degrade a little (upto -5.6%), but there are 13
cases
that improve, and many of those are in the 13% to over 100% (7 cases).

Thanks,

- KK

Krishna Kumar2/India/[EMAIL PROTECTED] wrote on 07/22/2007 02:34:57 PM:

> This set of patches implements the batching API, and makes the following
> changes resulting from the review of the first set:
>
> Changes :
> -
> 1.  Changed skb_blist from pointer to static as it saves only 12 bytes
> (i386), but bloats the code.
> 2.  Removed requirement for driver to set "features & NETIF_F_BATCH_SKBS"
> in register_netdev to enable batching as it is redundant. Changed
this
> flag to NETIF_F_BATCH_ON and it is set by register_netdev, and other
> user changable calls can modify this bit to enable/disable batching.
> 3.  Added ethtool support to enable/disable batching (not tested).
> 4.  Added rtnetlink support to enable/disable batching (not tested).
> 5.  Removed MIN_QUEUE_LEN_BATCH for batching as high performance drivers
> should not have a small queue anyway (adding bloat).
> 6.  skbs are purged from dev_deactivate instead of from unregister_netdev
> to drop all references to the device.
> 7.  Removed changelog in source code in sch_generic.c, and unrelated
renames
> from sch_generic.c (lockless, comments).
> 8.  Removed xmit_slots entirely, as it was adding bloat (code and header)
> and not adding value (it is calculated and set twice in internal send
> routine and handle work completion, and referenced once in batch
xmit;
> and can instead be calculated once in xmit).
>
> Issues :
> 
> 1. Remove /sysfs support completely ?
> 2. Whether rtnetlink support is required as GSO has only ethtool ?
>
> Patches are described as:
>Mail 0/12  : This mail.
>Mail 1/12  : HOWTO documentation.
>Mail 2/12  : Changes to netdevice.h
>Mail 3/12  : dev.c changes.
>Mail 4/12  : Ethtool changes.
>Mail 5/12  : sysfs changes.
>Mail 6/12  : rtnetlink changes.
>Mail 7/12  : Change in qdisc_run & qdisc_restart API, modify callers
>to use this API.
>Mail 8/12  : IPoIB include file changes.
>Mail 9/12  : IPoIB verbs changes
>Mail 10/12 : IPoIB multicast, CM changes
>Mail 11/12 : IPoIB xmit API addition
>Mail 12/12 : IPoIB xmit internals changes (ipoib_ib.c)
>
> I have started a 10 run test for various buffer sizes and processes, and
> will post the results on Monday.
>
> Please review and provide feedback/ideas; and consider for inclusion.
>
> Thanks,
>
> - KK

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[1/2] 2.6.23-rc1: known regressions

2007-07-23 Thread Michal Piotrowski
Hi all,

Here is a list of some known regressions in 2.6.23-rc1.

Feel free to add new regressions/remove fixed etc.
http://kernelnewbies.org/known_regressions

List of Aces

NameRegressions fixed since 21-Jun-2007
Andi Kleen 4
Linus Torvalds 4
Adrian Bunk3
Andrew Morton  3
Jens Axboe 3
Al Viro2
David Woodhouse2
Hugh Dickins   2
Tejun Heo  2



Unclassified

Subject : /usr/bin/ld: section .text [ff700500 -> 
ff7007e3] overlaps section .gnu.version_d [ff7004d8 -> 
ff70050f]
References  : http://lkml.org/lkml/2007/7/22/239
Last known good : ?
Submitter   : Andre Noll <[EMAIL PROTECTED]>
Caused-By   : ?
Handled-By  : Andi Kleen <[EMAIL PROTECTED]>
Status  : problem is being debugged

Subject : pcwd_init_module(): WARNING: at lib/kref.c:33 kref_get()
References  : http://lkml.org/lkml/2007/7/22/94
Last known good : ?
Submitter   : Ingo Molnar <[EMAIL PROTECTED]>
Caused-By   : ?
Handled-By  : ?
Status  : problem is being debugged

Subject : konqueror suddenly vanishing, "konqueror: Fatal IO error: 
client killed"
References  : http://lkml.org/lkml/2007/7/22/86
Last known good : ?
Submitter   : Markus <[EMAIL PROTECTED]>
Caused-By   : ?
Handled-By  : Ingo Molnar <[EMAIL PROTECTED]>
Status  : problem is being debugged



ACPI

Subject : drivers/misc/asus-laptop.c:*: error: 'struct led_classdev' 
has no member named 'class_dev'
References  : http://lkml.org/lkml/2007/7/22/299
Last known good : ?
Submitter   : Gabriel C <[EMAIL PROTECTED]>
Caused-By   : ?
Handled-By  : ?
Status  : unknown



Networking

Subject : New wake ups from sky2
References  : http://lkml.org/lkml/2007/7/20/386
Last known good : ?
Submitter   : Thomas Meyer <[EMAIL PROTECTED]>
Caused-By   : Stephen Hemminger <[EMAIL PROTECTED]>
  commit eb35cf60e462491249166182e3e755d3d5d91a28
Handled-By  : Stephen Hemminger <[EMAIL PROTECTED]>
Status  : unknown



Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TCP and batching WAS(Re: [PATCH 00/10] Implement batching skb API

2007-07-23 Thread Stephen Hemminger
On Sat, 21 Jul 2007 09:46:19 -0400
jamal <[EMAIL PROTECTED]> wrote:

> On Fri, 2007-20-07 at 08:18 +0100, Stephen Hemminger wrote:
> 
> > You may see worse performance with batching in the real world when
> > running over WAN's.  Like TSO, batching will generate back to back packet
> > trains that are subject to multi-packet synchronized loss. 
> 
> Has someone done any study on TSO effect? 
Not that I have seen, TCP research tends to turn of NAPI and TSO because it
causes other effects which are too confusing for measurement. The discussion
of TSO usually shows up in discussions of pacing. I have seen argument both
pro and con for pacing. The most convincing arguments are that pacing doesn't
help in the general case (and therefore TSO would be ok). 

> Doesnt ECN with a RED router
> help on something like this?
Yes, but RED is not deployed on backbone, and ECN only slightly.
Most common is over sized FIFO queues.

> I find it suprising that a single flow doing TSO would overwhelm a
> routers buffer. I actually think the value of batching as far as TCP is
> concerned is propotional to the number of flows. i.e the more flows you
> have the more batching you will end up doing. And if TCPs fairness is
> the legend talk it has been made to be, then i dont see this as
> problematic.

It is not that TSO would overwhelm the router by itself, just that any
congested link will have periods when there is only a small number of
available slots left. When this happens a TSO burst will get truncated.

The argument against pacing, and for TSO; is that the busy sender with
large congestion window is the one most likely to have send large bursts.
For fairness, the system works better if the busy sender gets penalized more,
and dropping the latter part of the burst does that.  With pacing, the sender
may be able to saturate the router more and not detect that it is monopolizing
the bandwidth.


> BTW, something i noticed regards to GSO when testing batching:
> For TCP packets slightly above MDU (upto 2K), GSO gives worse
> performance than non-GSO. Actually has nothing to do with batching,
> rather it works the same way with or without batching changes.
> 
> Another oddity:
> Looking at the flow rate from a purely packets/second (I know thats a
> router centric view, but i found it strange nevertheless) - you see that
> as packet size goes up, the pps also goes up. I tried mucking around
> with nagle etc, but saw no observable changes. Any insight?
> My expectation was that the pps would stay at least the same or get
> better with smaller packets (assuming theres less data to push around).
> 
> cheers,
> jamal
> 
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20->2.6.21 - networking dies after random time

2007-07-23 Thread Jarek Poplawski
On Mon, Jul 23, 2007 at 07:44:58AM +0200, Marcin Ĺšlusarz wrote:
> Ok, I've bisected this problem and found that this patch broke my NIC:

Congratulations!

> 
> 76d2160147f43f982dfe881404cfde9fd0a9da21 is first bad commit
> commit 76d2160147f43f982dfe881404cfde9fd0a9da21
> Author: Ingo Molnar <[EMAIL PROTECTED]>
> Date:   Fri Feb 16 01:28:24 2007 -0800
> 
>[PATCH] genirq: do not mask interrupts by default
...
> So I cooked patch like below and everything is working fine (so far)
> 
> Fix default_disable interrupt function (broken by [PATCH] genirq: do
> not mask interrupts by default) - revert removal of codepath which was
> invoked when removed flag (IRQ_DELAYED_DISABLE) wag NOT set
> 
> Signed-off-by: Marcin Slusarz <[EMAIL PROTECTED]>
> ---
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 76a9106..0bb23cd 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -230,6 +230,8 @@ static void default_enable(unsigned int irq)
>  */
> static void default_disable(unsigned int irq)
> {
> + struct irq_desc *desc = irq_desc + irq;
> + desc->chip->mask(irq);
> }
> 
> /*

I think your patch should very good point the source of the problem
and would help to many people, but it looks like too arbitrary for
those who didn't have such problems. It seems it was mainly with
x86_64, so maybe something like this below would be enough?

Cheers,
Jarek P.

PS: not tested!

---

diff -Nurp 2.6.22-/arch/x86_64/kernel/io_apic.c 
2.6.22/arch/x86_64/kernel/io_apic.c
--- 2.6.22-/arch/x86_64/kernel/io_apic.c2007-07-09 01:32:17.0 
+0200
+++ 2.6.22/arch/x86_64/kernel/io_apic.c 2007-07-23 10:33:05.0 +0200
@@ -1427,6 +1427,7 @@ static struct irq_chip ioapic_chip __rea
.name   = "IO-APIC",
.startup= startup_ioapic_irq,
.mask   = mask_IO_APIC_irq,
+   .disable= mask_IO_APIC_irq,
.unmask = unmask_IO_APIC_irq,
.ack= ack_apic_edge,
.eoi= ack_apic_level,
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[2.6 patch] drivers/net/acenic.c: fix check-after-use

2007-07-23 Thread Adrian Bunk
The Coverity checker noted that we've already dereferenced "dev" when we 
check whether it's NULL.

Since it's impossible that "dev" is NULL at this place this patch 
removes the NULL check.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---
--- linux-2.6.22-rc6-mm1/drivers/net/acenic.c.old   2007-07-23 
04:06:05.0 +0200
+++ linux-2.6.22-rc6-mm1/drivers/net/acenic.c   2007-07-23 04:08:11.0 
+0200
@@ -3124,20 +3124,14 @@ static int __devinit read_eeprom_byte(st
struct ace_private *ap = netdev_priv(dev);
struct ace_regs __iomem *regs = ap->regs;
unsigned long flags;
u32 local;
int result = 0;
short i;
 
-   if (!dev) {
-   printk(KERN_ERR "No device!\n");
-   result = -ENODEV;
-   goto out;
-   }
-
/*
 * Don't take interrupts on this CPU will bit banging
 * the [EMAIL PROTECTED] I2C device
 */
local_irq_save(flags);
 
eeprom_start(regs);

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [2.6.22] Fix a potential NULL pointer dereference in write_bulk_callback() in drivers/net/usb/pegasus.c

2007-07-23 Thread Micah Gruber
This patch fixes a potential null dereference bug where we dereference 
pegasus before a null check. This patch simply moves the dereferencing 
after the null check.


Signed-off-by: Micah Gruber <[EMAIL PROTECTED]>

---

--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -768,11 +768,13 @@
static void write_bulk_callback(struct urb *urb)
{
   pegasus_t *pegasus = urb->context;
-   struct net_device *net = pegasus->net;
+   struct net_device *net;

   if (!pegasus)
   return;

+   net = pegasus->net;
+
   if (!netif_device_present(net) || !netif_running(net))
   return;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] Initialize and fill IPv6 route age

2007-07-23 Thread Krishna Kumar2
> +   if (dumpflg)
> +  NLA_PUT_U32(skb, RTA_AGE, timeval_to_sec(&tv) - rt->rt6i_age);
> +   else
> +  NLA_PUT_U32(skb, RTA_AGE, rt->rt6i_age);

Makes more sense (and easy to understand) if you use :

if (dumpflg == RT6_GET_ROUTE_INFO)
  ...
so that your code does not break if someone changed the #define values.

- KK

[EMAIL PROTECTED] wrote on 07/23/2007 10:13:18 AM:

> The age field of the ipv6 route structures are initilized with the
current
> timeval at the time of route   creation. When the route dump is
called the
> route age value stored in the structure is subtracted from the
present
> timeval and the difference is passed on as the route age.
>
> Signed-off-by: Varun Chandramohan <[EMAIL PROTECTED]>
> ---
>  include/net/ip6_fib.h   |1 +
>  include/net/ip6_route.h |3 +++
>  net/ipv6/addrconf.c |5 +
>  net/ipv6/route.c|   23 +++
>  4 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
> index c48ea87..e30a1cf 100644
> --- a/include/net/ip6_fib.h
> +++ b/include/net/ip6_fib.h
> @@ -98,6 +98,7 @@ struct rt6_info
>
> u32rt6i_flags;
> u32rt6i_metric;
> +   time_trt6i_age;
> atomic_t rt6i_ref;
> struct fib6_table  *rt6i_table;
>
> diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
> index 5456fdd..fc9716c 100644
> --- a/include/net/ip6_route.h
> +++ b/include/net/ip6_route.h
> @@ -36,6 +36,9 @@ struct route_info {
>  #define RT6_LOOKUP_F_REACHABLE   0x2
>  #define RT6_LOOKUP_F_HAS_SADDR   0x4
>
> +#define RT6_SET_ROUTE_INFO 0x0
> +#define RT6_GET_ROUTE_INFO 0x1
> +
>  extern struct rt6_info   ip6_null_entry;
>
>  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 5a5f8bd..715c766 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -4187,6 +4187,7 @@ EXPORT_SYMBOL(unregister_inet6addr_notif
>
>  int __init addrconf_init(void)
>  {
> +   struct timeval tv;
> int err = 0;
>
> /* The addrconf netdev notifier requires that loopback_dev
> @@ -4214,10 +4215,14 @@ int __init addrconf_init(void)
> if (err)
>return err;
>
> +   do_gettimeofday(&tv);
> ip6_null_entry.rt6i_idev = in6_dev_get(&loopback_dev);
> +   ip6_null_entry.rt6i_age = timeval_to_sec(&tv);
>  #ifdef CONFIG_IPV6_MULTIPLE_TABLES
> ip6_prohibit_entry.rt6i_idev = in6_dev_get(&loopback_dev);
> +   ip6_prohibit_entry.rt6i_age = timeval_to_sec(&tv);
> ip6_blk_hole_entry.rt6i_idev = in6_dev_get(&loopback_dev);
> +   ip6_blk_hole_entry.rt6i_age = timeval_to_sec(&tv);
>  #endif
>
> register_netdevice_notifier(&ipv6_dev_notf);
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index fe8d983..9386c05 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -600,7 +600,14 @@ static int __ip6_ins_rt(struct rt6_info
>  {
> int err;
> struct fib6_table *table;
> +   struct timeval tv;
>
> +   do_gettimeofday(&tv);
> +   /* Update the timeval for new routes
> +* We add it here to make it common irrespective
> +* of how the new route is added.
> +*/
> +   rt->rt6i_age = timeval_to_sec(&tv);
> table = rt->rt6i_table;
> write_lock_bh(&table->tb6_lock);
> err = fib6_add(&table->tb6_root, rt, info);
> @@ -2111,6 +2118,7 @@ static inline size_t rt6_nlmsg_size(void
>+ nla_total_size(4) /* RTA_IIF */
>+ nla_total_size(4) /* RTA_OIF */
>+ nla_total_size(4) /* RTA_PRIORITY */
> +  + nla_total_size(4) /*RTA_AGE*/
>+ RTAX_MAX * nla_total_size(4) /* RTA_METRICS */
>+ nla_total_size(sizeof(struct rta_cacheinfo));
>  }
> @@ -2118,10 +2126,11 @@ static inline size_t rt6_nlmsg_size(void
>  static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt,
>struct in6_addr *dst, struct in6_addr *src,
>int iif, int type, u32 pid, u32 seq,
> -  int prefix, unsigned int flags)
> +  int prefix, unsigned int flags, int dumpflg)
>  {
> struct rtmsg *rtm;
> struct nlmsghdr *nlh;
> +   struct timeval tv;
> long expires;
> u32 table;
>
> @@ -2185,6 +2194,12 @@ static int rt6_fill_node(struct sk_buff
>if (ipv6_get_saddr(&rt->u.dst, dst, &saddr_buf) == 0)
>   NLA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf);
> }
> +
> +   do_gettimeofday(&tv);
> +   if (dumpflg)
> +  NLA_PUT_U32(skb, RTA_AGE, timeval_to_sec(&tv) - rt->rt6i_age);
> +   else
> +  NLA_PUT_U32(skb, RTA_AGE, rt->rt6i_age);
>
> if (rtnetlink_put_metrics(skb, rt->u.dst.metrics) < 0)
>goto nla_put_failure;
> @@ -,7 +2237,7 @@ int rt6_dump_route(struct rt6_info *rt,
>
> return rt6_fill_node(arg->skb, rt, NULL, NULL, 0, RTM_NEWROUTE,
> NETLINK_CB(arg->cb->skb).pid, arg->cb->nlh->nlmsg_seq,
> -   prefix, NLM_F_MULTI);
> +   prefix, NLM_F_MULTI, RT6_GET_ROUTE_INFO);
>  }
>
>  static int inet6_rtm_getroute(struct sk

Re: net/bluetooth/rfcomm/tty.c: use-after-free

2007-07-23 Thread Marcel Holtmann
Hi Adrian,

> Commit 8de0a15483b357d0f0b821330ec84d1660cadc4e added the following 
> use-after-free in net/bluetooth/rfcomm/tty.c:
> 
> <--  snip  -->
> 
> ...
> static int rfcomm_dev_add(struct rfcomm_dev_req *req, struct rfcomm_dlc *dlc)
> {
> ...
> if (IS_ERR(dev->tty_dev)) {
> list_del(&dev->list);
> kfree(dev);
> return PTR_ERR(dev->tty_dev);
> }
> ...
> 
> <--  snip  -->
> 
> Spotted by the Coverity checker.

really good catch. I fully overlooked that one. The attached patch
should fix it.

Signed-off-by: Marcel Holtmann <[EMAIL PROTECTED]>

Regards

Marcel

diff --git a/net/bluetooth/rfcomm/tty.c b/net/bluetooth/rfcomm/tty.c
index 23ba61a..22a8320 100644
--- a/net/bluetooth/rfcomm/tty.c
+++ b/net/bluetooth/rfcomm/tty.c
@@ -267,7 +267,7 @@ static int rfcomm_dev_add(struct rfcomm_dev_req *req, struct rfcomm_dlc *dlc)
 out:
 	write_unlock_bh(&rfcomm_dev_lock);
 
-	if (err) {
+	if (err < 0) {
 		kfree(dev);
 		return err;
 	}
@@ -275,9 +275,10 @@ out:
 	dev->tty_dev = tty_register_device(rfcomm_tty_driver, dev->id, NULL);
 
 	if (IS_ERR(dev->tty_dev)) {
+		err = PTR_ERR(dev->tty_dev);
 		list_del(&dev->list);
 		kfree(dev);
-		return PTR_ERR(dev->tty_dev);
+		return err;
 	}
 
 	return dev->id;


Re: [PATCH] [IrDA] KS959 USB IrDA dongle support

2007-07-23 Thread Samuel Ortiz

On 7/23/2007, "David Miller" <[EMAIL PROTECTED]> wrote:

>From: Samuel Ortiz <[EMAIL PROTECTED]>
>Date: Mon, 23 Jul 2007 00:32:17 +0300
>
>> Last IrDA patch for 2.6.23-rc1, if it's not too late...
>
>Too late, Linus just closed the merge window, you had two
>weeks to submit this :-)
Too bad...
I submitted it as soon as it was ready. Will we have to wait until the
2.6.24 merge window, or can it be applied earlier as it's a fully
standalone driver ?

Cheers,
Samuel.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IrDA] KS959 USB IrDA dongle support

2007-07-23 Thread David Miller
From: "Samuel Ortiz" <[EMAIL PROTECTED]>
Date: 23 Jul 2007 07:25:06 -

> I submitted it as soon as it was ready. Will we have to wait until the
> 2.6.24 merge window, or can it be applied earlier as it's a fully
> standalone driver ?

Please wait until I open up a net-2.6.24 tree, you can submit
it to me then.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html