date:20050901

Re: Question on debugging reference counts on Net devices.

2005-09-01 Thread Ben Greear

David S. Miller wrote:

From: Ben Greear <[EMAIL PROTECTED]>
Date: Thu, 01 Sep 2005 23:11:28 -0700

A quick optimization is to kmalloc chunks of 1000 or so structs at once
and then write a cheap foomalloc method to grab and release them.  We
already take a lock (at least in my implementation), so this should be
small overhead.

Better to eliminate all the kmalloc()'s altogether.  It's bad
for a debugging facility to have a known failure mode.

Maybe it's just too late..but I am not understanding this.  How
do you print all of the current users of the device when you are
trying to release the device and something has a reference to it?

The netdev_pointer's get linked in to a head pointer in
the netdev itself.

struct netdev {
...
struct hlist_head   debug_ref_list;
...
};

struct netdev_pointer {
...
struct hlist_node   debug_ref_node;
...
};

So you walk the debug_ref_list to get all the references
grabbed of the object.

Ok, so each object that now has a net_device* dev pointer instead
gets this netdev_pointer object.  When a reference is taken,
this netdev_pointer object is linked into the netdevice object
in a list.  This requires a lock and is O(1) even with debugging
disabled?

On dev_put, you remove the reference from the netdev's list.
This would be O(N) and require a lock with or without debugging enabled, right?
(N == number of references)
You are also going to pay the memory price of the hlist_node object at least, 
even
with debugging disabled.  In things like routes and sockets this might add up...

All of the code that currently goes foo->dev would have to be changed to
foo->dev.dev for reference, and/or we'd change most methods to take a pointer
to netdev_pointer instead of netdevice?

Please correct if I am wrong here.

Perhaps an even better idea for class #2 is to use a stack local
"struct netdev_pointer", since that is the mode in which this
case typically occurs, all within the same function so the function
stack frame is live during the entire dummy refrence grab.

Could you actually detect a reference leak in this case?  The struct goes out
of scope when the method ends..but since we don't have auto-destructors
like C++, then there is nothing to force the release of the reference.

Most of the cases in this class are in the scope of the method,
ie. we do the put before the function returns.

Right, but we want to catch the cases where someone forgets, so the
pointer object cannot be on the stack (we need something to reference
X time later when we try to free the device and want to know who has
a handle.)  This implies kmalloc to me, since setting any static amount
of them in the netdevice is just asking for trouble (or known failure
mode, to use your term :)

Bad caps I can quickly fix..but I actually went to some trouble to
try to get the tabbing correct.  Is my patch coming through with spaces
instead of tabs or something like that?

Yes, spaces instead of tabs:

-   slave_dev  = dev_get_by_name(srq.slave_name);
+/* Need this up here so we know what the eventual reference for
+ * the slave_dev will be.  This is to help with debugging netdev
+ * ref counts. --Ben
+ */
+s = kmalloc(sizeof(*s), GFP_KERNEL);
+if (!s) {
+   return -ENOMEM;
+}
+
+	slave_dev  = dev_get_by_name(srq.slave_name, &s->dev);

Ok, will search for those again.

and the unnecessary bracing there needs to be eliminated as well.
Some of us still read code in 24 line terminals from time to time :)

Bleh, some of us with actual monitors hate when commenting
out a line of code changes the logic flow!  But, not my ball game,
so I'll comply.

I went ahead and ported the rest of the kernel to my current method
earlier today.  Will update the patch with those changes as well as
the fixed capitilization and lack of spaces in a few minutes.

I do like that your method gets rid of the kmalloc in at least one
case, but to be honest, I still am partial to my method at this point
because it has little to no cost when disabled (and it's done :))

Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Orinoco-devel] [PATCH 0/8] orinoco: Bringing in-sync with CVS

2005-09-01 Thread David Gibson

On Thu, Sep 01, 2005 at 08:00:33PM -0400, Pavel Roskin wrote:
> Hello!
> 
> Following 8 patches bring orinoco drivers in the kernel fully in-sync
> with the CVS repository at http://savannah.nongnu.org/projects/orinoco/
> ("for_linus" branch).
> 
> The patches include 2 new front-end drivers: orinoco_nortel and
> spectrum_cs.  They also remove EXPERIMENTAL designation of the PCI
> front-ends (PCI, PLX and TMD).  The rest is pretty minor.
> 
> I'm placing the patches to http://red-bean.com/proski/orinoco/
> 
> Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

All look good to me.

Acked-by: David Gibson <[EMAIL PROTECTED]>

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Possible BUG in IPv4 TCP window handling, all recent 2.4.x/2.6.x kernels

2005-09-01 Thread David S. Miller

From: John Heffner <[EMAIL PROTECTED]>
Date: Thu, 1 Sep 2005 22:51:48 -0400

> I have an idea why this is going on.  Packets are pre-allocated by the 
> driver to be a max packet size, so when you send small packets, it 
> wastes a lot of memory.  Currently Linux uses the packets at the 
> beginning of a connection to make a guess at how best to advertise its 
> window so as not to overflow the socket's memory bounds.  Since you 
> start out with big segments then go to small ones, this is defeating 
> that mechanism.  It's actually documented in the comments in 
> tcp_input.c. :)
> 
>   * The scheme does not work when sender sends good segments opening
>   * window and then starts to feed us spagetti. But it should work
>   * in common situations. Otherwise, we have to rely on queue collapsing.

That's a strong possibility, good catch John.

Although, I'm still not ruling out some box in the middle
even though I consider it less likely than your theory.

So you're suggesting that tcp_prune_queue() should do the:

if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
tcp_clamp_window(sk, tp);

check after attempting to collapse the queue.

But, that window clamping should fix the problem, as we recalculate
the window to advertise.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question on debugging reference counts on Net devices.

2005-09-01 Thread David S. Miller

From: Ben Greear <[EMAIL PROTECTED]>
Date: Thu, 01 Sep 2005 23:11:28 -0700

> A quick optimization is to kmalloc chunks of 1000 or so structs at once
> and then write a cheap foomalloc method to grab and release them.  We
> already take a lock (at least in my implementation), so this should be
> small overhead.

Better to eliminate all the kmalloc()'s altogether.  It's bad
for a debugging facility to have a known failure mode.

> Maybe it's just too late..but I am not understanding this.  How
> do you print all of the current users of the device when you are
> trying to release the device and something has a reference to it?

The netdev_pointer's get linked in to a head pointer in
the netdev itself.

struct netdev {
...
struct hlist_head   debug_ref_list;
...
};

struct netdev_pointer {
...
struct hlist_node   debug_ref_node;
...
};

So you walk the debug_ref_list to get all the references
grabbed of the object.

> > Perhaps an even better idea for class #2 is to use a stack local
> > "struct netdev_pointer", since that is the mode in which this
> > case typically occurs, all within the same function so the function
> > stack frame is live during the entire dummy refrence grab.
> 
> Could you actually detect a reference leak in this case?  The struct goes out
> of scope when the method ends..but since we don't have auto-destructors
> like C++, then there is nothing to force the release of the reference.

Most of the cases in this class are in the scope of the method,
ie. we do the put before the function returns.

> Bad caps I can quickly fix..but I actually went to some trouble to
> try to get the tabbing correct.  Is my patch coming through with spaces
> instead of tabs or something like that?

Yes, spaces instead of tabs:

-   slave_dev  = dev_get_by_name(srq.slave_name);
+/* Need this up here so we know what the eventual reference for
+ * the slave_dev will be.  This is to help with debugging netdev
+ * ref counts. --Ben
+ */
+s = kmalloc(sizeof(*s), GFP_KERNEL);
+if (!s) {
+   return -ENOMEM;
+}
+
+   slave_dev  = dev_get_by_name(srq.slave_name, &s->dev);

and the unnecessary bracing there needs to be eliminated as well.
Some of us still read code in 24 line terminals from time to time :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question on debugging reference counts on Net devices.

2005-09-01 Thread Ben Greear

David S. Miller wrote:

From: Ben Greear <[EMAIL PROTECTED]>
Date: Mon, 29 Aug 2005 16:01:11 -0700

Latest netdevice ref-count debugging patch is up.  The
patch is against 2.6.13:

http://www.candelatech.com/oss/rfcnt.patch

I reviewed this, and I think the approach can be refined and made more
robust.

The worst part is that you do a kmalloc() on ever reference, and you
shouldn't need to do that.

A quick optimization is to kmalloc chunks of 1000 or so structs at once
and then write a cheap foomalloc method to grab and release them.  We
already take a lock (at least in my implementation), so this should be
small overhead.

As you have noted, a netdev refcount grab falls into two categories:

1) something->dev = dev;
   dev_get(dev);

2) dev_get(dev);
   /* some code where dev might get released but we want
* to preserve the device over all the calls we're doing
*/
   func1();
   func2();
   whatever();
   /* Ok we're done, now can let it go for real.  */
   dev_put(dev);

For the first case, just make a structure to hold this as the
place where the device pointer gets assigned to.  This would
allow abstraction, get rid of the memory allocation, and
for proper refcounting in fact.  So you'd have something like:

struct netdev;
struct netdev_pointer {
struct netdev   *dev;
#ifdef CONFIG_NETDEV_REFCNT_DEBUG
struct hlist_head   hash;
const char  *file;
int line;
#endif
};

Then in places where we have "struct netdev *dev", you replace it
with "struct netdev_pointer *__dev".  You have a routine like:

#define dev_set(netdev_pointer, netdev) \
do { \
netdev_pointer->dev = netdev; \
dev_get(netdev_pointer, netdev); \
netdev_pointer->file = __FILE__; \
netdev_pointer->line = __LINE__; \
} while (0)

that you invoke instead of the standard:

x->dev = dev;
dev_get(dev);

sequence.

Maybe it's just too late..but I am not understanding this.  How
do you print all of the current users of the device when you are
trying to release the device and something has a reference to it?

Class #2 is so transient that you can probably just use an array
of dummy struct netdev_pointers that sit in struct netdev itself.
Use, say, 4, and allocate them one by one until you run out.
And you do the linkage into there, and thus these act as the "object"
you're assigning the netdev pointer to that requires the refcount.
You'll be able to store the __FILE__ and __LINE__ information
quite neatly in there.

As soon as we choose 4, someone will need 5.  I'd prefer to keep
the storage of the reference information separate, though I am
fine with optimizing so that we should have very few kmallocs/frees
by allocing large chunks of objects at once.

Perhaps an even better idea for class #2 is to use a stack local
"struct netdev_pointer", since that is the mode in which this
case typically occurs, all within the same function so the function
stack frame is live during the entire dummy refrence grab.

Could you actually detect a reference leak in this case?  The struct goes out
of scope when the method ends..but since we don't have auto-destructors
like C++, then there is nothing to force the release of the reference.

This whole concept is actually quite generic, and we could thus
apply it to sockets, routes, neighbour table entries, etc.

And Ben you really need to get up to snuff with Linux coding
standards.  All of your "StudLyCaps" function names and bad
tabbing hurt my eyes quite a bit and make your patches nearly
impossible to review :-/

Bad caps I can quickly fix..but I actually went to some trouble to
try to get the tabbing correct.  Is my patch coming through with spaces
instead of tabs or something like that?

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question on debugging reference counts on Net devices.

2005-09-01 Thread David S. Miller

From: Ben Greear <[EMAIL PROTECTED]>
Date: Mon, 29 Aug 2005 16:01:11 -0700

> Latest netdevice ref-count debugging patch is up.  The
> patch is against 2.6.13:
> 
> http://www.candelatech.com/oss/rfcnt.patch

I reviewed this, and I think the approach can be refined and made more
robust.

The worst part is that you do a kmalloc() on ever reference, and you
shouldn't need to do that.

As you have noted, a netdev refcount grab falls into two categories:

1) something->dev = dev;
   dev_get(dev);

2) dev_get(dev);
   /* some code where dev might get released but we want
* to preserve the device over all the calls we're doing
*/
   func1();
   func2();
   whatever();
   /* Ok we're done, now can let it go for real.  */
   dev_put(dev);

For the first case, just make a structure to hold this as the
place where the device pointer gets assigned to.  This would
allow abstraction, get rid of the memory allocation, and
for proper refcounting in fact.  So you'd have something like:

struct netdev;
struct netdev_pointer {
struct netdev   *dev;
#ifdef CONFIG_NETDEV_REFCNT_DEBUG
struct hlist_head   hash;
const char  *file;
int line;
#endif
};

Then in places where we have "struct netdev *dev", you replace it
with "struct netdev_pointer *__dev".  You have a routine like:

#define dev_set(netdev_pointer, netdev) \
do { \
netdev_pointer->dev = netdev; \
dev_get(netdev_pointer, netdev); \
netdev_pointer->file = __FILE__; \
netdev_pointer->line = __LINE__; \
} while (0)

that you invoke instead of the standard:

x->dev = dev;
dev_get(dev);

sequence.

Class #2 is so transient that you can probably just use an array
of dummy struct netdev_pointers that sit in struct netdev itself.
Use, say, 4, and allocate them one by one until you run out.
And you do the linkage into there, and thus these act as the "object"
you're assigning the netdev pointer to that requires the refcount.
You'll be able to store the __FILE__ and __LINE__ information
quite neatly in there.

Perhaps an even better idea for class #2 is to use a stack local
"struct netdev_pointer", since that is the mode in which this
case typically occurs, all within the same function so the function
stack frame is live during the entire dummy refrence grab.

This whole concept is actually quite generic, and we could thus
apply it to sockets, routes, neighbour table entries, etc.

And Ben you really need to get up to snuff with Linux coding
standards.  All of your "StudLyCaps" function names and bad
tabbing hurt my eyes quite a bit and make your patches nearly
impossible to review :-/
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread Herbert Xu

On Thu, Sep 01, 2005 at 09:44:09PM -0700, David S. Miller wrote:
> 
> Ok, how does this look?

Thanks Dave, this one looks perfect.
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread David S. Miller

From: Herbert Xu <[EMAIL PROTECTED]>
Subject: Re: [PATCHES]: Two TSO refinements
Date: Fri, 2 Sep 2005 14:30:26 +1000

> On Thu, Sep 01, 2005 at 09:28:16PM -0700, David S. Miller wrote:
> >
> > > Therefore,
> > > 
> > >   tp->lost_out >= diff
> > 
> > I assume the same applies to tp->left_out as well?
> 
> Yes it does.

Ok, how does this look?

[TCP]: Keep TSO enabled even during loss events.

All we need to do is resegment the queue so that
we record SACK information accurately.  The edges
of the SACK blocks guide our resegmenting decisions.

With help from Herbert Xu.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/include/net/tcp.h b/include/net/tcp.h
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -454,6 +454,7 @@ extern int tcp_retransmit_skb(struct soc
 extern void tcp_xmit_retransmit_queue(struct sock *);
 extern void tcp_simple_retransmit(struct sock *);
 extern int tcp_trim_head(struct sock *, struct sk_buff *, u32);
+extern int tcp_fragment(struct sock *, struct sk_buff *, u32, unsigned int);
 
 extern void tcp_send_probe0(struct sock *);
 extern void tcp_send_partial(struct sock *);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -923,14 +923,6 @@ tcp_sacktag_write_queue(struct sock *sk,
int flag = 0;
int i;
 
-   /* So, SACKs for already sent large segments will be lost.
-* Not good, but alternative is to resegment the queue. */
-   if (sk->sk_route_caps & NETIF_F_TSO) {
-   sk->sk_route_caps &= ~NETIF_F_TSO;
-   sock_set_flag(sk, SOCK_NO_LARGESEND);
-   tp->mss_cache = tp->mss_cache;
-   }
-
if (!tp->sacked_out)
tp->fackets_out = 0;
prior_fackets = tp->fackets_out;
@@ -978,20 +970,40 @@ tcp_sacktag_write_queue(struct sock *sk,
flag |= FLAG_DATA_LOST;
 
sk_stream_for_retrans_queue(skb, sk) {
-   u8 sacked = TCP_SKB_CB(skb)->sacked;
-   int in_sack;
+   int in_sack, pcount;
+   u8 sacked;
 
/* The retransmission queue is always in order, so
 * we can short-circuit the walk early.
 */
-   if(!before(TCP_SKB_CB(skb)->seq, end_seq))
+   if (!before(TCP_SKB_CB(skb)->seq, end_seq))
break;
 
-   fack_count += tcp_skb_pcount(skb);
+   pcount = tcp_skb_pcount(skb);
+
+   if (pcount > 1 &&
+   (after(start_seq, TCP_SKB_CB(skb)->seq) ||
+before(end_seq, TCP_SKB_CB(skb)->end_seq))) {
+   unsigned int pkt_len;
+
+   if (after(start_seq, TCP_SKB_CB(skb)->seq))
+   pkt_len = (start_seq -
+  TCP_SKB_CB(skb)->seq);
+   else
+   pkt_len = (end_seq -
+  TCP_SKB_CB(skb)->seq);
+   if (tcp_fragment(sk, skb, pkt_len, 
skb_shinfo(skb)->tso_size))
+   break;
+   pcount = tcp_skb_pcount(skb);
+   }
+
+   fack_count += pcount;
 
in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) &&
!before(end_seq, TCP_SKB_CB(skb)->end_seq);
 
+   sacked = TCP_SKB_CB(skb)->sacked;
+
/* Account D-SACK for retransmitted packet. */
if ((dup_sack && in_sack) &&
(sacked & TCPCB_RETRANS) &&
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -428,11 +428,11 @@ static void tcp_set_skb_tso_segs(struct 
  * packet to the list.  This won't be called frequently, I hope. 
  * Remember, these are still headerless SKBs at this point.
  */
-static int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, 
unsigned int mss_now)
+int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int 
mss_now)
 {
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *buff;
-   int nsize;
+   int nsize, old_factor;
u16 flags;
 
nsize = skb_headlen(skb) - len;
@@ -490,18 +490,29 @@ static int tcp_fragment(struct sock *sk,
tp->left_out -= tcp_skb_pcount(skb);
}
 
+   old_factor = tcp_skb_pcount(skb);
+
/* Fix up tso_factor for both original and new SKB.  */
tcp_set_skb_tso_segs(sk, skb, mss_now);
tcp_set_skb_tso_segs(sk, buff, mss_now);
 
-   if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
-   tp->lost_

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread Herbert Xu

On Thu, Sep 01, 2005 at 09:28:16PM -0700, David S. Miller wrote:
>
> > Therefore,
> > 
> > tp->lost_out >= diff
> 
> I assume the same applies to tp->left_out as well?

Yes it does.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread David S. Miller

From: Herbert Xu <[EMAIL PROTECTED]>
Date: Fri, 2 Sep 2005 11:08:12 +1000

> Yes, because
> 
>   diff = pcount(orig_skb) - (pcount(skb) + pcount(buff))
><= pcount(orig_skb)
> 
> Now if orig_skb is marked as TCPCB_LOST, then by definition
> 
>   tp->lost_out >= pcount(orig_skb)
> 
> Therefore,
> 
>   tp->lost_out >= diff

I assume the same applies to tp->left_out as well?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Jouni Malinen

On Thu, Sep 01, 2005 at 05:26:07PM +0200, Jiri Benc wrote:

> The current implementation of ieee80211 as is in ieee80211 branch
> contains ugly hack so it works with ethernet frames externally (which
> are internally converted to and from 802.11 frames). Because 802.3 and
> 802.11 have the same format of MAC address, it is somehow possible -
> until you get to WDS and similar features. On the other hand, it allows
> direct bridging between ethernet and 802.11 network.

WDS works fine with Ethernet connection. WDS links are point-to-point
links with two addresses (RA, TA) coming from the link end-points and
two addresses (DA, SA) from the Ethernet frame.

> One of our patches fixes this so ieee80211 works with 802.11 frames.
> This breaks bridging for now (uhm... it isn't change in userspace, is
> it?). Later, the 802.11<->802.3 conversion interface will be added to
> the bridging code where it logically belongs.

Please do not break bridging between wlan and Ethernet interface. IEEE
802.11 development really needs to start looking at what is needed for
AP (Master) mode and WDS links. Both of these are often used with
bridging. I believe that the order here should be to first implement the
conversion interface and only after that change to 802.11 frames. In
many cases, I'm not even sure that move to 802.11 frames is that good of
an idea. Let's at least make sure it does not break more than what is
really needed.

Things like variable header size (e.g., QoS vs. non-QoS for data frames)
is likely to cause problems for many areas. Even if most of the kernel
code would handle this (would it?), there are number of important user
space programs that may be quite confused since they are used to sending
and receiving Ethernet frames with fixed header size. Does this mean that
user space programs will need to learn when to include QoS fields in the
headers if they are required to send frames for which they need to set
both the source and destination MAC addresses (e.g., RSN
pre-authentication)? With 802.11<->802.3 conversion, these kind of
problems are solved by removing the need to know about 802.11 details
from user space.

-- 
Jouni MalinenPGP id EFC895FA
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Possible BUG in IPv4 TCP window handling, all recent 2.4.x/2.6.x kernels

2005-09-01 Thread John Heffner


On Sep 1, 2005, at 6:53 PM, Ion Badulescu wrote:


A few minutes later it has finally caught up to present time and it 
starts receiving smaller packets containing real-time data. The TCP 
window is still 16534 at this point.


[tcpdump output removed]

This is where things start going bad. The window starts shrinking from 
15340 all the way down to 2355 over the course of 0.3 seconds. Notice 
the many duplicate acks that serve no purpose (there are no lost 
packets and the tcpdump is taken on the receiver so there is no 
packets/acks crossed in flight).


I have an idea why this is going on.  Packets are pre-allocated by the 
driver to be a max packet size, so when you send small packets, it 
wastes a lot of memory.  Currently Linux uses the packets at the 
beginning of a connection to make a guess at how best to advertise its 
window so as not to overflow the socket's memory bounds.  Since you 
start out with big segments then go to small ones, this is defeating 
that mechanism.  It's actually documented in the comments in 
tcp_input.c. :)


 * The scheme does not work when sender sends good segments opening
 * window and then starts to feed us spagetti. But it should work
 * in common situations. Otherwise, we have to rely on queue collapsing.

If you overflow the socket's memory bound, it ends up calling 
tcp_clamp_window().  (I'm not sure this is really the right thing to do 
here before trying to collapse the queue.)  If the receiving 
application doesn't fall too far behind, it might help you to set a 
much larger receiver buffer.


  -John

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread Herbert Xu

On Thu, Sep 01, 2005 at 05:53:03PM -0700, David S. Miller wrote:
> 
> > > + tp->lost_out -= diff;
> > > + if ((int)tp->lost_out < 0)
> > > + tp->lost_out = 0;
> > 
> > These checks aren't necessary.
> 
> Are you sure this can't happen if the MSS changes?

Yes, because

diff = pcount(orig_skb) - (pcount(skb) + pcount(buff))
 <= pcount(orig_skb)

Now if orig_skb is marked as TCPCB_LOST, then by definition

tp->lost_out >= pcount(orig_skb)

Therefore,

tp->lost_out >= diff

It's the same reason why we don't check this for packets_out.  fackets_out
is the odd one out because it's determined by the number of packets the
peer has ACKed which doesn't necessarily encompass orig_skb.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[ANNOUNCE] 1.0.3 release of lksctp-tools

2005-09-01 Thread Sridhar Samudrala

Hi All,

A new release of lksctp-tools 1.0.3 is now available for download. It is
based on the latest release of linux kernel 2.6.13 and can be downloaded
from
http://sourceforge.net/project/showfiles.php?group_id=26529

The release includes
* the following set of 4 RPM packages and 1 gzipped tar ball.
  - lksctp-tools-1.0.3-1.i386.rpm
SCTP run-time library, sample sctp applications and
withsctp: a tool when used with existing TCP binaries replaces
TCP with SCTP.
  - lksctp-tools-devel-1.0.3-1.i386.rpm
SCTP header file, source code for sample sctp applications and
man pages.
  - lksctp-tools-doc-1.0.3-1.i386.rpm
SCTP RFC's and internet drafts.
  - lksctp-tools-1.0.3-1.src.rpm
Source RPM
  - lksctp-tools-1.0.3.tar.gz
Source in tar format

Here are some of the major enhancements and bugfixes that went in
since 1.0.2 release.

List of changes that went into lksctp-tools since 1.0.2 release
- sctp_connectx() API.
- Fix inconsistencies in license stmt in the library files.
- Fix subscript out of range bug in sctp_test.c
- Update frametests to run on linux 2.6.13.
- Update SCTP internet drafts in doc directory

List of changes that went into linux kernel since linux 2.6.10
- sctp_connectx() API support.
- Fix potential NULL pointer dereference while handling an ICMP
  error.
- Make init & delayed sack timeouts configurable by user.
- Fix incorrect setting of sk_bound_dev_if when binding/sending
  to a ipv6 link local address.
- Support IP_FREEBIND socket option and ip_nonlocal_bind sysctl.
- Extend the info exported via /proc/net/sctp to support
  netstat for SCTP.
- Support SO_BINDTODEVICE option on incoming packets.
- Fix bug in restart of peeled-off association.
- Add sctp send buffer accounting.
- Implement Sec 2.41 of SCTP Implementers guide.
- Fix SCTP_ASSOCINFO getsockopt for 1-1 style sockets.
- Add sctp receive buffer accounting.

Regards
Sridhar

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread David S. Miller

From: Herbert Xu <[EMAIL PROTECTED]>
Date: Fri, 2 Sep 2005 08:11:37 +1000

> On Thu, Sep 01, 2005 at 03:06:47PM -0700, David S. Miller wrote:
> >
> > -   if (TCP_SKB_CB(buff)->sacked&TCPCB_LOST) {
> > -   tp->lost_out += tcp_skb_pcount(buff);
> > -   tp->left_out += tcp_skb_pcount(buff);
> > +   tp->packets_out -= diff;
> > +   if (diff > 0) {
> > +   tp->fackets_out -= diff;
> > +   if ((int)tp->fackets_out < 0)
> > +   tp->fackets_out = 0;
> > +   if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
> 
> The TCPCB_LOST stuff should be outside the diff > 0 if block.

Ok.

> > +   tp->lost_out -= diff;
> > +   if ((int)tp->lost_out < 0)
> > +   tp->lost_out = 0;
> 
> These checks aren't necessary.

Are you sure this can't happen if the MSS changes?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2/2] [TCP] Fix sk_forward_alloc underflow in tcp_sendmsg

2005-09-01 Thread David S. Miller

From: Herbert Xu <[EMAIL PROTECTED]>
Date: Fri, 2 Sep 2005 10:43:40 +1000

> I've finally found a potential cause of the sk_forward_alloc underflows
> that people have been reporting sporadically.
> 
> When tcp_sendmsg tacks on extra bits to an existing TCP_PAGE we don't
> check sk_forward_alloc even though a large amount of time may have
> elapsed since we allocated the page.  In the mean time someone could've
> come along and liberated packets and reclaimed sk_forward_alloc memory.
> 
> This patch makes tcp_sendmsg check sk_forward_alloc every time as we
> do in do_tcp_sendpages.
>  
> Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Ugh, good catch, both patches applied.

I think this is a 2.6.13.1 stable tree candidate, can you forward
this thusly to [EMAIL PROTECTED]  Thanks a lot.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2/2] [TCP] Fix sk_forward_alloc underflow in tcp_sendmsg

2005-09-01 Thread Herbert Xu

Hi:

I've finally found a potential cause of the sk_forward_alloc underflows
that people have been reporting sporadically.

When tcp_sendmsg tacks on extra bits to an existing TCP_PAGE we don't
check sk_forward_alloc even though a large amount of time may have
elapsed since we allocated the page.  In the mean time someone could've
come along and liberated packets and reclaimed sk_forward_alloc memory.

This patch makes tcp_sendmsg check sk_forward_alloc every time as we
do in do_tcp_sendpages.
 
Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/net/sock.h b/include/net/sock.h
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1232,9 +1232,8 @@ static inline struct page *sk_stream_all
 {
struct page *page = NULL;
 
-   if (sk_stream_wmem_schedule(sk, PAGE_SIZE))
-   page = alloc_pages(sk->sk_allocation, 0);
-   else {
+   page = alloc_pages(sk->sk_allocation, 0);
+   if (!page) {
sk->sk_prot->enter_memory_pressure();
sk_stream_moderate_sndbuf(sk);
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -769,19 +769,23 @@ new_segment:
if (off == PAGE_SIZE) {
put_page(page);
TCP_PAGE(sk) = page = NULL;
+   TCP_OFF(sk) = off = 0;
}
-   }
+   } else
+   BUG_ON(off);
+
+   if (copy > PAGE_SIZE - off)
+   copy = PAGE_SIZE - off;
+
+   if (!sk_stream_wmem_schedule(sk, copy))
+   goto wait_for_memory;
 
if (!page) {
/* Allocate new cache page. */
if (!(page = sk_stream_alloc_page(sk)))
goto wait_for_memory;
-   off = 0;
}
 
-   if (copy > PAGE_SIZE - off)
-   copy = PAGE_SIZE - off;
-
/* Time to copy data. We are close to
 * the end! */
err = skb_copy_to_page(sk, from, skb, page,

Re: netdevice refcount question for mirred.c

2005-09-01 Thread Patrick McHardy

Ben Greear wrote:
> 
> At about line 132 of mirred.c, there is this code:
> 
> if (parm->ifindex) {
> p->ifindex = parm->ifindex;
> if (ret != ACT_P_CREATED)
> 
> *** It appears that this check could allow over-writing of p->dev below
> without ever calling dev_put on the p->dev.  Should it maybe put the
> p->dev if it is not NULL?

The check is equivalent, when the action is newly created it can't hold
a reference to the device, otherwise it definitely does. I wrote this
code and still had to think about it for a couple of minutes, so I agree
a simple != NULL check would be easier to understand.

> 
> dev_put(p->dev);
> p->dev = dev;
> dev_hold(dev);
> p->ok_push = ok_push;
> }
> 
> 
> Also, earlier in the method it does a __dev_get_by_index(parm->ifindex),
> and continues to use the returned value after that.  Couldn't this lead
> to a reference-after-free, or does external locking prohibit this?

The rtnl semaphore guarantees the device can't go away.

BTW, since you're hunting for leaked device references, the mirred
action should register to the netdev notifier to kill actions holding
device references when the device goes down, no? Jamal?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[1/2] [NET] Add sk_stream_wmem_schedule

2005-09-01 Thread Herbert Xu

Hi Dave:

This patch introduces sk_stream_wmem_schedule as a short-hand for
the sk_forward_alloc checking on egress.

Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/net/sock.h b/include/net/sock.h
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -709,6 +709,12 @@ static inline int sk_stream_rmem_schedul
sk_stream_mem_schedule(sk, skb->truesize, 1);
 }
 
+static inline int sk_stream_wmem_schedule(struct sock *sk, int size)
+{
+   return size <= sk->sk_forward_alloc ||
+  sk_stream_mem_schedule(sk, size, 0);
+}
+
 /* Used by processes to "lock" a socket state, so that
  * interrupts and bottom half handlers won't change it
  * from under us. It essentially blocks any incoming
@@ -1203,8 +1209,7 @@ static inline struct sk_buff *sk_stream_
skb = alloc_skb_fclone(size + hdr_len, gfp);
if (skb) {
skb->truesize += mem;
-   if (sk->sk_forward_alloc >= (int)skb->truesize ||
-   sk_stream_mem_schedule(sk, skb->truesize, 0)) {
+   if (sk_stream_wmem_schedule(sk, skb->truesize)) {
skb_reserve(skb, hdr_len);
return skb;
}
@@ -1227,8 +1232,7 @@ static inline struct page *sk_stream_all
 {
struct page *page = NULL;
 
-   if (sk->sk_forward_alloc >= (int)PAGE_SIZE ||
-   sk_stream_mem_schedule(sk, PAGE_SIZE, 0))
+   if (sk_stream_wmem_schedule(sk, PAGE_SIZE))
page = alloc_pages(sk->sk_allocation, 0);
else {
sk->sk_prot->enter_memory_pressure();
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -552,8 +552,7 @@ new_segment:
tcp_mark_push(tp, skb);
goto new_segment;
}
-   if (sk->sk_forward_alloc < copy &&
-   !sk_stream_mem_schedule(sk, copy, 0))
+   if (!sk_stream_wmem_schedule(sk, copy))
goto wait_for_memory;

if (can_coalesce) {

[PATCH 8/8] orinoco: New driver - spectrum_cs.

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree dee4f325520d4ea29397dd67ca657b7235bb1790 (from 
c88faac230cc9775445e5c644991c352e35c72a1)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 17:46:39 2005 -0400

New driver - spectrum_cs.

Driver for 802.11b cards using RAM-loadable Symbol firmware, such as
Symbol Wireless Networker LA4100, CompactFlash cards by Socket
Communications and Intel PRO/Wireless 2011B.

The driver implements Symbol firmware download.  The rest is handled
in hermes.c and orinoco.c.

Utilities for downloading the Symbol firmware are available at
http://sourceforge.net/projects/orinoco/

diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -269,10 +269,23 @@ config PCMCIA_HERMES
 
  You will also very likely also need the Wireless Tools in order to
  configure your card and that /etc/pcmcia/wireless.opts works:
  .
 
+config PCMCIA_SPECTRUM
+   tristate "Symbol Spectrum24 Trilogy PCMCIA card support"
+   depends on NET_RADIO && PCMCIA && HERMES
+   ---help---
+
+ This is a driver for 802.11b cards using RAM-loadable Symbol
+ firmware, such as Symbol Wireless Networker LA4100, CompactFlash
+ cards by Socket Communications and Intel PRO/Wireless 2011B.
+
+ This driver requires firmware download on startup.  Utilities
+ for downloading Symbol firmware are available at
+ 
+
 config AIRO_CS
tristate "Cisco/Aironet 34X/35X/4500/4800 PCMCIA cards"
depends on NET_RADIO && PCMCIA && (BROKEN || !M32R)
---help---
  This is the standard Linux driver to support Cisco/Aironet PCMCIA
diff --git a/drivers/net/wireless/Makefile b/drivers/net/wireless/Makefile
--- a/drivers/net/wireless/Makefile
+++ b/drivers/net/wireless/Makefile
@@ -17,10 +17,11 @@ obj-$(CONFIG_PCMCIA_HERMES) += orinoco_c
 obj-$(CONFIG_APPLE_AIRPORT)+= airport.o
 obj-$(CONFIG_PLX_HERMES)   += orinoco_plx.o
 obj-$(CONFIG_PCI_HERMES)   += orinoco_pci.o
 obj-$(CONFIG_TMD_HERMES)   += orinoco_tmd.o
 obj-$(CONFIG_NORTEL_HERMES)+= orinoco_nortel.o
+obj-$(CONFIG_PCMCIA_SPECTRUM)  += spectrum_cs.o
 
 obj-$(CONFIG_AIRO) += airo.o
 obj-$(CONFIG_AIRO_CS)  += airo_cs.o airo.o
 
 obj-$(CONFIG_ATMEL) += atmel.o
diff --git a/drivers/net/wireless/spectrum_cs.c 
b/drivers/net/wireless/spectrum_cs.c
new file mode 100644
--- /dev/null
+++ b/drivers/net/wireless/spectrum_cs.c
@@ -0,0 +1,1120 @@
+/*
+ * Driver for 802.11b cards using RAM-loadable Symbol firmware, such as
+ * Symbol Wireless Networker LA4100, CompactFlash cards by Socket
+ * Communications and Intel PRO/Wireless 2011B.
+ *
+ * The driver implements Symbol firmware download.  The rest is handled
+ * in hermes.c and orinoco.c.
+ *
+ * Utilities for downloading the Symbol firmware are available at
+ * http://sourceforge.net/projects/orinoco/
+ *
+ * Copyright (C) 2002-2005 Pavel Roskin <[EMAIL PROTECTED]>
+ * Portions based on orinoco_cs.c:
+ * Copyright (C) David Gibson, Linuxcare Australia
+ * Portions based on Spectrum24tDnld.c from original spectrum24 driver:
+ * Copyright (C) Symbol Technologies.
+ *
+ * See copyright notice in file orinoco.c.
+ */
+
+#define DRIVER_NAME "spectrum_cs"
+#define PFX DRIVER_NAME ": "
+
+#include 
+#ifdef  __IN_PCMCIA_PACKAGE__
+#include 
+#endif /* __IN_PCMCIA_PACKAGE__ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "orinoco.h"
+
+/*
+ * If SPECTRUM_FW_INCLUDED is defined, the firmware is hardcoded into
+ * the driver.  Use get_symbol_fw script to generate spectrum_fw.h and
+ * copy it to the same directory as spectrum_cs.c.
+ *
+ * If SPECTRUM_FW_INCLUDED is not defined, the firmware is loaded at the
+ * runtime using hotplug.  Use the same get_symbol_fw script to generate
+ * files symbol_sp24t_prim_fw symbol_sp24t_sec_fw, copy them to the
+ * hotplug firmware directory (typically /usr/lib/hotplug/firmware) and
+ * make sure that you have hotplug installed and enabled in the kernel.
+ */
+/* #define SPECTRUM_FW_INCLUDED 1 */
+
+#ifdef SPECTRUM_FW_INCLUDED
+/* Header with the firmware */
+#include "spectrum_fw.h"
+#else  /* !SPECTRUM_FW_INCLUDED */
+#include 
+static unsigned char *primsym;
+static unsigned char *secsym;
+static const char primary_fw_name[] = "symbol_sp24t_prim_fw";
+static const char secondary_fw_name[] = "symbol_sp24t_sec_fw";
+#endif /* !SPECTRUM_FW_INCLUDED */
+
+//
+/* Module stuff*/
+/*

[PATCH 7/8] orinoco: New driver - orinoco_nortel.

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree dce61aef99ceb57370b70222dc34d788666c0ac3 (from 
ceb6695092be8dcdfe2dec6ee5097d613011489d)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 15:50:55 2005 -0400

New driver - orinoco_nortel.

This is a driver for Nortel emobility PCI adaptors, which consist of an
Orinoco compatible PCMCIA card and a simple PCI-to-PCMCIA bridge.  The
driver initializes the device and uses Orinoco core driver for actual
wireless networking.

diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -203,10 +203,19 @@ config TMD_HERMES
  orinoco) driver when used in TMD7160 based PCI adaptors.  These
  adaptors are not a full PCMCIA controller but act as a more limited
  PCI <-> PCMCIA bridge.  Several vendors sell such adaptors so that
  802.11b PCMCIA cards can be used in desktop machines.
 
+config NORTEL_HERMES
+   tristate "Nortel emobility PCI adaptor support"
+   depends on PCI && HERMES
+   help
+ Enable support for PCMCIA cards supported by the "Hermes" (aka
+ orinoco) driver when used in Nortel emobility PCI adaptors.  These
+ adaptors are not full PCMCIA controllers, but act as a more limited
+ PCI <-> PCMCIA bridge.
+
 config PCI_HERMES
tristate "Prism 2.5 PCI 802.11b adaptor support"
depends on PCI && HERMES
help
  Enable support for PCI and mini-PCI 802.11b wireless NICs based on
diff --git a/drivers/net/wireless/Makefile b/drivers/net/wireless/Makefile
--- a/drivers/net/wireless/Makefile
+++ b/drivers/net/wireless/Makefile
@@ -16,10 +16,11 @@ obj-$(CONFIG_HERMES)+= orinoco.o herme
 obj-$(CONFIG_PCMCIA_HERMES)+= orinoco_cs.o
 obj-$(CONFIG_APPLE_AIRPORT)+= airport.o
 obj-$(CONFIG_PLX_HERMES)   += orinoco_plx.o
 obj-$(CONFIG_PCI_HERMES)   += orinoco_pci.o
 obj-$(CONFIG_TMD_HERMES)   += orinoco_tmd.o
+obj-$(CONFIG_NORTEL_HERMES)+= orinoco_nortel.o
 
 obj-$(CONFIG_AIRO) += airo.o
 obj-$(CONFIG_AIRO_CS)  += airo_cs.o airo.o
 
 obj-$(CONFIG_ATMEL) += atmel.o
diff --git a/drivers/net/wireless/orinoco_nortel.c 
b/drivers/net/wireless/orinoco_nortel.c
new file mode 100644
--- /dev/null
+++ b/drivers/net/wireless/orinoco_nortel.c
@@ -0,0 +1,324 @@
+/* orinoco_nortel.c
+ * 
+ * Driver for Prism II devices which would usually be driven by orinoco_cs,
+ * but are connected to the PCI bus by a Nortel PCI-PCMCIA-Adapter. 
+ *
+ * Copyright (C) 2002 Tobias Hoffmann
+ *   (C) 2003 Christoph Jungegger <[EMAIL PROTECTED]>
+ *
+ * Some of this code is borrowed from orinoco_plx.c
+ * Copyright (C) 2001 Daniel Barlow
+ * Some of this code is borrowed from orinoco_pci.c 
+ *  Copyright (C) 2001 Jean Tourrilhes
+ * Some of this code is "inspired" by linux-wlan-ng-0.1.10, but nothing
+ * has been copied from it. linux-wlan-ng-0.1.10 is originally :
+ * Copyright (C) 1999 AbsoluteValue Systems, Inc.  All Rights Reserved.
+ * 
+ * The contents of this file are subject to the Mozilla Public License
+ * Version 1.1 (the "License"); you may not use this file except in
+ * compliance with the License. You may obtain a copy of the License
+ * at http://www.mozilla.org/MPL/
+ *
+ * Software distributed under the License is distributed on an "AS IS"
+ * basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
+ * the License for the specific language governing rights and
+ * limitations under the License.
+ *
+ * Alternatively, the contents of this file may be used under the
+ * terms of the GNU General Public License version 2 (the "GPL"), in
+ * which case the provisions of the GPL are applicable instead of the
+ * above.  If you wish to allow the use of your version of this file
+ * only under the terms of the GPL and not to allow others to use your
+ * version of this file under the MPL, indicate your decision by
+ * deleting the provisions above and replace them with the notice and
+ * other provisions required by the GPL.  If you do not delete the
+ * provisions above, a recipient may use your version of this file
+ * under either the MPL or the GPL.
+ */
+
+#define DRIVER_NAME "orinoco_nortel"
+#define PFX DRIVER_NAME ": "
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "hermes.h"
+#include "orinoco.h"
+
+#define COR_OFFSET(0xe0)   /* COR attribute offset of Prism2 PC card */
+#define COR_VALUE (COR_LEVEL_REQ | COR_FUNC_ENA)   /* Enable PC card with 
interrupt in level trigger */
+
+
+/* Nortel specific data */
+struct nortel_pci_card {
+   unsigned long iobase1;
+   unsigned long iobase2;
+};
+
+/*
+ * Do a soft reset of the PCI card using the Configuration

[PATCH 6/8] orinoco: Remove EXPERIMENTAL mark from PLX_HERMES, TMD_HERMES and PCI_HERMES.

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree ceb6695092be8dcdfe2dec6ee5097d613011489d (from 
6b39374a27eb4be7e9d82145ae270ba02ea90dc8)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 14:50:10 2005 -0400

Remove EXPERIMENTAL mark from PLX_HERMES, TMD_HERMES and PCI_HERMES.

Those drivers have been used for a long time, and there have been very
few problem reports.

diff --git a/drivers/net/wireless/Kconfig b/drivers/net/wireless/Kconfig
--- a/drivers/net/wireless/Kconfig
+++ b/drivers/net/wireless/Kconfig
@@ -183,39 +183,33 @@ config APPLE_AIRPORT
  built into the Macintosh iBook and other recent PowerPC-based
  Macintosh machines. This is essentially a Lucent Orinoco card with 
  a non-standard interface
 
 config PLX_HERMES
-   tristate "Hermes in PLX9052 based PCI adaptor support (Netgear MA301 
etc.) (EXPERIMENTAL)"
-   depends on PCI && HERMES && EXPERIMENTAL
+   tristate "Hermes in PLX9052 based PCI adaptor support (Netgear MA301 
etc.)"
+   depends on PCI && HERMES
help
  Enable support for PCMCIA cards supported by the "Hermes" (aka
  orinoco) driver when used in PLX9052 based PCI adaptors.  These
  adaptors are not a full PCMCIA controller but act as a more limited
  PCI <-> PCMCIA bridge.  Several vendors sell such adaptors so that
  802.11b PCMCIA cards can be used in desktop machines.  The Netgear
  MA301 is such an adaptor.
 
- Support for these adaptors is so far still incomplete and buggy.
- You have been warned.
-
 config TMD_HERMES
-   tristate "Hermes in TMD7160 based PCI adaptor support (EXPERIMENTAL)"
-   depends on PCI && HERMES && EXPERIMENTAL
+   tristate "Hermes in TMD7160 based PCI adaptor support"
+   depends on PCI && HERMES
help
  Enable support for PCMCIA cards supported by the "Hermes" (aka
  orinoco) driver when used in TMD7160 based PCI adaptors.  These
  adaptors are not a full PCMCIA controller but act as a more limited
  PCI <-> PCMCIA bridge.  Several vendors sell such adaptors so that
  802.11b PCMCIA cards can be used in desktop machines.
 
- Support for these adaptors is so far still incomplete and buggy.
- You have been warned.
-
 config PCI_HERMES
-   tristate "Prism 2.5 PCI 802.11b adaptor support (EXPERIMENTAL)"
-   depends on PCI && HERMES && EXPERIMENTAL
+   tristate "Prism 2.5 PCI 802.11b adaptor support"
+   depends on PCI && HERMES
help
  Enable support for PCI and mini-PCI 802.11b wireless NICs based on
  the Prism 2.5 chipset.  These are true PCI cards, not the 802.11b
  PCMCIA cards bundled with PCI<->PCMCIA adaptors which are also
  common.  Some of the built-in wireless adaptors in laptops are of


-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/8] orinoco: Optimize orinoco_join_ap()

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree cb289b9f9b2a0f3ae7070a008f22e383b37526ee (from 
56bfcdb38b3d04c1f8c1fd705e411f4be53b663c)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 19:05:16 2005 -0400

Optimize orinoco_join_ap() - break from loop once the requested
BSSID
is found.

diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -1049,12 +1049,13 @@ static void orinoco_join_ap(struct net_d
struct join_req {
u8 bssid[ETH_ALEN];
u16 channel;
} __attribute__ ((packed)) req;
const int atom_len = offsetof(struct prism2_scan_apinfo, atim);
-   struct prism2_scan_apinfo *atom;
+   struct prism2_scan_apinfo *atom = NULL;
int offset = 4;
+   int found = 0;
u8 *buf;
u16 len;
 
/* Allocate buffer for scan results */
buf = kmalloc(MAX_SCAN_LEN, GFP_KERNEL);
@@ -1085,19 +1086,22 @@ static void orinoco_join_ap(struct net_d
 
/* Go through the scan results looking for the channel of the AP
 * we were requested to join */
for (; offset + atom_len <= len; offset += atom_len) {
atom = (struct prism2_scan_apinfo *) (buf + offset);
-   if (memcmp(&atom->bssid, priv->desired_bssid, ETH_ALEN) == 0)
-   goto found;
+   if (memcmp(&atom->bssid, priv->desired_bssid, ETH_ALEN) == 0) {
+   found = 1;
+   break;
+   }
}
 
-   DEBUG(1, "%s: Requested AP not found in scan results\n",
- dev->name);
-   goto out;
+   if (! found) {
+   DEBUG(1, "%s: Requested AP not found in scan results\n",
+ dev->name);
+   goto out;
+   }
 
- found:
memcpy(req.bssid, priv->desired_bssid, ETH_ALEN);
req.channel = atom->channel;/* both are little-endian */
err = HERMES_WRITE_RECORD(hw, USER_BAP, HERMES_RID_CNFJOINREQUEST,
  &req);
if (err)


-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/8] orinoco: Fix memory leak on error in processing hostscan frames.

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree ca955293cdfd3139e150d3b4fed3922a7eb651fb (from 
cb289b9f9b2a0f3ae7070a008f22e383b37526ee)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 19:08:00 2005 -0400

Fix memory leak on error in processing hostscan frames.

diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -1284,12 +1284,14 @@ static void __orinoco_ev_info(struct net
break;
 
/* Read scan data */
err = hermes_bap_pread(hw, IRQ_BAP, (void *) buf, len,
   infofid, sizeof(info));
-   if (err)
+   if (err) {
+   kfree(buf);
break;
+   }
 
 #ifdef ORINOCO_DEBUG
{
int i;
printk(KERN_DEBUG "Scan result [%02X", buf[0]);


-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/8] orinoco: Remove entry for Intel PRO/Wireless 2011B.

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree c88faac230cc9775445e5c644991c352e35c72a1 (from 
dce61aef99ceb57370b70222dc34d788666c0ac3)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 17:09:45 2005 -0400

Remove entry for Intel PRO/Wireless 2011B.

It is not supported by this driver because it has no firmware in
flash.  spectrum_cs is needed for this device.

diff --git a/drivers/net/wireless/orinoco_cs.c 
b/drivers/net/wireless/orinoco_cs.c
--- a/drivers/net/wireless/orinoco_cs.c
+++ b/drivers/net/wireless/orinoco_cs.c
@@ -602,11 +602,10 @@ static char version[] __initdata = DRIVE
" (David Gibson <[EMAIL PROTECTED]>, "
"Pavel Roskin <[EMAIL PROTECTED]>, et al)";
 
 static struct pcmcia_device_id orinoco_cs_ids[] = {
PCMCIA_DEVICE_MANF_CARD(0x000b, 0x7300),
-   PCMCIA_DEVICE_MANF_CARD(0x0089, 0x0001),
PCMCIA_DEVICE_MANF_CARD(0x0138, 0x0002),
PCMCIA_DEVICE_MANF_CARD(0x0156, 0x0002),
PCMCIA_DEVICE_MANF_CARD(0x01eb, 0x080a),
PCMCIA_DEVICE_MANF_CARD(0x0261, 0x0002),
PCMCIA_DEVICE_MANF_CARD(0x0268, 0x0001),


-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/8] orinoco: Change orinoco_translate_scan() to return error code on error.

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree 8fc038ec51acf5f777fade80c5e38112b766aeee (from 
ca955293cdfd3139e150d3b4fed3922a7eb651fb)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 19:10:12 2005 -0400

Change orinoco_translate_scan() to return error code on error.
Adjust the caller to check for errors and clean up if needed.

diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -4023,11 +4023,12 @@ static int orinoco_ioctl_setscan(struct 
orinoco_unlock(priv, &flags);
return err;
 }
 
 /* Translate scan data returned from the card to a card independant
- * format that the Wireless Tools will understand - Jean II */
+ * format that the Wireless Tools will understand - Jean II
+ * Return message length or -errno for fatal errors */
 static inline int orinoco_translate_scan(struct net_device *dev,
 char *buffer,
 char *scan,
 int scan_len)
 {
@@ -4063,25 +4064,31 @@ static inline int orinoco_translate_scan
atom_len = 68;
offset = 0;
break;
case FIRMWARE_TYPE_INTERSIL:
offset = 4;
-   if (priv->has_hostscan)
-   atom_len = scan[0] + (scan[1] << 8);
-   else
+   if (priv->has_hostscan) {
+   atom_len = le16_to_cpup((u16 *)scan);
+   /* Sanity check for atom_len */
+   if (atom_len < sizeof(struct prism2_scan_apinfo)) {
+   printk(KERN_ERR "%s: Invalid atom_len in scan 
data: %d\n",
+   dev->name, atom_len);
+   return -EIO;
+   }
+   } else
atom_len = offsetof(struct prism2_scan_apinfo, atim);
break;
default:
-   return 0;
+   return -EOPNOTSUPP;
}
 
/* Check that we got an whole number of atoms */
if ((scan_len - offset) % atom_len) {
printk(KERN_ERR "%s: Unexpected scan data length %d, "
   "atom_len %d, offset %d\n", dev->name, scan_len,
   atom_len, offset);
-   return 0;
+   return -EIO;
}
 
/* Read the entries one by one */
for (; offset + atom_len <= scan_len; offset += atom_len) {
/* Get next atom */
@@ -4212,37 +4219,45 @@ static int orinoco_ioctl_getscan(struct 
err = -ENODATA;
} else {
/* We have some results to push back to user space */
 
/* Translate to WE format */
-   srq->length = orinoco_translate_scan(dev, extra,
-priv->scan_result,
-priv->scan_len);
-
-   /* Return flags */
-   srq->flags = (__u16) priv->scan_mode;
-
-   /* Results are here, so scan no longer in progress */
-   priv->scan_inprogress = 0;
-
-   /* In any case, Scan results will be cleaned up in the
-* reset function and when exiting the driver.
-* The person triggering the scanning may never come to
-* pick the results, so we need to do it in those places.
-* Jean II */
+   int ret = orinoco_translate_scan(dev, extra,
+priv->scan_result,
+priv->scan_len);
+
+   if (ret < 0) {
+   err = ret;
+   kfree(priv->scan_result);
+   priv->scan_result = NULL;
+   } else {
+   srq->length = ret;
+
+   /* Return flags */
+   srq->flags = (__u16) priv->scan_mode;
+
+   /* In any case, Scan results will be cleaned up in the
+* reset function and when exiting the driver.
+* The person triggering the scanning may never come to
+* pick the results, so we need to do it in those 
places.
+* Jean II */
 
 #ifdef SCAN_SINGLE_READ
-   /* If you enable this option, only one client (the first
-* one) will be able to read the result (and only one
-* time). If there is multiple concurent clients that
-* want to read scan results, this behavior is not
-* advisable - Jean II */
-   kfree(priv->scan_result);
-   priv->scan_result = NULL;
+   /* If you enable this option, only one client (the f

[PATCH 1/8] orinoco: Stop using "ieee802_11.h".

2005-09-01 Thread Pavel Roskin

Signed-off-by: Pavel Roskin <[EMAIL PROTECTED]>

diff-tree 56bfcdb38b3d04c1f8c1fd705e411f4be53b663c (from 
dee4f325520d4ea29397dd67ca657b7235bb1790)
Author: Pavel Roskin <[EMAIL PROTECTED]>
Date:   Thu Sep 1 18:15:07 2005 -0400

Stop using "ieee802_11.h".

Use equivalent constants from 

diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -99,11 +99,10 @@
 #include 
 
 #include "hermes.h"
 #include "hermes_rid.h"
 #include "orinoco.h"
-#include "ieee802_11.h"
 
 //
 /* Module information   */
 //
 
@@ -148,11 +147,11 @@ MODULE_PARM_DESC(force_monitor, "Allow m
 /* 802.2 LLC/SNAP header used for Ethernet encapsulation over 802.11 */
 static const u8 encaps_hdr[] = {0xaa, 0xaa, 0x03, 0x00, 0x00, 0x00};
 #define ENCAPS_OVERHEAD(sizeof(encaps_hdr) + 2)
 
 #define ORINOCO_MIN_MTU256
-#define ORINOCO_MAX_MTU(IEEE802_11_DATA_LEN - ENCAPS_OVERHEAD)
+#define ORINOCO_MAX_MTU(IEEE80211_DATA_LEN - ENCAPS_OVERHEAD)
 
 #define SYMBOL_MAX_VER_LEN (14)
 #define USER_BAP   0
 #define IRQ_BAP1
 #define MAX_IRQLOOPS_PER_IRQ   10
@@ -440,11 +439,11 @@ static int orinoco_change_mtu(struct net
struct orinoco_private *priv = netdev_priv(dev);
 
if ( (new_mtu < ORINOCO_MIN_MTU) || (new_mtu > ORINOCO_MAX_MTU) )
return -EINVAL;
 
-   if ( (new_mtu + ENCAPS_OVERHEAD + IEEE802_11_HLEN) >
+   if ( (new_mtu + ENCAPS_OVERHEAD + IEEE80211_HLEN) >
 (priv->nicbuf_size - ETH_HLEN) )
return -EINVAL;
 
dev->mtu = new_mtu;
 
@@ -916,11 +915,11 @@ static void __orinoco_ev_rx(struct net_d
/* At least on Symbol firmware with PCF we get quite a
lot of these legitimately - Poll frames with no
data. */
return;
}
-   if (length > IEEE802_11_DATA_LEN) {
+   if (length > IEEE80211_DATA_LEN) {
printk(KERN_WARNING "%s: Oversized frame received (%d bytes)\n",
   dev->name, length);
stats->rx_length_errors++;
goto update_stats;
}
@@ -2270,11 +2269,11 @@ static int orinoco_init(struct net_devic
 
TRACE_ENTER(dev->name);
 
/* No need to lock, the hw_unavailable flag is already set in
 * alloc_orinocodev() */
-   priv->nicbuf_size = IEEE802_11_FRAME_LEN + ETH_HLEN;
+   priv->nicbuf_size = IEEE80211_FRAME_LEN + ETH_HLEN;
 
/* Initialize the firmware */
err = orinoco_reinit_firmware(dev);
if (err != 0) {
printk(KERN_ERR "%s: failed to initialize firmware (err = 
%d)\n",


-- 
Regards,
Pavel Roskin

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Possible BUG in IPv4 TCP window handling, all recent 2.4.x/2.6.x kernels

2005-09-01 Thread Jesper Juhl

On 9/2/05, Ion Badulescu <[EMAIL PROTECTED]> wrote:
> Hi David,
> 
> On Thu, 1 Sep 2005, David S. Miller wrote:
> 
> > Thanks for the empty posting.  Please provide the content you
> > intended to post, and furthermore please post it to the network
> > developer mailing list, netdev@vger.kernel.org
> 
> First of all, thanks for the reply (even to an empty posting :).
> 
> The posting wasn't actually empty, it was probably too long (94K according

Two solutions commonly applied to that problem :

 - put the big file(s) online somewhere and include an URL in the email
 - compress the file(s) and attach the compressed files to the email

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Possible BUG in IPv4 TCP window handling, all recent 2.4.x/2.6.x kernels

2005-09-01 Thread Ion Badulescu


Hi David,

On Thu, 1 Sep 2005, David S. Miller wrote:


Thanks for the empty posting.  Please provide the content you
intended to post, and furthermore please post it to the network
developer mailing list, netdev@vger.kernel.org


First of all, thanks for the reply (even to an empty posting :).

The posting wasn't actually empty, it was probably too long (94K according 
to my sent-mail folder) and majordomo truncated it to zero. It has some 
tcpdump snippets, that's what made it so long... unfortunately, they're 
all necessary to understand the nature of the bug. I wasn't sure about 
netdev, that's why I posted it only to linux-kernel and linux-net.


I can provide the full tcpdump out-of-band to interested people, since I 
don't think I can get it past majordomo.


Here is the text of the message without the tcpdump inserts:

---
Hello,

I've been tracking down this bug for some time, and I'm fairly convinced 
at this point that it's a kernel bug.


Under certain conditions, the TCP stack starts shrinking the TCP window 
down to some ridiculously low values (hundreds of bytes, as low as 181) 
and never recovers. The certain conditions I mentioned are not well 
understood at this point, but they include a long-lived connection with a 
very one-sided, fluctuating traffic flowing through it.


So far I've been able to reproduce it on plain-vanilla 2.4.9, 2.4.11.9, 
and 2.4.12.2, as well as on the RHEL3 kernels 2.4.21-20 and 2.4.21-31. The 
hardware is dual Opteron 250, running both 32- and 64-bit SMP kernels 
(seems to make no difference). I've also seen the bug occur on a single 
Athlon XP running 2.6.11.9 UP.


The bug occurs with all sysctl settings at their default values. I've 
tried enabling and disabling pretty much all the tcp-related sysctl's in 
/proc/sys/net/ipv4, to no visible improvement.


Here are a few tcpdump snippets of a TCP connection exhibiting the bug 
(the complete tcpdump is available upon request, but it's very large). 
10.2.20.246 is the data receiver and is the box exhibiting the bug (I'm 
not sure what 10.2.224.182 is running, I don't have access to it). The 
data being sent through is real-time financial data; the session begins by 
catching up (at line speed) to present time, then continues to receive 
real-time data as it is being generated. For what it's worth, we've never 
been seen the bug occur while the session is still catching up (and 
receiving a few large packets at a time); it always seems to happen while 
receiving real-time data (many small packets, variably interspaced).


[I apologize for the amount of tcpdump data, but it's the only way to show 
the bug in action.]


[tcpdump output removed]

The connection is established and the receiver's TCP window quickly ramps 
up to 8192.


[tcpdump output removed]

Shortly thereafter the TCP window increases further to 16534. It remains 
around 16534 for the next 5 minutes or so.


[tcpdump output removed]

A few minutes later it has finally caught up to present time and it starts 
receiving smaller packets containing real-time data. The TCP window is 
still 16534 at this point.


[tcpdump output removed]

This is where things start going bad. The window starts shrinking from 
15340 all the way down to 2355 over the course of 0.3 seconds. Notice the 
many duplicate acks that serve no purpose (there are no lost packets and 
the tcpdump is taken on the receiver so there is no packets/acks crossed 
in flight).


[tcpdump output removed]

Five minutes later the TCP window is still at 2355, having never 
recovered. The window is so small that the available bandwidth for this 
connection is too small to keep up with the real-time data so it is 
falling behind, hence large packets are again being used. The application 
processing the data (Java-based) is mostly idle at this point, and netstat 
shows its recv queue to be empty. There is no apparent reason why the 
kernel shouldn't enlarge the window.


In fact, if I let it continue, it eventually shrinks the window even 
further (by 18:19:29, the time I'm writing this email, it's gone all the 
way down to 1373). As I mentioned earlier, I've seen it go as low as 181.


We are kind of stumped at this point, and it's proving to be a 
show-stopping bug for our purposes, especially over WAN links that have 
higher latency (for obvious reasons). Any kind of assistance would be 
greatly appreciated.


Thanks,
-Ion
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Repair Incoming Interface Handling for Raw Socket.

2005-09-01 Thread David S. Miller

From: YOSHIFUJI Hideaki <[EMAIL PROTECTED]>
Date: Thu, 01 Sep 2005 20:51:14 +0900 (JST)

> This patch fixes the issue by using appropriate incoming interface,
> in the sense of scoping architecture.
> 
> Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]>

Applied, thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: netdevice refcount question for mirred.c

2005-09-01 Thread David S. Miller

From: Ben Greear <[EMAIL PROTECTED]>
Date: Thu, 01 Sep 2005 15:39:31 -0700

> Also, earlier in the method it does a __dev_get_by_index(parm->ifindex),
> and continues to use the returned value after that.  Couldn't this lead
> to a reference-after-free, or does external locking prohibit this?

Probably the RTNL semaphore helps ensure this is OK.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Perf problem with qdisc ? dev_queue_xmit_nit() can be called many times for the same packet

2005-09-01 Thread David S. Miller

From: "Michael Chan" <[EMAIL PROTECTED]>
Date: Mon, 29 Aug 2005 13:29:50 -0700

> On Sat, 2005-08-27 at 22:38 +0200, Eric Dumazet wrote:
> 
> > 
> > - [TG3] : tx_lock spinlock is taken in tg3_tx() only when really needed.
> > 
> 
> This is similar to your tx_lock patch for tg3 but takes it one step
> further to eliminate the tx_lock in the tx_completion path when the tx
> queue is not stopped.
> 
> Signed-off-by: Michael Chan <[EMAIL PROTECTED]>

This looks really nice, patch applied.

Thanks everyone.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Fix typo (memcpy length) of CLUSTERIP target

2005-09-01 Thread David S. Miller

From: Harald Welte <[EMAIL PROTECTED]>
Date: Tue, 30 Aug 2005 16:11:19 +0200

> Hi Dave, Please apply.

Applied, thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Fix netpoll bug in Sun GEM Ether driver

2005-09-01 Thread David S. Miller

From: Geoff Levand <[EMAIL PROTECTED]>
Date: Mon, 29 Aug 2005 15:04:24 -0700

> Sure, your fix works, and seems to be the best way to do it.
> 
> Signed-off-by: Geoff Levand <[EMAIL PROTECTED]>

Applied, thanks everyone.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[git patches] 2.6.x net driver updates

2005-09-01 Thread Jeff Garzik


Please pull from 'upstream' branch of
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git

to receive the tulip and iseries_veth updates described below:




 drivers/net/iseries_veth.h |   46 --
 drivers/net/iseries_veth.c |  869 +++--
 drivers/net/tulip/de2104x.c|2 
 drivers/net/tulip/tulip_core.c |1 
 4 files changed, 583 insertions(+), 335 deletions(-)



Jeff Garzik:
  [netdrvr tulip] new PCI ID
  [netdrvr de2104x] store PCI bus addresses in unsigned long

Michael Ellerman:
  iseries_veth: Cleanup error and debug messages
  iseries_veth: Remove a FIXME WRT deletion of the ack_timer
  iseries_veth: Try to avoid pathological reset behaviour
  iseries_veth: Fix broken promiscuous handling
  iseries_veth: Remove redundant message stack lock
  iseries_veth: Replace lock-protected atomic with an ordinary variable
  iseries_veth: Only call dma_unmap_single() if dma_map_single() succeeded
  iseries_veth: Make init_connection() & destroy_connection() symmetrical
  iseries_veth: Use kobjects to track lifecycle of connection structs
  iseries_veth: Remove TX timeout code
  iseries_veth: Add a per-connection ack timer
  iseries_veth: Simplify full-queue handling
  iseries_veth: Fix bogus counting of TX errors
  iseries_veth: Add sysfs support for connection structs
  iseries_veth: Add sysfs support for port structs
  iseries_veth: Incorporate iseries_veth.h in iseries_veth.c
  iseries_veth: Remove studly caps from iseries_veth.c
  iseries_veth: Be consistent about driver name, increment version



diff --git a/drivers/net/iseries_veth.c b/drivers/net/iseries_veth.c
--- a/drivers/net/iseries_veth.c
+++ b/drivers/net/iseries_veth.c
@@ -79,12 +79,55 @@
 #include 
 #include 
 
-#include "iseries_veth.h"
+#undef DEBUG
 
 MODULE_AUTHOR("Kyle Lucke <[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION("iSeries Virtual ethernet driver");
 MODULE_LICENSE("GPL");
 
+#define VETH_EVENT_CAP (0)
+#define VETH_EVENT_FRAMES  (1)
+#define VETH_EVENT_MONITOR (2)
+#define VETH_EVENT_FRAMES_ACK  (3)
+
+#define VETH_MAX_ACKS_PER_MSG  (20)
+#define VETH_MAX_FRAMES_PER_MSG(6)
+
+struct veth_frames_data {
+   u32 addr[VETH_MAX_FRAMES_PER_MSG];
+   u16 len[VETH_MAX_FRAMES_PER_MSG];
+   u32 eofmask;
+};
+#define VETH_EOF_SHIFT (32-VETH_MAX_FRAMES_PER_MSG)
+
+struct veth_frames_ack_data {
+   u16 token[VETH_MAX_ACKS_PER_MSG];
+};
+
+struct veth_cap_data {
+   u8 caps_version;
+   u8 rsvd1;
+   u16 num_buffers;
+   u16 ack_threshold;
+   u16 rsvd2;
+   u32 ack_timeout;
+   u32 rsvd3;
+   u64 rsvd4[3];
+};
+
+struct veth_lpevent {
+   struct HvLpEvent base_event;
+   union {
+   struct veth_cap_data caps_data;
+   struct veth_frames_data frames_data;
+   struct veth_frames_ack_data frames_ack_data;
+   } u;
+
+};
+
+#define DRV_NAME   "iseries_veth"
+#define DRV_VERSION"2.0"
+
 #define VETH_NUMBUFFERS(120)
 #define VETH_ACKTIMEOUT(100) /* microseconds */
 #define VETH_MAX_MCAST (12)
@@ -113,9 +156,9 @@ MODULE_LICENSE("GPL");
 
 struct veth_msg {
struct veth_msg *next;
-   struct VethFramesData data;
+   struct veth_frames_data data;
int token;
-   unsigned long in_use;
+   int in_use;
struct sk_buff *skb;
struct device *dev;
 };
@@ -125,23 +168,28 @@ struct veth_lpar_connection {
struct work_struct statemachine_wq;
struct veth_msg *msgs;
int num_events;
-   struct VethCapData local_caps;
+   struct veth_cap_data local_caps;
 
+   struct kobject kobject;
struct timer_list ack_timer;
 
+   struct timer_list reset_timer;
+   unsigned int reset_timeout;
+   unsigned long last_contact;
+   int outstanding_tx;
+
spinlock_t lock;
unsigned long state;
HvLpInstanceId src_inst;
HvLpInstanceId dst_inst;
-   struct VethLpEvent cap_event, cap_ack_event;
+   struct veth_lpevent cap_event, cap_ack_event;
u16 pending_acks[VETH_MAX_ACKS_PER_MSG];
u32 num_pending_acks;
 
int num_ack_events;
-   struct VethCapData remote_caps;
+   struct veth_cap_data remote_caps;
u32 ack_timeout;
 
-   spinlock_t msg_stack_lock;
struct veth_msg *msg_stack_head;
 };
 
@@ -151,15 +199,17 @@ struct veth_port {
u64 mac_addr;
HvLpIndexMap lpar_map;
 
-   spinlock_t pending_gate;
-   struct sk_buff *pending_skb;
-   HvLpIndexMap pending_lpmask;
+   /* queue_lock protects the stopped_map and dev's queue. */
+   spinlock_t queue_lock;
+   HvLpIndexMap stopped_map;
 
+   /* mcast_gate protects promiscuous, num_mcast & mcast_addr. */
rwlock_t mcast_gate;
int promiscuous;
-   int all_mcast;
int num_mcast;
u64 mcast_addr[VETH_MAX_MCAST];
+
+   struct kobject kobject;
 };
 
 static

Re: [2.6 patch] include/net/ip_vs.h: "extern inline" -> "static inline"

2005-09-01 Thread David S. Miller

From: Adrian Bunk <[EMAIL PROTECTED]>
Date: Wed, 24 Aug 2005 17:58:06 +0200

> "extern inline" doesn't make much sense.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

Applied, thanks Adrian.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread David S. Miller


Ok, here is the current version of the patch, and what I intend
to push to Linus:

diff-tree 4980a059ef42741e80e9efa0dabdf520f9ba0c5a (from 
6b39374a27eb4be7e9d82145ae270ba02ea90dc8)
Author: David S. Miller <[EMAIL PROTECTED]>
Date:   Thu Sep 1 15:06:18 2005 -0700

[TCP]: Keep TSO enabled even during loss events.

All we need to do is resegment the queue so that
we record SACK information accurately.  The edges
of the SACK blocks guide our resegmenting decisions.

With help from Herbert Xu.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/include/net/tcp.h b/include/net/tcp.h
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -454,6 +454,7 @@ extern int tcp_retransmit_skb(struct soc
 extern void tcp_xmit_retransmit_queue(struct sock *);
 extern void tcp_simple_retransmit(struct sock *);
 extern int tcp_trim_head(struct sock *, struct sk_buff *, u32);
+extern int tcp_fragment(struct sock *, struct sk_buff *, u32, unsigned int);
 
 extern void tcp_send_probe0(struct sock *);
 extern void tcp_send_partial(struct sock *);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -923,14 +923,6 @@ tcp_sacktag_write_queue(struct sock *sk,
int flag = 0;
int i;
 
-   /* So, SACKs for already sent large segments will be lost.
-* Not good, but alternative is to resegment the queue. */
-   if (sk->sk_route_caps & NETIF_F_TSO) {
-   sk->sk_route_caps &= ~NETIF_F_TSO;
-   sock_set_flag(sk, SOCK_NO_LARGESEND);
-   tp->mss_cache = tp->mss_cache;
-   }
-
if (!tp->sacked_out)
tp->fackets_out = 0;
prior_fackets = tp->fackets_out;
@@ -978,20 +970,40 @@ tcp_sacktag_write_queue(struct sock *sk,
flag |= FLAG_DATA_LOST;
 
sk_stream_for_retrans_queue(skb, sk) {
-   u8 sacked = TCP_SKB_CB(skb)->sacked;
-   int in_sack;
+   int in_sack, pcount;
+   u8 sacked;
 
/* The retransmission queue is always in order, so
 * we can short-circuit the walk early.
 */
-   if(!before(TCP_SKB_CB(skb)->seq, end_seq))
+   if (!before(TCP_SKB_CB(skb)->seq, end_seq))
break;
 
-   fack_count += tcp_skb_pcount(skb);
+   pcount = tcp_skb_pcount(skb);
+
+   if (pcount > 1 &&
+   (after(start_seq, TCP_SKB_CB(skb)->seq) ||
+before(end_seq, TCP_SKB_CB(skb)->end_seq))) {
+   unsigned int pkt_len;
+
+   if (after(start_seq, TCP_SKB_CB(skb)->seq))
+   pkt_len = (start_seq -
+  TCP_SKB_CB(skb)->seq);
+   else
+   pkt_len = (end_seq -
+  TCP_SKB_CB(skb)->seq);
+   if (tcp_fragment(sk, skb, pkt_len, 
skb_shinfo(skb)->tso_size))
+   break;
+   pcount = tcp_skb_pcount(skb);
+   }
+
+   fack_count += pcount;
 
in_sack = !after(start_seq, TCP_SKB_CB(skb)->seq) &&
!before(end_seq, TCP_SKB_CB(skb)->end_seq);
 
+   sacked = TCP_SKB_CB(skb)->sacked;
+
/* Account D-SACK for retransmitted packet. */
if ((dup_sack && in_sack) &&
(sacked & TCPCB_RETRANS) &&
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -428,11 +428,11 @@ static void tcp_set_skb_tso_segs(struct 
  * packet to the list.  This won't be called frequently, I hope. 
  * Remember, these are still headerless SKBs at this point.
  */
-static int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, 
unsigned int mss_now)
+int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int 
mss_now)
 {
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *buff;
-   int nsize;
+   int nsize, old_factor;
u16 flags;
 
nsize = skb_headlen(skb) - len;
@@ -490,18 +490,33 @@ static int tcp_fragment(struct sock *sk,
tp->left_out -= tcp_skb_pcount(skb);
}
 
+   old_factor = tcp_skb_pcount(skb);
+
/* Fix up tso_factor for both original and new SKB.  */
tcp_set_skb_tso_segs(sk, skb, mss_now);
tcp_set_skb_tso_segs(sk, buff, mss_now);
 
-   if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
-   tp->lost_out += tcp_skb_pcount(skb);
-

Re: [Patch] Set link type on tun/tap, v2

2005-09-01 Thread David S. Miller

From: Mike Kershaw <[EMAIL PROTECTED]>
Date: Tue, 23 Aug 2005 19:48:37 -0400

> Same patch as before, only following Dave's advice to not change the
> link type of an interface while its up.  

Applied, thanks Mike.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHES]: Two TSO refinements

2005-09-01 Thread Herbert Xu

On Thu, Sep 01, 2005 at 03:06:47PM -0700, David S. Miller wrote:
>
> - if (TCP_SKB_CB(buff)->sacked&TCPCB_LOST) {
> - tp->lost_out += tcp_skb_pcount(buff);
> - tp->left_out += tcp_skb_pcount(buff);
> + tp->packets_out -= diff;
> + if (diff > 0) {
> + tp->fackets_out -= diff;
> + if ((int)tp->fackets_out < 0)
> + tp->fackets_out = 0;
> + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {

The TCPCB_LOST stuff should be outside the diff > 0 if block.

> + tp->lost_out -= diff;
> + if ((int)tp->lost_out < 0)
> + tp->lost_out = 0;

These checks aren't necessary.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ATA over Ethernet potential reference after free of net-device.

2005-09-01 Thread Ben Greear


Ed L Cashin wrote:

Ben Greear <[EMAIL PROTECTED]> writes:



Hello!

I believe that the aoecmd_cfg method in aoecmd.c can reference
the net-devices after free.  The reason is that it grabs and releases
the interface with dev_hold, dev_put, but after putting everything, it
then does the transmit of the skbs...



Thanks, I'll make a note to look into that.  (I'll be out of town soon
for a few days.)

Based on your understanding of the issue, can you trigger any
problematic behavior?


No, it's purely theoretical..and I may be completely wrong.  I'm
just auditing the netdevice reference counting code and thought
that part looked a little funny.

Thanks,
Ben


--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ATA over Ethernet potential reference after free of net-device.

2005-09-01 Thread Ed L Cashin

Ben Greear <[EMAIL PROTECTED]> writes:

> Hello!
>
> I believe that the aoecmd_cfg method in aoecmd.c can reference
> the net-devices after free.  The reason is that it grabs and releases
> the interface with dev_hold, dev_put, but after putting everything, it
> then does the transmit of the skbs...

Thanks, I'll make a note to look into that.  (I'll be out of town soon
for a few days.)

Based on your understanding of the issue, can you trigger any
problematic behavior?

-- 
  Ed L Cashin <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netdevice reference leak in af_ax25.c ??

2005-09-01 Thread Ralf Baechle

On Thu, Sep 01, 2005 at 08:56:19PM +0200, Patrick McHardy wrote:

> > I believe the SO_BINDTODEVICE case in net/ax25/af_x25.c  (line 613 or so)
> > leaks a reference to a net device.  It does a dev_get_by_name,
> > which holds a reference, but since it never assigns the pointer
> > anywhere, I do not see how it can ever free it later.
> > 
> > Please clue me in as to where it's released if it actually is.
> 
> I can't find the code you're talking about, there's no dev_get* in my
> version of af_x25.c. Please paste the code you're talking about in
> your bugreports, thanks.

Ben meant net/ax25/af_ax25.  The dev value is stored in the ax25_cb
indirectly after converting it to an ax25dev pointer and will be freed
what that ax25_cb (which really is the protocol-specific part of the
socket) is going to be closed.

You poked my nose at a bug though - it is possible to leak references by
performing multiple SO_BINDTODEVICE operations; we should either only
permit the first one to succeed or to drop the reference of the old
device in case of a repeated SO_BINDTODEVICE.  After the weekend ...

  Ralf
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] 8139cp: Catch all interrupts

2005-09-01 Thread Pierre Ossman

Register interrupt handler when net device is registered. Avoids missing
interrupts if the interrupt mask gets out of sync.

Signed-off-by: Pierre Ossman <[EMAIL PROTECTED]>

---

The reason this patch is needed for me is that the resume function is
broken. It enables interrupts unconditionally, but the interrupt handler
is only registered when the device is up.

I don't have enough knowledge about the driver to fix the resume
function so this patch will instead make sure that the interrupt handler
is registered at all times (which can be a nice safeguard even when the
resume function gets fixed).
Index: linux-wbsd/drivers/net/8139cp.c
===
--- linux-wbsd/drivers/net/8139cp.c	(revision 165)
+++ linux-wbsd/drivers/net/8139cp.c	(working copy)
@@ -1204,20 +1204,11 @@
 
 	cp_init_hw(cp);
 
-	rc = request_irq(dev->irq, cp_interrupt, SA_SHIRQ, dev->name, dev);
-	if (rc)
-		goto err_out_hw;
-
 	netif_carrier_off(dev);
 	mii_check_media(&cp->mii_if, netif_msg_link(cp), TRUE);
 	netif_start_queue(dev);
 
 	return 0;
-
-err_out_hw:
-	cp_stop_hw(cp);
-	cp_free_rings(cp);
-	return rc;
 }
 
 static int cp_close (struct net_device *dev)
@@ -1238,7 +1229,6 @@
 	spin_unlock_irqrestore(&cp->lock, flags);
 
 	synchronize_irq(dev->irq);
-	free_irq(dev->irq, dev);
 
 	cp_free_rings(cp);
 	return 0;
@@ -1813,6 +1803,10 @@
 	if (rc)
 		goto err_out_iomap;
 
+	rc = request_irq(dev->irq, cp_interrupt, SA_SHIRQ, dev->name, dev);
+	if (rc)
+		goto err_out_unreg;
+
 	printk (KERN_INFO "%s: RTL-8139C+ at 0x%lx, "
 		"%02x:%02x:%02x:%02x:%02x:%02x, "
 		"IRQ %d\n",
@@ -1832,6 +1826,8 @@
 
 	return 0;
 
+err_out_unreg:
+	unregister_netdev(dev);
 err_out_iomap:
 	iounmap(regs);
 err_out_res:
@@ -1852,6 +1848,7 @@
 
 	if (!dev)
 		BUG();
+	free_irq(dev->irq, dev);
 	unregister_netdev(dev);
 	iounmap(cp->regs);
 	if (cp->wol_enabled) pci_set_power_state (pdev, PCI_D0);

Re: Netlink connector. Revisited.

2005-09-01 Thread Evgeniy Polyakov

If noone has any objection, please consider for inclusion.

Thank you.

On Fri, Aug 26, 2005 at 02:09:38PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) 
wrote:
> Kernel connector - new userspace <-> kernel space easy to
> use communication module which implements easy to use bidirectional
> message bus using netlink as it's backend.
> Connector was created to eliminate complex skb handling both in send and
> receive message bus direction.
> 
> Connector driver adds possibility to connect various agents using
> as one of it's backends netlink based network.
> One must register callback and identifier. When driver receives
> special netlink message with appropriate identifier, appropriate
> callback will be called.
> 
> >From the userspace point of view it's quite straightforward:
> socket();
> bind();
> send();
> recv();
> X-Spam-Status: No, hits=0.00 required=0.90
> 
> But if kernelspace want to use full power of such connections, driver
> writer must create special sockets, must know about struct sk_buff
> handling...
> Connector allows any kernelspace agents to use netlink based
> networking for inter-process communication in a significantly easier
> way:
> 
> int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void
> *));
> void cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask);
> 
> struct cb_id
> {
>   __u32   idx;
>   __u32   val;
> };
> 
> idx and val are unique identifiers which must be registered in
> connector.h for in-kernel usage.
> void (*callback) (void *) - is a callback function which will be called
> when message with above idx.val will be received by connector core.
> 
> Using connector completely hides low-level transport layer from 
> it's users.
> 
> Connector uses new netlink ability to have many groups in one socket.
> 
> Sorry for long carbon-copy list - I've added all people answering about
> connector and related stuff before.
> 
> Thank you.
> 
> Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>
> 
> diff --git a/include/linux/netlink.h b/include/linux/netlink.h
> --- a/include/linux/netlink.h
> +++ b/include/linux/netlink.h
> @@ -15,6 +15,7 @@
>  #define NETLINK_ISCSI8   /* Open-iSCSI */
>  #define NETLINK_AUDIT9   /* auditing */
>  #define NETLINK_FIB_LOOKUP   10  
> +#define NETLINK_CONNECTOR11
>  #define NETLINK_NETFILTER12  /* netfilter subsystem */
>  #define NETLINK_IP6_FW   13
>  #define NETLINK_DNRTMSG  14  /* DECnet routing messages */
> 
> diff --git a/Documentation/connector/cn_test.c 
> b/Documentation/connector/cn_test.c
> new file mode 100644
> --- /dev/null
> +++ b/Documentation/connector/cn_test.c
> @@ -0,0 +1,195 @@
> +/*
> + *   cn_test.c
> + * 
> + * 2004-2005 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
> + * All rights reserved.
> + * 
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "connector.h"
> +
> +static struct cb_id cn_test_id = { 0x123, 0x456 };
> +static char cn_test_name[] = "cn_test";
> +static struct sock *nls;
> +static struct timer_list cn_test_timer;
> +
> +void cn_test_callback(void *data)
> +{
> + struct cn_msg *msg = (struct cn_msg *)data;
> +
> + printk("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n",
> +__func__, jiffies, msg->id.idx, msg->id.val,
> +msg->seq, msg->ack, msg->len, (char *)msg->data);
> +}
> +
> +static int cn_test_want_notify(void)
> +{
> + struct cn_ctl_msg *ctl;
> + struct cn_notify_req *req;
> + struct cn_msg *msg = NULL;
> + int size, size0;
> + struct sk_buff *skb;
> + struct nlmsghdr *nlh;
> + u32 group = 1;
> +
> + size0 = sizeof(*msg) + sizeof(*ctl) + 3 * sizeof(*req);
> +
> + size = NLMSG_SPACE(size0);
> +
> + skb = alloc_skb(size, GFP_ATOMIC);
> + if (!skb) {
> + printk(KERN_ERR "Failed to allocate new skb with size=%u.\n",
> +size);
> +
> + return -ENOMEM;
> + }
> +
> + nlh = NLMSG_PUT(skb, 0, 0x123, NLMSG_DONE, size - sizeof(*nlh));
> +
> + msg = (struct cn_msg *)NLMSG_DATA(nlh);
> +
> + memset(msg, 0, size0);
> +
> +

Re: Netdevice reference leak in af_ax25.c ??

2005-09-01 Thread Patrick McHardy

Ben Greear wrote:
> 
> I believe the SO_BINDTODEVICE case in net/ax25/af_x25.c  (line 613 or so)
> leaks a reference to a net device.  It does a dev_get_by_name,
> which holds a reference, but since it never assigns the pointer
> anywhere, I do not see how it can ever free it later.
> 
> Please clue me in as to where it's released if it actually is.

I can't find the code you're talking about, there's no dev_get* in my
version of af_x25.c. Please paste the code you're talking about in
your bugreports, thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netlink connector. Revisited.

2005-09-01 Thread David S. Miller

From: Evgeniy Polyakov <[EMAIL PROTECTED]>
Date: Thu, 1 Sep 2005 23:51:31 +0400

> If noone has any objection, please consider for inclusion.

I intend to review this soon,  but I'm backlogged with other
tasks at the moment.  I'm hoping someone else can at least
give some feedback, meanwhile.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netdevice reference leak in af_ax25.c ??

2005-09-01 Thread Ben Greear


Ralf Baechle wrote:

On Thu, Sep 01, 2005 at 08:56:19PM +0200, Patrick McHardy wrote:



I believe the SO_BINDTODEVICE case in net/ax25/af_x25.c  (line 613 or so)
leaks a reference to a net device.  It does a dev_get_by_name,
which holds a reference, but since it never assigns the pointer
anywhere, I do not see how it can ever free it later.

Please clue me in as to where it's released if it actually is.


I can't find the code you're talking about, there's no dev_get* in my
version of af_x25.c. Please paste the code you're talking about in
your bugreports, thanks.



Ben meant net/ax25/af_ax25.  The dev value is stored in the ax25_cb
indirectly after converting it to an ax25dev pointer and will be freed
what that ax25_cb (which really is the protocol-specific part of the
socket) is going to be closed.


Ok, I'm getting hopelessly lost in the ax25 code trying to follow
references, so I'm just going to use the generic ref counting debugging.

That will still point to the right module, but not the line of code,
should a leak occur (and should the patch be accepted) :)


You poked my nose at a bug though - it is possible to leak references by
performing multiple SO_BINDTODEVICE operations; we should either only
permit the first one to succeed or to drop the reference of the old
device in case of a repeated SO_BINDTODEVICE.  After the weekend ...


Thanks for taking a look.

Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Jean Tourrilhes

On Thu, Sep 01, 2005 at 07:36:34PM +0100, Pedro Ramalhais wrote:
> 
> Oops, my brain had censored that part of the iwconfig manual.

Yeah, it come at the end of a long page ;-)

> Reading that, and since i imagine that few people use it,

Not explicitely, but it's used internally.

> it comes to
> mind that we could abuse this, and redefine it to be THE association
> command.

As stated earlier, the problem is *NOT* specific to wireless
(some USB-Ethernet or Firewire adapters may need to load firmware), so
I believe the solution should not be wireless specific (i.e. DHCP
should wait for the link up event, card could be identified by
businfo).

> ex:
> iwconfig eth0 commit would cause association and no other command would
> cause it.

The current expected behavior is that if commit has not been
done at "ifconfig up" time, it should be done then by the driver. I
would keep that, it's a sane behavior that make sure the system behave
in expected way if the user/scripts do stupid things.
One change that could be done is to have the commit handler
called explicitely by the kernel when the user do "ifconfig up",
instead of the current implicit behavior. I did not do that because I
did not want to change generic Ethernet code, but now that we have our
own device type, it may make sense.

> I also fail to see what other uses it could have besides setting the
> configuration on the card all at once.

Well, it was designed explicitely for that ;-)

> Which wireless settings could be
> left to be applied on the card "later"?

Manual override. At any point in time, even if the card is up,
the user/tools can use ifconfig to change the IP address and ethtool
to change the Ethernet media (speed/duplex). I don't see any reason of
getting rid of that (and the commit stuff behave sanely in this case).
Only a single driver doesn't allow manual override of wireless
params, ray_cs, and I don't want this list to grow. Otherwise, we
might as well just use module parameters and be done with it...

> hmmm, the only thing coming to
> mind is waiting for a scan to finish, does that count? That would be
> better exposed as a BLOCKING vs NONBLOCKING setting.

The scan API is quite different from the other APIs for this
specific reason (trigger cmd + event + read cmd). It's usually easier
to emulate a blocking behaviour from a non-blocking API than vice
versa, so I would not want the scanning API to be blocking.

> Thanks!
> -- 
> Pedro Ramalhais <[EMAIL PROTECTED]>

Have fun...

Jean
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Netdevice reference leak in af_ax25.c ??

2005-09-01 Thread Ben Greear


Patrick McHardy wrote:

Ben Greear wrote:


I believe the SO_BINDTODEVICE case in net/ax25/af_x25.c  (line 613 or so)
leaks a reference to a net device.  It does a dev_get_by_name,
which holds a reference, but since it never assigns the pointer
anywhere, I do not see how it can ever free it later.

Please clue me in as to where it's released if it actually is.



I can't find the code you're talking about, there's no dev_get* in my
version of af_x25.c. Please paste the code you're talking about in
your bugreports, thanks.


Please ignore the NRDK thing..I am adding reference counting debugging
to the netdevice code.  This is from the 2.6.13 kernel:

In this method:

/*
 *  Handling for system calls applied via the various interfaces to an
 *  AX25 socket object
 */

static int ax25_setsockopt(struct socket *sock, int level, int optname,
char __user *optval, int optlen)
{

.

case SO_BINDTODEVICE:
if (optlen > IFNAMSIZ)
optlen=IFNAMSIZ;
if (copy_from_user(devname, optval, optlen)) {
res = -EFAULT;
break;
}

dev = dev_get_by_name(devname, NDRK_GENERIC);
if (dev == NULL) {
res = -ENODEV;
break;
}

if (sk->sk_type == SOCK_SEQPACKET &&
   (sock->state != SS_UNCONNECTED ||
sk->sk_state == TCP_LISTEN)) {
res = -EADDRNOTAVAIL;
dev_put(dev, NDRK_GENERIC);
break;
}

ax25->ax25_dev = ax25_dev_ax25dev(dev);
ax25_fillin_cb(ax25, ax25->ax25_dev);
dev_put(dev, NDRK_GENERIC); /* TODO:  Verify we should put it 
here. */
break;

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Netdevice reference leak in af_ax25.c ??

2005-09-01 Thread Ben Greear



I believe the SO_BINDTODEVICE case in net/ax25/af_x25.c  (line 613 or so)
leaks a reference to a net device.  It does a dev_get_by_name,
which holds a reference, but since it never assigns the pointer
anywhere, I do not see how it can ever free it later.

Please clue me in as to where it's released if it actually is.

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Pedro Ramalhais

On Thu, 2005-09-01 at 10:59 -0700, Jean Tourrilhes wrote:
> Jiri Benc wrote :
> > On Thu, 01 Sep 2005 11:09:16 +0100, Pedro Ramalhais wrote:
> > > Right, that would need a new interface where all parameters are passed
> > > at once,
> > 
> > Then you will lose the possibility of having default parameters.
> 
>   Just for your information, it's actually trivial to cache
> parameters in the driver and to apply them in one go using the WE
> commit mechanism. It's actually even strongly advised, as it make the
> startup performance much better (fewer reset of the hardware).
>   Many drivers do implement this, such as orinoco.c, atmel.c,
> airo.c (partial) and ray_cs.c. The performance benefit of implementing
> it in orinoco.c was actually noticeable.
>   The current API is flexible and allow you to have it both
> way. When I designed it, I actually thought about passing a big struct
> with all the parameters to the driver and rejected it because too
> inflexible. It's even documented in iw_handler.h, and I list other
> drawbacks of the approach.
> 
>   Pedro, if you want more detail of the commit stuff, please
> yell...
>   Have fun...
> 
>   Jean
> 

Oops, my brain had censored that part of the iwconfig manual.

"commit Some cards may not apply changes done through Wireless
Extensions immediately (they may wait to  aggregate  the  changes  or
  apply  it only when the card is brought up via ifconfig).
This command (when available) forces the card to apply all pending
  changes.
  This is normally not needed, because the card will
eventually apply the changes, but can be useful for debugging."

Reading that, and since i imagine that few people use it, it comes to
mind that we could abuse this, and redefine it to be THE association
command. ex:
iwconfig eth0 commit would cause association and no other command would
cause it.
I also fail to see what other uses it could have besides setting the
configuration on the card all at once. Which wireless settings could be
left to be applied on the card "later"? hmmm, the only thing coming to
mind is waiting for a scan to finish, does that count? That would be
better exposed as a BLOCKING vs NONBLOCKING setting.

Thanks!
-- 
Pedro Ramalhais <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Pedro Ramalhais

On Thu, 2005-09-01 at 16:48 +0200, Jiri Benc wrote:
> On Thu, 01 Sep 2005 11:09:16 +0100, Pedro Ramalhais wrote:
> > The scheme looks good to me. Wireless cards mostly map to a regular
> > network card. Only difference is that you need to do something to
> > configure the link to have "carrier detected" and DHCP should only be
> > started after "carrier detected" (IFF_RUNNING IIRC).
> 
> The fact that DHCP should be only started when carrier is detected is
> not wireless-specific.
> 

Sorry, english is not my natural laguange. I was just stating the
obvious. Not related to the diference i was talking about.

> > Regarding association only on explicit userspace request, that's fixable
> > in the drivers (some drivers automatically associate once they're
> > ifconfig'ed up, the ipw2x00 drivers have a module parameter to change
> > this behaviour).
> 
> ipw drivers are broken in this manner as they try to associate right
> after modprobe. No card should do anything unless it is explicitly told
> so - policy is the matter of user-space, not the kernel. Startup scripts
> in distributions are the right place for instructing a card to
> associate.

scripts, daemons, apps, whatever the distro/user wants to use.

> 
> Sure, this is fixable in drivers. And it really needs to be fixed. Even
> more, I think it should be forced by ieee80211 layer.
> 

You mean something like not allowing the driver to send an event
association through the ieee80211 layer or accept traffic until the
layer is configured correctly? That would probably work.

> > > And yes, this brings up the problem with firmware loading. It should
> > > really be solved, but trying to solve it by requiring to bring the card
> > > up before it is configured is the bad way.
> > 
> > Why is it the "wrong way"? I don't see a big problem with this, the card
> > is only going to be used after it's UP. The only problem i see is that
> > it doesn't behave exactly like most network drivers, where they are able
> > to detect a link even when they're DOWN. Is there a good reason for a
> > card to do anything even when it's DOWN?
> 
> You seem to agree with me :-) A card in DOWN state should do nothing. In
> particular, it shouldn't associate.
> 

At least i don't see any reason why it should do anything. If someone
knows of something that could be useful during the DOWN state i'd like
to hear it.

> But it should be configurable. And it is really necessary that we can
> tell a card that we are done with configuration so it can associate. The
> easiest place for doing this is bringing the card to UP state.
> 

I fail to see why configuration needs to be done in the ifconfig up
stage.

I tend to think of a wireless card as a normal ethernet network device.
And you can map one to the other easily.
In the case of a normal network card, link state means the state of the
path between the card and another equipment (a switch or another card
with a crossover cable) and successful ethernet negotiation.
In the case of a wireless card, it means the successful association with
an access point, ad-hoc network (always link up?) or master mode (always
link up?).
There are user configurable things in both cards regarding the link: in
network cards you have the cable, in wireless cards you have wireless
association configuration AND policy.
After saying that, think about this: in a network card you can ifconfig
the card UP and connect the cable later (bringging the link up), so you
should be able to do the same thing with a wireless card: ifconfig up
and then configure the card so that it establishes a link.

> > Right, that would need a new interface where all parameters are passed
> > at once,
> 
> Then you will lose the possibility of having default parameters.
> 

Not really, there can be default parameters like channel=0 or -1,
essid_len=-1, AP bssid=00:00:00:00:00:00 or FF:FF:FF:FF:FF:FF, etc... or
a bitfield with flags for each parameter being set. The bitfield would
be nice so that you wouldn't need to pass all the parameters (even the
default ones) like the radiotap headers.

> > or keep the existing interface and add another just to
> > explicitly associate.
> 
> Is there any reason not to do this by bringing the card up ("ifconfig up")?

Yes, i might want to bring the card UP so that it can scan, but don't
want to associate. Or bring the card UP and configure the card in
Monitor mode. Or bring the card UP and configure the card in Master
mode. Maybe ad-hoc too, not sure.

> 
> > Besides that, there should also be a way to configure if you want to
> > auto-associate to new access points if the old access point becomes
> > unavailable or with a weaker signal than the new one, or if you want to
> > manually associate, ex: association is done once and never tried again
> > until you tell it to do so. The manual association would be a good thing
> > for a wireless managed, where it would have the work of handling new
> > networks, APs becoming unavailable and available again,

Re: Very strange Marvell/Yukon Gigabit NIC networking problems

2005-09-01 Thread Stephen Hemminger

On Tue, 30 Aug 2005 12:54:31 +0100
Daniel Drake <[EMAIL PROTECTED]> wrote:

> Hi Stephen,
> 
> This looks like an issue I reported previously. After you use a recent skge, 
> you can't use any older drivers or the windows driver, but skge still works 
> fine every time.
> 
>   http://marc.theaimsgroup.com/?l=linux-netdev&m=112268414417743&w=2
> 
> The Gentoo bug report is here:
> 
>   http://bugs.gentoo.org/show_bug.cgi?id=100258
> 
> I closed the Gentoo bug as I hoped this patch would solve it:
> 
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=0eedf4ac5b536c7922263adf1b1d991d2e2397b9;hp=acdd80d514a08800380c9f92b1bf4d4c9e818125
> 
> But according to Steve Kieu, the problem is still there in 2.6.13. It's 
> slightly odd as Steve was previously a sk98lin user and initially reported 
> this problem for sk98lin in 2.6.13 whereas it did not happen with sk98lin in 
> 2.6.12.
> 
> Any ideas?
> 
> Thanks.
> 
> Steve Kieu wrote:
> > Tested , not broken, working now but the same problem,
> > that is if I reboot to winXP or 2.6.12, 2.6.11, the
> > NIC is unusaeble. In XP it always says link is down,
> > or media disconnected (from ipconfig command output in
> > XP)
> > is it because the firmware of NIC has changed or any
> > reason?
> > 
> > 
> > I noticed  warning messages only with 2.6.13 
> > 
> > PCI: Failed to allocate mem resource #10:[EMAIL PROTECTED] for
> > :02:01.0
> > 
> > and modem device in 2.6.13 IRQ is disabled.

This is a different problem related to ACPI and other bus
changes in 2.6.13.


> > ACPI: PCI Interrupt Link [LKMO] enabled at IRQ 20
> > ACPI: PCI Interrupt :00:06.1[B] -> Link [LKMO] ->
> > GSI 20 (level, low) -> IRQ
> >  17
> > ACPI: PCI interrupt for device :00:06.1 disabled
> > 
> > not sure if it gives more information.
> > 
> > skge addr 0xfeaf8000 irq 19 chip Yukon-Lite rev 9
> > skge eth0: addr 00:11:d8:f2:1f:18
> > ACPI: PCI Interrupt :02:01.0[A] -> Link [LNKB] ->
> > GSI 18 (level, low) -> IRQ
> >  16
> > Yenta: CardBus bridge found at :02:01.0
> > [1043:1987]
> > skge eth0: enabling interface
> > 
> > skge eth0: Link is up at 10 Mbps, half duplex, flow
> > control none
> > 
> > Not sure how can I restore this thing back to normal
> > (sigh)


Is this the correct summary of the problem scenarios.
Assume each one starts from cold boot (power off).

* 2.6.13(skge) boot=> Good
* 2.6.13(sk98lin) boot => Good
* 2.6.13 + SK version of sk98lin   => Good
* XP boot  => Good

Okay, now the cases where one OS is run first and
a reboot is done to a second, or in the case of
just Linux, this should be the same as rmmoding on
driver and modprobing

Second
First   | skge | sk98lin | XP
skge| OK   |  BAD| BAD
sk98lin | ok   |  OK | ok 
XP  | ok   |  ok | OK


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Very strange Marvell/Yukon Gigabit NIC networking problems

2005-09-01 Thread Stephen Hemminger

On Wed, 31 Aug 2005 10:09:48 +1000 (EST)
Steve Kieu <[EMAIL PROTECTED]> wrote:

> 
> --- Stephen Hemminger <[EMAIL PROTECTED]> wrote:
> 
> > On Wed, 31 Aug 2005 07:49:37 +1000 (EST)
> 
> > > 
> > > install-8_23.tar.bz2
> > 
> > Just look for references to CHIP_REV_YU_LITE_A3 in
> > the driver
> > sk98lin/skgeinit.c and sk98lin/skxmac2.c
> > The comparison should always be:
> 
> Have a look but no clue to patch it, there are one
> instance of comparing

I don't fix the out of tree vendor driver. sorry.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Jean Tourrilhes

Jiri Benc wrote :
> On Thu, 01 Sep 2005 11:09:16 +0100, Pedro Ramalhais wrote:
> > Right, that would need a new interface where all parameters are passed
> > at once,
> 
> Then you will lose the possibility of having default parameters.

Just for your information, it's actually trivial to cache
parameters in the driver and to apply them in one go using the WE
commit mechanism. It's actually even strongly advised, as it make the
startup performance much better (fewer reset of the hardware).
Many drivers do implement this, such as orinoco.c, atmel.c,
airo.c (partial) and ray_cs.c. The performance benefit of implementing
it in orinoco.c was actually noticeable.
The current API is flexible and allow you to have it both
way. When I designed it, I actually thought about passing a big struct
with all the parameters to the driver and rejected it because too
inflexible. It's even documented in iw_handler.h, and I list other
drawbacks of the approach.

Pedro, if you want more detail of the commit stuff, please
yell...
Have fun...

Jean

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [resend][PATCH net-drivers-2.6 3/8] e1000: Fixes for packet split related issues

2005-09-01 Thread Malli

there are no changes in other patches
I din't want to flood your mail box,
-Malli

On 9/1/05, Jeff Garzik <[EMAIL PROTECTED]> wrote:
> I only received parts 2[abc], 3, and 8.
> 
>Jeff
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Stephen Hemminger

By the way, last time I looked at the ifplugd source it was using
outdated and incorrect ways to detect carrier. It should just
open a netlink socket and wait for carrier event. Instead it seems
to muck around looking at MII, wireless API and other ways
that only work on some devices.  In current kernels, all
network devices report carrier the same way.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] mv643xx: fix skb memory leak

2005-09-01 Thread Dale Farnsworth

This patch fixes an skb memory leak under heavy receive load
(whenever the more packets have been received than the NAPI budget
allows to be processed).

Signed-off-by: Dale Farnsworth <[EMAIL PROTECTED]>

Index: linux-2.6.13-rc6-mm2-mv643xx-enet/drivers/net/mv643xx_eth.c
===
--- linux-2.6.13-rc6-mm2-mv643xx-enet.orig/drivers/net/mv643xx_eth.c
+++ linux-2.6.13-rc6-mm2-mv643xx-enet/drivers/net/mv643xx_eth.c
@@ -412,15 +412,13 @@ static int mv643xx_eth_receive_queue(str
struct pkt_info pkt_info;
 
 #ifdef MV643XX_NAPI
-   while (eth_port_receive(mp, &pkt_info) == ETH_OK && budget > 0) {
+   while (budget-- > 0 && eth_port_receive(mp, &pkt_info) == ETH_OK) {
 #else
while (eth_port_receive(mp, &pkt_info) == ETH_OK) {
 #endif
mp->rx_ring_skbs--;
received_packets++;
-#ifdef MV643XX_NAPI
-   budget--;
-#endif
+
/* Update statistics. Note byte count includes 4 byte CRC count 
*/
stats->rx_packets++;
stats->rx_bytes += pkt_info.byte_cnt;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Jean Tourrilhes

On Thu, Sep 01, 2005 at 02:13:21PM +0200, Jiri Benc wrote:
> On Wed, 31 Aug 2005 10:52:54 -0700, Jean Tourrilhes wrote:
> > I personally consider that a bug in ifplugd. For example, the
> > hp100 Ethernet driver will start media sensing only in the open()
> > call, which means that ifplugd won't work on the hp100 driver.
> > It would be trivial to fix ifplugd to open the device without
> > an IP configuration before doing media sensing. If the device doesn't
> > have an IP config, it won't be used by the networking
> > layer. Similarly, when ifplugd doesn't want anymore to use an
> > interface, it remove the IP address from it and leave it up.
> 
> I thought this was done by not passing the -a parameter to ifplugd.

I have not used ifplugd, so I'm pretty clueless to what it
does. James was complaining that ifplugd require carrier sensing
before "ifconfig up" and that was the main root of his issues. If you
tell me that ifplugd can "ifconfig up" before carrier sensing, then we
can just use that and then everybody is happy.
Note that, as I've stated, this is not a wireless specific
issue (it can happen with Ethernet, and I expect some USB Ethernet
devices require firmware loading), so the solution should not be
wireless specific.

> > Anyway, it seems that ifplugd itself see media sensing has not
> > trivial and has many options to do media sensing, and had many changes
> > in that part, so they don't see that as totally clear and
> > straightforward.
> 
> Because there is no common standard? Every driver should do this the
> same way.

Yeah, the Ethernet API are fluctuating, but at least the
Wireless API has always been well defined. The other part of the
reason is that carrier sensing doesn't always have the same meaning,
depending on technology.

> > Having the MAC address before open() is not mandatory. I
> > specifically designed ifrename with a *wide* variety of selectors, so
> > that user could address trivially cases where the device doesn't have
> > a MAC address at boot up, and the man page makes it clear that the MAC
> > address is not always present.
> 
> But (at least part of) distributions recognize cards by MAC addresses.
> And I'm not sure there exists something better than MAC address to
> distinguish cards.
> 
> > For example :
> > 
> > ipw*  driver ipw2100
> > 
> > would work properly on your driver even if you don't read the
> > MAC address until open().
> 
> And if you have two ipw2100 cards?

There can't be two ipw2100 in the same device, because laptops
only have a single mini-PCI slot. But yeah, it could happen for other
devices (prism54 comes to mind).
In the example above, they would be called ipw0 and ipw1,
which is already good enough for most users (the two interfaces offer
the same functionality, so are interchangeable). If you want to really
separate the two interfaces, you could do for example :

ipwfirst  driver ipw2100 businfo :02:05.0
ipwsecond driver ipw2100 businfo :02:07.0

This is why ifrename is infinitely superior to all the
alternatives ;-) And if you have a neat idea for another interesting
selector, I'll add it to ifrename.

> > Correct, however the problem is not WE per say. We are not
> > going to change the networking scripts of all existing distribution to
> > add a specific call to associate wireless devices. And then change all
> > the existing wireless drivers.
> 
> I'm not for specific call for association. But I don't think that the
> problem you described is really a problem - when the device is opened
> without performing association call, association can be requested
> automatically by ieee80211 layer before driver's open method is called.

Yep.
And let's not forget that you can always force an association
when the device is down with "iwconfig commit". But as I was saying
above, as the problem is not wireless specific, the solution should
not be wireless specific.

> It's about the problem with MAC address availability.

I would be worried of making everything too dependant on the
MAC address, as it is something that can be changed by the system or
the user via "ifconfig hw".
Most other OSes seems to identify devices by businfo. Of
course, in the past it was not possible with Linux, because before
2.6.X we did not have a unified bus model (and Pcmcia is only getting
there now).
I'm not sure identifying netdev by businfo is always the right
solution either. But, if we support both, I think we can please
eveybody...

> > There are patches on my web page adding Wireless Extensions
> > over RtNetlink, which seems exactly like what everybody is clamoring
> > about. They have been sent a couple of time to this mailing list. To
> > date, I've received *zero* feed

Re: ieee80211 patches

2005-09-01 Thread Jiri Benc

On Wed, 31 Aug 2005 22:40:00 +0200, Pavel Machek wrote:
> AFAICS, with your patches ifconfig shows counts of wifi packets. How
> do I get ethernet packet counts? Will tcpdump wlan0 work on ethernet
> or wifi level?

Ethernet corresponds to 802.3, wifi is 802.11. These are different
standards describing connections over different media. These standards
have different frame formats, different access methods, etc. How can you
get 802.3 (or ethernet) packet from 802.11 device?

The current implementation of ieee80211 as is in ieee80211 branch
contains ugly hack so it works with ethernet frames externally (which
are internally converted to and from 802.11 frames). Because 802.3 and
802.11 have the same format of MAC address, it is somehow possible -
until you get to WDS and similar features. On the other hand, it allows
direct bridging between ethernet and 802.11 network.

One of our patches fixes this so ieee80211 works with 802.11 frames.
This breaks bridging for now (uhm... it isn't change in userspace, is
it?). Later, the 802.11<->802.3 conversion interface will be added to
the bridging code where it logically belongs.

So tcpdump will of course work with 802.11 frames. It understands them
already - try and see :-)

> wlan... there are no other wireless LANs in common use :-).

Actually, I don't really care about it. If you (or anybody else) think
that it is needed, send a patch.

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Jiri Benc

On Thu, 01 Sep 2005 11:09:16 +0100, Pedro Ramalhais wrote:
> The scheme looks good to me. Wireless cards mostly map to a regular
> network card. Only difference is that you need to do something to
> configure the link to have "carrier detected" and DHCP should only be
> started after "carrier detected" (IFF_RUNNING IIRC).

The fact that DHCP should be only started when carrier is detected is
not wireless-specific.

> Regarding association only on explicit userspace request, that's fixable
> in the drivers (some drivers automatically associate once they're
> ifconfig'ed up, the ipw2x00 drivers have a module parameter to change
> this behaviour).

ipw drivers are broken in this manner as they try to associate right
after modprobe. No card should do anything unless it is explicitly told
so - policy is the matter of user-space, not the kernel. Startup scripts
in distributions are the right place for instructing a card to
associate.

Sure, this is fixable in drivers. And it really needs to be fixed. Even
more, I think it should be forced by ieee80211 layer.

> > And yes, this brings up the problem with firmware loading. It should
> > really be solved, but trying to solve it by requiring to bring the card
> > up before it is configured is the bad way.
> 
> Why is it the "wrong way"? I don't see a big problem with this, the card
> is only going to be used after it's UP. The only problem i see is that
> it doesn't behave exactly like most network drivers, where they are able
> to detect a link even when they're DOWN. Is there a good reason for a
> card to do anything even when it's DOWN?

You seem to agree with me :-) A card in DOWN state should do nothing. In
particular, it shouldn't associate.

But it should be configurable. And it is really necessary that we can
tell a card that we are done with configuration so it can associate. The
easiest place for doing this is bringing the card to UP state.

> Right, that would need a new interface where all parameters are passed
> at once,

Then you will lose the possibility of having default parameters.

> or keep the existing interface and add another just to
> explicitly associate.

Is there any reason not to do this by bringing the card up ("ifconfig up")?

> Besides that, there should also be a way to configure if you want to
> auto-associate to new access points if the old access point becomes
> unavailable or with a weaker signal than the new one, or if you want to
> manually associate, ex: association is done once and never tried again
> until you tell it to do so. The manual association would be a good thing
> for a wireless managed, where it would have the work of handling new
> networks, APs becoming unavailable and available again, etc.

Yes. If the new AP is in the same ESS, that should be done automatically
(if not explicitly disabled e.g. by setting of explicit BSSID). Of
course you may sometimes want to force reassociation manually - and
there should be some call available for this. (Maybe setting BSSID while
the card is running should force reassociation?)

If you want to change SSID (i.e. associate to a completely new network)
I can imagine that you will be forced to bring the card down, change
settings and bring it up again (but I don't insist on it as it is not
necessary - bringing the card down and up again can be done internally
by ieee80211 layer in such case).

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ieee80211 patches

2005-09-01 Thread Jiri Benc

On Wed, 31 Aug 2005 17:06:19 -0400, Peter Jones wrote:
> Not necessarily started "by hotplug", but started by something like
> ifplugd or NetworkManager.  And that class of programs is already
> responsible for things like choosing what AP to associate, so it's an
> extra degree of control for them, but not really a hastle for end users,
> who shouldn't have to run "ifconfig up" or anything else 99% of the
> time.

I apologize if I misunderstood you.

We're actually not talking about ifconfig. By "ifconfig up" there is in
this thread almost always meant setting IFF_UP flag regardless of
actually used program. So "that class of programs" is exactly what we're
talking about. And it's not about an extra degree of control, it's about
the right way to do this (i.e. no side effects, no unwanted
associations, etc.).

> You also don't want to "ifconfig down" when you're hopping from AP to AP
> with the same ESSID -- you generally want to tell it to reassociate with
> the other AP, but leave the connection up the entire time.

Sure. And not even that, you want the ieee80211 layer to do this
automatically (based on signal strength) if it is told so.

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Fix MCAST_EXCLUDE line dupes in igmp/mcast

2005-09-01 Thread Denis Lukianov

No reply from ipv4/6 maintainers, forwarding this to the networking  
list.

Begin forwarded message:

From: Denis Lukianov <[EMAIL PROTECTED]>
Date: 27 August 2005 16:54:05 GMT+04:00
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: [PATCH] Fix MCAST_EXCLUDE line dupe

Grepping for "sfcount[MCAST_EXCLUDE] =" revealed the following dupes:

./net/ipv4/igmp.c:1606:pmc->sfcount[MCAST_EXCLUDE] = 0;
./net/ipv4/igmp.c:1607:pmc->sfcount[MCAST_EXCLUDE] = 1;

./net/ipv6/mcast.c:1971:   pmc->mca_sfcount[MCAST_EXCLUDE] = 0;
./net/ipv6/mcast.c:1972:   pmc->mca_sfcount[MCAST_EXCLUDE] = 1;

The attached patches use MCAST_INCLUDE on the first line in each case  
(separate patches for 2.4 and 2.6, -p1). Please check these as they  
are my first ever kernel patches.

patch_for_2.4.32-pre3
Description: Binary data

patch_for_2.6.13-rc7
Description: Binary data

Cheers,
Denis Lukianov

Re: ieee80211 patches

2005-09-01 Thread Jiri Benc

On Wed, 31 Aug 2005 10:52:54 -0700, Jean Tourrilhes wrote:
>   I personally consider that a bug in ifplugd. For example, the
> hp100 Ethernet driver will start media sensing only in the open()
> call, which means that ifplugd won't work on the hp100 driver.
>   It would be trivial to fix ifplugd to open the device without
> an IP configuration before doing media sensing. If the device doesn't
> have an IP config, it won't be used by the networking
> layer. Similarly, when ifplugd doesn't want anymore to use an
> interface, it remove the IP address from it and leave it up.

I thought this was done by not passing the -a parameter to ifplugd.

>   The other problem is that not all drivers/technologies
> implement media sensing. The media-sensing API is part of the specific
> Ethernet API, not the generic network API, which make sense. If you
> think about it 2sec, media sensing does not make sense in 802.11
> Ad-Hoc mode, neither in Master mode. Having tools or distro depending
> on it at this stage seems quite foolish to me.

It does make sense in ad-hoc mode. AP mode is indeed a special case -
and not only in media sensing. Startup scripts have to deal with AP mode
specifically anyway.

>   Anyway, it seems that ifplugd itself see media sensing has not
> trivial and has many options to do media sensing, and had many changes
> in that part, so they don't see that as totally clear and
> straightforward.

Because there is no common standard? Every driver should do this the
same way.

>   Having the MAC address before open() is not mandatory. I
> specifically designed ifrename with a *wide* variety of selectors, so
> that user could address trivially cases where the device doesn't have
> a MAC address at boot up, and the man page makes it clear that the MAC
> address is not always present.

But (at least part of) distributions recognize cards by MAC addresses.
And I'm not sure there exists something better than MAC address to
distinguish cards.

>   For example :
> 
> ipw*  driver ipw2100
> 
>   would work properly on your driver even if you don't read the
> MAC address until open().

And if you have two ipw2100 cards?

>   Correct, however the problem is not WE per say. We are not
> going to change the networking scripts of all existing distribution to
> add a specific call to associate wireless devices. And then change all
> the existing wireless drivers.

I'm not for specific call for association. But I don't think that the
problem you described is really a problem - when the device is opened
without performing association call, association can be requested
automatically by ieee80211 layer before driver's open method is called.

>   And the actual solution is quite trivial, the driver just
> needs to cache the wireless settings if the card is not up. A few
> driver do that already, such as orinoco.c, airo.c, atmel.c and
> ray_cs.c. And the Wireless Extension provides a bit of help through
> the "commit" mechanism.
>   Personally, I don't see what the fuss is about.

It's about the problem with MAC address availability.

>   Actually, it's already unified, and the right way. And it
> works today for a wide range of cards/drivers.

Great to hear it. Unfortunately it doesn't seem that everybody agrees.

>   By the way, how do you define "fully configured" ? If I want
> to connect at home or at a random hotspot, the default configuration
> of most device (mode=managed ; essid=any ; wep=off ; ip=dhcp) is just
> what I want, and therefore I don't need to "configure" the device.

So you have the card fully configured (i.e. the default configuration is
enough for you) and you can just do ifconfig up.

>   There are patches on my web page adding Wireless Extensions
> over RtNetlink, which seems exactly like what everybody is clamoring
> about. They have been sent a couple of time to this mailing list. To
> date, I've received *zero* feedback on those.

We will take a look.


-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Repair Incoming Interface Handling for Raw Socket.

2005-09-01 Thread YOSHIFUJI Hideaki / 吉藤英明

Hello.

Due to changes to enforce checking interface bindings,
sockets did not see loopback packets bound for our local address
on our interface.

e.g.)
  When we ping6 fe80::1%eth0, skb->dev points loopback_dev while
  IP6CB(skb)->iif indicates eth0.

This patch fixes the issue by using appropriate incoming interface,
in the sense of scoping architecture.

Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]>

diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -549,7 +549,7 @@ static void icmpv6_notify(struct sk_buff
read_lock(&raw_v6_lock);
if ((sk = sk_head(&raw_v6_htable[hash])) != NULL) {
while((sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr,
-   skb->dev->ifindex))) {
+   IP6CB(skb)->iif))) {
rawv6_err(sk, skb, NULL, type, code, inner_offset, 
info);
sk = sk_next(sk);
}
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -166,7 +166,7 @@ int ipv6_raw_deliver(struct sk_buff *skb
if (sk == NULL)
goto out;
 
-   sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, skb->dev->ifindex);
+   sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);
 
while (sk) {
delivered = 1;
@@ -178,7 +178,7 @@ int ipv6_raw_deliver(struct sk_buff *skb
rawv6_rcv(sk, clone);
}
sk = __raw_v6_lookup(sk_next(sk), nexthdr, daddr, saddr,
-skb->dev->ifindex);
+IP6CB(skb)->iif);
}
 out:
read_unlock(&raw_v6_lock);

-- 
YOSHIFUJI Hideaki @ USAGI Project  <[EMAIL PROTECTED]>
GPG-FP  : 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Ieee80211-devel] Re: ieee80211 patches

2005-09-01 Thread Pedro Ramalhais

Jiri Benc wrote:
> On Sat, 27 Aug 2005 11:21:37 -0500, James Ketrenos wrote:
> 
>>The order required of user space is:
>>
>>   kernel hotplug   hotplug script
>>   --------
>>1. module load
>>2. netdev device registered
>>3.new device
>>4.  ifconfig up
>>5. open: load firmware,
>>6. init device, etc.
>>7.  configure wireless
>>8. scan
>>9. associate
>>A.
>>link  
>>  
>>
>>B.  carrier detected
>>C.  configure link
>>(dhcp)
> 
> 
> I don't agree with this scheme. Association should be started on
> explicit userspace request (*). As we definitely don't want to add a new
> WE (or some other) call to perform this, the only call we can use for
> telling the driver "ok, now it's the time to associate" is ifconfig up.
> So opening the card should follow its configuration.
> 

The scheme looks good to me. Wireless cards mostly map to a regular
network card. Only difference is that you need to do something to
configure the link to have "carrier detected" and DHCP should only be
started after "carrier detected" (IFF_RUNNING IIRC).
Regarding association only on explicit userspace request, that's fixable
in the drivers (some drivers automatically associate once they're
ifconfig'ed up, the ipw2x00 drivers have a module parameter to change
this behaviour).

> And yes, this brings up the problem with firmware loading. It should
> really be solved, but trying to solve it by requiring to bring the card
> up before it is configured is the bad way.

Why is it the "wrong way"? I don't see a big problem with this, the card
is only going to be used after it's UP. The only problem i see is that
it doesn't behave exactly like most network drivers, where they are able
to detect a link even when they're DOWN. Is there a good reason for a
card to do anything even when it's DOWN?

> 
> Today, some cards require to be brought up before they are configured
> and some require it in the other order. Distributions have to deal with
> it if they want to support different devices. It definitely needs to be
> unified. And when the unification is performed, why not do this the
> right way?
> 

Agreed, there should be a guideline for the setup behaviour.

> (*) Because it's not good idea to start association before wireless is
> fully configured. Trying to associate to some random AP because
> semi-entered configuration matches the AP settings is very unexpected
> behaviour. And there might arise some problems with allowed/forbidden
> channels as well.
> 

Right, that would need a new interface where all parameters are passed
at once, or keep the existing interface and add another just to
explicitly associate.
Besides that, there should also be a way to configure if you want to
auto-associate to new access points if the old access point becomes
unavailable or with a weaker signal than the new one, or if you want to
manually associate, ex: association is done once and never tried again
until you tell it to do so. The manual association would be a good thing
for a wireless managed, where it would have the work of handling new
networks, APs becoming unavailable and available again, etc.
-- 
Pedro Ramalhais
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [resend][PATCH net-drivers-2.6 3/8] e1000: Fixes for packet split related issues

2005-09-01 Thread Jeff Garzik


I only received parts 2[abc], 3, and 8.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

67 matches

Mail list logo