Re: [PATCH 0/7] NetXen: Make driver use multiple PCI functions

2007-04-25 Thread Mithlesh Thukral
On Tuesday 24 April 2007 23:31, Jeff Garzik wrote:
 Mithlesh Thukral wrote:
  hi All,
 
  Thanks Stephen for your suggestion. I am resending the 7 patches
  after incorporating the suggestion.
  These patches are with respect to netdev#upstream and we wish their
  inclusion in 2.6.22 kernel.
 
  Out of these the first 2 patches were already accepted into the netdev
  tree, but we have requested them to be dropped. So we are resending those
  2. Please see the following thread for more details :
  http://www.spinics.net/lists/netdev/msg26805.html

 So what does that mean?

 If the patches were accepted, then you must send further patches
 cumulative to what is currently in the tree.  If it is already accepted,
 you cannot drop a patch.
My apologies for the confusion created by my email. 
We wish inclusion of these 7 patches in the netdev tree. All these patches are 
cumulative with what is currently present in the tree.

Thanks,
Mithlesh Thukral


   Jeff
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]:Replace with time_before in net/ipv4/ipip.c

2007-04-25 Thread Shani Moideen
Hi,

Replacing (jiffies - errtime  TIMEOUT) with time_before in net/ipv4/ipip.c

thanks.

Signed-off-by: Shani Moideen [EMAIL PROTECTED]



diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 3ec5ce0..d2bc835 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -108,6 +108,7 @@
 #include linux/init.h
 #include linux/netfilter_ipv4.h
 #include linux/if_ether.h
+#include linux/jiffies.h
 
 #include net/sock.h
 #include net/ip.h
@@ -324,7 +325,7 @@ static int ipip_err(struct sk_buff *skb, u32 info)
if (t-parms.iph.ttl == 0  type == ICMP_TIME_EXCEEDED)
goto out;
 
-   if (jiffies - t-err_time  IPTUNNEL_ERR_TIMEO)
+   if (time_before(jiffies , t-err_time + IPTUNNEL_ERR_TIMEO))
t-err_count++;
else
t-err_count = 1;
@@ -590,7 +591,7 @@ static int ipip_tunnel_xmit(struct sk_buff *skb, struct 
net_device *dev)
}
 
if (tunnel-err_count  0) {
-   if (jiffies - tunnel-err_time  IPTUNNEL_ERR_TIMEO) {
+   if (time_before(jiffies , tunnel-err_time + 
IPTUNNEL_ERR_TIMEO)) {
tunnel-err_count--;
dst_link_failure(skb);
} else

-- 
Shani 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]:Replacing with time_before in net/ipv4/ip_gre.c

2007-04-25 Thread Shani Moideen
Hi,

Replacing with time_before in net/ipv4/ip_gre.c

thanks.

Signed-off-by: Shani Moideen [EMAIL PROTECTED]


diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 9151da6..05cd63b 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -28,6 +28,7 @@
 #include linux/igmp.h
 #include linux/netfilter_ipv4.h
 #include linux/if_ether.h
+#include linux/jiffies.h
 
 #include net/sock.h
 #include net/ip.h
@@ -376,7 +377,7 @@ static void ipgre_err(struct sk_buff *skb, u32 info)
if (t-parms.iph.ttl == 0  type == ICMP_TIME_EXCEEDED)
goto out;
 
-   if (jiffies - t-err_time  IPTUNNEL_ERR_TIMEO)
+   if (time_before(jiffies , t-err_time + IPTUNNEL_ERR_TIMEO))
t-err_count++;
else
t-err_count = 1;
@@ -801,7 +802,7 @@ static int ipgre_tunnel_xmit(struct sk_buff *skb, struct 
net_device *dev)
 #endif
 
if (tunnel-err_count  0) {
-   if (jiffies - tunnel-err_time  IPTUNNEL_ERR_TIMEO) {
+   if (time_before(jiffies , tunnel-err_time + 
IPTUNNEL_ERR_TIMEO)) {
tunnel-err_count--;
 
dst_link_failure(skb);


Shani

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-04-25 Thread weidong
Hi Mr. David
I have modified my patch according to you advice. I think -
EHOSTUNREACH
is only for input path. In output path, we can just simply check
-ENETUNREACH  (^_^), the patch is shown in the end of this mail.

BTW: my E-mail has been changed to [EMAIL PROTECTED]


  Function need to fix:
  tcp_v4_connect(); ip4_datagram_connect(); udp_sendmsg();


  I think we need to make these checks more carefully.
 
  Route lookup can fail for several reasons other than
  no route being available.  Two examples are:
 
  1) Out of memory error while creating route
  2) IPSEC disallows communication to that flow ID
 
  As a result, we'll probably best limiting the counter
  increment when the error is either -EHOSTUNREACH or
  -ENETUNREACH.


signed-off-by: Wei Dong [EMAIL PROTECTED]


diff -ruNp a/net/ipv4/datagram.c b/net/ipv4/datagram.c
--- a/net/ipv4/datagram.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/datagram.c   2007-04-25 15:21:42.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
+
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -ruNp a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/tcp_ipv4.c   2007-04-25 15:21:42.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk);
-   if (tmp  0)
+   if (tmp  0) {
+   if (tmp == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }

if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -ruNp a/net/ipv4/udp.c b/net/ipv4/udp.c
--- a/net/ipv4/udp.c2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/udp.c2007-04-25 15:21:42.0 +0800
@@ -630,8 +630,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk, 
!(msg-msg_flagsMSG_DONTWAIT));
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }

err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

 Yes sure. Note that this is documented:
 
   /*
* Kill off a pending schedule_delayed_work().  Note that the work 
 callback
* function may still be running on return from cancel_delayed_work().  
 Run
* flush_workqueue() or cancel_work_sync() to wait on it.
*/

No, it isn't documented.  It says that the *work* callback may be running, but
does not mention the timer callback.  However, just looking at the
cancellation function source made it clear that this would wait for the timer
handler to return first.


However, is it worth just making cancel_delayed_work() a void function and not
returning anything?  I'm not sure the return value is very useful.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]:Replace with time_before in net/ipv4/ipip.c

2007-04-25 Thread David Miller
From: Shani Moideen [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 11:30:13 +0530

 Replacing (jiffies - errtime  TIMEOUT) with time_before in net/ipv4/ipip.c
 
 thanks.
 
 Signed-off-by: Shani Moideen [EMAIL PROTECTED]

The test you are replacing actually gives a larger window
of acceptance than time_before() does.

It's been a long standing issue whether we should give up
this semantic advantage for the sake of code cleanliness
in the networking code.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-04-25 Thread David Miller

Please do not post in HTML, nobody will read it, including
me.

Please use plain ASCII text for all mailing list postings,
especially those containing patches.


Thank you.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8057] New: slab corruption running ip6sic

2007-04-25 Thread Eric Sesterhenn / Snakebyte
* Herbert Xu ([EMAIL PROTECTED]) wrote:
 Jarek Poplawski [EMAIL PROTECTED] wrote:
 
  My proposal is: maybe Eric could change this in
  xfrm6_tunnel_rcv() from xfrm6_tunnel.c e.g. like this:
  
  return xfrm6_rcv_spi(skb, spi)  0 ? : 0;
  
  and, if no errors in testing, he could resubmit this patch? 
 
 I agree, this is the right fix.


The fix proposed by Jarek indeed fixes the problem, tested on two boxes,
with an -rc5 kernel and a yesterdays git

Acked-by: Eric Sesterhenn [EMAIL PROTECTED]

--- linux-2.6/net/ipv6/xfrm6_tunnel.c.orig  2007-04-25 00:22:30.0 
+0200
+++ linux-2.6/net/ipv6/xfrm6_tunnel.c   2007-04-25 00:22:45.0 +0200
@@ -261,7 +261,7 @@ static int xfrm6_tunnel_rcv(struct sk_bu
__be32 spi;
 
spi = xfrm6_tunnel_spi_lookup((xfrm_address_t *)iph-saddr);
-   return xfrm6_rcv_spi(skb, spi);
+   return xfrm6_rcv_spi(skb, spi)  0 ? : 0;
 }
 
 static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt,




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with commit a0ee18b9b7d3847976c6fb315c06a34fb296de0e

2007-04-25 Thread Ismail Dönmez
Hi,
On Tuesday 24 April 2007 00:23:13 Ismail Dönmez wrote:
 On Tuesday 24 April 2007 00:17:40 Thomas Graf wrote:
  * Ismail D?nmez [EMAIL PROTECTED] 2007-04-23 22:09
 
   Yes I know the fix is in but I wondered why its creating such problems
   with 2.6.18 kernel, guess it depends on some other commits.
 
  As long as you apply the complete patch including the additional
  sanity check for RTN_MAX it should work perfectly fine on 2.6.18.

 The sanity check part doesn't seem to apply to 2.6.18.

  I can't think of any connection between the patch and the errors
  you are seeing.
 
  Are you absolutely sure the errors you see are directly connected
  to applying the patch?

 Yes actually I am but I'll re-test and see. Thanks.

I was able to reproduce the same problem with Linus' GIT tree too. Since I 
started to see these after I applied the commit 
a0ee18b9b7d3847976c6fb315c06a34fb296de0e to 2.6.18 tree, there is a big 
possiblity that the commit is the culprit.

I attach the relevant dmesg messages. Problem happened after 12 hours of 
uptime and and net connection gets stable again after 1-2 minutes.

Regards,
ismail

-- 
Life is a game, and if you aren't in it to win,
what the heck are you still doing here?

-- Linus Torvalds (talking about open source development)
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
printk: 1 messages suppressed.
Neighbour table overflow.
printk: 4 messages suppressed.
Neighbour table overflow.
printk: 3 messages suppressed.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
printk: 3 messages suppressed.
Neighbour table overflow.
printk: 15 messages suppressed.
Neighbour table overflow.
printk: 6 messages suppressed.
Neighbour table overflow.
printk: 13 messages suppressed.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
printk: 2 messages suppressed.
Neighbour table overflow.
printk: 1 messages suppressed.
Neighbour table overflow.
Neighbour table overflow.



Re: Fix ipOutNoRoutes counter error for TCP and UDP

2007-04-25 Thread weidong

Hi Mr. David
I have modified my patch according to you advice. I think -
EHOSTUNREACH is only for input path. In output path, we can just
simply check-ENETUNREACH  (^_^), the patch is shown in the end of this mail.

BTW: my E-mail has been changed to [EMAIL PROTECTED]



  Function need to fix:
  tcp_v4_connect(); ip4_datagram_connect(); udp_sendmsg();





  I think we need to make these checks more carefully.
 
  Route lookup can fail for several reasons other than
  no route being available.  Two examples are:
 
  1) Out of memory error while creating route
  2) IPSEC disallows communication to that flow ID
 
  As a result, we'll probably best limiting the counter
  increment when the error is either -EHOSTUNREACH or
  -ENETUNREACH.




signed-off-by: Wei Dong [EMAIL PROTECTED]


diff -ruNp a/net/ipv4/datagram.c b/net/ipv4/datagram.c
--- a/net/ipv4/datagram.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/datagram.c   2007-04-25 15:21:42.0 +0800
@@ -50,8 +50,12 @@ int ip4_datagram_connect(struct sock *sk
   RT_CONN_FLAGS(sk), oif,
   sk-sk_protocol,
   inet-sport, usin-sin_port, sk);
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return err;
+   }
+
if ((rt-rt_flags  RTCF_BROADCAST)  !sock_flag(sk, SOCK_BROADCAST)) {
ip_rt_put(rt);
return -EACCES;
diff -ruNp a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c   2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/tcp_ipv4.c   2007-04-25 15:21:42.0 +0800
@@ -192,8 +192,11 @@ int tcp_v4_connect(struct sock *sk, stru
   RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
   IPPROTO_TCP,
   inet-sport, usin-sin_port, sk);
-   if (tmp  0)
+   if (tmp  0) {
+   if (tmp == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
return tmp;
+   }

if (rt-rt_flags  (RTCF_MULTICAST | RTCF_BROADCAST)) {
ip_rt_put(rt);
diff -ruNp a/net/ipv4/udp.c b/net/ipv4/udp.c
--- a/net/ipv4/udp.c2007-04-25 15:20:19.0 +0800
+++ b/net/ipv4/udp.c2007-04-25 15:21:42.0 +0800
@@ -630,8 +630,11 @@ int udp_sendmsg(struct kiocb *iocb, stru
 .dport = dport } } };
security_sk_classify_flow(sk, fl);
err = ip_route_output_flow(rt, fl, sk,
!(msg-msg_flagsMSG_DONTWAIT));
-   if (err)
+   if (err) {
+   if (err == -ENETUNREACH)
+   IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES);
goto out;
+   }

err = -EACCES;
if ((rt-rt_flags  RTCF_BROADCAST) 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread Oleg Nesterov
On 04/25, David Howells wrote:

 Oleg Nesterov [EMAIL PROTECTED] wrote:
 
  Yes sure. Note that this is documented:
  
  /*
   * Kill off a pending schedule_delayed_work().  Note that the work 
  callback
   * function may still be running on return from cancel_delayed_work().  
  Run
   * flush_workqueue() or cancel_work_sync() to wait on it.
   */
 
 No, it isn't documented.  It says that the *work* callback may be running, but
 does not mention the timer callback.  However, just looking at the
 cancellation function source made it clear that this would wait for the timer
 handler to return first.

Ah yes, it says nothing about what the returned value means...

 However, is it worth just making cancel_delayed_work() a void function and not
 returning anything?  I'm not sure the return value is very useful.

cancel_rearming_delayed_work() needs this, tty_io.c, probably somebody else.

Oleg.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread David Howells
Oleg Nesterov [EMAIL PROTECTED] wrote:

 Ah yes, it says nothing about what the returned value means...

Yeah...  If you could amend that as part of your patch, that'd be great.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Howells

The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

http://people.redhat.com/~dhowells/rxrpc/series
http://people.redhat.com/~dhowells/rxrpc/01-move-skb-generic.diff
http://people.redhat.com/~dhowells/rxrpc/02-cancel_delayed_work.diff
http://people.redhat.com/~dhowells/rxrpc/03-keys.diff
http://people.redhat.com/~dhowells/rxrpc/04-timer-exports.diff
http://people.redhat.com/~dhowells/rxrpc/05-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

http://people.redhat.com/~dhowells/rxrpc/06-afs-cleanup.diff
http://people.redhat.com/~dhowells/rxrpc/07-af_rxrpc-kernel.diff
http://people.redhat.com/~dhowells/rxrpc/08-af_rxrpc-afs.diff
http://people.redhat.com/~dhowells/rxrpc/09-af_rxrpc-delete-old.diff

And then the rest of the patches extend AFS to provide automatic unmounting of
automount trees, security support and directory-level write support (create,
mkdir, etc.):

http://people.redhat.com/~dhowells/rxrpc/10-afs-multimount.diff
http://people.redhat.com/~dhowells/rxrpc/11-afs-security.diff
http://people.redhat.com/~dhowells/rxrpc/12-afs-doc.diff

http://people.redhat.com/~dhowells/rxrpc/13-netlink-support-MSG_TRUNC.diff
http://people.redhat.com/~dhowells/rxrpc/14-afs-get-capabilities.diff
http://people.redhat.com/~dhowells/rxrpc/15-afs-initcallbackstate3.diff
http://people.redhat.com/~dhowells/rxrpc/16-afs-dir-write-support.diff

Note that file-level write support is not yet complete and so is not included
in this patch set.


The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

http://people.redhat.com/~dhowells/rxrpc/klog.c
http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
http://people.redhat.com/~dhowells/rxrpc/listen.c

The klog program is provided to go and get a Kerberos IV key from the AFS
kaserver.  Currently it must be edited before compiling to note the right
server IP address and the appropriate credentials.

These programs can be compiled by:

make klog rxrpc listen CFLAGS=-Wall -g LDLIBS=-lcrypto -lcrypt 
-lkrb4 -lkeyutils

Then a ticket can be obtained by:

./klog

If a security key is acquired in this way, then all subsequent AFS operations -
including VL lookups and mounts - performed with that session keyring will be
authenticated using that key.  The key can be viewed like so:

[EMAIL PROTECTED] ~]# keyctl show
Session Keyring
   -3 --alswrv  0 0  keyring: _ses.3268
2 --alswrv  0 0   \_ keyring: _uid.0
111416553 --als--v  0 0   \_ rxrpc: [EMAIL PROTECTED]

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
 configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
 the connect() address of a client socket by making use of msg_name in the
 msghdr struct when using sendmsg() to send the first data packet of a
 call.  Indeed, a client socket need not be connected before being used
 so.

 (*) I've also added a facility whereby client calls may also be made on
 server sockets, again by using msg_name in the msghdr struct.  In such a
 case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
 (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
 the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
 messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
 to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
 added the direct access interface conflicts with patches Herbert Xu is
 producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse work from the
 af_rxrpc-kernel patch to the af_rxrpc main patch.

 (*) Make RxRPC use its own private work queues rather than keventd's to avoid
 deadlocks when AFS tries to use keventd too.  This also puts encryption
 in the private work queue rather than keventd's queue as that might take
 a relatively long time to 

[PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC [try #3]

2007-04-25 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code [try #3]

2007-04-25 Thread David Howells
Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/skbuff.h |6 ++
 include/net/esp.h  |2 -
 net/core/skbuff.c  |  188 
 net/xfrm/xfrm_algo.c   |  169 ---
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5992f65..c905d42 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -361,6 +362,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff 
*skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
   int newheadroom, int newtailroom,
   gfp_t priority);
+extern intskb_to_sgvec(struct sk_buff *skb,
+   struct scatterlist *sg, int offset,
+   int len);
+extern intskb_cow_data(struct sk_buff *skb, int tailbits,
+   struct sk_buff **trailer);
 extern intskb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)   kfree_skb(a)
 extern void  skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int 
offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff 
**trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 336958f..aa02bd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
+#include linux/scatterlist.h
 
 #include net/protocol.h
 #include net/dst.h
@@ -2005,6 +2006,190 @@ void __init skb_init(void)
NULL, NULL);
 }
 
+/**
+ * skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ * @skb: Socket buffer containing the buffers to be mapped
+ * @sg: The scatter-gather list to map into
+ * @offset: The offset into the buffer's contents to start mapping
+ * @len: Length of buffer space to be mapped
+ *
+ * Fill the specified scatter-gather list with mappings/pointers into a
+ * region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   int elt = 0;
+
+   if (copy  0) {
+   if (copy  len)
+   copy = len;
+   sg[elt].page = virt_to_page(skb-data + offset);
+   sg[elt].offset = (unsigned long)(skb-data + offset) % 
PAGE_SIZE;
+   sg[elt].length = copy;
+   elt++;
+   if ((len -= copy) == 0)
+   return elt;
+   offset += copy;
+   }
+
+   for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + skb_shinfo(skb)-frags[i].size;
+   if ((copy = end - offset)  0) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[i];
+
+   if (copy  len)
+   copy = len;
+   sg[elt].page = frag-page;
+   sg[elt].offset = frag-page_offset+offset-start;
+   sg[elt].length = copy;
+   elt++;
+   if (!(len -= copy))
+   return elt;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   if (skb_shinfo(skb)-frag_list) {
+   struct sk_buff *list = skb_shinfo(skb)-frag_list;
+
+   for (; list; list = list-next) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + list-len;
+   if ((copy = end - offset)  0) {
+   if (copy  len)
+   copy = len;
+   elt += skb_to_sgvec(list, sg+elt, offset - 
start, copy);
+   if ((len -= copy) == 0)

[PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() [try #3]

2007-04-25 Thread David Howells
del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

before this patch:
work-func may still be running or queued

after this patch:
work-func may still be running or queued, or
delayed_work_timer_fn-__queue_work() in progress.

The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

delayed_work-work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).

In this case the timer was re-started by work-func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work-work.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]
Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..b8abfc7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -191,14 +191,15 @@ int execute_in_process_context(work_func_t fn, struct 
execute_work *);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
- * function may still be running on return from cancel_delayed_work().  Run
- * flush_scheduled_work() to wait on it.
+ * function may still be running on return from cancel_delayed_work(), unless
+ * it returns 1 and the work doesn't re-arm itself. Run flush_workqueue() or
+ * cancel_work_sync() to wait on it.
  */
 static inline int cancel_delayed_work(struct delayed_work *work)
 {
int ret;
 
-   ret = del_timer_sync(work-timer);
+   ret = del_timer(work-timer);
if (ret)
work_release(work-work);
return ret;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly [try #3]

2007-04-25 Thread David Howells
Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the silent parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb-s_flags.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/super.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void 
*data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
struct afs_mount_params *params = data;
struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
goto error;
}
 
-   sb-s_flags = flags;
-
-   ret = afs_fill_super(sb, params, flags  MS_SILENT ? 1 : 0);
-   if (ret  0) {
-   up_write(sb-s_umount);
-   deactivate_super(sb);
-   goto error;
+   if (!sb-s_root) {
+   /* initial superblock/root creation */
+   _debug(create);
+   sb-s_flags = flags;
+   ret = afs_fill_super(sb, params);
+   if (ret  0) {
+   up_write(sb-s_umount);
+   deactivate_super(sb);
+   goto error;
+   }
+   sb-s_flags |= MS_ACTIVE;
+   } else {
+   _debug(reuse);
+   ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
-   sb-s_flags |= MS_ACTIVE;
-   simple_set_mnt(mnt, sb);
 
+   simple_set_mnt(mnt, sb);
afs_put_volume(params.volume);
afs_put_cell(params.default_cell);
_leave( = 0 [%p], sb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 [try #3]

2007-04-25 Thread David Howells
[NETLINK]: Mirror UDP MSG_TRUNC semantics.

If the user passes MSG_TRUNC in via msg_flags, return
the full packet size not the truncated size.

Idea from Herbert Xu and Thomas Graf.

Signed-off-by: David S. Miller [EMAIL PROTECTED]
---

 net/netlink/af_netlink.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c48b0f4..5890210 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1242,6 +1242,9 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct 
socket *sock,
 
scm_recv(sock, msg, siocb-scm, flags);
 
+   if (flags  MSG_TRUNC)
+   copied = skb-len;
+
 out:
netlink_rcv_wake(sk);
return err ? : copied;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] AFS: Implement the CB.InitCallBackState3 operation [try #3]

2007-04-25 Thread David Howells
Implement the CB.InitCallBackState3 operation for the fileserver to call.
This reduces the amount of network traffic because if this op is aborted, the
fileserver will then attempt an CB.InitCallBackState operation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/AFS_CM.h|1 +
 fs/afs/cmservice.c |   46 ++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/fs/afs/AFS_CM.h b/fs/afs/AFS_CM.h
index d4bd201..7b4d4fa 100644
--- a/fs/afs/AFS_CM.h
+++ b/fs/afs/AFS_CM.h
@@ -23,6 +23,7 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBInitCallBackState3= 213,  /* initialise callback state, version 3 
*/
CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 5139723..3d58861 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -20,6 +20,8 @@ struct workqueue_struct *afs_cm_workqueue;
 
 static int afs_deliver_cb_init_call_back_state(struct afs_call *,
   struct sk_buff *, bool);
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *,
+   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
@@ -47,6 +49,16 @@ static const struct afs_call_type afs_SRXCBInitCallBackState 
= {
 };
 
 /*
+ * CB.InitCallBackState3 operation type
+ */
+static const struct afs_call_type afs_SRXCBInitCallBackState3 = {
+   .name   = CB.InitCallBackState3,
+   .deliver= afs_deliver_cb_init_call_back_state3,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
@@ -83,6 +95,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBInitCallBackState:
call-type = afs_SRXCBInitCallBackState;
return true;
+   case CBInitCallBackState3:
+   call-type = afs_SRXCBInitCallBackState3;
+   return true;
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
@@ -312,6 +327,37 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *call,
 }
 
 /*
+ * deliver request data to a CB.InitCallBackState3 call
+ */
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *call,
+   struct sk_buff *skb,
+   bool last)
+{
+   struct afs_server *server;
+   struct in_addr addr;
+
+   _enter(,{%u},%d, skb-len, last);
+
+   if (!last)
+   return 0;
+
+   /* no unmarshalling required */
+   call-state = AFS_CALL_REPLYING;
+
+   /* we'll need the file server record as that tells us which set of
+* vnodes to operate upon */
+   memcpy(addr, skb-nh.iph-saddr, 4);
+   server = afs_find_server(addr);
+   if (!server)
+   return -ENOTCONN;
+   call-server = server;
+
+   INIT_WORK(call-work, SRXAFSCB_InitCallBackState);
+   schedule_work(call-work);
+   return 0;
+}
+
+/*
  * allow the fileserver to see if the cache manager is still alive
  */
 static void SRXAFSCB_Probe(struct work_struct *work)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation [try #3]

2007-04-25 Thread David Howells
Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
 plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
 is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
 to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
 to pull out the MAC address of the lowest index interface to use in UUID
 construction.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/AFS_CM.h|3 
 fs/afs/Makefile|1 
 fs/afs/cmservice.c |   98 ++
 fs/afs/internal.h  |   42 
 fs/afs/main.c  |   49 +
 fs/afs/rxrpc.c |   39 
 fs/afs/use-rtnetlink.c |  473 
 7 files changed, 705 insertions(+), 0 deletions(-)

diff --git a/fs/afs/AFS_CM.h b/fs/afs/AFS_CM.h
index 7c8e3d4..d4bd201 100644
--- a/fs/afs/AFS_CM.h
+++ b/fs/afs/AFS_CM.h
@@ -23,6 +23,9 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
+#define AFS_CAP_ERROR_TRANSLATION  0x1
+
 #endif /* AFS_FS_H */
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index cca198b..01545eb 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -18,6 +18,7 @@ kafs-objs := \
security.o \
server.o \
super.o \
+   use-rtnetlink.o \
vlclient.o \
vlocation.o \
vnode.o \
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 9cb3ac5..5139723 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -22,6 +22,8 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *,
   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
+static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
+  bool);
 static void afs_cm_destructor(struct afs_call *);
 
 /*
@@ -55,6 +57,16 @@ static const struct afs_call_type afs_SRXCBProbe = {
 };
 
 /*
+ * CB.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_SRXCBGetCapabilites = {
+   .name   = CB.GetCapabilities,
+   .deliver= afs_deliver_cb_get_capabilities,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * route an incoming cache manager call
  * - return T if supported, F if not
  */
@@ -74,6 +86,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
+   case CBGetCapabilities:
+   call-type = afs_SRXCBGetCapabilites;
+   return true;
default:
return false;
}
@@ -328,3 +343,86 @@ static int afs_deliver_cb_probe(struct afs_call *call, 
struct sk_buff *skb,
schedule_work(call-work);
return 0;
 }
+
+/*
+ * allow the fileserver to ask about the cache manager's capabilities
+ */
+static void SRXAFSCB_GetCapabilities(struct work_struct *work)
+{
+   struct afs_interface *ifs;
+   struct afs_call *call = container_of(work, struct afs_call, work);
+   int loop, nifs;
+
+   struct {
+   struct /* InterfaceAddr */ {
+   __be32 nifs;
+   __be32 uuid[11];
+   __be32 ifaddr[32];
+   __be32 netmask[32];
+   __be32 mtu[32];
+   } ia;
+   struct /* Capabilities */ {
+   __be32 capcount;
+   __be32 caps[1];
+   } cap;
+   } reply;
+
+   _enter();
+
+   nifs = 0;
+   ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
+   if (ifs) {
+   nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+   if (nifs  0) {
+   kfree(ifs);
+   ifs = NULL;
+   nifs = 0;
+   }
+   }
+
+   memset(reply, 0, sizeof(reply));
+   reply.ia.nifs = htonl(nifs);
+
+   reply.ia.uuid[0] = htonl(afs_uuid.time_low);
+   reply.ia.uuid[1] = htonl(afs_uuid.time_mid);
+   reply.ia.uuid[2] = htonl(afs_uuid.time_hi_and_version);
+   reply.ia.uuid[3] = htonl((s8) 

[PATCH 12/16] AFS: Update the AFS fs documentation [try #3]

2007-04-25 Thread David Howells
Update the AFS fs documentation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/filesystems/afs.txt |  214 +++--
 1 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/Documentation/filesystems/afs.txt 
b/Documentation/filesystems/afs.txt
index 2f4237d..12ad6c7 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+
 kAFS: AFS FILESYSTEM
 
 
-ABOUT
-=
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+
+OVERVIEW
+
 
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set.  The features
+it does support include:
 
-   (*) Write support.
-   (*) Communications security.
-   (*) Local caching.
-   (*) pioctl() system call.
-   (*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
 
+ (*) File reading.
 
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===
+COMPILATION
+===
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+   CONFIG_AF_RXRPC - The RxRPC protocol transport
+   CONFIG_RXKAD- The RxRPC Kerberos security handler
+   CONFIG_AFS  - The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+   CONFIG_AF_RXRPC_DEBUG   - Permit AF_RXRPC debugging to be enabled
+   CONFIG_AFS_DEBUG- Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+   /sys/module/af_rxrpc/parameters/debug
+   /sys/module/afs/parameters/debug
+
+
+=
 USAGE
 =
 
 When inserting the driver modules the root cell must be specified along with a
 list of volume location server IP addresses:
 
-   insmod rxrpc.o
+   insmod af_rxrpc.o
+   insmod rxkad.o
insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
 
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver.  This provides the
+RxRPC remote operation protocol and may also be accessed from userspace.  See:
+
+   Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
 
 Once the module has been loaded, more modules can be added by the following
 procedure:
@@ -33,7 +84,7 @@ procedure:
echo add grand.central.org 18.7.14.88:128.2.191.224 /proc/fs/afs/cells
 
 Where the parameters to the add command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
 
 Filesystems can be mounted anywhere by commands similar to the following:
 
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to 
the following:
mount -t afs #root.afs. /afs
mount -t afs #root.cell. /afs/cambridge
 
-  NB: When using this on Linux 2.4, the mount command has to be different,
-  since the filesystem doesn't have access to the device name argument:
-
-   mount -t afs none /afs -ovol=#root.afs.
-
 Where the initial character is either a hash or a percent symbol depending on
 whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
 volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified 
during insmod.
 Additional cells can be added through /proc (see later section).
 
 
+===
 MOUNTPOINTS
 ===
 
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the device name passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the device name passed to mount).  kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics).  If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
 
-  (*) They cannot 

Re: [PATCH] IPROUTE: Modify tc for new PRIO multiqueue behavior

2007-04-25 Thread jamal
On Tue, 2007-24-04 at 21:05 -0700, Stephen Hemminger wrote:
 Peter P Waskiewicz Jr wrote:

 Only if this binary compatiable with older kernels.

It is not. But i think that is a lesser problem, the bigger question is:
Why would you need to change a qdisc just so you can support egress
multiqueues?

BTW, is there any reason this is being cced to lkml?

cheers,
jamal

PS:- I havent read the kernel patches (i am congested and about 1000
messages behind on netdev) and my opinions may be influenced by an
approach i have in trying to help someone fixup a wireless driver with
multiqueue support.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8057] New: slab corruption running ip6sic

2007-04-25 Thread Jarek Poplawski
On Wed, Apr 25, 2007 at 10:27:59AM +0200, Eric Sesterhenn / Snakebyte wrote:
 * Herbert Xu ([EMAIL PROTECTED]) wrote:
  Jarek Poplawski [EMAIL PROTECTED] wrote:
  
   My proposal is: maybe Eric could change this in
   xfrm6_tunnel_rcv() from xfrm6_tunnel.c e.g. like this:
   
   return xfrm6_rcv_spi(skb, spi)  0 ? : 0;
   
   and, if no errors in testing, he could resubmit this patch? 
  
  I agree, this is the right fix.
 
 
 The fix proposed by Jarek indeed fixes the problem, tested on two boxes,
 with an -rc5 kernel and a yesterdays git
 
 Acked-by: Eric Sesterhenn [EMAIL PROTECTED]
 
 --- linux-2.6/net/ipv6/xfrm6_tunnel.c.orig2007-04-25 00:22:30.0 
 +0200
 +++ linux-2.6/net/ipv6/xfrm6_tunnel.c 2007-04-25 00:22:45.0 +0200
 @@ -261,7 +261,7 @@ static int xfrm6_tunnel_rcv(struct sk_bu
   __be32 spi;
  
   spi = xfrm6_tunnel_spi_lookup((xfrm_address_t *)iph-saddr);
 - return xfrm6_rcv_spi(skb, spi);
 + return xfrm6_rcv_spi(skb, spi)  0 ? : 0;
  }
  
  static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt,

I think the main idea of fixing the problem plus testing is Eric's only
credit and I've only proposed some change in placement of cosmetic value.

So, Eric, considering all massive work you've done with rather feeble
support, please, fix the comment and sign this patch (at least I'm not
going to).

I also please Andrew to change the assignement of this patch in -mm.

Thanks  regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PATCH] [net-2.6.22] IPv6, IPv4 Updates

2007-04-25 Thread YOSHIFUJI Hideaki / 吉藤英明
Dave,

Please consider pulling following commits available on
net-2.6.22-20070425a-inet6-cleanup-20070425
branch at
git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev.git.

HEADLINES
-

[IPV6] SIT: Unify code path to get hash array index.
[IPV4] IPIP: Unify code path to get hash array index.
[IPV4] IP_GRE: Unify code path to get hash array index.
[IPV6]: Export in6addr_any for future use.
[IPV6] XFRM: Use ip6addr_any where applicable.
[IPV6] NDISC: Unify main process of sending ND messages.

DIFFSTAT


 include/linux/in6.h|2 
 net/ipv4/ip_gre.c  |   23 ++--
 net/ipv4/ipip.c|   22 +---
 net/ipv6/addrconf.c|2 
 net/ipv6/ndisc.c   |  283 ++--
 net/ipv6/sit.c |   23 ++--
 net/ipv6/xfrm6_input.c |4 -
 7 files changed, 112 insertions(+), 247 deletions(-)

CHANGESETS
--

commit ed808452811f1b5b55727ab6c5336a488d5689b4
Author: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date:   Tue Apr 24 20:44:47 2007 +0900

[IPV6] SIT: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 27fe10f..1efa95a 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -99,10 +99,10 @@ static struct ip_tunnel * ipip6_tunnel_lookup(__be32 
remote, __be32 local)
return NULL;
 }
 
-static struct ip_tunnel ** ipip6_bucket(struct ip_tunnel *t)
+static struct ip_tunnel **__ipip6_bucket(struct ip_tunnel_parm *parms)
 {
-   __be32 remote = t-parms.iph.daddr;
-   __be32 local = t-parms.iph.saddr;
+   __be32 remote = parms-iph.daddr;
+   __be32 local = parms-iph.saddr;
unsigned h = 0;
int prio = 0;
 
@@ -117,6 +117,11 @@ static struct ip_tunnel ** ipip6_bucket(struct ip_tunnel 
*t)
return tunnels[prio][h];
 }
 
+static inline struct ip_tunnel **ipip6_bucket(struct ip_tunnel *t)
+{
+   return __ipip6_bucket(t-parms);
+}
+
 static void ipip6_tunnel_unlink(struct ip_tunnel *t)
 {
struct ip_tunnel **tp;
@@ -147,19 +152,9 @@ static struct ip_tunnel * ipip6_tunnel_locate(struct 
ip_tunnel_parm *parms, int
__be32 local = parms-iph.saddr;
struct ip_tunnel *t, **tp, *nt;
struct net_device *dev;
-   unsigned h = 0;
-   int prio = 0;
char name[IFNAMSIZ];
 
-   if (remote) {
-   prio |= 2;
-   h ^= HASH(remote);
-   }
-   if (local) {
-   prio |= 1;
-   h ^= HASH(local);
-   }
-   for (tp = tunnels[prio][h]; (t = *tp) != NULL; tp = t-next) {
+   for (tp = __ipip6_bucket(parms); (t = *tp) != NULL; tp = t-next) {
if (local == t-parms.iph.saddr  remote == t-parms.iph.daddr)
return t;
}

---
commit 2f66586f53dd6319323c7d0c6ac0d4a4fb522865
Author: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date:   Tue Apr 24 20:44:47 2007 +0900

[IPV4] IPIP: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]

diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 37ab391..ebd2f2d 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -157,10 +157,10 @@ static struct ip_tunnel * ipip_tunnel_lookup(__be32 
remote, __be32 local)
return NULL;
 }
 
-static struct ip_tunnel **ipip_bucket(struct ip_tunnel *t)
+static struct ip_tunnel **__ipip_bucket(struct ip_tunnel_parm *parms)
 {
-   __be32 remote = t-parms.iph.daddr;
-   __be32 local = t-parms.iph.saddr;
+   __be32 remote = parms-iph.daddr;
+   __be32 local = parms-iph.saddr;
unsigned h = 0;
int prio = 0;
 
@@ -175,6 +175,10 @@ static struct ip_tunnel **ipip_bucket(struct ip_tunnel *t)
return tunnels[prio][h];
 }
 
+static inline struct ip_tunnel **ipip_bucket(struct ip_tunnel *t)
+{
+   return __ipip_bucket(t-parms);
+}
 
 static void ipip_tunnel_unlink(struct ip_tunnel *t)
 {
@@ -206,19 +210,9 @@ static struct ip_tunnel * ipip_tunnel_locate(struct 
ip_tunnel_parm *parms, int c
__be32 local = parms-iph.saddr;
struct ip_tunnel *t, **tp, *nt;
struct net_device *dev;
-   unsigned h = 0;
-   int prio = 0;
char name[IFNAMSIZ];
 
-   if (remote) {
-   prio |= 2;
-   h ^= HASH(remote);
-   }
-   if (local) {
-   prio |= 1;
-   h ^= HASH(local);
-   }
-   for (tp = tunnels[prio][h]; (t = *tp) != NULL; tp = t-next) {
+   for (tp = __ipip_bucket(parms); (t = *tp) != NULL; tp = t-next) {
if (local == t-parms.iph.saddr  remote == t-parms.iph.daddr)
return t;
}

---
commit e8b22bea08420e24a09e32972f455c21206fe102
Author: YOSHIFUJI Hideaki [EMAIL PROTECTED]
Date:   Tue Apr 24 20:44:48 2007 +0900

[IPV4] IP_GRE: Unify code path to get hash array index.

Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]

diff --git a/net/ipv4

Re: netlink locking warnings in 2.6.21-rc7-mm1

2007-04-25 Thread Patrick McHardy
David Miller wrote:
 I think I see what might be the problem, nlk-cb_mutex is set
 to rtnl_mutex and this is used for other purposes in various
 code paths here, maybe there is a double mutex_unlock() or
 similar due to that?


Nothing in the callbacks should be touching the rtnl, that would
have been broken before since we already used to hold it during
the first invocation of the dump callback, the only difference
is that we now hold it during the entire dump operation. The
cb_mutex is only set on socket creation, so there's also nothing
that should be rewriting it.

I can't see whats wrong here.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netlink locking warnings in 2.6.21-rc7-mm1

2007-04-25 Thread Patrick McHardy
Herbert Xu wrote:
 David Miller [EMAIL PROTECTED] wrote:
 
I think I see what might be the problem, nlk-cb_mutex is set
to rtnl_mutex and this is used for other purposes in various
code paths here, maybe there is a double mutex_unlock() or
similar due to that?
 
 
 Indeed, the RTNL is held during the processing of all RTNETLINK
 messages so we'd be trying to lock it recursively here which is
 not allowed.


No, it is released before calling netlink_dump_start().

 Actually I'm not quite sure what the benefit is for allowing an
 override CB mutex.  Since we still have to take it and we always
 allocate memory for a mutex anyway this would seem to be strictly
 worse than just using our own mutex.


The idea was that netlink families that don't want to consistently
hold the same mutex used for queue processing during the entire
dump operation can still have per-socket mutexes just to protect
the callback data and have concurrent dump continuations.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

 I'm ducking all feature and cleanup patches now, and probably shall
 continue to do so for some weeks.  The priority (which I believe to be
 increasingly urgent) is to fix the 2.6.21 regressions and to stabilise
 the things which we presently have queued for 2.6.22.  Not to
 mention the 1000ish unaddressed bug reports in bugzilla and elsewhere.

Fair enough.  I think the idea is for them (or at least some of them) to go
through one of DaveM's net git trees anyway.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with commit a0ee18b9b7d3847976c6fb315c06a34fb296de0e

2007-04-25 Thread Ismail Dönmez
On Wednesday 25 April 2007 11:52:52 Ismail Dönmez wrote:
 Hi,

 On Tuesday 24 April 2007 00:23:13 Ismail Dönmez wrote:
  On Tuesday 24 April 2007 00:17:40 Thomas Graf wrote:
   * Ismail D?nmez [EMAIL PROTECTED] 2007-04-23 22:09
  
Yes I know the fix is in but I wondered why its creating such
problems with 2.6.18 kernel, guess it depends on some other commits.
  
   As long as you apply the complete patch including the additional
   sanity check for RTN_MAX it should work perfectly fine on 2.6.18.
 
  The sanity check part doesn't seem to apply to 2.6.18.
 
   I can't think of any connection between the patch and the errors
   you are seeing.
  
   Are you absolutely sure the errors you see are directly connected
   to applying the patch?
 
  Yes actually I am but I'll re-test and see. Thanks.

 I was able to reproduce the same problem with Linus' GIT tree too. Since I
 started to see these after I applied the commit
 a0ee18b9b7d3847976c6fb315c06a34fb296de0e to 2.6.18 tree, there is a big
 possiblity that the commit is the culprit.

 I attach the relevant dmesg messages. Problem happened after 12 hours of
 uptime and and net connection gets stable again after 1-2 minutes.

Ignore this my laptop seems to be totally dying now including wireless card 
too. Sorry for the noise.

Regards,
ismail

-- 
Life is a game, and if you aren't in it to win,
what the heck are you still doing here?

-- Linus Torvalds (talking about open source development)


signature.asc
Description: This is a digitally signed message part.


Re: Getting the new RxRPC patches upstream

2007-04-25 Thread David Howells

David Miller [EMAIL PROTECTED] wrote:

 Is it possible for your changes to be purely networking
 and not need those changes outside of the networking?

See my latest patchset release.  I've reduced the dependencies on
non-networking changes to:

 (1) Oleg Nesterov's patch to change cancel_delayed_work() to use del_timer()
 rather than del_timer_sync() [patch 02/16].

 This patch can be discarded without compilation failure at the expense of
 making AFS slightly less efficient. It also makes AF_RXRPC slightly less
 efficient, but only in the rmmod path.

 (2) A symbol export in the keyring stuff plus a proliferation of the types
 available in the struct key::type_data union [patch 03/16].  This does
 not conflict with any other patches that I know about.

 (3) A symbol export in the timer stuff [patch 04/16].

Everything else that remains after the reduction is confined to the AF_RXRPC
or AFS code, save for a couple of networking patches in my patchset that you
already have and I just need to make the thing compile.

I'm not sure that I can make the AF_RXRPC patches totally independent of the
AFS patches as the two sets need to interleave since the last AF_RXRPC patch
deletes the old RxRPC code - which the old AFS code depends on.

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20.7 mss negotiation and path mtu discovery mostly broken?

2007-04-25 Thread Andi Kleen
Ristuccia, Brian [EMAIL PROTECTED] writes:

 I'm seeing a problem where the kernel attempts to send packets with a
 MSS larger than the one negotiated when the TCP connection is
 established. Even after ICMP can't fragment messages arrive, the
 kernel still attempts to increase the MSS rather aggressively. The end
 result is extremely poor throughput when sending to a network with a
 smaller MTU. 
 
 In /proc/sys/net/ipv4:
 ip_no_pmtu_disc:0
 tcp_mtu_probing:0
 
 The sending host (10.2.10.254) has an MTU of 9000. The destination host
 (12.33.234.69) has an MTU of 1500. There is one router between the hosts
 which will drop packets with the DF flag when they don't fit the
 destination interface's MTU and generates the required icmp can't
 fragment message. 
 
 The dump shows the initial handshake with correct mss options sent:
 
 08:39:55.493029 IP 12.33.234.69.35026  10.2.10.254.22: S
 2768979373:2768979373(
 0) win 5840 mss 1460,sackOK,timestamp 3873837730 0,nop,wscale 2
 08:39:55.493119 IP 10.2.10.254.22  12.33.234.69.35026: S
 963242385:963242385(0)
  ack 2768979374 win 17896 mss 8960,sackOK,timestamp 413751

The MSS clamp for sending to 10.2.10.254.22 is 8960. MSS is only
one way -- each uses what the other tells it.

 In the following dump, the system eventually gets in a state where it
 oscillates between sendng undeliverable 2896 byte packets and
 deliverable 1448 byte ones. 

This should only happen on PMTU expire, which is normally ~15mins.
Perhaps you misconfigured it manually using sysctl.

-And
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: 2.6.20.7 mss negotiation and path mtu discovery mostly broken?

2007-04-25 Thread Ristuccia, Brian

  08:39:55.493029 IP 12.33.234.69.35026  10.2.10.254.22: S 
  2768979373:2768979373(
  0) win 5840 mss 1460,sackOK,timestamp 3873837730 0,nop,wscale 2
  08:39:55.493119 IP 10.2.10.254.22  12.33.234.69.35026: S
  963242385:963242385(0)
   ack 2768979374 win 17896 mss 8960,sackOK,timestamp 413751
 
 The MSS clamp for sending to 10.2.10.254.22 is 8960. MSS is 
 only one way -- each uses what the other tells it.
 

Right - except that 10.2.10.254 keeps sending to 12.33.234.69 with an
increasingly large MSS, even though 12.33.234.69 has asked for no larger
than 1460.

  In the following dump, the system eventually gets in a 
 state where it 
  oscillates between sendng undeliverable 2896 byte packets and 
  deliverable 1448 byte ones.
 
 This should only happen on PMTU expire, which is normally ~15mins.
 Perhaps you misconfigured it manually using sysctl.
 

This is /proc/sys/net/ipv4/route/mtu_expires? It's 600.

-- 
Brian Ristuccia


This email message and any attachments are confidential information of Starent 
Networks, Corp. The information transmitted may not be used to create or change 
any contractual obligations of Starent Networks, Corp.  Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon this e-mail and its attachments by persons or entities other than 
the intended recipient is prohibited. If you are not the intended recipient, 
please notify the sender immediately -- by replying to this message or by 
sending an email to [EMAIL PROTECTED] -- and destroy all copies of this message 
and any attachments without reading or disclosing their contents. Thank you.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb-net/pegasus: fix pegasus carrier detection

2007-04-25 Thread Petko Manolov
In general i agree with the reasoning below.  However, isn't it better to 
remove the code that sets carrier on/off in intr_callback()?


There's a reliable way of getting the link status by reading the MII. 
After correct checking of the return value from read_mii_word(), 
set_carrier() is what is good enough.  If 2 seconds is too long of an 
interval we could reduce it to 1 second or, if needed, less.


I'd like to avoid adding additional flags per device as it will take 
forever to collect information about their correct behavior and update 
pegasus.h.  In short i think this part of your patch should be enough:


---

@@ -847,10 +848,16 @@ static void intr_callback(struct urb *urb)
 * d[0].NO_CARRIER kicks in only with failed TX.
 * ... so monitoring with MII may be safest.
 */
-   if (d[0]  NO_CARRIER)
-   netif_carrier_off(net);
-   else
-   netif_carrier_on(net);
-
/* bytes 3-4 == rx_lostpkt, reg 2E/2F */
pegasus-stats.rx_missed_errors += ((d[3]  0x7f)  8) | d[4];
@@ -950,7 +957,7 @@ static void set_carrier(struct net_device *net)
pegasus_t *pegasus = netdev_priv(net);
u16 tmp;

-   if (!read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
+   if (read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
return;

---


cheers,
Petko


On Tue, 24 Apr 2007, Dan Williams wrote:


On Tue, 2007-04-24 at 20:48 +0300, [EMAIL PROTECTED] wrote:

On Tue, Apr 24, 2007 at 12:49:12PM -0400, Jeff Garzik wrote:

 Long term, Greg seemed OK with moving the net drivers from
drivers/usb/net
 to drivers/usb/net, in line with the current policy of placing net
drivers
 in drivers/net/*, bus agnostic.  After that move, sending to netdev and
me
 (as you did here) would be the preferred avenue.


Speaking of which, do you want me to do this in the 2.6.22-rc1
timeframe?  Usually big code moves like this are good to do right after
rc1 comes out as the major churn is usually completed then.


Sorry to interfere, but could you guys wait until tomorrow before applying
the patch to your respective GIT trees?  I'd like to check if the code is
doing the right thing and avoid patch reversal.


Original problem was that the patch I referenced in the commit message
from Jan 6 2006 switched the return value semantics from
read_mii_word().  Before the patch, read_mii_word returned 1 on success,
0 on error.  After the patch, it returns the generally accepted 0 on
success and !0 on error.

That causes set_carrier() to return immediately rather than fiddle with
netif_carrier_*.  When the Jan 6 2006 patch went in changing the return
values, set_carrier() was not updated for the new return values.
Nothing else in the code cares about read_mii_word()'s return value
except set_carrier().

But when the card is brought up and no cable is plugged in,
intr_callback() gets called repeatedly, which itself repeatedly calls
netif_carrier_on() due to the NO_CARRIER check.  The comment there about
NO_CARRIER kicks in on TX failure seems accurate, because even with no
cable plugged in, and therefore no packets getting transmitted, the
NO_CARRIER check is never true on the Belkin part.  Therefore,
netif_carrier_on() is always called as a result of the failure of d[0] 
NO_CARRIER, turning carrier back on even if there is no cable plugged
in.  This bulldozes over the MII carrier_check routine too.

I don't think the intr_callback() code should ever turn the carrier
_on_, because there's that 2*HZ MII carrier check which can certainly
handle the carrier on/off stuff.

LINK_STATUS appears valid on the Belkin part too, so we can add that as
a reverse-quirk and use LINK_STATUS on parts where it works.  If you
think that the NO_CARRIER check should be in _addition_ to the
LINK_STATUS check, that's fine with me, provided that the NO_CARRIER
check only turns carrier off.

Dan




-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb-net/pegasus: fix pegasus carrier detection

2007-04-25 Thread Dan Williams
On Wed, 2007-04-25 at 17:58 +0300, Petko Manolov wrote:
 In general i agree with the reasoning below.  However, isn't it better to 
 remove the code that sets carrier on/off in intr_callback()?

I'm fine with this; whatever makes carrier status work makes me happy :)

Dan

 There's a reliable way of getting the link status by reading the MII. 
 After correct checking of the return value from read_mii_word(), 
 set_carrier() is what is good enough.  If 2 seconds is too long of an 
 interval we could reduce it to 1 second or, if needed, less.
 
 I'd like to avoid adding additional flags per device as it will take 
 forever to collect information about their correct behavior and update 
 pegasus.h.  In short i think this part of your patch should be enough:
 
 ---
 
 @@ -847,10 +848,16 @@ static void intr_callback(struct urb *urb)
* d[0].NO_CARRIER kicks in only with failed TX.
* ... so monitoring with MII may be safest.
*/
 - if (d[0]  NO_CARRIER)
 - netif_carrier_off(net);
 - else
 - netif_carrier_on(net);
 -
   /* bytes 3-4 == rx_lostpkt, reg 2E/2F */
   pegasus-stats.rx_missed_errors += ((d[3]  0x7f)  8) | d[4];
 @@ -950,7 +957,7 @@ static void set_carrier(struct net_device *net)
   pegasus_t *pegasus = netdev_priv(net);
   u16 tmp;
 
 - if (!read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
 + if (read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
   return;
 
 ---
 
 
 cheers,
 Petko
 
 
 On Tue, 24 Apr 2007, Dan Williams wrote:
 
  On Tue, 2007-04-24 at 20:48 +0300, [EMAIL PROTECTED] wrote:
  On Tue, Apr 24, 2007 at 12:49:12PM -0400, Jeff Garzik wrote:
   Long term, Greg seemed OK with moving the net drivers from
  drivers/usb/net
   to drivers/usb/net, in line with the current policy of placing net
  drivers
   in drivers/net/*, bus agnostic.  After that move, sending to netdev and
  me
   (as you did here) would be the preferred avenue.
 
  Speaking of which, do you want me to do this in the 2.6.22-rc1
  timeframe?  Usually big code moves like this are good to do right after
  rc1 comes out as the major churn is usually completed then.
 
  Sorry to interfere, but could you guys wait until tomorrow before applying
  the patch to your respective GIT trees?  I'd like to check if the code is
  doing the right thing and avoid patch reversal.
 
  Original problem was that the patch I referenced in the commit message
  from Jan 6 2006 switched the return value semantics from
  read_mii_word().  Before the patch, read_mii_word returned 1 on success,
  0 on error.  After the patch, it returns the generally accepted 0 on
  success and !0 on error.
 
  That causes set_carrier() to return immediately rather than fiddle with
  netif_carrier_*.  When the Jan 6 2006 patch went in changing the return
  values, set_carrier() was not updated for the new return values.
  Nothing else in the code cares about read_mii_word()'s return value
  except set_carrier().
 
  But when the card is brought up and no cable is plugged in,
  intr_callback() gets called repeatedly, which itself repeatedly calls
  netif_carrier_on() due to the NO_CARRIER check.  The comment there about
  NO_CARRIER kicks in on TX failure seems accurate, because even with no
  cable plugged in, and therefore no packets getting transmitted, the
  NO_CARRIER check is never true on the Belkin part.  Therefore,
  netif_carrier_on() is always called as a result of the failure of d[0] 
  NO_CARRIER, turning carrier back on even if there is no cable plugged
  in.  This bulldozes over the MII carrier_check routine too.
 
  I don't think the intr_callback() code should ever turn the carrier
  _on_, because there's that 2*HZ MII carrier check which can certainly
  handle the carrier on/off stuff.
 
  LINK_STATUS appears valid on the Belkin part too, so we can add that as
  a reverse-quirk and use LINK_STATUS on parts where it works.  If you
  think that the NO_CARRIER check should be in _addition_ to the
  LINK_STATUS check, that's fine with me, provided that the NO_CARRIER
  check only turns carrier off.
 
  Dan
 
 
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb-net/pegasus: fix pegasus carrier detection

2007-04-25 Thread Petko Manolov

On Wed, 25 Apr 2007, Dan Williams wrote:


On Wed, 2007-04-25 at 17:58 +0300, Petko Manolov wrote:

In general i agree with the reasoning below.  However, isn't it better to
remove the code that sets carrier on/off in intr_callback()?


I'm fine with this; whatever makes carrier status work makes me happy :)


Great.  Are you going to submit the new patch or this hard labor will lay 
on my shoulders? :)



Petko




There's a reliable way of getting the link status by reading the MII.
After correct checking of the return value from read_mii_word(),
set_carrier() is what is good enough.  If 2 seconds is too long of an
interval we could reduce it to 1 second or, if needed, less.

I'd like to avoid adding additional flags per device as it will take
forever to collect information about their correct behavior and update
pegasus.h.  In short i think this part of your patch should be enough:

---

@@ -847,10 +848,16 @@ static void intr_callback(struct urb *urb)
 * d[0].NO_CARRIER kicks in only with failed TX.
 * ... so monitoring with MII may be safest.
 */
-   if (d[0]  NO_CARRIER)
-   netif_carrier_off(net);
-   else
-   netif_carrier_on(net);
-
/* bytes 3-4 == rx_lostpkt, reg 2E/2F */
pegasus-stats.rx_missed_errors += ((d[3]  0x7f)  8) | d[4];
@@ -950,7 +957,7 @@ static void set_carrier(struct net_device *net)
pegasus_t *pegasus = netdev_priv(net);
u16 tmp;

-   if (!read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
+   if (read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
return;

---


cheers,
Petko


On Tue, 24 Apr 2007, Dan Williams wrote:


On Tue, 2007-04-24 at 20:48 +0300, [EMAIL PROTECTED] wrote:

On Tue, Apr 24, 2007 at 12:49:12PM -0400, Jeff Garzik wrote:

 Long term, Greg seemed OK with moving the net drivers from
drivers/usb/net
 to drivers/usb/net, in line with the current policy of placing net
drivers
 in drivers/net/*, bus agnostic.  After that move, sending to netdev and
me
 (as you did here) would be the preferred avenue.


Speaking of which, do you want me to do this in the 2.6.22-rc1
timeframe?  Usually big code moves like this are good to do right after
rc1 comes out as the major churn is usually completed then.


Sorry to interfere, but could you guys wait until tomorrow before applying
the patch to your respective GIT trees?  I'd like to check if the code is
doing the right thing and avoid patch reversal.


Original problem was that the patch I referenced in the commit message
from Jan 6 2006 switched the return value semantics from
read_mii_word().  Before the patch, read_mii_word returned 1 on success,
0 on error.  After the patch, it returns the generally accepted 0 on
success and !0 on error.

That causes set_carrier() to return immediately rather than fiddle with
netif_carrier_*.  When the Jan 6 2006 patch went in changing the return
values, set_carrier() was not updated for the new return values.
Nothing else in the code cares about read_mii_word()'s return value
except set_carrier().

But when the card is brought up and no cable is plugged in,
intr_callback() gets called repeatedly, which itself repeatedly calls
netif_carrier_on() due to the NO_CARRIER check.  The comment there about
NO_CARRIER kicks in on TX failure seems accurate, because even with no
cable plugged in, and therefore no packets getting transmitted, the
NO_CARRIER check is never true on the Belkin part.  Therefore,
netif_carrier_on() is always called as a result of the failure of d[0] 
NO_CARRIER, turning carrier back on even if there is no cable plugged
in.  This bulldozes over the MII carrier_check routine too.

I don't think the intr_callback() code should ever turn the carrier
_on_, because there's that 2*HZ MII carrier check which can certainly
handle the carrier on/off stuff.

LINK_STATUS appears valid on the Belkin part too, so we can add that as
a reverse-quirk and use LINK_STATUS on parts where it works.  If you
think that the NO_CARRIER check should be in _addition_ to the
LINK_STATUS check, that's fine with me, provided that the NO_CARRIER
check only turns carrier off.

Dan







-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][XFRM] export SAD info

2007-04-25 Thread jamal
Dave,

Something ive been meaning to do since you made the hash changes. I will
be doing one also for policy. Against latest Linus tree because i am
having strange challenges syncing net-2.6.22.

cheers,
jamal
[XFRM] export SAD info

On a system with a lot of SAs, counting SAD entries chews useful
CPU time since you need to dump the whole SAD to user space;
i.e something like ip xfrm state ls | grep -i src | wc -l
I have seen taking literally minutes on a 40K SAs when the system
is swapping.
With this patch, some of the SAD info (that was already being tracked)
is exposed to user space. i.e you do:
ip xfrm state count
And you get the count; you can also pass -s to the command line and
get the hash info.

Signed-off-by: Jamal Hadi Salim [EMAIL PROTECTED]

---
commit 1fb99604e38f27c1ad4cb74b11f148c34d0d3be6
tree 1bb35db627ac5d3d2f370d0fc993ba6b80392696
parent 146d97b89c83c9460012185bfd584d21a3b5fe19
author Jamal Hadi Salim [EMAIL PROTECTED] Wed, 25 Apr 2007 11:30:21 -0400
committer Jamal Hadi Salim [EMAIL PROTECTED] Wed, 25 Apr 2007 11:30:21 -0400

 include/linux/xfrm.h  |   25 ++
 include/net/xfrm.h|8 +++
 net/xfrm/xfrm_state.c |   12 ++-
 net/xfrm/xfrm_user.c  |   56 +
 4 files changed, 100 insertions(+), 1 deletions(-)

diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
index 15ca89e..9c656a5 100644
--- a/include/linux/xfrm.h
+++ b/include/linux/xfrm.h
@@ -181,6 +181,10 @@ enum {
XFRM_MSG_MIGRATE,
 #define XFRM_MSG_MIGRATE XFRM_MSG_MIGRATE
 
+   XFRM_MSG_NEWSADINFO,
+#define XFRM_MSG_NEWSADINFO XFRM_MSG_NEWSADINFO
+   XFRM_MSG_GETSADINFO,
+#define XFRM_MSG_GETSADINFO XFRM_MSG_GETSADINFO
__XFRM_MSG_MAX
 };
 #define XFRM_MSG_MAX (__XFRM_MSG_MAX - 1)
@@ -234,6 +238,17 @@ enum xfrm_ae_ftype_t {
 #define XFRM_AE_MAX (__XFRM_AE_MAX - 1)
 };
 
+/* SAD Table filter flags  */
+enum xfrm_sad_ftype_t {
+   XFRM_SAD_UNSPEC,
+   XFRM_SAD_HMASK=1,
+   XFRM_SAD_HMAX=2,
+   XFRM_SAD_CNT=4,
+   __XFRM_SAD_MAX
+
+#define XFRM_SAD_MAX (__XFRM_SAD_MAX - 1)
+};
+
 struct xfrm_userpolicy_type {
__u8type;
__u16   reserved1;
@@ -265,6 +280,16 @@ enum xfrm_attr_type_t {
 #define XFRMA_MAX (__XFRMA_MAX - 1)
 };
 
+enum xfrm_sadattr_type_t {
+   XFRMA_SAD_UNSPEC,
+   XFRMA_SADHMASK,
+   XFRMA_SADHMAX,
+   XFRMA_SADCNT,
+   __XFRMA_SAD_MAX
+
+#define XFRMA_SAD_MAX (__XFRMA_SAD_MAX - 1)
+};
+
 struct xfrm_usersa_info {
struct xfrm_selectorsel;
struct xfrm_id  id;
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 5a00aa8..4922e9f 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -416,6 +416,13 @@ struct xfrm_audit
u32 secid;
 };
 
+/* SAD metadata, add more later */
+struct xfrm_sadinfo
+{
+   u32 sadhcnt; /* current hash bkts */
+   u32 sadhmcnt; /* max allowed hash bkts */
+   u32 sadcnt; /* current running count */
+};
 #ifdef CONFIG_AUDITSYSCALL
 extern void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
struct xfrm_policy *xp, struct xfrm_state *x);
@@ -938,6 +945,7 @@ static inline int xfrm_state_sort(struct xfrm_state **dst, 
struct xfrm_state **s
 extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq);
 extern int xfrm_state_delete(struct xfrm_state *x);
 extern void xfrm_state_flush(u8 proto, struct xfrm_audit *audit_info);
+extern void xfrm_sad_getinfo(struct xfrm_sadinfo *si);
 extern int xfrm_replay_check(struct xfrm_state *x, __be32 seq);
 extern void xfrm_replay_advance(struct xfrm_state *x, __be32 seq);
 extern void xfrm_replay_notify(struct xfrm_state *x, int event);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index c1581fb..98e5ce3 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -53,7 +53,7 @@ static struct hlist_head *xfrm_state_bysrc __read_mostly;
 static struct hlist_head *xfrm_state_byspi __read_mostly;
 static unsigned int xfrm_state_hmask __read_mostly;
 static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;
-static u32 xfrm_state_num;
+static unsigned int xfrm_state_num;
 static unsigned int xfrm_state_genid;
 
 static inline unsigned int xfrm_dst_hash(xfrm_address_t *daddr,
@@ -421,6 +421,16 @@ restart:
 }
 EXPORT_SYMBOL(xfrm_state_flush);
 
+void xfrm_sad_getinfo(struct xfrm_sadinfo *si)
+{
+   spin_lock_bh(xfrm_state_lock);
+   si-sadcnt = xfrm_state_num;
+   si-sadhcnt = xfrm_state_hmask;
+   si-sadhmcnt = xfrm_state_hashmax;
+   spin_unlock_bh(xfrm_state_lock);
+}
+EXPORT_SYMBOL(xfrm_sad_getinfo);
+
 static int
 xfrm_init_tempsel(struct xfrm_state *x, struct flowi *fl,
  struct xfrm_tmpl *tmpl,
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 816e369..089159a 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -672,6 +672,61 @@ static struct sk_buff *xfrm_state_netlink(struct sk_buff 

[PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #4]

2007-04-25 Thread David Howells

The first of these patches together provide secure client-side RxRPC
connectivity as a Linux kernel socket family.  Only the RxRPC transport/session
side is supplied - the presentation side (marshalling the data) is left to the
client.  Copies of the patches can be found here:

http://people.redhat.com/~dhowells/rxrpc/series
http://people.redhat.com/~dhowells/rxrpc/01-move-skb-generic.diff
http://people.redhat.com/~dhowells/rxrpc/02-cancel_delayed_work.diff
http://people.redhat.com/~dhowells/rxrpc/03-keys.diff
http://people.redhat.com/~dhowells/rxrpc/04-timer-exports.diff
http://people.redhat.com/~dhowells/rxrpc/05-af_rxrpc.diff

Further patches make the in-kernel AFS filesystem use AF_RXRPC and delete the
old RxRPC implementation:

http://people.redhat.com/~dhowells/rxrpc/06-afs-cleanup.diff
http://people.redhat.com/~dhowells/rxrpc/07-af_rxrpc-kernel.diff
http://people.redhat.com/~dhowells/rxrpc/08-af_rxrpc-afs.diff
http://people.redhat.com/~dhowells/rxrpc/09-af_rxrpc-delete-old.diff

And then the rest of the patches extend AFS to provide automatic unmounting of
automount trees, security support and directory-level write support (create,
mkdir, etc.):

http://people.redhat.com/~dhowells/rxrpc/10-afs-multimount.diff
http://people.redhat.com/~dhowells/rxrpc/11-afs-security.diff
http://people.redhat.com/~dhowells/rxrpc/12-afs-doc.diff

http://people.redhat.com/~dhowells/rxrpc/13-netlink-support-MSG_TRUNC.diff
http://people.redhat.com/~dhowells/rxrpc/14-afs-get-capabilities.diff
http://people.redhat.com/~dhowells/rxrpc/15-afs-initcallbackstate3.diff
http://people.redhat.com/~dhowells/rxrpc/16-afs-dir-write-support.diff

Note that file-level write support is not yet complete and so is not included
in this patch set.


The userspace access methods make use of the control data passed to/by
sendmsg() and recvmsg().  See the three simple test programs:

http://people.redhat.com/~dhowells/rxrpc/klog.c
http://people.redhat.com/~dhowells/rxrpc/rxrpc.c
http://people.redhat.com/~dhowells/rxrpc/listen.c

The klog program is provided to go and get a Kerberos IV key from the AFS
kaserver.  Currently it must be edited before compiling to note the right
server IP address and the appropriate credentials.

These programs can be compiled by:

make klog rxrpc listen CFLAGS=-Wall -g LDLIBS=-lcrypto -lcrypt 
-lkrb4 -lkeyutils

Then a ticket can be obtained by:

./klog

If a security key is acquired in this way, then all subsequent AFS operations -
including VL lookups and mounts - performed with that session keyring will be
authenticated using that key.  The key can be viewed like so:

[EMAIL PROTECTED] ~]# keyctl show
Session Keyring
   -3 --alswrv  0 0  keyring: _ses.3268
2 --alswrv  0 0   \_ keyring: _uid.0
111416553 --als--v  0 0   \_ rxrpc: [EMAIL PROTECTED]

TODO:

 (*) Make certain parameters (such as connection timeouts) userspace
 configurable.

 (*) Make userspace utilities use it; librxrpc.

 (*) Userspace documentation.

 (*) KerberosV security.

Changes:

 (*) SOCK_RPC has been removed.  SOCK_DGRAM is now used instead.

 (*) I've add a facility whereby calls can be made to destinations other than
 the connect() address of a client socket by making use of msg_name in the
 msghdr struct when using sendmsg() to send the first data packet of a
 call.  Indeed, a client socket need not be connected before being used
 so.

 (*) I've also added a facility whereby client calls may also be made on
 server sockets, again by using msg_name in the msghdr struct.  In such a
 case, the server's local transport endpoint is used.

 (*) I've made the write buffer space check available to various callers
 (sk_write_space) and implemented poll support.

 (*) Rewrote rxrpc_recvmsg().  It now concatenates adjacent data messages from
 the same call when delivering them.

 (*) Updated the documentation to include notes on recvmsg, cover control
 messages and cover SOL_RXRPC-level socket options.

 (*) Provided an in-kernel interface to give in-kernel utilities easier access
 to the facility.

 (*) Made fs/afs/ use it.

 (*) Deleted the old contents of net/rxrpc/.

 (*) Use the scatterlist interface to the crypto API for now.  The patch that
 added the direct access interface conflicts with patches Herbert Xu is
 producing, so I've dropped it for the moment.

 (*) Moved a bug fix to make secure connection reuse work from the
 af_rxrpc-kernel patch to the af_rxrpc main patch.

 (*) Make RxRPC use its own private work queues rather than keventd's to avoid
 deadlocks when AFS tries to use keventd too.  This also puts encryption
 in the private work queue rather than keventd's queue as that might take
 a relatively long time to 

[PATCH 02/16] cancel_delayed_work: use del_timer() instead of del_timer_sync() [try #4]

2007-04-25 Thread David Howells
del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
efficient since it locks the timer unconditionally, and may wait for the
completion of the delayed_work_timer_fn().

cancel_delayed_work() == 0 means:

before this patch:
work-func may still be running or queued

after this patch:
work-func may still be running or queued, or
delayed_work_timer_fn-__queue_work() in progress.

The latter doesn't differ from the caller's POV,
delayed_work_timer_fn() is called with _PENDING
bit set.

cancel_delayed_work() == 1 with this patch adds a new possibility:

delayed_work-work was cancelled, but delayed_work_timer_fn
is still running (this is only possible for the re-arming
works on single-threaded workqueue).

In this case the timer was re-started by work-func(), nobody
else can do this. This in turn means that delayed_work_timer_fn
has already passed __queue_work() (and wont't touch delayed_work)
because nobody else can queue delayed_work-work.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]
Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/workqueue.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2a7b38d..b8abfc7 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -191,14 +191,15 @@ int execute_in_process_context(work_func_t fn, struct 
execute_work *);
 
 /*
  * Kill off a pending schedule_delayed_work().  Note that the work callback
- * function may still be running on return from cancel_delayed_work().  Run
- * flush_scheduled_work() to wait on it.
+ * function may still be running on return from cancel_delayed_work(), unless
+ * it returns 1 and the work doesn't re-arm itself. Run flush_workqueue() or
+ * cancel_work_sync() to wait on it.
  */
 static inline int cancel_delayed_work(struct delayed_work *work)
 {
int ret;
 
-   ret = del_timer_sync(work-timer);
+   ret = del_timer(work-timer);
if (ret)
work_release(work-work);
return ret;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/16] AF_RXRPC: Move generic skbuff stuff from XFRM code to generic code [try #4]

2007-04-25 Thread David Howells
Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can
use it too.

The kdoc comments I've attached to the functions needs to be checked by whoever
wrote them as I had to make some guesses about the workings of these functions.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 include/linux/skbuff.h |6 ++
 include/net/esp.h  |2 -
 net/core/skbuff.c  |  188 
 net/xfrm/xfrm_algo.c   |  169 ---
 4 files changed, 194 insertions(+), 171 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5992f65..c905d42 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -83,6 +83,7 @@
  */
 
 struct net_device;
+struct scatterlist;
 
 #ifdef CONFIG_NETFILTER
 struct nf_conntrack {
@@ -361,6 +362,11 @@ extern struct sk_buff *skb_realloc_headroom(struct sk_buff 
*skb,
 extern struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
   int newheadroom, int newtailroom,
   gfp_t priority);
+extern intskb_to_sgvec(struct sk_buff *skb,
+   struct scatterlist *sg, int offset,
+   int len);
+extern intskb_cow_data(struct sk_buff *skb, int tailbits,
+   struct sk_buff **trailer);
 extern intskb_pad(struct sk_buff *skb, int pad);
 #define dev_kfree_skb(a)   kfree_skb(a)
 extern void  skb_over_panic(struct sk_buff *skb, int len,
diff --git a/include/net/esp.h b/include/net/esp.h
index 713d039..d05d8d2 100644
--- a/include/net/esp.h
+++ b/include/net/esp.h
@@ -40,8 +40,6 @@ struct esp_data
} auth;
 };
 
-extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int 
offset, int len);
-extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff 
**trailer);
 extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len);
 
 static inline int esp_mac_digest(struct esp_data *esp, struct sk_buff *skb,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 336958f..aa02bd4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -55,6 +55,7 @@
 #include linux/cache.h
 #include linux/rtnetlink.h
 #include linux/init.h
+#include linux/scatterlist.h
 
 #include net/protocol.h
 #include net/dst.h
@@ -2005,6 +2006,190 @@ void __init skb_init(void)
NULL, NULL);
 }
 
+/**
+ * skb_to_sgvec - Fill a scatter-gather list from a socket buffer
+ * @skb: Socket buffer containing the buffers to be mapped
+ * @sg: The scatter-gather list to map into
+ * @offset: The offset into the buffer's contents to start mapping
+ * @len: Length of buffer space to be mapped
+ *
+ * Fill the specified scatter-gather list with mappings/pointers into a
+ * region of the buffer space attached to a socket buffer.
+ */
+int
+skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   int elt = 0;
+
+   if (copy  0) {
+   if (copy  len)
+   copy = len;
+   sg[elt].page = virt_to_page(skb-data + offset);
+   sg[elt].offset = (unsigned long)(skb-data + offset) % 
PAGE_SIZE;
+   sg[elt].length = copy;
+   elt++;
+   if ((len -= copy) == 0)
+   return elt;
+   offset += copy;
+   }
+
+   for (i = 0; i  skb_shinfo(skb)-nr_frags; i++) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + skb_shinfo(skb)-frags[i].size;
+   if ((copy = end - offset)  0) {
+   skb_frag_t *frag = skb_shinfo(skb)-frags[i];
+
+   if (copy  len)
+   copy = len;
+   sg[elt].page = frag-page;
+   sg[elt].offset = frag-page_offset+offset-start;
+   sg[elt].length = copy;
+   elt++;
+   if (!(len -= copy))
+   return elt;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   if (skb_shinfo(skb)-frag_list) {
+   struct sk_buff *list = skb_shinfo(skb)-frag_list;
+
+   for (; list; list = list-next) {
+   int end;
+
+   BUG_TRAP(start = offset + len);
+
+   end = start + list-len;
+   if ((copy = end - offset)  0) {
+   if (copy  len)
+   copy = len;
+   elt += skb_to_sgvec(list, sg+elt, offset - 
start, copy);
+   if ((len -= copy) == 0)

[PATCH 03/16] AF_RXRPC: Key facility changes for AF_RXRPC [try #4]

2007-04-25 Thread David Howells
Export the keyring key type definition and document its availability.

Add alternative types into the key's type_data union to make it more useful.
Not all users necessarily want to use it as a list_head (AF_RXRPC doesn't, for
example), so make it clear that it can be used in other ways.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/keys.txt  |   12 
 include/linux/key.h |2 ++
 security/keys/keyring.c |2 ++
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/Documentation/keys.txt b/Documentation/keys.txt
index 60c665d..81d9aa0 100644
--- a/Documentation/keys.txt
+++ b/Documentation/keys.txt
@@ -859,6 +859,18 @@ payload contents for more information.
void unregister_key_type(struct key_type *type);
 
 
+Under some circumstances, it may be desirable to desirable to deal with a
+bundle of keys.  The facility provides access to the keyring type for managing
+such a bundle:
+
+   struct key_type key_type_keyring;
+
+This can be used with a function such as request_key() to find a specific
+keyring in a process's keyrings.  A keyring thus found can then be searched
+with keyring_search().  Note that it is not possible to use request_key() to
+search a specific keyring, so using keyrings in this way is of limited utility.
+
+
 ===
 NOTES ON ACCESSING PAYLOAD CONTENTS
 ===
diff --git a/include/linux/key.h b/include/linux/key.h
index 169f05e..a9220e7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -160,6 +160,8 @@ struct key {
 */
union {
struct list_headlink;
+   unsigned long   x[2];
+   void*p[2];
} type_data;
 
/* key data
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index ad45ce7..88292e3 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -66,6 +66,8 @@ struct key_type key_type_keyring = {
.read   = keyring_read,
 };
 
+EXPORT_SYMBOL(key_type_keyring);
+
 /*
  * semaphore to serialise link/link calls to prevent two link calls in parallel
  * introducing a cycle

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] AFS: Handle multiple mounts of an AFS superblock correctly [try #4]

2007-04-25 Thread David Howells
Handle multiple mounts of an AFS superblock correctly, checking to see whether
the superblock is already initialised after calling sget() rather than just
unconditionally stamping all over it.

Also delete the silent parameter to afs_fill_super() as it's not used and
can, in any case, be obtained from sb-s_flags.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/super.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/afs/super.c b/fs/afs/super.c
index efc4fe6..77e6875 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -212,7 +212,7 @@ static int afs_test_super(struct super_block *sb, void 
*data)
 /*
  * fill in the superblock
  */
-static int afs_fill_super(struct super_block *sb, void *data, int silent)
+static int afs_fill_super(struct super_block *sb, void *data)
 {
struct afs_mount_params *params = data;
struct afs_super_info *as = NULL;
@@ -319,17 +319,23 @@ static int afs_get_sb(struct file_system_type *fs_type,
goto error;
}
 
-   sb-s_flags = flags;
-
-   ret = afs_fill_super(sb, params, flags  MS_SILENT ? 1 : 0);
-   if (ret  0) {
-   up_write(sb-s_umount);
-   deactivate_super(sb);
-   goto error;
+   if (!sb-s_root) {
+   /* initial superblock/root creation */
+   _debug(create);
+   sb-s_flags = flags;
+   ret = afs_fill_super(sb, params);
+   if (ret  0) {
+   up_write(sb-s_umount);
+   deactivate_super(sb);
+   goto error;
+   }
+   sb-s_flags |= MS_ACTIVE;
+   } else {
+   _debug(reuse);
+   ASSERTCMP(sb-s_flags, , MS_ACTIVE);
}
-   sb-s_flags |= MS_ACTIVE;
-   simple_set_mnt(mnt, sb);
 
+   simple_set_mnt(mnt, sb);
afs_put_volume(params.volume);
afs_put_cell(params.default_cell);
_leave( = 0 [%p], sb);

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] AFS: Implement the CB.InitCallBackState3 operation [try #4]

2007-04-25 Thread David Howells
Implement the CB.InitCallBackState3 operation for the fileserver to call.
This reduces the amount of network traffic because if this op is aborted, the
fileserver will then attempt an CB.InitCallBackState operation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/afs_cm.h|1 +
 fs/afs/cmservice.c |   46 ++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/fs/afs/afs_cm.h b/fs/afs/afs_cm.h
index d4bd201..7b4d4fa 100644
--- a/fs/afs/afs_cm.h
+++ b/fs/afs/afs_cm.h
@@ -23,6 +23,7 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBInitCallBackState3= 213,  /* initialise callback state, version 3 
*/
CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index f8ad36b..32deb04 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -20,6 +20,8 @@ struct workqueue_struct *afs_cm_workqueue;
 
 static int afs_deliver_cb_init_call_back_state(struct afs_call *,
   struct sk_buff *, bool);
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *,
+   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
@@ -47,6 +49,16 @@ static const struct afs_call_type afs_SRXCBInitCallBackState 
= {
 };
 
 /*
+ * CB.InitCallBackState3 operation type
+ */
+static const struct afs_call_type afs_SRXCBInitCallBackState3 = {
+   .name   = CB.InitCallBackState3,
+   .deliver= afs_deliver_cb_init_call_back_state3,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * CB.Probe operation type
  */
 static const struct afs_call_type afs_SRXCBProbe = {
@@ -83,6 +95,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBInitCallBackState:
call-type = afs_SRXCBInitCallBackState;
return true;
+   case CBInitCallBackState3:
+   call-type = afs_SRXCBInitCallBackState3;
+   return true;
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
@@ -312,6 +327,37 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *call,
 }
 
 /*
+ * deliver request data to a CB.InitCallBackState3 call
+ */
+static int afs_deliver_cb_init_call_back_state3(struct afs_call *call,
+   struct sk_buff *skb,
+   bool last)
+{
+   struct afs_server *server;
+   struct in_addr addr;
+
+   _enter(,{%u},%d, skb-len, last);
+
+   if (!last)
+   return 0;
+
+   /* no unmarshalling required */
+   call-state = AFS_CALL_REPLYING;
+
+   /* we'll need the file server record as that tells us which set of
+* vnodes to operate upon */
+   memcpy(addr, skb-nh.iph-saddr, 4);
+   server = afs_find_server(addr);
+   if (!server)
+   return -ENOTCONN;
+   call-server = server;
+
+   INIT_WORK(call-work, SRXAFSCB_InitCallBackState);
+   schedule_work(call-work);
+   return 0;
+}
+
+/*
  * allow the fileserver to see if the cache manager is still alive
  */
 static void SRXAFSCB_Probe(struct work_struct *work)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][XFRM] export SAD info

2007-04-25 Thread jamal

That patch has xfrm_state_num being mucked with; just ignore that bit.
I need to send a patch against net-2.6.22 and i will clean that up -
just need some feedback.

Would it make sense to have those vars as u32 instead of unsigned int?

cheers,
jamal

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/16] commit ad495d7b6cfcd1bc2eaf06c42699be0bb5d84234 [try #4]

2007-04-25 Thread David Howells
[NETLINK]: Mirror UDP MSG_TRUNC semantics.

If the user passes MSG_TRUNC in via msg_flags, return
the full packet size not the truncated size.

Idea from Herbert Xu and Thomas Graf.

Signed-off-by: David S. Miller [EMAIL PROTECTED]
---

 net/netlink/af_netlink.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c48b0f4..5890210 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1242,6 +1242,9 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct 
socket *sock,
 
scm_recv(sock, msg, siocb-scm, flags);
 
+   if (flags  MSG_TRUNC)
+   copied = skb-len;
+
 out:
netlink_rcv_wake(sk);
return err ? : copied;

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/16] AFS: Add support for the CB.GetCapabilities operation [try #4]

2007-04-25 Thread David Howells
Add support for the CB.GetCapabilities operation with which the fileserver can
ask the client for the following information:

 (1) The list of network interfaces it has available as IPv4 address + netmask
 plus the MTUs.

 (2) The client's UUID.

 (3) The extended capabilities of the client, for which the only current one
 is unified error mapping (abort code interpretation).

To support this, the patch adds the following routines to AFS:

 (1) A function to iterate through all the network interfaces using RTNETLINK
 to extract IPv4 addresses and MTUs.

 (2) A function to iterate through all the network interfaces using RTNETLINK
 to pull out the MAC address of the lowest index interface to use in UUID
 construction.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 fs/afs/Makefile|1 
 fs/afs/afs_cm.h|3 
 fs/afs/cmservice.c |   98 ++
 fs/afs/internal.h  |   42 
 fs/afs/main.c  |   49 +
 fs/afs/rxrpc.c |   39 
 fs/afs/use-rtnetlink.c |  473 
 7 files changed, 705 insertions(+), 0 deletions(-)

diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index cca198b..01545eb 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -18,6 +18,7 @@ kafs-objs := \
security.o \
server.o \
super.o \
+   use-rtnetlink.o \
vlclient.o \
vlocation.o \
vnode.o \
diff --git a/fs/afs/afs_cm.h b/fs/afs/afs_cm.h
index 7c8e3d4..d4bd201 100644
--- a/fs/afs/afs_cm.h
+++ b/fs/afs/afs_cm.h
@@ -23,6 +23,9 @@ enum AFS_CM_Operations {
CBGetCE = 208,  /* get cache file description */
CBGetXStatsVersion  = 209,  /* get version of extended statistics */
CBGetXStats = 210,  /* get contents of extended statistics 
data */
+   CBGetCapabilities   = 65538, /* get client capabilities */
 };
 
+#define AFS_CAP_ERROR_TRANSLATION  0x1
+
 #endif /* AFS_FS_H */
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c
index 7e184bb..f8ad36b 100644
--- a/fs/afs/cmservice.c
+++ b/fs/afs/cmservice.c
@@ -22,6 +22,8 @@ static int afs_deliver_cb_init_call_back_state(struct 
afs_call *,
   struct sk_buff *, bool);
 static int afs_deliver_cb_probe(struct afs_call *, struct sk_buff *, bool);
 static int afs_deliver_cb_callback(struct afs_call *, struct sk_buff *, bool);
+static int afs_deliver_cb_get_capabilities(struct afs_call *, struct sk_buff *,
+  bool);
 static void afs_cm_destructor(struct afs_call *);
 
 /*
@@ -55,6 +57,16 @@ static const struct afs_call_type afs_SRXCBProbe = {
 };
 
 /*
+ * CB.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_SRXCBGetCapabilites = {
+   .name   = CB.GetCapabilities,
+   .deliver= afs_deliver_cb_get_capabilities,
+   .abort_to_error = afs_abort_to_error,
+   .destructor = afs_cm_destructor,
+};
+
+/*
  * route an incoming cache manager call
  * - return T if supported, F if not
  */
@@ -74,6 +86,9 @@ bool afs_cm_incoming_call(struct afs_call *call)
case CBProbe:
call-type = afs_SRXCBProbe;
return true;
+   case CBGetCapabilities:
+   call-type = afs_SRXCBGetCapabilites;
+   return true;
default:
return false;
}
@@ -328,3 +343,86 @@ static int afs_deliver_cb_probe(struct afs_call *call, 
struct sk_buff *skb,
schedule_work(call-work);
return 0;
 }
+
+/*
+ * allow the fileserver to ask about the cache manager's capabilities
+ */
+static void SRXAFSCB_GetCapabilities(struct work_struct *work)
+{
+   struct afs_interface *ifs;
+   struct afs_call *call = container_of(work, struct afs_call, work);
+   int loop, nifs;
+
+   struct {
+   struct /* InterfaceAddr */ {
+   __be32 nifs;
+   __be32 uuid[11];
+   __be32 ifaddr[32];
+   __be32 netmask[32];
+   __be32 mtu[32];
+   } ia;
+   struct /* Capabilities */ {
+   __be32 capcount;
+   __be32 caps[1];
+   } cap;
+   } reply;
+
+   _enter();
+
+   nifs = 0;
+   ifs = kcalloc(32, sizeof(*ifs), GFP_KERNEL);
+   if (ifs) {
+   nifs = afs_get_ipv4_interfaces(ifs, 32, false);
+   if (nifs  0) {
+   kfree(ifs);
+   ifs = NULL;
+   nifs = 0;
+   }
+   }
+
+   memset(reply, 0, sizeof(reply));
+   reply.ia.nifs = htonl(nifs);
+
+   reply.ia.uuid[0] = htonl(afs_uuid.time_low);
+   reply.ia.uuid[1] = htonl(afs_uuid.time_mid);
+   reply.ia.uuid[2] = htonl(afs_uuid.time_hi_and_version);
+   reply.ia.uuid[3] = htonl((s8) 

[PATCH 12/16] AFS: Update the AFS fs documentation [try #4]

2007-04-25 Thread David Howells
Update the AFS fs documentation.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 Documentation/filesystems/afs.txt |  214 +++--
 1 files changed, 154 insertions(+), 60 deletions(-)

diff --git a/Documentation/filesystems/afs.txt 
b/Documentation/filesystems/afs.txt
index 2f4237d..12ad6c7 100644
--- a/Documentation/filesystems/afs.txt
+++ b/Documentation/filesystems/afs.txt
@@ -1,31 +1,82 @@
+
 kAFS: AFS FILESYSTEM
 
 
-ABOUT
-=
+Contents:
+
+ - Overview.
+ - Usage.
+ - Mountpoints.
+ - Proc filesystem.
+ - The cell database.
+ - Security.
+ - Examples.
+
+
+
+OVERVIEW
+
 
-This filesystem provides a fairly simple AFS filesystem driver. It is under
-development and only provides very basic facilities. It does not yet support
-the following AFS features:
+This filesystem provides a fairly simple secure AFS filesystem driver. It is
+under development and does not yet provide the full feature set.  The features
+it does support include:
 
-   (*) Write support.
-   (*) Communications security.
-   (*) Local caching.
-   (*) pioctl() system call.
-   (*) Automatic mounting of embedded mountpoints.
+ (*) Security (currently only AFS kaserver and KerberosIV tickets).
 
+ (*) File reading.
 
+ (*) Automounting.
+
+It does not yet support the following AFS features:
+
+ (*) Write support.
+
+ (*) Local caching.
+
+ (*) pioctl() system call.
+
+
+===
+COMPILATION
+===
+
+The filesystem should be enabled by turning on the kernel configuration
+options:
+
+   CONFIG_AF_RXRPC - The RxRPC protocol transport
+   CONFIG_RXKAD- The RxRPC Kerberos security handler
+   CONFIG_AFS  - The AFS filesystem
+
+Additionally, the following can be turned on to aid debugging:
+
+   CONFIG_AF_RXRPC_DEBUG   - Permit AF_RXRPC debugging to be enabled
+   CONFIG_AFS_DEBUG- Permit AFS debugging to be enabled
+
+They permit the debugging messages to be turned on dynamically by manipulating
+the masks in the following files:
+
+   /sys/module/af_rxrpc/parameters/debug
+   /sys/module/afs/parameters/debug
+
+
+=
 USAGE
 =
 
 When inserting the driver modules the root cell must be specified along with a
 list of volume location server IP addresses:
 
-   insmod rxrpc.o
+   insmod af_rxrpc.o
+   insmod rxkad.o
insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
 
-The first module is a driver for the RxRPC remote operation protocol, and the
-second is the actual filesystem driver for the AFS filesystem.
+The first module is the AF_RXRPC network protocol driver.  This provides the
+RxRPC remote operation protocol and may also be accessed from userspace.  See:
+
+   Documentation/networking/rxrpc.txt
+
+The second module is the kerberos RxRPC security driver, and the third module
+is the actual filesystem driver for the AFS filesystem.
 
 Once the module has been loaded, more modules can be added by the following
 procedure:
@@ -33,7 +84,7 @@ procedure:
echo add grand.central.org 18.7.14.88:128.2.191.224 /proc/fs/afs/cells
 
 Where the parameters to the add command are the name of a cell and a list of
-volume location servers within that cell.
+volume location servers within that cell, with the latter separated by colons.
 
 Filesystems can be mounted anywhere by commands similar to the following:
 
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to 
the following:
mount -t afs #root.afs. /afs
mount -t afs #root.cell. /afs/cambridge
 
-  NB: When using this on Linux 2.4, the mount command has to be different,
-  since the filesystem doesn't have access to the device name argument:
-
-   mount -t afs none /afs -ovol=#root.afs.
-
 Where the initial character is either a hash or a percent symbol depending on
 whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O
 volume, but are willing to use a R/W volume instead (percent).
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified 
during insmod.
 Additional cells can be added through /proc (see later section).
 
 
+===
 MOUNTPOINTS
 ===
 
-AFS has a concept of mountpoints. These are specially formatted symbolic links
-(of the same form as the device name passed to mount). kAFS presents these
-to the user as directories that have special properties:
+AFS has a concept of mountpoints. In AFS terms, these are specially formatted
+symbolic links (of the same form as the device name passed to mount).  kAFS
+presents these to the user as directories that have a follow-link capability
+(ie: symbolic link semantics).  If anyone attempts to access them, they will
+automatically cause the target volume to be mounted (if possible) on that site.
 
-  (*) They cannot 

Re: [PATCH] usb-net/pegasus: fix pegasus carrier detection

2007-04-25 Thread Dan Williams
On Wed, 2007-04-25 at 18:09 +0300, Petko Manolov wrote:
 On Wed, 25 Apr 2007, Dan Williams wrote:
 
  On Wed, 2007-04-25 at 17:58 +0300, Petko Manolov wrote:
  In general i agree with the reasoning below.  However, isn't it better to
  remove the code that sets carrier on/off in intr_callback()?
 
  I'm fine with this; whatever makes carrier status work makes me happy :)
 
 Great.  Are you going to submit the new patch or this hard labor will lay 
 on my shoulders? :)

Well, it looked like you already had one; but if you'd like I'll whip up
a new one.

Dan

 
   Petko
 
 
 
  There's a reliable way of getting the link status by reading the MII.
  After correct checking of the return value from read_mii_word(),
  set_carrier() is what is good enough.  If 2 seconds is too long of an
  interval we could reduce it to 1 second or, if needed, less.
 
  I'd like to avoid adding additional flags per device as it will take
  forever to collect information about their correct behavior and update
  pegasus.h.  In short i think this part of your patch should be enough:
 
  ---
 
  @@ -847,10 +848,16 @@ static void intr_callback(struct urb *urb)
  * d[0].NO_CARRIER kicks in only with failed TX.
  * ... so monitoring with MII may be safest.
  */
  -  if (d[0]  NO_CARRIER)
  -  netif_carrier_off(net);
  -  else
  -  netif_carrier_on(net);
  -
 /* bytes 3-4 == rx_lostpkt, reg 2E/2F */
 pegasus-stats.rx_missed_errors += ((d[3]  0x7f)  8) | d[4];
  @@ -950,7 +957,7 @@ static void set_carrier(struct net_device *net)
 pegasus_t *pegasus = netdev_priv(net);
 u16 tmp;
 
  -  if (!read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
  +  if (read_mii_word(pegasus, pegasus-phy, MII_BMSR, tmp))
 return;
 
  ---
 
 
  cheers,
  Petko
 
 
  On Tue, 24 Apr 2007, Dan Williams wrote:
 
  On Tue, 2007-04-24 at 20:48 +0300, [EMAIL PROTECTED] wrote:
  On Tue, Apr 24, 2007 at 12:49:12PM -0400, Jeff Garzik wrote:
   Long term, Greg seemed OK with moving the net drivers from
  drivers/usb/net
   to drivers/usb/net, in line with the current policy of placing net
  drivers
   in drivers/net/*, bus agnostic.  After that move, sending to netdev 
  and
  me
   (as you did here) would be the preferred avenue.
 
  Speaking of which, do you want me to do this in the 2.6.22-rc1
  timeframe?  Usually big code moves like this are good to do right after
  rc1 comes out as the major churn is usually completed then.
 
  Sorry to interfere, but could you guys wait until tomorrow before 
  applying
  the patch to your respective GIT trees?  I'd like to check if the code is
  doing the right thing and avoid patch reversal.
 
  Original problem was that the patch I referenced in the commit message
  from Jan 6 2006 switched the return value semantics from
  read_mii_word().  Before the patch, read_mii_word returned 1 on success,
  0 on error.  After the patch, it returns the generally accepted 0 on
  success and !0 on error.
 
  That causes set_carrier() to return immediately rather than fiddle with
  netif_carrier_*.  When the Jan 6 2006 patch went in changing the return
  values, set_carrier() was not updated for the new return values.
  Nothing else in the code cares about read_mii_word()'s return value
  except set_carrier().
 
  But when the card is brought up and no cable is plugged in,
  intr_callback() gets called repeatedly, which itself repeatedly calls
  netif_carrier_on() due to the NO_CARRIER check.  The comment there about
  NO_CARRIER kicks in on TX failure seems accurate, because even with no
  cable plugged in, and therefore no packets getting transmitted, the
  NO_CARRIER check is never true on the Belkin part.  Therefore,
  netif_carrier_on() is always called as a result of the failure of d[0] 
  NO_CARRIER, turning carrier back on even if there is no cable plugged
  in.  This bulldozes over the MII carrier_check routine too.
 
  I don't think the intr_callback() code should ever turn the carrier
  _on_, because there's that 2*HZ MII carrier check which can certainly
  handle the carrier on/off stuff.
 
  LINK_STATUS appears valid on the Belkin part too, so we can add that as
  a reverse-quirk and use LINK_STATUS on parts where it works.  If you
  think that the NO_CARRIER check should be in _addition_ to the
  LINK_STATUS check, that's fine with me, provided that the NO_CARRIER
  check only turns carrier off.
 
  Dan
 
 
 
 
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/16] AF_RXRPC: Make it possible to merely try to cancel timers from a module [try #4]

2007-04-25 Thread David Howells
Export try_to_del_timer_sync() for use by the AF_RXRPC module.

Signed-Off-By: David Howells [EMAIL PROTECTED]
---

 kernel/timer.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index dd6c2c1..b22bd39 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -505,6 +505,8 @@ out:
return ret;
 }
 
+EXPORT_SYMBOL(try_to_del_timer_sync);
+
 /**
  * del_timer_sync - deactivate a timer and wait for the handler to finish.
  * @timer: the timer to be deactivated

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] usb-net/pegasus: fix pegasus carrier detection

2007-04-25 Thread Jeff Garzik
The patch went upstream ~24 hours ago: 
c43c49bd61fdb9bb085ddafcaadb17d06f95ec43


Upstream is the base for any new patches.

Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cfg80211: new wireless config infrastructure

2007-04-25 Thread Ingo Oeser
Hi there,

John W. Linville schrieb:
 From: Johannes Berg [EMAIL PROTECTED]
 --- /dev/null
 +++ b/net/wireless/core.c
 @@ -0,0 +1,209 @@
 +/*
 + * This is the linux wireless configuration interface.
 + *
 + * Copyright 2006, 2007  Johannes Berg [EMAIL PROTECTED]
 + */
 +
 +#include linux/if.h
 +#include linux/module.h
 +#include linux/err.h
 +#include linux/mutex.h
 +#include linux/list.h
 +#include linux/nl80211.h
 +#include linux/debugfs.h
 +#include linux/notifier.h
 +#include linux/device.h
 +#include net/genetlink.h
 +#include net/cfg80211.h
 +#include net/wireless.h
 +#include core.h
 +#include sysfs.h
 +
 +/* name for sysfs, %d is appended */
 +#define PHY_NAME phy
 +
 +MODULE_AUTHOR(Johannes Berg);
 +MODULE_LICENSE(GPL);
 +MODULE_DESCRIPTION(wireless configuration support);
 +
 +/* RCU might be appropriate here since we usually
 + * only read the list, and that can happen quite
 + * often because we need to do it for each command */
 +LIST_HEAD(cfg80211_drv_list);
 +DEFINE_MUTEX(cfg80211_drv_mutex);
 +static int wiphy_counter;
 +
 +/* for debugfs */
 +static struct dentry *ieee80211_debugfs_dir;
 +
 +/* exported functions */
 +
 +struct wiphy *wiphy_new(struct cfg80211_ops *ops, int sizeof_priv)
 +{
 + struct cfg80211_registered_device *drv;
 + int alloc_size;
 +
 + alloc_size = sizeof(*drv) + sizeof_priv;
 +
 + drv = kzalloc(alloc_size, GFP_KERNEL);
 + if (!drv)
 + return NULL;
 +
 + drv-ops = ops;
 +
 + mutex_lock(cfg80211_drv_mutex);
 +
 + if (unlikely(wiphy_counter0)) {

mutex_unlock(cfg80211_drv_mutex);

 + /* ugh, wrapped! */
 + kfree(drv);
 + return NULL;
 + }
 + drv-idx = wiphy_counter;
 +
 + /* give it a proper name */
 + snprintf(drv-wiphy.dev.bus_id, BUS_ID_SIZE,
 +  PHY_NAME %d, drv-idx);
 +
 + /* now increase counter for the next time */
 + wiphy_counter++;
 + mutex_unlock(cfg80211_drv_mutex);

Since drv and its contents are not visible to anyone yet, 
I suggest the following code flow for that:

mutex_lock(cfg80211_drv_mutex);

drv-idx = wiphy_counter;

/* increase counter for the next time, if id didn't wrap */
if (drv-idx = 0)
wiphy_counter++;

mutex_unlock(cfg80211_drv_mutex);

if (drv-idx  0) {
kfree(drv);
return NULL;
}

/* give it a proper name */
snprintf(drv-wiphy.dev.bus_id, BUS_ID_SIZE,
 PHY_NAME %d, drv-idx);

[enqueue to all lists here]


Rest looks good so far.

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Fwd: [PATCH] [TIPC]: Enhancements to msg_set_bits() routine]

2007-04-25 Thread Ingo Oeser
Hi Jon,

Jon Paul Maloy schrieb:
 2) The code has been optimized to minimize the number of run-time
endianness conversion operations by leveraging the fact that the
mask (and, in some cases, the value as well) is constant and the
necessary conversion can be performed by the compiler.

3) It can be checked by sparse, if you use proper types.
 
 diff --git a/net/tipc/msg.h b/net/tipc/msg.h
 index 62d5490..5c64e55 100644
 --- a/net/tipc/msg.h
 +++ b/net/tipc/msg.h
 @@ -71,8 +71,11 @@ static inline void msg_set_word(struct tipc_msg *m, u32
 w, u32 val) static inline void msg_set_bits(struct tipc_msg *m, u32 w,
   u32 pos, u32 mask, u32 val)

static inlinevoid msg_set_bits(struct tipc_msg *m, u32 w,
u32 pos, __be32 mask, __be32 val)


Care to resubmit?


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IPROUTE: Modify tc for new PRIO multiqueue behavior

2007-04-25 Thread Waskiewicz Jr, Peter P
 -Original Message-
 From: J Hadi Salim [mailto:[EMAIL PROTECTED] On Behalf Of jamal
 Sent: Wednesday, April 25, 2007 4:37 AM
 To: Stephen Hemminger
 Cc: Waskiewicz Jr, Peter P; netdev@vger.kernel.org; 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]; cramerj; 
 Kok, Auke-jan H; Leech, Christopher; [EMAIL PROTECTED]
 Subject: Re: [PATCH] IPROUTE: Modify tc for new PRIO 
 multiqueue behavior
 
 On Tue, 2007-24-04 at 21:05 -0700, Stephen Hemminger wrote:
  Peter P Waskiewicz Jr wrote:
 
  Only if this binary compatiable with older kernels.
 
 It is not. But i think that is a lesser problem, the bigger 
 question is:
 Why would you need to change a qdisc just so you can support 
 egress multiqueues?

The previous version of my multiqueue patches I sent for consideration
had feedback from Patrick McHardy asking that the user be able to
configure the PRIO qdisc to run with multiqueue support or not.  That is
why TC needed a modification, since I agreed with Patrick that this
would be a useful option.

All the versions of multiqueue network device support I've sent for
consideration had PRIO modified to support multiqueue devices, since it
lends itself well for the model of multiple, independent flows.

 
 BTW, is there any reason this is being cced to lkml?

Since this change affects how tc interacts with the qdisc layer, I cced
lkml.

 
 cheers,
 jamal
 
 PS:- I havent read the kernel patches (i am congested and 
 about 1000 messages behind on netdev) and my opinions may be 
 influenced by an approach i have in trying to help someone 
 fixup a wireless driver with multiqueue support.

As long as someone is looking at them, I'll be happy.  :-)

Thanks,

-PJ Waskiewicz
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] infinite recursion in netlink

2007-04-25 Thread Alexey Kuznetsov
Hello!

Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.

The bug is present in all kernel versions since the feature appeared.

The patch also makes some minimal cleanup:

1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup

Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED]


diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index fc920f6..cac06c4 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -776,6 +776,8 @@ static void nl_fib_lookup(struct fib_res
   .nl_u = { .ip4_u = { .daddr = 
frn-fl_addr,
.tos = frn-fl_tos,
.scope = 
frn-fl_scope } } };
+
+   frn-err = -ENOENT;
if (tb) {
local_bh_disable();
 
@@ -787,6 +789,7 @@ static void nl_fib_lookup(struct fib_res
frn-nh_sel = res.nh_sel;
frn-type = res.type;
frn-scope = res.scope;
+   fib_res_put(res);
}
local_bh_enable();
}
@@ -801,6 +804,9 @@ static void nl_fib_input(struct sock *sk
struct fib_table *tb;
 
skb = skb_dequeue(sk-sk_receive_queue);
+   if (skb == NULL)
+   return;
+
nlh = (struct nlmsghdr *)skb-data;
if (skb-len  NLMSG_SPACE(0) || skb-len  nlh-nlmsg_len ||
nlh-nlmsg_len  NLMSG_LENGTH(sizeof(*frn))) {
@@ -813,7 +819,7 @@ static void nl_fib_input(struct sock *sk
 
nl_fib_lookup(frn, tb);
 
-   pid = nlh-nlmsg_pid;   /*pid of sending process */
+   pid = NETLINK_CB(skb).pid;   /* pid of sending process */
NETLINK_CB(skb).pid = 0; /* from kernel */
NETLINK_CB(skb).dst_group = 0;  /* unicast */
netlink_unicast(sk, skb, pid, MSG_DONTWAIT);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


sysctls

2007-04-25 Thread Andrew Morton

I note that the networking tree is adding new sysctls:

 HEAD/include/linux/sysctl.h
NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
===
NET_IPV6_OPTIMISTIC_DAD=24,
NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
 /include/linux/sysctl.h

(Well, it's trying to - there are some git rejects in net-2.6.22)

But we kind-of decided a while back to stop doing that and to
use CTL_UNNUMBERED.

Frankly, I don't 100% remember the thinking - Eric, can you please remind
us?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Miller
From: David Howells [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 14:38:32 +0100

 I think the idea is for them (or at least some of them) to go
 through one of DaveM's net git trees anyway.

Then please generate your patches against my net-2.6.21 GIT
tree.  Most of your initial patches in the series (the SKB
routine one for example) are already in my tree.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netlink locking warnings in 2.6.21-rc7-mm1

2007-04-25 Thread Andrew Morton

I just retested bare net-2.6.22, pulled 30 minutes ago.  I got just one
warning:


PM: Removing info for No Bus::06:0b.0
eth0: no IPv6 routers present
ipw2200: Radio Frequency Kill Switch is On:
Kill switch must be turned off for wireless networking to work.
PM: Adding info for No Bus:eth1
ipw2200: Detected geography ZZA (11 802.11bg channels, 13 802.11a channels)
ipw2200: Failed to send WEP_KEY: Aborted due to RF kill switch.
ipw2200: Failed to send WEP_KEY: Command timed out.
ipw2200: Failed to send WEP_KEY: Command timed out.
BUG: at kernel/mutex-debug.c:82 debug_mutex_unlock()
 [c012d18a] debug_mutex_unlock+0x5a/0x134
 [c02d67e2] __mutex_unlock_slowpath+0x9d/0xcf
 [f8c3618b] ipw_wx_set_encode+0x0/0x82 [ipw2200]
 [c028b92c] rtnl_unlock+0xa/0x29
 [c0286651] dev_ioctl+0x3d0/0x402
 [c014b078] __handle_mm_fault+0x7c6/0x7e8
 [c01a649b] selinux_file_alloc_security+0x1f/0x40
 [c027b943] sock_ioctl+0x0/0x1be
 [c0162925] do_ioctl+0x19/0x4d
 [c0162b58] vfs_ioctl+0x1ff/0x216
 [c0162bbb] sys_ioctl+0x4c/0x65
 [c0103b0c] syscall_call+0x7/0xb
 [c02d] unix_dgram_sendmsg+0x76/0x400
 ===

It's 100% reproducible here, using
http://userweb.kernel.org/~akpm/config-sony.txt





The weird ASSERT_RTNL warnings aren't there, so something else in -mm
(prior to git-net.patch in the series file) would appear to be interacting
with net changes.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread Eric W. Biederman
Andrew Morton [EMAIL PROTECTED] writes:

 I note that the networking tree is adding new sysctls:

  HEAD/include/linux/sysctl.h
 NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
 ===
 NET_IPV6_OPTIMISTIC_DAD=24,
 NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
 /include/linux/sysctl.h

 (Well, it's trying to - there are some git rejects in net-2.6.22)

 But we kind-of decided a while back to stop doing that and to
 use CTL_UNNUMBERED.

 Frankly, I don't 100% remember the thinking - Eric, can you please remind
 us?

The thinking is this:

  Binary sysctl numbers are a problem because of patch conflicts like
  the above, and the related user space breakage they cause.

  In practice no one uses binary sysctl numbers.

So the policy should be to add new sysctl's using CTL_UNNUMBERED
(to prevent patch conflicts and user space breakage).

There may be cases where someone actually needs the binary sysctl
interface.  Once there is a demonstrated need we can go back
and very carefully add numbers for these very few cases, with
a strong review process.

Adding binary sysctl numbers should be done as carefully as and with
as much review as adding syscall numbers, and distro kernels and other
stable kernels should never get a sysctl number backport until the
number first reaches Linus's tree.  To avoid difference in meaning
between different kernels.

Given that no one except on BSD uses the binary sysctl interface
anyway my personal preference is to just freeze it and to reduce the
number of binary sysctls we support if possible.

Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread Neil Horman
On Wed, Apr 25, 2007 at 01:45:19PM -0600, Eric W. Biederman wrote:
 Andrew Morton [EMAIL PROTECTED] writes:
 
  I note that the networking tree is adding new sysctls:
 
   HEAD/include/linux/sysctl.h
  NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
  ===
  NET_IPV6_OPTIMISTIC_DAD=24,
  NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
  /include/linux/sysctl.h
 
  (Well, it's trying to - there are some git rejects in net-2.6.22)
 
  But we kind-of decided a while back to stop doing that and to
  use CTL_UNNUMBERED.
 
  Frankly, I don't 100% remember the thinking - Eric, can you please remind
  us?
 
 The thinking is this:
 
   Binary sysctl numbers are a problem because of patch conflicts like
   the above, and the related user space breakage they cause.
 
   In practice no one uses binary sysctl numbers.
 
 So the policy should be to add new sysctl's using CTL_UNNUMBERED
 (to prevent patch conflicts and user space breakage).
 
 There may be cases where someone actually needs the binary sysctl
 interface.  Once there is a demonstrated need we can go back
 and very carefully add numbers for these very few cases, with
 a strong review process.
 
 Adding binary sysctl numbers should be done as carefully as and with
 as much review as adding syscall numbers, and distro kernels and other
 stable kernels should never get a sysctl number backport until the
 number first reaches Linus's tree.  To avoid difference in meaning
 between different kernels.
 
 Given that no one except on BSD uses the binary sysctl interface
 anyway my personal preference is to just freeze it and to reduce the
 number of binary sysctls we support if possible.
 
 Eric
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

I did the optimistic dad sysctl, and have no strict use for numbered sysctls (I
was just unaware of the policy).  I'll work up a patch to use
register_sysclt_table with CTL_UNNUMBERED in the next few days.

Regards
Neil

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Howells
David Miller [EMAIL PROTECTED] wrote:

 Then please generate your patches against my net-2.6.21 GIT
 tree.  Most of your initial patches in the series (the SKB
 routine one for example) are already in my tree.

Do you mean your net-2.6.22 GIT tree?

Do you want me to make it available as a GIT tree for you to pull?  Or would
you prefer patches?

David
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 12:29:24 -0700

 
 I note that the networking tree is adding new sysctls:
 
  HEAD/include/linux/sysctl.h
 NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
 ===
 NET_IPV6_OPTIMISTIC_DAD=24,
 NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
  /include/linux/sysctl.h
 
 (Well, it's trying to - there are some git rejects in net-2.6.22)

I knew this was going to happen because of Yoshifuji's
security fix, the conflict is trivial to resolve.

I'll rebase the net-2.6.22 tree later today since all
we should have before 2.6.21-final is the netlink
OOPS'er fix Alexey just posted.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/16] AF_RXRPC socket family and AFS rewrite [try #3]

2007-04-25 Thread David Miller
From: David Howells [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 20:56:47 +0100

 David Miller [EMAIL PROTECTED] wrote:
 
  Then please generate your patches against my net-2.6.21 GIT
  tree.  Most of your initial patches in the series (the SKB
  routine one for example) are already in my tree.
 
 Do you mean your net-2.6.22 GIT tree?
 
 Do you want me to make it available as a GIT tree for you to pull?  Or would
 you prefer patches?

Just patches is perfectly fine.

Also, if it's easier to diff against -mm, that works too
since Andrew integrates my net-2.6.22 tree into -mm most
of the time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infinite recursion in netlink

2007-04-25 Thread Greg KH
On Wed, Apr 25, 2007 at 10:38:56PM +0400, Alexey Kuznetsov wrote:
 Hello!
 
 Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
 which resulted in infinite recursion and stack overflow.
 
 The bug is present in all kernel versions since the feature appeared.

Any hint on when this feature appeared so that we can notify the distros
for older releases?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infinite recursion in netlink

2007-04-25 Thread David Miller
From: Greg KH [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 12:59:41 -0700

 On Wed, Apr 25, 2007 at 10:38:56PM +0400, Alexey Kuznetsov wrote:
  Hello!
  
  Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
  which resulted in infinite recursion and stack overflow.
  
  The bug is present in all kernel versions since the feature appeared.
 
 Any hint on when this feature appeared so that we can notify the distros
 for older releases?

It's been there since Jun 20th, 2005

commit 246955fe4c38bd706ae30e37c64892c94213775d
Author: Robert Olsson [EMAIL PROTECTED]
Date:   Mon Jun 20 13:36:39 2005 -0700

[NETLINK]: fib_lookup() via netlink

Below is a more generic patch to do fib_lookup via netlink. For others
we should say that we discussed this as a way to verify route selection.
It's also possible there are others uses for this.

In short the fist half of struct fib_result_nl is filled in by caller
and netlink call fills in the other half and returns it.

In case anyone is interested there is a corresponding user app to compare
the full routing table this was used to test implementation of the LC-trie.

Signed-off-by: David S. Miller [EMAIL PROTECTED]

diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index e38407a..561d4dc 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -14,6 +14,7 @@
 #define NETLINK_SELINUX7   /* SELinux event notifications 
*/
 #define NETLINK_ARPD   8
 #define NETLINK_AUDIT  9   /* auditing */
+#define NETLINK_FIB_LOOKUP 10  
 #define NETLINK_ROUTE6 11  /* af_inet6 route comm channel */
 #define NETLINK_IP6_FW 13
 #define NETLINK_DNRTMSG14  /* DECnet routing messages */
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index e5a5f6b..a4208a3 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -109,6 +109,20 @@ struct fib_result {
 #endif
 };
 
+struct fib_result_nl {
+   u32 fl_addr;   /* To be looked up*/ 
+   u32 fl_fwmark; 
+   unsigned char   fl_tos;
+   unsigned char   fl_scope;
+   unsigned char   tb_id_in;
+
+   unsigned char   tb_id;  /* Results */
+   unsigned char   prefixlen;
+   unsigned char   nh_sel;
+   unsigned char   type;
+   unsigned char   scope;
+   int err;  
+};
 
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
 
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 563e7d6..cd8e45a 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -516,6 +516,60 @@ static void fib_del_ifaddr(struct in_ifaddr *ifa)
 #undef BRD1_OK
 }
 
+static void nl_fib_lookup(struct fib_result_nl *frn, struct fib_table *tb )
+{
+   
+   struct fib_result   res;
+   struct flowifl = { .nl_u = { .ip4_u = { .daddr = 
frn-fl_addr, 
+   .fwmark = 
frn-fl_fwmark,
+   .tos = frn-fl_tos,
+   .scope = 
frn-fl_scope } } };
+   if (tb) {
+   local_bh_disable();
+
+   frn-tb_id = tb-tb_id;
+   frn-err = tb-tb_lookup(tb, fl, res);
+
+   if (!frn-err) {
+   frn-prefixlen = res.prefixlen;
+   frn-nh_sel = res.nh_sel;
+   frn-type = res.type;
+   frn-scope = res.scope;
+   }
+   local_bh_enable();
+   }
+}
+
+static void nl_fib_input(struct sock *sk, int len)
+{
+   struct sk_buff *skb = NULL;
+struct nlmsghdr *nlh = NULL;
+   struct fib_result_nl *frn;
+   int err;
+   u32 pid; 
+   struct fib_table *tb;
+   
+   skb = skb_recv_datagram(sk, 0, 0, err);
+   nlh = (struct nlmsghdr *)skb-data;
+   
+   frn = (struct fib_result_nl *) NLMSG_DATA(nlh);
+   tb = fib_get_table(frn-tb_id_in);
+
+   nl_fib_lookup(frn, tb);
+   
+   pid = nlh-nlmsg_pid;   /*pid of sending process */
+   NETLINK_CB(skb).groups = 0; /* not in mcast group */
+   NETLINK_CB(skb).pid = 0; /* from kernel */
+   NETLINK_CB(skb).dst_pid = pid;
+   NETLINK_CB(skb).dst_groups = 0;  /* unicast */
+   netlink_unicast(sk, skb, pid, MSG_DONTWAIT);
+}
+
+static void nl_fib_lookup_init(void)
+{
+  netlink_kernel_create(NETLINK_FIB_LOOKUP, nl_fib_input);
+}
+
 static void fib_disable_ip(struct net_device *dev, int force)
 {
if (fib_sync_down(0, dev, force))
@@ -604,6 +658,7 @@ void __init ip_fib_init(void)
 
register_netdevice_notifier(fib_netdev_notifier);
register_inetaddr_notifier(fib_inetaddr_notifier);
+   nl_fib_lookup_init();
 }
 
 EXPORT_SYMBOL(inet_addr_type);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a 

Re: sysctls

2007-04-25 Thread Eric W. Biederman
David Miller [EMAIL PROTECTED] writes:

 From: Andrew Morton [EMAIL PROTECTED]
 Date: Wed, 25 Apr 2007 12:29:24 -0700

 
 I note that the networking tree is adding new sysctls:
 
  HEAD/include/linux/sysctl.h
 NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
 ===
 NET_IPV6_OPTIMISTIC_DAD=24,
 NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
  /include/linux/sysctl.h
 
 (Well, it's trying to - there are some git rejects in net-2.6.22)

 I knew this was going to happen because of Yoshifuji's
 security fix, the conflict is trivial to resolve.

 I'll rebase the net-2.6.22 tree later today since all
 we should have before 2.6.21-final is the netlink
 OOPS'er fix Alexey just posted.

David for clarity do you happen to know of anyone using binary
sysctl values?

In particular is there any reason not to use CTL_UNNUMBERED
for new networking sysctls?

Eric

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infinite recursion in netlink

2007-04-25 Thread David Miller
From: Alexey Kuznetsov [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 22:38:56 +0400

 Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
 which resulted in infinite recursion and stack overflow.
 
 The bug is present in all kernel versions since the feature appeared.
 
 The patch also makes some minimal cleanup:
 
 1. Return something consistent (-ENOENT) when fib table is missing
 2. Do not crash when queue is empty (does not happen, but yet)
 3. Put result of lookup
 
 Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED]

Applied, thanks a lot Alexey.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


2.6.20.7 mss negotiation and path mtu discovery mostly broken?

2007-04-25 Thread Ristuccia, Brian
I had previously posted this message to linux-kernel, but David Miller
asked me to post here instead. Some replies to my message on l-k have
already been copied here. I'm seeing a problem where the kernel attempts
to send packets with a MSS larger than the one negotiated when the TCP
connection is established. Even after ICMP can't fragment messages
arrive, the kernel still attempts to increase the MSS rather
aggressively. The end result is extremely poor throughput when sending
to a network with a smaller MTU. 

In /proc/sys/net/ipv4:
ip_no_pmtu_disc:0
tcp_mtu_probing:0

The sending host (10.2.10.254) has an MTU of 9000. The destination host
(12.33.234.69) has an MTU of 1500. There is one router between the hosts
which will drop packets with the DF flag when they don't fit the
destination interface's MTU and generates the required icmp can't
fragment message. 

The dump shows the initial handshake with correct mss options sent:

08:39:55.493029 IP 12.33.234.69.35026  10.2.10.254.22: S
2768979373:2768979373(
0) win 5840 mss 1460,sackOK,timestamp 3873837730 0,nop,wscale 2
08:39:55.493119 IP 10.2.10.254.22  12.33.234.69.35026: S
963242385:963242385(0)
 ack 2768979374 win 17896 mss 8960,sackOK,timestamp 413751
3873837730,nop,wscal
e 5

Then I see the system send larger packets (larger than the mss),
provoking a can't fragment from the router. Now I suppose it might be
reasonable to occasionally probe a larger MSS when the current MSS is a
result of reductions due to path mtu discovery. After all, the path
taken could change over time. But when the current MSS is at the value
negotiated by the MSS option during the TCP handshake, it seems like
there's no sense in trying to send with a lager MSS. Even if there were,
there's certainly no justification for making such an attempt every
other packet (2.6.18) or every fourth packet (2.6.20.7). 

In the following dump, the system eventually gets in a state where it
oscillates between sendng undeliverable 2896 byte packets and
deliverable 1448 byte ones. 

08:39:55.649689 IP 10.2.10.254.22  12.33.234.69.35026: .
5906:10250(4344) ack 1
794 win 674 nop,nop,timestamp 413790 3873837887
08:39:55.650532 IP 10.2.10.1  10.2.10.254: icmp 92: 12.33.234.69
unreachable -
need to frag (mtu 1500)
08:39:55.689774 IP 12.33.234.69.35026  10.2.10.254.22: . ack 5906 win
4544 nop
,nop,timestamp 3873837927 413790
08:39:55.689784 IP 10.2.10.254.22  12.33.234.69.35026: .
10250:13146(2896) ack
1794 win 674 nop,nop,timestamp 413800 3873837927
08:39:55.690497 IP 10.2.10.1  10.2.10.254: icmp 92: 12.33.234.69
unreachable -
need to frag (mtu 1500)
08:39:55.902494 IP 10.2.10.254.22  12.33.234.69.35026: .
5906:7354(1448) ack 17
94 win 674 nop,nop,timestamp 413853 3873837927

Since any sane router will only generate can't fragment ICMP's at a
limited rate, for two hosts on gigabit ethernet, one on a MTU 1500
subnet and another on a MTU 9000 subnet, I can move only 40-50KB/sec
over an affected TCP connection. 

I was unable to find any reference to this problem in the kernel
changelogs, or even any reports of anyone else having a similar problem.
The above dumps are from 2.6.19.7. I could also reproduce the problem in
2.6.18, although the dumps looked slightly different. I was unable to
reproduce this problem with the 2.6.9-42.0.10.ELsmp kernel which ships
in RHEL4. 

I can send a pcap dump to anyone interested. 

-- 
Brian Ristuccia


This email message and any attachments are confidential information of Starent 
Networks, Corp. The information transmitted may not be used to create or change 
any contractual obligations of Starent Networks, Corp.  Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon this e-mail and its attachments by persons or entities other than 
the intended recipient is prohibited. If you are not the intended recipient, 
please notify the sender immediately -- by replying to this message or by 
sending an email to [EMAIL PROTECTED] -- and destroy all copies of this message 
and any attachments without reading or disclosing their contents. Thank you.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread David Miller
From: [EMAIL PROTECTED] (Eric W. Biederman)
Date: Wed, 25 Apr 2007 14:06:34 -0600

 David for clarity do you happen to know of anyone using binary
 sysctl values?

None at all.

 In particular is there any reason not to use CTL_UNNUMBERED
 for new networking sysctls?

Neil said he would send me a patch to do that.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: 2.6.20.7 mss negotiation and path mtu discovery mostly broken?

2007-04-25 Thread Ristuccia, Brian

 I'm seeing a 
 problem where the kernel attempts to send packets with a MSS 
 larger than the one negotiated when the TCP connection is 
 established. Even after ICMP can't fragment messages 
 arrive, the kernel still attempts to increase the MSS rather 
 aggressively. The end result is extremely poor throughput 
 when sending to a network with a smaller MTU. 

I've tracked this problem to the TSO feature in the bnx2 driver. Turning
off TSO with ethtool -K eth1 tso off seems to work around the problem.
It appears that the bnx2 device is not using the correct mss when
performing segmentation offload. 

-Brian


This email message and any attachments are confidential information of Starent 
Networks, Corp. The information transmitted may not be used to create or change 
any contractual obligations of Starent Networks, Corp.  Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon this e-mail and its attachments by persons or entities other than 
the intended recipient is prohibited. If you are not the intended recipient, 
please notify the sender immediately -- by replying to this message or by 
sending an email to [EMAIL PROTECTED] -- and destroy all copies of this message 
and any attachments without reading or disclosing their contents. Thank you.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread Eric W. Biederman
David Miller [EMAIL PROTECTED] writes:

 From: [EMAIL PROTECTED] (Eric W. Biederman)
 Date: Wed, 25 Apr 2007 14:06:34 -0600

 David for clarity do you happen to know of anyone using binary
 sysctl values?

 None at all.

 In particular is there any reason not to use CTL_UNNUMBERED
 for new networking sysctls?

 Neil said he would send me a patch to do that.

Thanks. I just wanted to be certain I wasn't missing something,
when asking people not to use binary sysctl values.

Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread Linus Torvalds


On Wed, 25 Apr 2007, Alexey Kuznetsov wrote:
 
 Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
 which resulted in infinite recursion and stack overflow.

So I assume it's this line that actually _fixes_ it:

 - pid = nlh-nlmsg_pid;   /*pid of sending process */
 + pid = NETLINK_CB(skb).pid;   /* pid of sending process */
   NETLINK_CB(skb).pid = 0; /* from kernel */
   NETLINK_CB(skb).dst_group = 0;  /* unicast */
   netlink_unicast(sk, skb, pid, MSG_DONTWAIT);

No?

If so, shouldn't we also have some safety-net to make sure it doesn't 
still get routed back forever, ie adding something like

if (!pid) {
skb_free(skb);
return -EINVAL;
}

or similar? I don't know the netlink layer from a dolphin, but if the old 
code could cause infinite recursion, it sounds like the new code could too 
with the right pid, since the only change is the choice of pid.

Yes/No/This is why Linus is a dickweed and doesn't understand the problem?

Linus
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread David Miller
From: Linus Torvalds [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 13:15:12 -0700 (PDT)

 If so, shouldn't we also have some safety-net to make sure it doesn't 
 still get routed back forever, ie adding something like
 
   if (!pid) {
   skb_free(skb);
   return -EINVAL;
   }
 
 or similar? I don't know the netlink layer from a dolphin, but if the old 
 code could cause infinite recursion, it sounds like the new code could too 
 with the right pid, since the only change is the choice of pid.

Netlink pids are more like port numbers in the socket sense, do
not confuse them with process pids or similar.

The kernel explicitly assigns them to sockets, and zero is special.
The fact that the process pid of the socket creator is used as
an initial selection heuristic, is just that, a heuristic.

Alexey's fix is %100 the right way to go IMHO.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netlink locking warnings in 2.6.21-rc7-mm1

2007-04-25 Thread Patrick McHardy
Andrew Morton wrote:
 I just retested bare net-2.6.22, pulled 30 minutes ago.  I got just one
 warning:
 
 BUG: at kernel/mutex-debug.c:82 debug_mutex_unlock()
  [c012d18a] debug_mutex_unlock+0x5a/0x134
  [c02d67e2] __mutex_unlock_slowpath+0x9d/0xcf
  [f8c3618b] ipw_wx_set_encode+0x0/0x82 [ipw2200]
  [c028b92c] rtnl_unlock+0xa/0x29
  [c0286651] dev_ioctl+0x3d0/0x402
  [c014b078] __handle_mm_fault+0x7c6/0x7e8
  [c01a649b] selinux_file_alloc_security+0x1f/0x40
  [c027b943] sock_ioctl+0x0/0x1be
  [c0162925] do_ioctl+0x19/0x4d
  [c0162b58] vfs_ioctl+0x1ff/0x216
  [c0162bbb] sys_ioctl+0x4c/0x65
  [c0103b0c] syscall_call+0x7/0xb
  [c02d] unix_dgram_sendmsg+0x76/0x400
  ===
 
 It's 100% reproducible here, using
 http://userweb.kernel.org/~akpm/config-sony.txt
 
 
 The weird ASSERT_RTNL warnings aren't there, so something else in -mm
 (prior to git-net.patch in the series file) would appear to be interacting
 with net changes.


I think I found the problem, the rtnl_mutex was reinitialized on every
rtnetlink socket creation. This is most likely responsible for both
warnings.

[NETLINK]: don't reinitialize callback mutex

Don't reinitialize the callback mutex the netlink_kernel_create caller
handed in, it is supposed to already be initialized and could already
be held by someone.

Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

---
commit 9cc4e9c2d8b022c10ded98610a3cd76a8b89cf49
tree e53f10a158858e20ef2e9922cabc5bf43980708d
parent 7255fbb088e3f1b8be97472a38f645a8da595fe2
author Patrick McHardy [EMAIL PROTECTED] Wed, 25 Apr 2007 22:47:20 +0200
committer Patrick McHardy [EMAIL PROTECTED] Wed, 25 Apr 2007 22:47:20 +0200

 net/netlink/af_netlink.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index ec16c9b..64d4b27 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -388,8 +388,12 @@ static int __netlink_create(struct socket *sock, struct 
mutex *cb_mutex,
sock_init_data(sock, sk);
 
nlk = nlk_sk(sk);
-   nlk-cb_mutex = cb_mutex ? : nlk-cb_def_mutex;
-   mutex_init(nlk-cb_mutex);
+   if (cb_mutex)
+   nlk-cb_mutex = cb_mutex;
+   else {
+   nlk-cb_mutex = nlk-cb_def_mutex;
+   mutex_init(nlk-cb_mutex);
+   }
init_waitqueue_head(nlk-wait);
 
sk-sk_destruct = netlink_sock_destruct;


Re: netlink locking warnings in 2.6.21-rc7-mm1

2007-04-25 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 22:51:43 +0200

 [NETLINK]: don't reinitialize callback mutex
 
 Don't reinitialize the callback mutex the netlink_kernel_create caller
 handed in, it is supposed to already be initialized and could already
 be held by someone.
 
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

Applied, thanks a lot for tracking this down Patrick.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


very strange inet_sock corruption with rpc

2007-04-25 Thread Vlad Yasevich
Hi All

To support a piece of custom functionality, we needed to add
2 member to the struct inet_sock.  During testing, we started
seeing an interesting corruption.  Following a hunch, we've
completely ripped out all of our code with the exception of
5 lines that do this:

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index ce6da97..605f5c0 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -140,6 +140,8 @@ struct inet_sock {
__be32  addr;
struct flowifl;
} cork;
+   void *foo;
+   u32  bar;
 };
 
 #define IPCORK_OPT 1   /* ip-options has been held in ipcork.opt */
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index cf358c8..98ad2c2 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -335,6 +335,9 @@ lookup_protocol:
 
sk_refcnt_debug_inc(sk);
 
+   inet-foo = NULL;
+   inet-bar = 0;
+
if (inet-num) {
/* It assumes that any protocol which allows
 * the user to assign a number at socket

(Variables were really named something else, but I hacked this into
 net-2.6 to see if I could reproduce).

With just the above patch, I can catch a corruption of the inet_sock
in the inet_cks_bind_conflict() with this:

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 43fb160..5cd5b6d 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -45,6 +45,18 @@ int inet_csk_bind_conflict(const struct sock *sk,
int reuse = sk-sk_reuse;
 
sk_for_each_bound(sk2, node, tb-owners) {
+   if (inet_sk(sk2)-foo) {
+   printk(KERN_WARN sk2 might be corrupt.  Info:\n);
+   printk(KERN_WARN \tsk2 = %p\n, sk2);
+   printk(KERN_WARN \ttb-port = %d\n, tb-port);
+   printk(KERN_WARN \tinet_sk(sk2)-num = %d\n,
+   inet_sk(sk2)-num);
+   printk(KERN_WARN \tinet_sk(sk2)-foo = %p\n,
+   inet_sk(sk2)-foo);
+   printk(KERN_WARN \tinet_sk(sk2)-bar = %p\n,
+   inet_sk(sk2)-bar);
+   WARN_ON(1);
+   }

Nobody outside of inet_create() writes to the foo pointer so it should
always be NULL.  I've enabled SLAB debugging, stack overflow debugging, VM
debugging and nothing triggers.

The corruption is triggered after about 10 minutes of running the following
script:

nfspath = $1
localpath = $2
while true; do
mount $nfspath $localpath
sleep 5
cp /boot/vmlinuz $localpath
sleep 5
rm $localpath/vmlinuz
sleep 5
umount $localpath
done


And looks like this:

sk2 might be corrupt.  Info:
sk2 = 8100f004d080
tb-port = 844
inet_sk(sk2)-num = 61695
inet_sk(sk2)-foo = 24242424243f243f
inet_sk(sk2)-bar = 3f24243f
BUG: at net/ipv4/inet_connection_sock.c:58 inet_csk_bind_conflict()

Call Trace:
 [803cc591] inet_csk_bind_conflict+0xcb/0x178
 [803cc4c6] inet_csk_bind_conflict+0x0/0x178
 [803cc2ff] inet_csk_get_port+0x11a/0x1ef
 [803ddf51] inet_bind+0x117/0x1f5
 [88184e13] :sunrpc:xs_bindresvport+0x4e/0xbf
 [881853a4] :sunrpc:xs_tcp_connect_worker+0x0/0x2a0
 [88185433] :sunrpc:xs_tcp_connect_worker+0x8f/0x2a0
 [80248bd3] run_workqueue+0x8f/0x137
 [80245687] worker_thread+0x0/0x14a
 [8024579b] worker_thread+0x114/0x14a
 [8027e544] default_wake_function+0x0/0xe
 [8022ff49] kthread+0xd1/0x100
 [80258f68] child_rip+0xa/0x12
 [8022fe78] kthread+0x0/0x100
 [80258f5e] child_rip+0x0/0x12


It looks like someone is stepping all over the inet_sock.
We'll continue looking, but if anyone has any ideas of what might
be going on, I'd appreciate it.

It looks like a serious bug lurking somewhere.

-vlad

p.s  the mount is using nfsv3 over UDP (nothing fancy at all)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread Andrew Morton
On Wed, 25 Apr 2007 15:53:19 -0400
Neil Horman [EMAIL PROTECTED] wrote:

 I did the optimistic dad sysctl, and have no strict use for numbered sysctls 
 (I
 was just unaware of the policy).  I'll work up a patch to use
 register_sysclt_table with CTL_UNNUMBERED in the next few days.

I don't think you need to add a call to register_sysctl_table(), if that's
what you're proposing.  Just drop the changes to sysctl.h and use CTL_UNNUMBERED
in sysctl.c.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PATCH] [net-2.6.22] IPv6, IPv4 Updates

2007-04-25 Thread David Miller
From: YOSHIFUJI Hideaki / 吉藤英明 [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 21:55:21 +0900 (JST)

 Please consider pulling following commits available on
   net-2.6.22-20070425a-inet6-cleanup-20070425
 branch at
   git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev.git.
 
 HEADLINES
 -
 
 [IPV6] SIT: Unify code path to get hash array index.
 [IPV4] IPIP: Unify code path to get hash array index.
 [IPV4] IP_GRE: Unify code path to get hash array index.
 [IPV6]: Export in6addr_any for future use.
 [IPV6] XFRM: Use ip6addr_any where applicable.
 [IPV6] NDISC: Unify main process of sending ND messages.

Pulled, thanks a lot!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: very strange inet_sock corruption with rpc

2007-04-25 Thread Sridhar Samudrala
On Wed, 2007-04-25 at 17:03 -0400, Vlad Yasevich wrote:
 Hi All
 
 To support a piece of custom functionality, we needed to add
 2 member to the struct inet_sock.  During testing, we started
 seeing an interesting corruption.  Following a hunch, we've
 completely ripped out all of our code with the exception of
 5 lines that do this:
 
 diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
 index ce6da97..605f5c0 100644
 --- a/include/net/inet_sock.h
 +++ b/include/net/inet_sock.h
 @@ -140,6 +140,8 @@ struct inet_sock {
 __be32  addr;
 struct flowifl;
 } cork;
 +   void *foo;
 +   u32  bar;
  };
 
  #define IPCORK_OPT 1   /* ip-options has been held in ipcork.opt */
 diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
 index cf358c8..98ad2c2 100644
 --- a/net/ipv4/af_inet.c
 +++ b/net/ipv4/af_inet.c
 @@ -335,6 +335,9 @@ lookup_protocol:
 
 sk_refcnt_debug_inc(sk);
 
 +   inet-foo = NULL;
 +   inet-bar = 0;
 +
 if (inet-num) {
 /* It assumes that any protocol which allows
  * the user to assign a number at socket
 
 (Variables were really named something else, but I hacked this into
  net-2.6 to see if I could reproduce).
 
 With just the above patch, I can catch a corruption of the inet_sock
 in the inet_cks_bind_conflict() with this:
 
 diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
 index 43fb160..5cd5b6d 100644
 --- a/net/ipv4/inet_connection_sock.c
 +++ b/net/ipv4/inet_connection_sock.c
 @@ -45,6 +45,18 @@ int inet_csk_bind_conflict(const struct sock *sk,
 int reuse = sk-sk_reuse;
 
 sk_for_each_bound(sk2, node, tb-owners) {
 +   if (inet_sk(sk2)-foo) {
 +   printk(KERN_WARN sk2 might be corrupt.  Info:\n);
 +   printk(KERN_WARN \tsk2 = %p\n, sk2);
 +   printk(KERN_WARN \ttb-port = %d\n, tb-port);
 +   printk(KERN_WARN \tinet_sk(sk2)-num = %d\n,
 +   inet_sk(sk2)-num);
 +   printk(KERN_WARN \tinet_sk(sk2)-foo = %p\n,
 +   inet_sk(sk2)-foo);
 +   printk(KERN_WARN \tinet_sk(sk2)-bar = %p\n,
 +   inet_sk(sk2)-bar);
 +   WARN_ON(1);
 +   }
 
 Nobody outside of inet_create() writes to the foo pointer so it should
 always be NULL.  I've enabled SLAB debugging, stack overflow debugging, VM
 debugging and nothing triggers.
 
 The corruption is triggered after about 10 minutes of running the following
 script:
 
 nfspath = $1
 localpath = $2
 while true; do
   mount $nfspath $localpath
   sleep 5
   cp /boot/vmlinuz $localpath
   sleep 5
   rm $localpath/vmlinuz
   sleep 5
   umount $localpath
 done
 
 
 And looks like this:
 
 sk2 might be corrupt.  Info:
 sk2 = 8100f004d080
 tb-port = 844
 inet_sk(sk2)-num = 61695
 inet_sk(sk2)-foo = 24242424243f243f
 inet_sk(sk2)-bar = 3f24243f
 BUG: at net/ipv4/inet_connection_sock.c:58 inet_csk_bind_conflict()
 
 Call Trace:
  [803cc591] inet_csk_bind_conflict+0xcb/0x178
  [803cc4c6] inet_csk_bind_conflict+0x0/0x178
  [803cc2ff] inet_csk_get_port+0x11a/0x1ef
  [803ddf51] inet_bind+0x117/0x1f5
  [88184e13] :sunrpc:xs_bindresvport+0x4e/0xbf
  [881853a4] :sunrpc:xs_tcp_connect_worker+0x0/0x2a0
  [88185433] :sunrpc:xs_tcp_connect_worker+0x8f/0x2a0

If you are using NFS over UDP, why is a TCP routine
getting called by sunrpc?

  [80248bd3] run_workqueue+0x8f/0x137
  [80245687] worker_thread+0x0/0x14a
  [8024579b] worker_thread+0x114/0x14a
  [8027e544] default_wake_function+0x0/0xe
  [8022ff49] kthread+0xd1/0x100
  [80258f68] child_rip+0xa/0x12
  [8022fe78] kthread+0x0/0x100
  [80258f5e] child_rip+0x0/0x12
 
 
 It looks like someone is stepping all over the inet_sock.
 We'll continue looking, but if anyone has any ideas of what might
 be going on, I'd appreciate it.
 
 It looks like a serious bug lurking somewhere.
 
 -vlad
 
 p.s  the mount is using nfsv3 over UDP (nothing fancy at all)


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.20.7 mss negotiation and path mtu discovery mostly broken?

2007-04-25 Thread Herbert Xu
Ristuccia, Brian [EMAIL PROTECTED] wrote:
 
 08:39:55.649689 IP 10.2.10.254.22  12.33.234.69.35026: .
 5906:10250(4344) ack 1
 794 win 674 nop,nop,timestamp 413790 3873837887
 08:39:55.650532 IP 10.2.10.1  10.2.10.254: icmp 92: 12.33.234.69
 unreachable -
 need to frag (mtu 1500)

Where was this dump taken, on 10.2.10.254?

If so could youd either take the dump further down the route or show
the full contents (with tcpdump -x) of the ICMP error here so that
we can see what the actual packet size was?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] infinite recursion in netlink

2007-04-25 Thread Jaco Kroon

Greg KH wrote:

On Wed, Apr 25, 2007 at 10:38:56PM +0400, Alexey Kuznetsov wrote:

Hello!

Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.

The bug is present in all kernel versions since the feature appeared.


Any hint on when this feature appeared so that we can notify the distros
for older releases?

thanks,


2.6.13 if I'm not mistaken, confirmed on debian testing by Simeon 
Miteff.  From man 7 netlink:


NETLINK_W1 and NETLINK_FIB_LOOKUP appeared in Linux 2.6.13.

Jaco
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bluetooth patches for 2.6.21-rc7

2007-04-25 Thread David Miller
From: Marcel Holtmann [EMAIL PROTECTED]
Date: Thu, 26 Apr 2007 01:05:55 +0200

 I have two last minute patches before the final 2.6.21 kernel hits the
 streets. One is a kernel memory leak that has been classified as
 security issue. The second one is a sysfs fix to correct a wrong use of
 class and bus devices.

I don't think this one will make it as Linus is very eager
to get the release out at this point :-)

If it doesn't, I will make sure to push it into the -stable
branch, so no worries.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bluetooth patches for 2.6.21-rc7

2007-04-25 Thread Marcel Holtmann
Hi Dave,

I have two last minute patches before the final 2.6.21 kernel hits the
streets. One is a kernel memory leak that has been classified as
security issue. The second one is a sysfs fix to correct a wrong use of
class and bus devices.

Regards

Marcel


Please pull from

git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6.git

This will update the following files:

 net/bluetooth/hci_sock.c  |9 +
 net/bluetooth/hci_sysfs.c |9 -
 net/bluetooth/l2cap.c |6 ++
 3 files changed, 23 insertions(+), 1 deletion(-)

through these ChangeSets:

Commit: 9457de6253a222a8c340b0442fb63c172069d962 
Author: Marcel Holtmann [EMAIL PROTECTED] Wed, 25 Apr 2007 22:38:39 +0200 

[Bluetooth] Attach host adapters to the Bluetooth bus

The Bluetooth host adapters are attached to the Bluetooth class and the
low-level connections are children of these class devices. Having class
devices as parent of bus devices breaks a lot of reasonable assumptions
about sysfs. The host adapters should be attached to the Bluetooth bus
to simplify the dependency resolving. For compatibility an additional
symlink from the Bluetooth class will be used.

Signed-off-by: Marcel Holtmann [EMAIL PROTECTED]

Commit: 32f1cf0a4643018f8473065d645dbc6b5772e93c 
Author: Marcel Holtmann [EMAIL PROTECTED] Wed, 25 Apr 2007 22:38:34 +0200 

[Bluetooth] Fix L2CAP and HCI setsockopt() information leaks

The L2CAP and HCI setsockopt() implementations have a small information
leak that makes it possible to leak kernel stack memory to userspace.

If the optlen parameter is 0, no data will be copied by copy_from_user(),
but the uninitialized stack buffer will be read and stored later. A call
to getsockopt() can now retrieve the leaked information.

To fix this problem the stack buffer given to copy_from_user() must be
initialized with the current settings.

Signed-off-by: Marcel Holtmann [EMAIL PROTECTED]



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bluetooth patches for 2.6.21-rc7

2007-04-25 Thread Marcel Holtmann
Hi Dave,

  I have two last minute patches before the final 2.6.21 kernel hits the
  streets. One is a kernel memory leak that has been classified as
  security issue. The second one is a sysfs fix to correct a wrong use of
  class and bus devices.
 
 I don't think this one will make it as Linus is very eager
 to get the release out at this point :-)

I realized that 2.6.21 is almost out of the door. This is why I put
Linus on CC. His call.

Regards

Marcel


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sysctls

2007-04-25 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 14:50:18 -0700

 On Wed, 25 Apr 2007 15:53:19 -0400
 Neil Horman [EMAIL PROTECTED] wrote:
 
  I did the optimistic dad sysctl, and have no strict use for numbered 
  sysctls (I
  was just unaware of the policy).  I'll work up a patch to use
  register_sysclt_table with CTL_UNNUMBERED in the next few days.
 
 I don't think you need to add a call to register_sysctl_table(), if that's
 what you're proposing.  Just drop the changes to sysctl.h and use 
 CTL_UNNUMBERED
 in sysctl.c.

Ok, I'll take care of this.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: netlink locking warnings in 2.6.21-rc7-mm1

2007-04-25 Thread Andrew Morton
On Wed, 25 Apr 2007 22:51:43 +0200 Patrick McHardy [EMAIL PROTECTED] wrote:

 I think I found the problem, the rtnl_mutex was reinitialized on every
 rtnetlink socket creation. This is most likely responsible for both
 warnings.

Yup, no warnings any more, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] bridge: don't change packet type

2007-04-25 Thread Stephen Hemminger
The change to forward STP bpdu's (for usermode STP) through normal path,
changed the packet type in the process. Since link local stuff is multicast, it
should stay pkt_type = PACKET_MULTICAST.  The code was probably copy/pasted
incorrectly from the bridge pseudo-device receive path.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- bridge-2.6.22.orig/net/bridge/br_input.c
+++ bridge-2.6.22/net/bridge/br_input.c
@@ -131,12 +131,9 @@ struct sk_buff *br_handle_frame(struct n
if (!is_valid_ether_addr(eth_hdr(skb)-h_source))
goto drop;
 
-   if (unlikely(is_link_local(dest))) {
-   skb-pkt_type = PACKET_HOST;
-
+   if (unlikely(is_link_local(dest)))
return (NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb-dev,
NULL, br_handle_local_finish) == 0) ? skb : 
NULL;
-   }
 
switch (p-state) {
case BR_STATE_FORWARDING:

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] Bridge patches for 2.6.22

2007-04-25 Thread Stephen Hemminger
Here are some patches for bridge code in 2.6.22. The user mode RSTP
from Aji is working. Anyone who wants to test it can get it from:
   git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/rstp.git

Thanks

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] bridge: drop PAUSE frames

2007-04-25 Thread Stephen Hemminger
Pause frames should never make it out of the network device into
the stack. But if a device was misconfigured, it might happen.
So drop pause frames in bridge.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

--- bridge-2.6.22.orig/include/linux/if_ether.h
+++ bridge-2.6.22/include/linux/if_ether.h
@@ -61,6 +61,7 @@
 #define ETH_P_8021Q0x8100  /* 802.1Q VLAN Extended Header  */
 #define ETH_P_IPX  0x8137  /* IPX over DIX */
 #define ETH_P_IPV6 0x86DD  /* IPv6 over bluebook   */
+#define ETH_P_PAUSE0x8808  /* IEEE Pause frames. See 802.3 31B */
 #define ETH_P_SLOW 0x8809  /* Slow Protocol. See 802.3ad 43B */
 #define ETH_P_WCCP 0x883E  /* Web-cache coordination protocol
 * defined in 
draft-wilson-wrec-wccp-v2-00.txt */
--- bridge-2.6.22.orig/net/bridge/br_input.c
+++ bridge-2.6.22/net/bridge/br_input.c
@@ -131,9 +131,14 @@ struct sk_buff *br_handle_frame(struct n
if (!is_valid_ether_addr(eth_hdr(skb)-h_source))
goto drop;
 
-   if (unlikely(is_link_local(dest)))
+   if (unlikely(is_link_local(dest))) {
+   /* Pause frames shouldn't be passed up by driver anyway */
+   if (skb-protocol == htons(ETH_P_PAUSE))
+   goto drop;
+
return (NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb-dev,
NULL, br_handle_local_finish) == 0) ? skb : 
NULL;
+   }
 
switch (p-state) {
case BR_STATE_FORWARDING:

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] bridge: missing rtnl

2007-04-25 Thread Stephen Hemminger
Writing to /sys/class/net/brX/bridge/stp_state causes a warning because
RTNL is not held when call br_stp_if.c

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 net/bridge/br_sysfs_br.c |2 ++
 1 file changed, 2 insertions(+)

--- bridge-2.6.22.orig/net/bridge/br_sysfs_br.c
+++ bridge-2.6.22/net/bridge/br_sysfs_br.c
@@ -149,9 +149,11 @@ static ssize_t show_stp_state(struct dev
 
 static void set_stp_state(struct net_bridge *br, unsigned long val)
 {
+   rtnl_lock();
spin_unlock_bh(br-lock);
br_stp_set_enabled(br, val);
spin_lock_bh(br-lock);
+   rtnl_unlock();
 }
 
 static ssize_t store_stp_state(struct device *d,

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] bridge: if no STP then forward all BPDUs

2007-04-25 Thread Stephen Hemminger
If a bridge is not running STP, then it has no way to detect a cycle
in the network. But if it is not running STP and some other machine
or device is running STP, then if STP BPDU's get forwarded to it can
detect the cycle.

This is how the old 2.4 and early 2.6 code worked.

Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

---
 net/bridge/br_input.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- bridge-2.6.22.orig/net/bridge/br_input.c
+++ bridge-2.6.22/net/bridge/br_input.c
@@ -136,8 +136,14 @@ struct sk_buff *br_handle_frame(struct n
if (skb-protocol == htons(ETH_P_PAUSE))
goto drop;
 
-   return (NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb-dev,
-   NULL, br_handle_local_finish) == 0) ? skb : 
NULL;
+   /* Process STP BPDU's through normal netif_receive_skb() path */
+   if (p-br-stp_enabled != BR_NO_STP) {
+   if (NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb-dev,
+   NULL, br_handle_local_finish))
+   return NULL;
+   else
+   return skb;
+   }
}
 
switch (p-state) {

-- 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] usb-net/pegasus: simplify carrier detection

2007-04-25 Thread Dan Williams
Simplify pegasus carrier detection; rely only on the periodic MII
polling.  Reverts pieces of c43c49bd61fdb9bb085ddafcaadb17d06f95ec43.

Signed-off-by: Dan Williams [EMAIL PROTECTED]

--- a/drivers/usb/net/pegasus.h 2007-04-25 21:21:00.0 -0400
+++ b/drivers/usb/net/pegasus.h 2007-04-25 21:21:13.0 -0400
@@ -11,7 +11,6 @@
 
 #definePEGASUS_II  0x8000
 #defineHAS_HOME_PNA0x4000
-#defineTRUST_LINK_STATUS   0x2000
 
 #definePEGASUS_MTU 1536
 #defineRX_SKBS 4
@@ -204,7 +203,7 @@
 PEGASUS_DEV( Allied Telesyn Int. AT-USB100, VENDOR_ALLIEDTEL, 0xb100,
DEFAULT_GPIO_RESET | PEGASUS_II )
 PEGASUS_DEV( Belkin F5D5050 USB Ethernet, VENDOR_BELKIN, 0x0121,
-   DEFAULT_GPIO_RESET | PEGASUS_II | TRUST_LINK_STATUS )
+   DEFAULT_GPIO_RESET | PEGASUS_II )
 PEGASUS_DEV( Billionton USB-100, VENDOR_BILLIONTON, 0x0986,
DEFAULT_GPIO_RESET )
 PEGASUS_DEV( Billionton USBLP-100, VENDOR_BILLIONTON, 0x0987,
--- a/drivers/usb/net/pegasus.c 2007-04-25 21:20:32.0 -0400
+++ b/drivers/usb/net/pegasus.c 2007-04-25 21:22:15.0 -0400
@@ -848,16 +848,6 @@
 * d[0].NO_CARRIER kicks in only with failed TX.
 * ... so monitoring with MII may be safest.
 */
-   if (pegasus-features  TRUST_LINK_STATUS) {
-   if (d[5]  LINK_STATUS)
-   netif_carrier_on(net);
-   else
-   netif_carrier_off(net);
-   } else {
-   /* Never set carrier _on_ based on ! NO_CARRIER */
-   if (d[0]  NO_CARRIER)
-   netif_carrier_off(net); 
-   }
 
/* bytes 3-4 == rx_lostpkt, reg 2E/2F */
pegasus-stats.rx_missed_errors += ((d[3]  0x7f)  8) | d[4];


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH -mm take4 2/6] support multiple logging

2007-04-25 Thread Keiichi KII

Well..  before you can finish this work we need to decide upon what the
interface to userspace will be.

- The miscdev isn't appropriate

Why isn't miscdev appropriate? 
We just shouldn't use miscdev for networking conventionally?



Yes it's rather odd, especially for networking.

What does the miscdev _do_ anyway?  Is it purely a target for the ioctls?


Yes, I purely use miscdev for the ioctls.

I want to use sysfs and ioctl to implement the dynamic configurabillity.
The sysfs shows/changes netconsole configurations(IP address, port and so on).
A userland application using the ioctl adds/removes netconsole port.

I thought that the dynamic configurability could be realized without a 
userland application. in the kernel only.

(e.g. only sysfs, no userland application)
But I think we need the function to automatically resolve the destination 
MAC address from IP address because of the resolving cost and 
I should implement a userland application, not netconsole kernel module.

The netconsle will become more useful by implementing the above function.


Some other speculations:
1. Would it be possible to add ioctl's to /dev/console? This would be more in
keeping with older Unix style model.

2. Using sysfs makes sense if there is a device object that exists to
   add the sysfs attributes to.

3. Procfs is handy for summary type tables.

4. Netlink does feel like overkill for this. Although newer generic netlink
   makes it easier.


If I use sysfs, Is it proper location that adds each attributes of netconsole 
port in /sys/class/misc/netconsole/port[0-9]*, or another locations in /sys/?


Stephen Hemminger said The configuration of netconsole's looks like the 
configuration of routes.

I think so too.
So I think ioctl commands for adding/removing port and the following userland 
application like route(8) command by using the ioctl.


e.g.
1. add port
# netconfig add 192.168.0.10 

2. remove port
# netconfig remove 1

3. show port info
# netconfig
id status  Source IP   Source Port Destination IP Destination Port Destination 
MAC
1  enable  192.168.0.1 6665192.168.0.10    
00:11:22:33:44:55
2  disable 192.168.0.1 6665192.168.0.20    
00:11:22:33:44:66

route(8) command uses ioctl for Netlink.
But, I'm going to implement ioctl's to /dev/console because of the above 
comments.

Thank you for your comments.
Any comments very welcome.
--
Keiichi KII
NEC Corporation OSS Promotion Center
E-mail: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH -mm take4 2/6] support multiple logging

2007-04-25 Thread David Miller
From: Keiichi KII [EMAIL PROTECTED]
Date: Thu, 26 Apr 2007 13:02:04 +0900

 Stephen Hemminger said The configuration of netconsole's looks like the 
 configuration of routes.
 I think so too.
 So I think ioctl commands for adding/removing port and the following userland 
 application like route(8) command by using the ioctl.

Like the route command itself, the route changing ioctl()s are
old deprecated BSD compatible functionality.

All current routing configuration is done using netlink and the 'ip'
utility.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8342] New: sctp_getsockopt_local_addrs_old() calls copy_to_user() while a spinlock is held

2007-04-25 Thread David Miller
From: Vlad Yasevich [EMAIL PROTECTED]
Date: Mon, 23 Apr 2007 13:43:35 -0400

 [PATCH] [SCTP] Fix sctp_getsockopt_local_addrs_old() to use local storage
 
 sctp_getsockopt_local_addrs_old() in net/sctp/socket.c calls copy_to_user()
 while the spinlock addr_lock is held. this should not be done as 
 copy_to_user()
 might sleep. the call to sctp_copy_laddrs_to_user() while holding the lock is
 also problematic as it calls copy_to_user()
 
 Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]

As Andrew Morton just noticed and fixed in -mm, you're passing
in int pointers to arguments that should be size_t pointers,
specifically for some of the calls to sctp_copy_laddrs().

Please fix this, and please start testing builds on 64-bit
platforms (even if via cross compile) so that you can catch
these as the warnings generated by the compiler on this one
were obvious.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] bridge: don't change packet type

2007-04-25 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 16:47:38 -0700

 The change to forward STP bpdu's (for usermode STP) through normal path,
 changed the packet type in the process. Since link local stuff is multicast, 
 it
 should stay pkt_type = PACKET_MULTICAST.  The code was probably copy/pasted
 incorrectly from the bridge pseudo-device receive path.
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Applied.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] bridge: drop PAUSE frames

2007-04-25 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 16:47:39 -0700

 Pause frames should never make it out of the network device into
 the stack. But if a device was misconfigured, it might happen.
 So drop pause frames in bridge.
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

It can happen when in promiscuous mode, but that's the only
legal case I can think of.

But that case shouldn't be hitting this path I don't think.

So this change is borderline and if anything we should put
an assertion somewhere maybe, but applied for now.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] bridge: missing rtnl

2007-04-25 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 16:47:41 -0700

 Writing to /sys/class/net/brX/bridge/stp_state causes a warning because
 RTNL is not held when call br_stp_if.c
 
 Signed-off-by: Stephen Hemminger [EMAIL PROTECTED]

Applied, thanks Stephen.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread Greg KH
On Wed, Apr 25, 2007 at 01:15:12PM -0700, Linus Torvalds wrote:
 
 
 On Wed, 25 Apr 2007, Alexey Kuznetsov wrote:
  
  Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
  which resulted in infinite recursion and stack overflow.

Wait, I just had the bright idea of actually testing this before I
pushed out a 2.6.20.9 kernel with another fix in it, and nope, still
crashes, even with this patch  :(

Full stackdump in a picture (forgot to have netconsole running) at:
http://www.kroah.com/netlink_oops.jpg

Any thoughts?

I'll go try 2.6.21 now too...

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread David Miller
From: Greg KH [EMAIL PROTECTED]
Date: Wed, 25 Apr 2007 22:29:12 -0700

 On Wed, Apr 25, 2007 at 01:15:12PM -0700, Linus Torvalds wrote:
  
  
  On Wed, 25 Apr 2007, Alexey Kuznetsov wrote:
   
   Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
   which resulted in infinite recursion and stack overflow.
 
 Wait, I just had the bright idea of actually testing this before I
 pushed out a 2.6.20.9 kernel with another fix in it, and nope, still
 crashes, even with this patch  :(
 
 Full stackdump in a picture (forgot to have netconsole running) at:
   http://www.kroah.com/netlink_oops.jpg
 
 Any thoughts?
 
 I'll go try 2.6.21 now too...

Crap.  We should have let this one simmer for a day to get
more eyes on it.

Thanks for catching this Greg.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread Chris Wright
* Greg KH ([EMAIL PROTECTED]) wrote:
 On Wed, Apr 25, 2007 at 01:15:12PM -0700, Linus Torvalds wrote:
  
  
  On Wed, 25 Apr 2007, Alexey Kuznetsov wrote:
   
   Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
   which resulted in infinite recursion and stack overflow.
 
 Wait, I just had the bright idea of actually testing this before I
 pushed out a 2.6.20.9 kernel with another fix in it, and nope, still
 crashes, even with this patch  :(

Odd, I tested it too (on linus-git), and it's fixed (it was definitely
the problem, of sending back to kernel).

thanks,
-chris
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread Greg KH
On Wed, Apr 25, 2007 at 10:32:01PM -0700, David Miller wrote:
 From: Greg KH [EMAIL PROTECTED]
 Date: Wed, 25 Apr 2007 22:29:12 -0700
 
  On Wed, Apr 25, 2007 at 01:15:12PM -0700, Linus Torvalds wrote:
   
   
   On Wed, 25 Apr 2007, Alexey Kuznetsov wrote:

Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.
  
  Wait, I just had the bright idea of actually testing this before I
  pushed out a 2.6.20.9 kernel with another fix in it, and nope, still
  crashes, even with this patch  :(
  
  Full stackdump in a picture (forgot to have netconsole running) at:
  http://www.kroah.com/netlink_oops.jpg
  
  Any thoughts?
  
  I'll go try 2.6.21 now too...
 
 Crap.  We should have let this one simmer for a day to get
 more eyes on it.
 
 Thanks for catching this Greg.

Odd, 2.6.21 doesn't crash at all.

Can anyone verify that I made the 2.6.20.8 release correctly with the
proper patch?

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH] infinite recursion in netlink

2007-04-25 Thread Greg KH
On Wed, Apr 25, 2007 at 10:44:20PM -0700, Greg KH wrote:
 On Wed, Apr 25, 2007 at 10:32:01PM -0700, David Miller wrote:
  From: Greg KH [EMAIL PROTECTED]
  Date: Wed, 25 Apr 2007 22:29:12 -0700
  
   On Wed, Apr 25, 2007 at 01:15:12PM -0700, Linus Torvalds wrote:


On Wed, 25 Apr 2007, Alexey Kuznetsov wrote:
 
 Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
 which resulted in infinite recursion and stack overflow.
   
   Wait, I just had the bright idea of actually testing this before I
   pushed out a 2.6.20.9 kernel with another fix in it, and nope, still
   crashes, even with this patch  :(
   
   Full stackdump in a picture (forgot to have netconsole running) at:
 http://www.kroah.com/netlink_oops.jpg
   
   Any thoughts?
   
   I'll go try 2.6.21 now too...
  
  Crap.  We should have let this one simmer for a day to get
  more eyes on it.
  
  Thanks for catching this Greg.
 
 Odd, 2.6.21 doesn't crash at all.
 
 Can anyone verify that I made the 2.6.20.8 release correctly with the
 proper patch?

fyi, here's the patch that I applied, perhaps 2.6.20 needed something
else too?

thanks,

greg k-h


Subject: NETLINK: Infinite recursion in netlink.

From: Alexey Kuznetsov [EMAIL PROTECTED]

[NETLINK]: Infinite recursion in netlink.

Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.

The bug is present in all kernel versions since the feature appeared.

The patch also makes some minimal cleanup:

1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup

Signed-off-by: Alexey Kuznetsov [EMAIL PROTECTED]
Signed-off-by: David S. Miller [EMAIL PROTECTED]
Signed-off-by: Greg Kroah-Hartman [EMAIL PROTECTED]

---
 net/ipv4/fib_frontend.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -772,6 +772,8 @@ static void nl_fib_lookup(struct fib_res
   .nl_u = { .ip4_u = { .daddr = 
frn-fl_addr,
.tos = frn-fl_tos,
.scope = 
frn-fl_scope } } };
+
+   frn-err = -ENOENT;
if (tb) {
local_bh_disable();
 
@@ -783,6 +785,7 @@ static void nl_fib_lookup(struct fib_res
frn-nh_sel = res.nh_sel;
frn-type = res.type;
frn-scope = res.scope;
+   fib_res_put(res);
}
local_bh_enable();
}
@@ -797,6 +800,9 @@ static void nl_fib_input(struct sock *sk
struct fib_table *tb;

skb = skb_dequeue(sk-sk_receive_queue);
+   if (skb == NULL)
+   return;
+
nlh = (struct nlmsghdr *)skb-data;
if (skb-len  NLMSG_SPACE(0) || skb-len  nlh-nlmsg_len ||
nlh-nlmsg_len  NLMSG_LENGTH(sizeof(*frn))) {
@@ -809,7 +815,7 @@ static void nl_fib_input(struct sock *sk
 
nl_fib_lookup(frn, tb);

-   pid = nlh-nlmsg_pid;   /*pid of sending process */
+   pid = NETLINK_CB(skb).pid;   /* pid of sending process */
NETLINK_CB(skb).pid = 0; /* from kernel */
NETLINK_CB(skb).dst_group = 0;  /* unicast */
netlink_unicast(sk, skb, pid, MSG_DONTWAIT);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >