Re: [2.6 patch] let USB_USBNET always select MII

2007-11-07 Thread David Miller
From: Adrian Bunk [EMAIL PROTECTED]
Date: Thu, 1 Nov 2007 23:25:24 +0100

 All this USB_USBNET_MII trickery is simply not worth it considering how 
 few code it saves.
 
 As a side effect, this also fixes the following compile error reported 
 by Toralf Förster:
 
 --  snip  --
 
 ...
   LD  .tmp_vmlinux1
 drivers/built-in.o: In function `usbnet_set_settings':
 (.text+0xf1876): undefined reference to `mii_ethtool_sset'
 drivers/built-in.o: In function `usbnet_get_settings':
 (.text+0xf1836): undefined reference to `mii_ethtool_gset'
 drivers/built-in.o: In function `usbnet_get_link':
 (.text+0xf18d6): undefined reference to `mii_link_ok'
 drivers/built-in.o: In function `usbnet_nway_reset':
 (.text+0xf18f6): undefined reference to `mii_nway_restart'
 make: *** [.tmp_vmlinux1] Error 1
 
 --  snip  --
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Applied, thanks Adrian.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':

2007-11-07 Thread David Miller

David, I hate to say this and point you out like this, but you are a
real cancer for bug fixes to USB things in the kernel, and I'm very
tired of seeing things stuck in the mud (and engineering resources
wasted) because of how you handle things.  It's very bad for Linux,
and the USB code in particular.

If I had a nickel for every patch from someone else you grinded into
the ground and stalled I'd truly be a millionare.

You absolutely stifle development progress.  I thought my OHCI
deadlock patch was an isolated case (and nothing is still applied,
which is just awesome, my original patch was posted more than a month
ago), but you're doing the same exact thing to Adrian here too.

You want to see things fixed your way.  But you can get away with the
if, and only if, you can spend every day working on your own version
of fixes when you don't like the submitters version.  But unlike me
you don't have that luxury so you have to give patch submitters a
larger level of freedom and, plainly, just let go.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: Removing duplicit #includes

2007-11-07 Thread Jiri Olsa

Removing duplicit #includes for net/
Signed-off-by: Jiri Olsa [EMAIL PROTECTED] 
---

net/core/dst.c   |1 -
net/ieee80211/ieee80211_crypt_tkip.c |1 -
net/ieee80211/ieee80211_crypt_wep.c  |1 -
3 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 16958e6..03daead 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -18,7 +18,6 @@
#include linux/types.h
#include net/net_namespace.h

-#include net/net_namespace.h
#include net/dst.h

/*
diff --git a/net/ieee80211/ieee80211_crypt_tkip.c 
b/net/ieee80211/ieee80211_crypt_tkip.c
index 4cce353..58b2261 100644
--- a/net/ieee80211/ieee80211_crypt_tkip.c
+++ b/net/ieee80211/ieee80211_crypt_tkip.c
@@ -25,7 +25,6 @@
#include net/ieee80211.h

#include linux/crypto.h
-#include linux/scatterlist.h
#include linux/crc32.h

MODULE_AUTHOR(Jouni Malinen);
diff --git a/net/ieee80211/ieee80211_crypt_wep.c 
b/net/ieee80211/ieee80211_crypt_wep.c
index 866fc04..3fa30c4 100644
--- a/net/ieee80211/ieee80211_crypt_wep.c
+++ b/net/ieee80211/ieee80211_crypt_wep.c
@@ -22,7 +22,6 @@
#include net/ieee80211.h

#include linux/crypto.h
-#include linux/scatterlist.h
#include linux/crc32.h

MODULE_AUTHOR(Jouni Malinen);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Removing duplicit #includes

2007-11-07 Thread David Miller
From: Jiri Olsa [EMAIL PROTECTED]
Date: Wed, 07 Nov 2007 09:29:37 +0100

 Removing duplicit #includes for net/
 Signed-off-by: Jiri Olsa [EMAIL PROTECTED] 

Applied, thanks Jiri.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: Removing duplicit #includes

2007-11-07 Thread Jiri Olsa
Removing duplicit #includes for net/
Signed-off-by: Jiri Olsa [EMAIL PROTECTED] 
---
 net/core/dst.c   |1 -
 net/ieee80211/ieee80211_crypt_tkip.c |1 -
 net/ieee80211/ieee80211_crypt_wep.c  |1 -
 3 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/net/core/dst.c b/net/core/dst.c
index 16958e6..03daead 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -18,7 +18,6 @@
 #include linux/types.h
 #include net/net_namespace.h
 
-#include net/net_namespace.h
 #include net/dst.h
 
 /*
diff --git a/net/ieee80211/ieee80211_crypt_tkip.c 
b/net/ieee80211/ieee80211_crypt_tkip.c
index 4cce353..58b2261 100644
--- a/net/ieee80211/ieee80211_crypt_tkip.c
+++ b/net/ieee80211/ieee80211_crypt_tkip.c
@@ -25,7 +25,6 @@
 #include net/ieee80211.h
 
 #include linux/crypto.h
-#include linux/scatterlist.h
 #include linux/crc32.h
 
 MODULE_AUTHOR(Jouni Malinen);
diff --git a/net/ieee80211/ieee80211_crypt_wep.c 
b/net/ieee80211/ieee80211_crypt_wep.c
index 866fc04..3fa30c4 100644
--- a/net/ieee80211/ieee80211_crypt_wep.c
+++ b/net/ieee80211/ieee80211_crypt_wep.c
@@ -22,7 +22,6 @@
 #include net/ieee80211.h
 
 #include linux/crypto.h
-#include linux/scatterlist.h
 #include linux/crc32.h
 
 MODULE_AUTHOR(Jouni Malinen);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Endianness problem with u32 classifier hash masks

2007-11-07 Thread David Miller
From: Radu Rendec [EMAIL PROTECTED]
Date: Tue, 06 Nov 2007 19:00:16 +0200

 On Tue, 2007-11-06 at 09:43 -0500, jamal wrote:
  On Tue, 2007-06-11 at 15:25 +0100, Jarek Poplawski wrote:
  
   Yes, it saves one htonl() on the slow path!
  
  Would it feel better to say grew down exponentially from version 1 to
  3? ;-
 
 Not only it saves one htonl(), but also keeps the code readable :)
 Computing offsets within the rtnetlink response skb and applying htonl()
 there is quite tricky and might get broken if RTA_PUT() is changed.
 Unfortunately I spent about an hour figuring out how to do that :))
 
 The bad news is that today I haven't got the chance to work on the two
 patches. But the good news is that I managed to finish the (urgent) task
 that had been assigned to me at work, and tomorrow I will be able to
 work on the kernel and test it leisurely.

I've grown impatient and done the work for you :-)  I've applied
the patch below to my tree, thank you!

If someone wants to send me the ffs() thing relative to this,
I'd appreciate it.  Thanks again!

From 8e36263f10a054479636b57943cdeaf37470acc5 Mon Sep 17 00:00:00 2001
From: Radu Rendec [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 01:20:12 -0800
Subject: [PATCH] [PKT_SCHED] CLS_U32: Fix endianness problem with u32 
classifier hash masks.

From: Radu Rendec [EMAIL PROTECTED]

While trying to implement u32 hashes in my shaping machine I ran into
a possible bug in the u32 hash/bucket computing algorithm
(net/sched/cls_u32.c).

The problem occurs only with hash masks that extend over the octet
boundary, on little endian machines (where htonl() actually does
something).

Let's say that I would like to use 0x3fc0 as the hash mask. This means
8 contiguous 1 bits starting at b6. With such a mask, the expected
(and logical) behavior is to hash any address in, for instance,
192.168.0.0/26 in bucket 0, then any address in 192.168.0.64/26 in
bucket 1, then 192.168.0.128/26 in bucket 2 and so on.

This is exactly what would happen on a big endian machine, but on
little endian machines, what would actually happen with current
implementation is 0x3fc0 being reversed (into 0xc03f) by htonl()
in the userspace tool and then applied to 192.168.x.x in the u32
classifier. When shifting right by 16 bits (rank of first 1 bit in
the reversed mask) and applying the divisor mask (0xff for divisor
256), what would actually remain is 0x3f applied on the 168 octet of
the address.

One could say is this can be easily worked around by taking endianness
into account in userspace and supplying an appropriate mask (0xfc03)
that would be turned into contiguous 1 bits when reversed
(0x03fc). But the actual problem is the network address (inside
the packet) not being converted to host order, but used as a
host-order value when computing the bucket.

Let's say the network address is written as n31 n30 ... n0, with n0
being the least significant bit. When used directly (without any
conversion) on a little endian machine, it becomes n7 ... n0 n8 ..n15
etc in the machine's registers. Thus bits n7 and n8 would no longer be
adjacent and 192.168.64.0/26 and 192.168.128.0/26 would no longer be
consecutive.

The fix is to apply ntohl() on the hmask before computing fshift,
and in u32_hash_fold() convert the packet data to host order before
shifting down by fshift.

With helpful feedback from Jamal Hadi Salim and Jarek Poplawski.

Signed-off-by: David S. Miller [EMAIL PROTECTED]
---
 net/sched/cls_u32.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 9e98c6e..5317102 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -91,7 +91,7 @@ static struct tc_u_common *u32_list;
 
 static __inline__ unsigned u32_hash_fold(u32 key, struct tc_u32_sel *sel, u8 
fshift)
 {
-   unsigned h = (key  sel-hmask)fshift;
+   unsigned h = ntohl(key  sel-hmask)fshift;
 
return h;
 }
@@ -615,7 +615,7 @@ static int u32_change(struct tcf_proto *tp, unsigned long 
base, u32 handle,
n-handle = handle;
 {
u8 i = 0;
-   u32 mask = s-hmask;
+   u32 mask = ntohl(s-hmask);
if (mask) {
while (!(mask  1)) {
i++;
-- 
1.5.3.5

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Removing duplicit #includes

2007-11-07 Thread Jiri Olsa
sorry, the patch is mangled, I will resend another 

Jiri Olsa wrote:
 Removing duplicit #includes for net/
 Signed-off-by: Jiri Olsa [EMAIL PROTECTED] ---
 net/core/dst.c   |1 -
 net/ieee80211/ieee80211_crypt_tkip.c |1 -
 net/ieee80211/ieee80211_crypt_wep.c  |1 -
 3 files changed, 0 insertions(+), 3 deletions(-)
 
 diff --git a/net/core/dst.c b/net/core/dst.c
 index 16958e6..03daead 100644
 --- a/net/core/dst.c
 +++ b/net/core/dst.c
 @@ -18,7 +18,6 @@
 #include linux/types.h
 #include net/net_namespace.h
 
 -#include net/net_namespace.h
 #include net/dst.h
 
 /*
 diff --git a/net/ieee80211/ieee80211_crypt_tkip.c
 b/net/ieee80211/ieee80211_crypt_tkip.c
 index 4cce353..58b2261 100644
 --- a/net/ieee80211/ieee80211_crypt_tkip.c
 +++ b/net/ieee80211/ieee80211_crypt_tkip.c
 @@ -25,7 +25,6 @@
 #include net/ieee80211.h
 
 #include linux/crypto.h
 -#include linux/scatterlist.h
 #include linux/crc32.h
 
 MODULE_AUTHOR(Jouni Malinen);
 diff --git a/net/ieee80211/ieee80211_crypt_wep.c
 b/net/ieee80211/ieee80211_crypt_wep.c
 index 866fc04..3fa30c4 100644
 --- a/net/ieee80211/ieee80211_crypt_wep.c
 +++ b/net/ieee80211/ieee80211_crypt_wep.c
 @@ -22,7 +22,6 @@
 #include net/ieee80211.h
 
 #include linux/crypto.h
 -#include linux/scatterlist.h
 #include linux/crc32.h
 
 MODULE_AUTHOR(Jouni Malinen);
 

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [VLAN]: Fix SET_VLAN_INGRESS_PRIORITY_CMD ioctl

2007-11-07 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Sat, 03 Nov 2007 13:24:34 +0100

 Fix a regression in 2.6.23. Candidate for -stable IMO.

Applied, and I'll queue it up for -stable, thanks!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Why does a connect to IPv6 LLA address fail ?

2007-11-07 Thread Karsten Keil
Hi,

currently I do some cerification test for IPv6 with the TAHI ct testsuite.
With the default-addr-select tests for compliance with RFC3484 here are
FAILs with Destination Address Selection Check Rule 2(Prefer matching
scope). Yes I know that Destination Address Selection is done in glibc,
but it seems that the kernel behaves wrong in the connect system call
with IPv6 link local addresses. 

The glibc getaddrinfo function try to verify if a address is valid and
examine the source address. For this it create a socket for datagram and
protocol IPPROTO_IP and then try to connect it with the destination
address. This fails in the case of a LLA, because connect returns EINVAL,
since here is no device bind to this socket at this time. So getaddrinfo
mark this candidate address as not reachable and so it will never prefered
because of rule 1 of RFC3484 Destination Address Selection.

Why do we have this check in ip6_datagram_connect() ?

The posix manpage for connect says about EINVAL:
EINVAL - The address_len argument is not a valid length for the address 
family; or
invalid address family in the sockaddr structure.

Which is not the case here.

-- 
Karsten Keil
SuSE Labs
ISDN and VOIP development
SUSE LINUX Products GmbH, Maxfeldstr.5 90409 Nuernberg, GF: Markus Rex, HRB 
16746 (AG Nuernberg)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible BUG on net/ipv4/ipcomp.c, line 358 (fwd)

2007-11-07 Thread David Miller
From: Herbert Xu [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 09:11:24 +0800

 [IPSEC]: Fix crypto_alloc_comp error checking
 
 The function crypto_alloc_comp returns an errno instead of NULL
 to indicate error.  So it needs to be tested with IS_ERR.
 
 This is based on a patch by Vicenç Beltran Querol.
 
 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Applied, and I'll queue this up for -stable too.

Thanks!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [AF_PACKET]: Allow multicast traffic to be caught by ORIGDEV when bonded

2007-11-07 Thread David Miller
From: PJ Waskiewicz [EMAIL PROTECTED]
Date: Tue, 06 Nov 2007 07:25:47 -0800

 The socket option for packet sockets to return the original ifindex instead
 of the bonded ifindex will not match multicast traffic.  Since this socket
 option is the most useful for layer 2 traffic and multicast traffic, make
 the option multicast-aware.
 
 Signed-off-by: Peter P Waskiewicz Jr [EMAIL PROTECTED]

I agree with you in principle, but I'd like to hear some feedback
from other folks.  In particular I'd like a discussion about
what this might break, if anything.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-07 Thread David Miller
From: Eric Dumazet [EMAIL PROTECTED]
Date: Sun, 04 Nov 2007 12:31:28 +0100

 [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table
 
 As done two years ago on IP route cache table (commit 
 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per 
 hash bucket for the huge TCP/DCCP hash tables.
 
 On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle 
 performance differences. (we hit a different cache line for the rwlock, but 
 then the bucket cache line have a better sharing factor among cpus, since we 
 dirty it less often). For netstat or ss commands that want a full scan of 
 hash 
 table, we perform fewer memory accesses.
 
 Using a 'small' table of hashed rwlocks should be more than enough to provide 
 correct SMP concurrency between different buckets, without using too much 
 memory. Sizing of this table depends on num_possible_cpus() and various 
 CONFIG 
 settings.
 
 This patch provides some locking abstraction that may ease a future work 
 using 
   a different model for TCP/DCCP table.
 
 Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
 Acked-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]

I'm going to push this current version to Linus, the space saving
really justify it and if we want to refine things further we do it
with followon work rather than blocking this patch.

Thanks Eric!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NETLINK]: Fix unicast timeouts

2007-11-07 Thread David Miller
From: Patrick McHardy [EMAIL PROTECTED]
Date: Sun, 04 Nov 2007 17:52:56 +0100

 [NETLINK]: Fix unicast timeouts
 
 Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts by
 moving the schedule_timeout() call to a new function that doesn't propagate
 the remaining timeout back to the caller. This means on each retry we start
 with the full timeout again.
 
 ipc/mqueue.c seems to actually want to wait indefinitely so this behaviour
 is retained.
 
 Cc: Manfred Spraul [EMAIL PROTECTED]
 Signed-off-by: Patrick McHardy [EMAIL PROTECTED]

Applied, and I'll queue this up for -stable, thanks Patrick!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stack Trace. Bad?

2007-11-07 Thread Evgeniy Polyakov
Hi Jon.

On Tue, Nov 06, 2007 at 02:23:03PM -0600, Jon Nelson ([EMAIL PROTECTED]) wrote:
 [linux-raid was also emailed this same information]

It looks like it was not :)

 I was testing some network throughput today and ran into this.
 I should note that I've this motherboard has 2x MCP55 Ethernet and one
 of them works fine and the other one gives lots and lots of frame
 errors under load.
 
 The following is only an harmless informational message.
 Unless you get a _continuous_flood_ of these messages it means
 everything is working fine. Allocations from irqs cannot be
 perfectly reliable and the kernel is designed to handle that.
 md0_raid5: page allocation failure. order:2, mode:0x20
 
 Call Trace:
  IRQ  [802684c2] __alloc_pages+0x324/0x33d
  [80283147] kmem_getpages+0x66/0x116
  [8028367a] fallback_alloc+0x104/0x174
  [80283330] kmem_cache_alloc_node+0x9c/0xa8
  [80396984] __alloc_skb+0x65/0x138
  [8821d82a] :forcedeth:nv_alloc_rx_optimized+0x4d/0x18f

What MTU for this card is? Forcedeth supports jumbo frames, but does it
in very unoptimized way, particulary by relying on the possibility to
allocate 2-order pages, which is wrong.

So, set MTU to 1500 and things will be back into good shape.
I think adding fragments support is not a short-term solution because
of closed specs.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/2] ipvs: Bind connections on stanby if the destination exists

2007-11-07 Thread David Miller
From: [EMAIL PROTECTED]
Date: Mon, 05 Nov 2007 12:08:52 +0900

 From: Rumen G. Bogdanovski [EMAIL PROTECTED]
 
 This patch fixes the problem with node overload on director fail-over.
 Given the scenario: 2 nodes each accepting 3 connections at a time and 2
 directors, director failover occurs when the nodes are fully loaded (6
 connections to the cluster) in this case the new director will assign
 another 6 connections to the cluster, If the same real servers exist
 there.
 ...
 Acked-by: Julian Anastasov [EMAIL PROTECTED]
 Signed-off-by: Rumen G. Bogdanovski [EMAIL PROTECTED]
 Signed-off-by: Simon Horman [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] remove Documentation/networking/routing.txt

2007-11-07 Thread David Miller
From: Adrian Bunk [EMAIL PROTECTED]
Date: Mon, 5 Nov 2007 18:06:01 +0100

 This file is so outdated that I can't see any value in keeping it.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Agreed, and applied.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] remove Documentation/networking/ncsa-telnet

2007-11-07 Thread David Miller
From: Adrian Bunk [EMAIL PROTECTED]
Date: Mon, 5 Nov 2007 18:05:46 +0100

 Newsflash: There once was a version of NCSA telnet that had some bug.
 
 Spotted by Pekka Pietikainen.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Clean proto_(un)register from in-code ifdefs

2007-11-07 Thread David Miller
From: Pavel Emelyanov [EMAIL PROTECTED]
Date: Tue, 06 Nov 2007 20:20:24 +0300

 The struct proto has the per-cpu inuse counter, which is handled
 with a special care. All the handling code hides under the ifdef 
 CONFIG_SMP and it introduces some code duplication and makes it 
 look worse than it could.
 
 Clean this.
 
 Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

Applied, thanks!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] remove Documentation/networking/Configurable

2007-11-07 Thread David Miller
From: Adrian Bunk [EMAIL PROTECTED]
Date: Mon, 5 Nov 2007 18:05:11 +0100

 After more than 11 years this file does no longer contain much useful 
 information.
 
 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] remove comx driver docs

2007-11-07 Thread David Miller
From: Alan Cox [EMAIL PROTECTED]
Date: Mon, 5 Nov 2007 17:18:37 +

 On Mon, 5 Nov 2007 18:04:45 +0100
 Adrian Bunk [EMAIL PROTECTED] wrote:
 
  The drivers have already been removed 3.5 years ago.
  
  Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
 
 Acked-by: Alan Cox [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2.6 patch] remove Documentation/networking/pt.txt

2007-11-07 Thread David Miller
From: Alan Cox [EMAIL PROTECTED]
Date: Mon, 5 Nov 2007 17:17:57 +

 On Mon, 5 Nov 2007 18:05:57 +0100
 Adrian Bunk [EMAIL PROTECTED] wrote:
 
  There's no no point in keeping documentation for a driver that was 
  removed many years ago.
  
  Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
 
 Defintiely very dead
 
 Acked-by: Alan Cox [EMAIL PROTECTED]

Applied.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] in inet6_create

2007-11-07 Thread David Miller
From: YOSHIFUJI Hideaki / 吉藤英明 [EMAIL PROTECTED]
Date: Mon, 05 Nov 2007 20:00:46 +0900 (JST)

 [IPV6]: Ensure to initialize inetsw6 array before we start accepting socket.
 
 Signed-off-by: YOSHIFUJI Hideaki [EMAIL PROTECTED]
 
 diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
 index ecbd388..9ecd41b 100644
 --- a/net/ipv6/af_inet6.c
 +++ b/net/ipv6/af_inet6.c
 @@ -789,6 +789,7 @@ static int __init inet6_init(void)
   /* Register the socket-side information for inet6_create.  */
   for(r = inetsw6[0]; r  inetsw6[SOCK_MAX]; ++r)
   INIT_LIST_HEAD(r);
 + synchronize_net();
  
   /* We MUST register RAW sockets before we create the ICMP6,
* IGMP6, or NDISC control sockets.
 

I don't see how this can make a difference.

sock_register() takes spinlocks, and therefore provides
a full memory barrier.  The list initializations MUST
appear before any code path can see inet6_create() and
friends.

I simply cannot see how this crash is even possible.

Also, the original bug reporter cannot provide an inet6.o image that
matches any of his OOPS traces, so we cannot analyze this bug properly.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] ipvs: Syncrhonise Closing of Connections

2007-11-07 Thread David Miller
From: [EMAIL PROTECTED]
Date: Mon, 05 Nov 2007 12:08:53 +0900

 From: Rumen G. Bogdanovski [EMAIL PROTECTED]
 
 This patch makes the master daemon to sync the connection when it is about
 to close.  This makes the connections on the backup to close or timeout
 according their state.  Before the sync was performed only if the
 connection is in ESTABLISHED state which always made the connections to
 timeout in the hard coded 3 minutes. However the Andy Gospodarek's patch
 ([IPVS]: use proper timeout instead of fixed value) effectively did nothing
 more than increasing this to 15 minutes (Established state timeout).  So
 this patch makes use of proper timeout since it syncs the connections on
 status changes to FIN_WAIT (2min timeout) and CLOSE (10sec timeout).
 However if the backup misses CLOSE hopefully it did not miss FIN_WAIT.
 Otherwise we will just have to wait for the ESTABLISHED state timeout. As
 it is without this patch.  This way the number of the hanging connections
 on the backup is kept to minimum. And very few of them will be left to
 timeout with a long timeout.
 
 This is important if we want to make use of the fix for the real server
 overcommit on master/backup fail-over.
 
 Signed-off-by: Rumen G. Bogdanovski [EMAIL PROTECTED]
 Signed-off-by: Simon Horman [EMAIL PROTECTED]

Also applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [AF_PACKET]: Allow multicast traffic to be caught by ORIGDEV when bonded

2007-11-07 Thread Waskiewicz Jr, Peter P
  The socket option for packet sockets to return the original ifindex 
  instead of the bonded ifindex will not match multicast 
 traffic.  Since 
  this socket option is the most useful for layer 2 traffic and 
  multicast traffic, make the option multicast-aware.
  
  Signed-off-by: Peter P Waskiewicz Jr 
 [EMAIL PROTECTED]
 
 I agree with you in principle, but I'd like to hear some 
 feedback from other folks.  In particular I'd like a 
 discussion about what this might break, if anything.

That's reasonable.  In any event, the only thing this could affect is if
the option is set on the socket, which shouldn't be very often at all.

I'm more than open to feedback on this change.

Thanks Dave,

-PJ Waskiewicz
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [AF_PACKET]: Allow multicast traffic to be caught by ORIGDEV when bonded

2007-11-07 Thread David Miller
From: Waskiewicz Jr, Peter P [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 03:00:42 -0800

 In any event, the only thing this could affect is if the option is
 set on the socket, which shouldn't be very often at all.

Any idea how many programs set this option and which
ones?  You obviously noticed, so perhaps you know at
least one or was this discovered purely by code
inspection?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why does a connect to IPv6 LLA address fail ?

2007-11-07 Thread Jiri Bohac
Hi,

 For this it create a socket for datagram and
 protocol IPPROTO_IP and then try to connect it with the destination
 address. This fails in the case of a LLA, because connect returns EINVAL,
 since here is no device bind to this socket at this time.

[snip]

 Why do we have this check in ip6_datagram_connect() ?

This problem has been nicely described in
http://www.linux-ipv6.org/ml/usagi-users/msg03062.html
without any response. 

RFC2461, Appendix A, really suggests performing neighbour
discovery on all the links. I like the idea, it would make LLAs
much more useful. 

Has anyone experimented with this? Is there any good reason why
we don't send NSs to all the links to find out which link the
destination LLA is on?

Regards,

-- 
Jiri Bohac [EMAIL PROTECTED]
SUSE Labs, SUSE CZ

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] cleanup pernet operation without CONFIG_NET_NS

2007-11-07 Thread Denis V. Lunev
If CONFIG_NET_NS is not set, the only namespace is possible.

This patch removes list of pernet_operations and cleanups code a bit.
This list is not needed if there are no namespaces. We should just call
-init method.

Additionally, the -exit will be called on module unloading only. This
case is safe - the code is not discarded. For the in/kernel code, -exit
should never be called.

Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]

--- ./net/core/net_namespace.c.netinitdata  2007-10-15 13:55:25.0 
+0400
+++ ./net/core/net_namespace.c  2007-11-06 14:33:14.0 +0300
@@ -179,6 +180,7 @@ static int __init net_ns_init(void)
 
 pure_initcall(net_ns_init);
 
+#ifdef CONFIG_NET_NS
 static int register_pernet_operations(struct list_head *list,
  struct pernet_operations *ops)
 {
@@ -220,6 +222,23 @@ static void unregister_pernet_operations
ops-exit(net);
 }
 
+#else
+
+static int register_pernet_operations(struct list_head *list,
+ struct pernet_operations *ops)
+{
+   if (ops-init == NULL)
+   return 0;
+   return ops-init(init_net);
+}
+
+static void unregister_pernet_operations(struct pernet_operations *ops)
+{
+   if (ops-exit)
+   ops-exit(init_net);
+}
+#endif
+
 /**
  *  register_pernet_subsys - register a network namespace subsystem
  * @ops:  pernet operations structure for the subsystem
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] fix for OOPS in pernet list operations if CONFIG_NET_NS undefined

2007-11-07 Thread Denis V. Lunev
These patches are addressed to the oops reported by the Cedric Le Goater
a week ago. The pernet_operations were discarder during kernel boot and
this breaks further operations as this 

Though, the patch from Pavel Emelyanov was partially reverted
by the Eric W. Biederman [commit 2b008b0a8e96b726c603c5e1a5a7a509b5f61e35]

So, I revert the Eric patch (actually, Eric one can be simply dropped) and
fix original code. There is no need for such complex code if CONFIG_NET_NS
is not defined.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] move unneeded data to initdata section

2007-11-07 Thread Denis V. Lunev
This patch reverts Eric's commit 2b008b0a8e96b726c603c5e1a5a7a509b5f61e35

It diets .text  .data section of the kernel if CONFIG_NET_NS is not set.
This is safe after list operations cleanup.

Signed-of-by: Denis V. Lunev [EMAIL PROTECTED]

--- ./drivers/net/loopback.c.reversed   2007-10-30 14:45:07.0 +0300
+++ ./drivers/net/loopback.c2007-11-01 17:30:55.0 +0300
@@ -284,7 +284,7 @@ static __net_exit void loopback_net_exit
unregister_netdev(dev);
 }
 
-static struct pernet_operations loopback_net_ops = {
+static struct pernet_operations __net_initdata loopback_net_ops = {
.init = loopback_net_init,
.exit = loopback_net_exit,
 };
--- ./fs/proc/proc_net.c.reversed   2007-10-30 14:45:07.0 +0300
+++ ./fs/proc/proc_net.c2007-11-01 17:30:57.0 +0300
@@ -185,7 +185,7 @@ static __net_exit void proc_net_ns_exit(
kfree(net-proc_net_root);
 }
 
-static struct pernet_operations proc_net_ns_ops = {
+static struct pernet_operations __net_initdata proc_net_ns_ops = {
.init = proc_net_ns_init,
.exit = proc_net_ns_exit,
 };
--- ./include/net/net_namespace.h.reversed  2007-10-30 14:45:07.0 
+0300
+++ ./include/net/net_namespace.h   2007-11-01 17:30:58.0 +0300
@@ -102,9 +102,11 @@ static inline void release_net(struct ne
 #ifdef CONFIG_NET_NS
 #define __net_init
 #define __net_exit
+#define __net_initdata
 #else
 #define __net_init __init
 #define __net_exit __exit_refok
+#define __net_initdata __initdata
 #endif
 
 struct pernet_operations {
--- ./net/core/dev.c.reversed   2007-10-30 14:45:08.0 +0300
+++ ./net/core/dev.c2007-11-01 17:30:58.0 +0300
@@ -2676,7 +2676,7 @@ static void __net_exit dev_proc_net_exit
proc_net_remove(net, dev);
 }
 
-static struct pernet_operations dev_proc_ops = {
+static struct pernet_operations __net_initdata dev_proc_ops = {
.init = dev_proc_net_init,
.exit = dev_proc_net_exit,
 };
@@ -4336,7 +4336,7 @@ static void __net_exit netdev_exit(struc
kfree(net-dev_index_head);
 }
 
-static struct pernet_operations  netdev_net_ops = {
+static struct pernet_operations __net_initdata netdev_net_ops = {
.init = netdev_init,
.exit = netdev_exit,
 };
@@ -4367,7 +4367,7 @@ static void __net_exit default_device_ex
rtnl_unlock();
 }
 
-static struct pernet_operations  default_device_ops = {
+static struct pernet_operations __net_initdata default_device_ops = {
.exit = default_device_exit,
 };
 
--- ./net/core/dev_mcast.c.reversed 2007-10-30 14:45:08.0 +0300
+++ ./net/core/dev_mcast.c  2007-11-01 17:31:00.0 +0300
@@ -285,7 +285,7 @@ static void __net_exit dev_mc_net_exit(s
proc_net_remove(net, dev_mcast);
 }
 
-static struct pernet_operations dev_mc_net_ops = {
+static struct pernet_operations __net_initdata dev_mc_net_ops = {
.init = dev_mc_net_init,
.exit = dev_mc_net_exit,
 };
--- ./net/netlink/af_netlink.c.reversed 2007-10-30 14:45:08.0 +0300
+++ ./net/netlink/af_netlink.c  2007-11-01 17:31:01.0 +0300
@@ -1888,7 +1888,7 @@ static void __net_exit netlink_net_exit(
 #endif
 }
 
-static struct pernet_operations netlink_net_ops = {
+static struct pernet_operations __net_initdata netlink_net_ops = {
.init = netlink_net_init,
.exit = netlink_net_exit,
 };
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-07 Thread Jarek Poplawski
On Wed, Nov 07, 2007 at 02:41:14AM -0800, David Miller wrote:
 From: Eric Dumazet [EMAIL PROTECTED]
 Date: Sun, 04 Nov 2007 12:31:28 +0100
 
  [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table
  
  As done two years ago on IP route cache table (commit 
  22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per 
  hash bucket for the huge TCP/DCCP hash tables.
  
  On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle 
  performance differences. (we hit a different cache line for the rwlock, but 
  then the bucket cache line have a better sharing factor among cpus, since 
  we 
  dirty it less often). For netstat or ss commands that want a full scan of 
  hash 
  table, we perform fewer memory accesses.
  
  Using a 'small' table of hashed rwlocks should be more than enough to 
  provide 
  correct SMP concurrency between different buckets, without using too much 
  memory. Sizing of this table depends on num_possible_cpus() and various 
  CONFIG 
  settings.
  
  This patch provides some locking abstraction that may ease a future work 
  using 
a different model for TCP/DCCP table.
  
  Signed-off-by: Eric Dumazet [EMAIL PROTECTED]
  Acked-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
 
 I'm going to push this current version to Linus, the space saving
 really justify it and if we want to refine things further we do it
 with followon work rather than blocking this patch.
 
 Thanks Eric!

I hope my remarks didn't block anything?! I've written it's OK.
So, I'm not sure it's useful or expected, but anyway:

Acked-by: Jarek Poplawski [EMAIL PROTECTED]

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mdiobus_register: check bus not being NULL before dereferencing it.

2007-11-07 Thread Uwe Kleine-König
Signed-off-by: Uwe Kleine-König [EMAIL PROTECTED]
Cc: Andy Fleming [EMAIL PROTECTED]
---
 drivers/net/phy/mdio_bus.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index fc2f0e6..7ff55bb 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -49,13 +49,13 @@ int mdiobus_register(struct mii_bus *bus)
int i;
int err = 0;
 
-   spin_lock_init(bus-mdio_lock);
-
if (NULL == bus || NULL == bus-name ||
NULL == bus-read ||
NULL == bus-write)
return -EINVAL;
 
+   spin_lock_init(bus-mdio_lock);
+
if (bus-reset)
bus-reset(bus);
 
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Endianness problem with u32 classifier hash masks

2007-11-07 Thread Jarek Poplawski
On Wed, Nov 07, 2007 at 01:22:20AM -0800, David Miller wrote:
 From: Radu Rendec [EMAIL PROTECTED]
 Date: Tue, 06 Nov 2007 19:00:16 +0200
 
  On Tue, 2007-11-06 at 09:43 -0500, jamal wrote:
   On Tue, 2007-06-11 at 15:25 +0100, Jarek Poplawski wrote:
   
Yes, it saves one htonl() on the slow path!
   
   Would it feel better to say grew down exponentially from version 1 to
   3? ;-
  
  Not only it saves one htonl(), but also keeps the code readable :)
  Computing offsets within the rtnetlink response skb and applying htonl()
  there is quite tricky and might get broken if RTA_PUT() is changed.
  Unfortunately I spent about an hour figuring out how to do that :))
  
  The bad news is that today I haven't got the chance to work on the two
  patches. But the good news is that I managed to finish the (urgent) task
  that had been assigned to me at work, and tomorrow I will be able to
  work on the kernel and test it leisurely.
 
 I've grown impatient and done the work for you :-)  I've applied
 the patch below to my tree, thank you!
 
 If someone wants to send me the ffs() thing relative to this,
 I'd appreciate it.  Thanks again!

...And Radu has spend so much time on this git vs. which tree to cut
for the beginning... I hope this ffs() patch will be enough to check
on some bush at least!

 
 From 8e36263f10a054479636b57943cdeaf37470acc5 Mon Sep 17 00:00:00 2001
 From: Radu Rendec [EMAIL PROTECTED]
 Date: Wed, 7 Nov 2007 01:20:12 -0800
 Subject: [PATCH] [PKT_SCHED] CLS_U32: Fix endianness problem with u32 
 classifier hash masks.
 
 From: Radu Rendec [EMAIL PROTECTED]
 
 While trying to implement u32 hashes in my shaping machine I ran into
 a possible bug in the u32 hash/bucket computing algorithm
 (net/sched/cls_u32.c).
 
 The problem occurs only with hash masks that extend over the octet
 boundary, on little endian machines (where htonl() actually does
 something).
 
 Let's say that I would like to use 0x3fc0 as the hash mask. This means
 8 contiguous 1 bits starting at b6. With such a mask, the expected
 (and logical) behavior is to hash any address in, for instance,
 192.168.0.0/26 in bucket 0, then any address in 192.168.0.64/26 in
 bucket 1, then 192.168.0.128/26 in bucket 2 and so on.
 
 This is exactly what would happen on a big endian machine, but on
 little endian machines, what would actually happen with current
 implementation is 0x3fc0 being reversed (into 0xc03f) by htonl()
 in the userspace tool and then applied to 192.168.x.x in the u32
 classifier. When shifting right by 16 bits (rank of first 1 bit in
 the reversed mask) and applying the divisor mask (0xff for divisor
 256), what would actually remain is 0x3f applied on the 168 octet of
 the address.
 
 One could say is this can be easily worked around by taking endianness
 into account in userspace and supplying an appropriate mask (0xfc03)
 that would be turned into contiguous 1 bits when reversed
 (0x03fc). But the actual problem is the network address (inside
 the packet) not being converted to host order, but used as a
 host-order value when computing the bucket.
 
 Let's say the network address is written as n31 n30 ... n0, with n0
 being the least significant bit. When used directly (without any
 conversion) on a little endian machine, it becomes n7 ... n0 n8 ..n15
 etc in the machine's registers. Thus bits n7 and n8 would no longer be
 adjacent and 192.168.64.0/26 and 192.168.128.0/26 would no longer be
 consecutive.
 
 The fix is to apply ntohl() on the hmask before computing fshift,
 and in u32_hash_fold() convert the packet data to host order before
 shifting down by fshift.
 
 With helpful feedback from Jamal Hadi Salim and Jarek Poplawski.

Acked-by: Jarek Poplawski [EMAIL PROTECTED]

 
 Signed-off-by: David S. Miller [EMAIL PROTECTED]
 ---
  net/sched/cls_u32.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
 index 9e98c6e..5317102 100644
 --- a/net/sched/cls_u32.c
 +++ b/net/sched/cls_u32.c
 @@ -91,7 +91,7 @@ static struct tc_u_common *u32_list;
  
  static __inline__ unsigned u32_hash_fold(u32 key, struct tc_u32_sel *sel, u8 
 fshift)
  {
 - unsigned h = (key  sel-hmask)fshift;
 + unsigned h = ntohl(key  sel-hmask)fshift;
  
   return h;
  }
 @@ -615,7 +615,7 @@ static int u32_change(struct tcf_proto *tp, unsigned long 
 base, u32 handle,
   n-handle = handle;
  {
   u8 i = 0;
 - u32 mask = s-hmask;
 + u32 mask = ntohl(s-hmask);
   if (mask) {
   while (!(mask  1)) {
   i++;
 -- 
 1.5.3.5
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/24] [IPSEC]: Move x-outer_mode-output out of locked section

2007-11-07 Thread Ingo Oeser
Hi Herbert,

Herbert Xu schrieb:
 diff --git a/net/ipv6/xfrm6_mode_ro.c b/net/ipv6/xfrm6_mode_ro.c
 index a7bc8c6..4a01cb3 100644
 --- a/net/ipv6/xfrm6_mode_ro.c
 +++ b/net/ipv6/xfrm6_mode_ro.c
 @@ -53,7 +54,9 @@ static int xfrm6_ro_output(struct xfrm_state *x, struct 
 sk_buff *skb)
   __skb_pull(skb, hdr_len);
   memmove(ipv6_hdr(skb), iph, hdr_len);
  
 + spin_lock_bh(x-lock);
   x-lastused = get_seconds();
 + spin_unlock_bh(x-lock);
  
   return 0;
  }

Can you move the retrieval of the seconds outside the spinlock?

e.g.

tmp = get_seconds();
spin_lock_bh(x-lock);
x-lastused = tmp;
spin_unlock_bh(x-lock);

or is it not really worth it?

Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'fixes-davem' branch of wireless-2.6

2007-11-07 Thread John W. Linville
Dave,

Hold-off on this one for now if -- clearly Johannes and I need to
brush-up on our Kconfig skills... :-(

I'll post a new pull request soon.

Thanks,

John

On Tue, Nov 06, 2007 at 07:13:14PM -0500, John W. Linville wrote:
 Dave,
 
 Here are some fixes for 2.6.24...
 
 The iwlwifi patch is needed because the iwlwifi drivers routinely end-up
 associated with the simple rate control algorithm, yet those drivers
 really only work with their own custom algorithms.  The other rate
 control patches are there to satisfy dependencies for this patch.
 
 mac80211: remove ieee80211_common.h cleans-up an unused file left
 hanging-around after an earlier patch already in 2.6.24.
 
 mac80211: remove unused driver ops is really a clean-up, but
 mac80211: use IW_AUTH_PRIVACY_INVOKED rather than IW_AUTH_KEY_MGMT
 depends on it.
 
 ssb: Fix initcall ordering changes a subsys_initcall to an
 fs_initcall.  This seems like a bit of a hack, but it fixes a real
 problem and I'm not sure what cleaner solution is either reasonable
 or available.  The comment in the patch explains the reasoning for this
 somewhat unique situation.
 
 I think the other patches are plain enough to not require further
 comment.  Let me know if there are any problems!
 
 Thanks,
 
 John
 
 ---
 
 Individual patches are available here:
 
   
 http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/fixes-davem/
 
 ---
 
 The following changes since commit 2655e2cee2d77459fcb7e10228259e4ee0328697:
   Alan Cox (1):
 ata_piix: Add additional PCI identifier for 40 wire short cable
 
 are available in the git repository at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
 fixes-davem
 
 Johannes Berg (9):
   softmac: fix wext MLME request reason code endianness
   mac80211: make simple rate control algorithm built-in
   mac80211: don't allow registering the same rate control twice
   mac80211: allow driver to ask for a rate control algorithm
   iwlwifi: select proper rate control algorithm
   softmac: MAINTAINERS update
   mac80211: remove ieee80211_common.h
   mac80211: remove unused driver ops
   mac80211: use IW_AUTH_PRIVACY_INVOKED rather than IW_AUTH_KEY_MGMT
 
 John W. Linville (1):
   mac80211: make decrypt failed messages conditional upon MAC80211_DEBUG
 
 Michael Buesch (5):
   ssb: Fix initcall ordering
   rfkill: Register LED triggers before registering switch
   rfkill: Use subsys_initcall
   rfkill: Use mutex_lock() at register and add sanity check
   rfkill: Fix sparse warning
 
  MAINTAINERS |7 +--
  drivers/net/wireless/iwlwifi/iwl3945-base.c |2 +
  drivers/net/wireless/iwlwifi/iwl4965-base.c |2 +
  drivers/ssb/main.c  |5 +-
  include/net/mac80211.h  |   26 ++--
  net/ieee80211/softmac/ieee80211softmac_wx.c |2 +-
  net/mac80211/Kconfig|   12 
  net/mac80211/Makefile   |3 +-
  net/mac80211/ieee80211.c|   16 +-
  net/mac80211/ieee80211_common.h |   91 
 ---
  net/mac80211/ieee80211_i.h  |2 +-
  net/mac80211/ieee80211_ioctl.c  |   21 +++
  net/mac80211/ieee80211_rate.c   |   24 ++-
  net/mac80211/ieee80211_rate.h   |3 +
  net/mac80211/ieee80211_sta.c|   18 +++--
  net/mac80211/rc80211_simple.c   |   25 +---
  net/mac80211/rx.c   |2 +
  net/mac80211/wep.c  |2 +
  net/mac80211/wpa.c  |   18 --
  net/rfkill/rfkill.c |   37 ++-
  20 files changed, 126 insertions(+), 192 deletions(-)
  delete mode 100644 net/mac80211/ieee80211_common.h

-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
 

 -Original Message-
 From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, November 07, 2007 10:49 AM
 To: Templin, Fred L
 Cc: [EMAIL PROTECTED]; netdev@vger.kernel.org; [EMAIL PROTECTED]
 Subject: Re: [PATCH 02/05] ipv6: RFC4214 Support
 
 In article 
 [EMAIL PROTECTED]
 eing.com (at Wed, 7 Nov 2007 10:24:50 -0800), Templin, Fred 
 L [EMAIL PROTECTED] says:
 
   
  
   -Original Message-
   From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] 
   Sent: Wednesday, November 07, 2007 10:12 AM
   To: [EMAIL PROTECTED]
   Cc: Templin, Fred L; netdev@vger.kernel.org; 
 [EMAIL PROTECTED]
   Subject: Re: [PATCH 02/05] ipv6: RFC4214 Support
   
   Hello.
   
   In article [EMAIL PROTECTED] (at Wed, 7 
   Nov 2007 16:58:59 +0100), Ingo Oeser [EMAIL PROTECTED] says:
   
 + eui[0] = 0;
 +
 + /* Check for RFC3330 global address ranges */
 + if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
 + ((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
 + ((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
 + ((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
 + ((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
 + ((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
 + ((ipv4 = 0xc614)  (ipv4  
 0xe000))) eui[0] |=
 0x2;
 +

Instead of converting network to host byte order at runtime 
and comparing the results to constants, let the compiler convert
the constants to network byte order and compare in 
 network order.

so use:

 if (((*addr = htonl(0x0100))  (*addr  
   htonl(0x0a00))) || 

instead. The compiler will notice that 0x0100 is a 
   constant and will
use _constant_htonl() automatically.
   
   No, you cannot do this.
   When you check the range, you need to use host-byte order.
  
  I think the original poster was correct on this one; the addr comes
  in in network byte order, and the constants are depicted in host
  byte order. So, the suggested fix was to have htonl(const) to make
  all of the constants into network byte order while leaving addr
  alone.
 
 I don't understand.
 
 For example, 1.0.0.11 is valid IPv4 global address.
 In little-endian, this is not in the range of
 0x0001 = addr = 0x000a (addr is 0x0b01).

Maybe it is I who did not understand. Can you suggest a clean solution?

Fred
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread Simon Arlott
On 07/11/07 18:52, Templin, Fred L wrote:
 +eui[0] = 0;
 +
 +/* Check for RFC3330 global address ranges */
 +if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
 +((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
 +((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
 +((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
 +((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
 +((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
 +((ipv4 = 0xc614)  (ipv4  
 0xe000))) eui[0] |=
 0x2;
 I don't understand.
 
 For example, 1.0.0.11 is valid IPv4 global address.
 In little-endian, this is not in the range of
 0x0001 = addr = 0x000a (addr is 0x0b01).
 
 Maybe it is I who did not understand. Can you suggest a clean solution?

((ipv4  htonl(0xFF00)) == htonl(0x0A00)) etc.?

-- 
Simon Arlott
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread YOSHIFUJI Hideaki / 吉藤英明
In article [EMAIL PROTECTED] (at Wed, 7 Nov 2007 10:24:50 -0800), Templin, 
Fred L [EMAIL PROTECTED] says:

  
 
  -Original Message-
  From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] 
  Sent: Wednesday, November 07, 2007 10:12 AM
  To: [EMAIL PROTECTED]
  Cc: Templin, Fred L; netdev@vger.kernel.org; [EMAIL PROTECTED]
  Subject: Re: [PATCH 02/05] ipv6: RFC4214 Support
  
  Hello.
  
  In article [EMAIL PROTECTED] (at Wed, 7 
  Nov 2007 16:58:59 +0100), Ingo Oeser [EMAIL PROTECTED] says:
  
+   eui[0] = 0;
+
+   /* Check for RFC3330 global address ranges */
+   if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
+   ((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
+   ((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
+   ((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
+   ((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
+   ((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
+   ((ipv4 = 0xc614)  (ipv4  0xe000))) eui[0] |=
0x2;
+
   
   Instead of converting network to host byte order at runtime 
   and comparing the results to constants, let the compiler convert
   the constants to network byte order and compare in network order.
   
   so use:
   
if (((*addr = htonl(0x0100))  (*addr  
  htonl(0x0a00))) || 
   
   instead. The compiler will notice that 0x0100 is a 
  constant and will
   use _constant_htonl() automatically.
  
  No, you cannot do this.
  When you check the range, you need to use host-byte order.
 
 I think the original poster was correct on this one; the addr comes
 in in network byte order, and the constants are depicted in host
 byte order. So, the suggested fix was to have htonl(const) to make
 all of the constants into network byte order while leaving addr
 alone.

I don't understand.

For example, 1.0.0.11 is valid IPv4 global address.
In little-endian, this is not in the range of
0x0001 = addr = 0x000a (addr is 0x0b01).

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.23.1-smp kernel panic (network-related)

2007-11-07 Thread Marek Kierdelewicz
What is the test input that causes the crash??

Test box is treated with mirrored traffic that is routed by production
linux router (with non-smp kernel). It's usual traffic generated by
broadband clients. Some of the characteristics:

bandwidth used: ~40/40 Mbit (up/down)
pps: ~15k

number of clients: ~550

dump of packet sizes:
 Packet size  | Count
1 to   75: 186501
76 to  150: 14145
151 to  225: 3285
226 to  300: 2088
301 to  375: 3632
376 to  450: 2097
451 to  525: 1513
526 to  600: 3069
601 to  675:20081
676 to  750: 1294
751 to  825: 1189
826 to  900:  885
901 to  975: 2207
976 to 1050: 1333
1051 to 1125:1192
1201 to 1275:3036
1276 to 1350:3709
1351 to 1425:3453
1426 to 1500+: 185318

protocol breakdown:
most of the traffic is IPv4, some UDP and a little bit of ICMP

Don't know if it's important, but box is connected to switch by a
802.1q trunk. Each of vlan interfaces has egress shaping on it +
dedicated ifb device attached to ingress qdisc and ingress shaping on
ifb device.

Is any additional information needed?

-- 
Marek Kierdelewicz
Kierownik Działu Systemów Sieciowych, KoBa
Manager of Network Systems Department, KoBa
tel. (85) 7406466; fax. (85) 7406467
e-mail: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread YOSHIFUJI Hideaki / 吉藤英明
Hello.

In article [EMAIL PROTECTED] (at Wed, 7 Nov 2007 16:58:59 +0100), Ingo Oeser 
[EMAIL PROTECTED] says:

  +   eui[0] = 0;
  +
  +   /* Check for RFC3330 global address ranges */
  +   if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
  +   ((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
  +   ((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
  +   ((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
  +   ((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
  +   ((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
  +   ((ipv4 = 0xc614)  (ipv4  0xe000))) eui[0] |=
  0x2;
  +
 
 Instead of converting network to host byte order at runtime 
 and comparing the results to constants, let the compiler convert
 the constants to network byte order and compare in network order.
 
 so use:
 
  if (((*addr = htonl(0x0100))  (*addr  htonl(0x0a00))) || 
 
 instead. The compiler will notice that 0x0100 is a constant and will
 use _constant_htonl() automatically.

No, you cannot do this.
When you check the range, you need to use host-byte order.

  +
  +static inline int ipv6_addr_is_isatap(const struct in6_addr *addr)
  +{
  +   return (addr-s6_addr32[2] == __constant_htonl(0x02005EFE) ||
  +   addr-s6_addr32[2] == __constant_htonl(0x5EFE));
  +}
  +#endif
 
 The compiler will notice that 0x0100 is a constant and will
 use _constant_htonl() automatically. Please use simply htonl().

Right.  And, maybe, you can write as follows:
return ((addr-s6_addr32[2] | htonl(0x0200)) == htonl(0x02005EFE));

--yoshfuji
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/13] SCTP : Fix to process bundled ASCONF chunk correctly

2007-11-07 Thread Vlad Yasevich
From: Wei Yongjun [EMAIL PROTECTED]

If ASCONF chunk is bundled with other chunks as the first chunk, when
process the ASCONF parameters, full packet data will be process as the
parameters of the ASCONF chunk, not only the real parameters. So if you
send a ASCONF chunk bundled with other chunks, you will get an unexpected
result.
This problem also exists when ASCONF-ACK chunk is bundled with other chunks.

This patch fix this problem.

Signed-off-by: Wei Yongjun [EMAIL PROTECTED]
Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 net/sctp/sm_make_chunk.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index c377e4e..5de4729 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -2848,10 +2848,11 @@ struct sctp_chunk *sctp_process_asconf(struct 
sctp_association *asoc,
 
__be16  err_code;
int length = 0;
-   int chunk_len = asconf-skb-len;
+   int chunk_len;
__u32   serial;
int all_param_pass = 1;
 
+   chunk_len = ntohs(asconf-chunk_hdr-length) - sizeof(sctp_chunkhdr_t);
hdr = (sctp_addiphdr_t *)asconf-skb-data;
serial = ntohl(hdr-serial);
 
@@ -2990,7 +2991,7 @@ static __be16 sctp_get_asconf_response(struct sctp_chunk 
*asconf_ack,
sctp_addip_param_t  *asconf_ack_param;
sctp_errhdr_t   *err_param;
int length;
-   int asconf_ack_len = asconf_ack-skb-len;
+   int asconf_ack_len;
__be16  err_code;
 
if (no_err)
@@ -2998,6 +2999,9 @@ static __be16 sctp_get_asconf_response(struct sctp_chunk 
*asconf_ack,
else
err_code = SCTP_ERROR_REQ_REFUSED;
 
+   asconf_ack_len = ntohs(asconf_ack-chunk_hdr-length) -
+sizeof(sctp_chunkhdr_t);
+
/* Skip the addiphdr from the asconf_ack chunk and store a pointer to
 * the first asconf_ack parameter.
 */
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/13] SCTP : Fix bad formatted comment in outqueue.c

2007-11-07 Thread Vlad Yasevich
From: Wei Yongjun [EMAIL PROTECTED]

Just fix the bad format of the comment in outqueue.c.

Signed-off-by: Wei Yongjun [EMAIL PROTECTED]
Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 net/sctp/outqueue.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 28f4fe7..e315c6c 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -641,7 +641,8 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct 
sctp_packet *pkt,
 
/* If we are here due to a retransmit timeout or a fast
 * retransmit and if there are any chunks left in the retransmit
-* queue that could not fit in the PMTU sized packet, they need 
 * to be marked as ineligible for a subsequent fast retransmit.
+* queue that could not fit in the PMTU sized packet, they need
+* to be marked as ineligible for a subsequent fast retransmit.
 */
if (rtx_timeout  !lchunk) {
list_for_each(lchunk1, lqueue) {
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/13] SCTP: Use hashed lookup when looking for an association.

2007-11-07 Thread Vlad Yasevich
A SCTP endpoint may have a lot of associations on them and walking
the list is fairly inefficient.  Instead, use a hashed lookup,
and filter out the hash list based on the endopoint we already have.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 net/sctp/endpointola.c |   33 +
 1 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
index 2d2d81e..f38fa0f 100644
--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -328,24 +328,33 @@ static struct sctp_association 
*__sctp_endpoint_lookup_assoc(
const union sctp_addr *paddr,
struct sctp_transport **transport)
 {
+   struct sctp_association *asoc = NULL;
+   struct sctp_transport *t = NULL;
+   struct sctp_hashbucket *head;
+   struct sctp_ep_common *epb;
+   int hash;
int rport;
-   struct sctp_association *asoc;
-   struct list_head *pos;
 
+   *transport = NULL;
rport = ntohs(paddr-v4.sin_port);
 
-   list_for_each(pos, ep-asocs) {
-   asoc = list_entry(pos, struct sctp_association, asocs);
-   if (rport == asoc-peer.port) {
-   *transport = sctp_assoc_lookup_paddr(asoc, paddr);
-
-   if (*transport)
-   return asoc;
+   hash = sctp_assoc_hashfn(ep-base.bind_addr.port, rport);
+   head = sctp_assoc_hashtable[hash];
+   read_lock(head-lock);
+   for (epb = head-chain; epb; epb = epb-next) {
+   asoc = sctp_assoc(epb);
+   if (asoc-ep != ep || rport != asoc-peer.port)
+   goto next;
+
+   t = sctp_assoc_lookup_paddr(asoc, paddr);
+   if (t) {
+   *transport = t;
+   break;
}
+next:
+   asoc = NULL;
}
-
-   *transport = NULL;
-   return NULL;
+   return asoc;
 }
 
 /* Lookup association on an endpoint based on a peer address.  BH-safe.  */
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/13] SCTP: Convert custom hash lists to use hlist.

2007-11-07 Thread Vlad Yasevich
Convert the custom hash list traversals to use hlist functions.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/sctp.h|3 +++
 include/net/sctp/structs.h |   10 --
 net/sctp/endpointola.c |3 ++-
 net/sctp/input.c   |   43 +++
 net/sctp/proc.c|6 --
 net/sctp/protocol.c|6 +++---
 net/sctp/socket.c  |   14 +-
 7 files changed, 32 insertions(+), 53 deletions(-)

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 7082730..67c997c 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -665,6 +665,9 @@ static inline int sctp_vtag_hashfn(__u16 lport, __u16 
rport, __u32 vtag)
return (h  (sctp_assoc_hashsize-1));
 }
 
+#define sctp_for_each_hentry(epb, node, head) \
+   hlist_for_each_entry(epb, node, head, node)
+
 /* Is a socket of this style? */
 #define sctp_style(sk, style) __sctp_style((sk), (SCTP_SOCKET_##style))
 static inline int __sctp_style(const struct sock *sk, sctp_socket_type_t style)
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 44f2672..eb3113c 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -100,20 +100,19 @@ struct crypto_hash;
 struct sctp_bind_bucket {
unsigned short  port;
unsigned short  fastreuse;
-   struct sctp_bind_bucket *next;
-   struct sctp_bind_bucket **pprev;
+   struct hlist_node   node;
struct hlist_head   owner;
 };
 
 struct sctp_bind_hashbucket {
spinlock_t  lock;
-   struct sctp_bind_bucket *chain;
+   struct hlist_head   chain;
 };
 
 /* Used for hashing all associations.  */
 struct sctp_hashbucket {
rwlock_tlock;
-   struct sctp_ep_common  *chain;
+   struct hlist_head   chain;
 } __attribute__((__aligned__(8)));
 
 
@@ -1230,8 +1229,7 @@ typedef enum {
 
 struct sctp_ep_common {
/* Fields to help us manage our entries in the hash tables. */
-   struct sctp_ep_common *next;
-   struct sctp_ep_common **pprev;
+   struct hlist_node node;
int hashent;
 
/* Runtime type information.  What kind of endpoint is this? */
diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
index f38fa0f..0bd3147 100644
--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -332,6 +332,7 @@ static struct sctp_association 
*__sctp_endpoint_lookup_assoc(
struct sctp_transport *t = NULL;
struct sctp_hashbucket *head;
struct sctp_ep_common *epb;
+   struct hlist_node *node;
int hash;
int rport;
 
@@ -341,7 +342,7 @@ static struct sctp_association 
*__sctp_endpoint_lookup_assoc(
hash = sctp_assoc_hashfn(ep-base.bind_addr.port, rport);
head = sctp_assoc_hashtable[hash];
read_lock(head-lock);
-   for (epb = head-chain; epb; epb = epb-next) {
+   sctp_for_each_hentry(epb, node, head-chain) {
asoc = sctp_assoc(epb);
if (asoc-ep != ep || rport != asoc-peer.port)
goto next;
diff --git a/net/sctp/input.c b/net/sctp/input.c
index 86503e7..91ae463 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -656,7 +656,6 @@ discard:
 /* Insert endpoint into the hash table.  */
 static void __sctp_hash_endpoint(struct sctp_endpoint *ep)
 {
-   struct sctp_ep_common **epp;
struct sctp_ep_common *epb;
struct sctp_hashbucket *head;
 
@@ -666,12 +665,7 @@ static void __sctp_hash_endpoint(struct sctp_endpoint *ep)
head = sctp_ep_hashtable[epb-hashent];
 
sctp_write_lock(head-lock);
-   epp = head-chain;
-   epb-next = *epp;
-   if (epb-next)
-   (*epp)-pprev = epb-next;
-   *epp = epb;
-   epb-pprev = epp;
+   hlist_add_head(epb-node, head-chain);
sctp_write_unlock(head-lock);
 }
 
@@ -691,19 +685,15 @@ static void __sctp_unhash_endpoint(struct sctp_endpoint 
*ep)
 
epb = ep-base;
 
+   if (hlist_unhashed(epb-node))
+   return;
+
epb-hashent = sctp_ep_hashfn(epb-bind_addr.port);
 
head = sctp_ep_hashtable[epb-hashent];
 
sctp_write_lock(head-lock);
-
-   if (epb-pprev) {
-   if (epb-next)
-   epb-next-pprev = epb-pprev;
-   *epb-pprev = epb-next;
-   epb-pprev = NULL;
-   }
-
+   __hlist_del(epb-node);
sctp_write_unlock(head-lock);
 }
 
@@ -721,12 +711,13 @@ static struct sctp_endpoint 
*__sctp_rcv_lookup_endpoint(const union sctp_addr *l
struct sctp_hashbucket *head;
struct sctp_ep_common *epb;
struct sctp_endpoint *ep;
+   struct hlist_node *node;
int hash;
 
hash = sctp_ep_hashfn(ntohs(laddr-v4.sin_port));
head = sctp_ep_hashtable[hash];
read_lock(head-lock);
-   for (epb = head-chain; epb; epb = epb-next) {
+   sctp_for_each_hentry(epb, 

Fw: 2.6.23.1-smp kernel panic (network-related)

2007-11-07 Thread Stephen Hemminger
What is the test input that causes the crash??

Forwarding message to proper group.

Begin forwarded message:

Date: Wed, 7 Nov 2007 17:52:11 +0100
From: Marek Kierdelewicz [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Newsgroups: gmane.linux.kernel
Subject: 2.6.23.1-smp  kernel panic (network-related)


Hi there,

My company's (ISP) bussines model requires dynamic resizing of the
client queues. It's achieved by regenerating shaping rules and loading
then using batch mode of a tc binary. On production systems it's done
once every 1 or 2 minutes. Unfortunately this causes smp kernels to
panic. Non-smp kernels don't have such problems. Bug is around a long
time. I first noticed it after migrating to shaping configs that use
IFB, it might have been 2.6.18 era.

Test scenario:

I've put together a test machine with configuration copied from
production router. I'm feeding the machine with production traffic
by means of port mirroring. Test machine has the same config as
production one (including mac addresses), so it tries to route the
incoming traffic.

Tested kernels were 2.6.31.1 and 2.6.20.6 (config from 2.6.20.6 is in
attachment). Both panicked if compiled with SMP support and work stable
otherwise. Problem occurs only with cyclic shaping restarts. For the
test, reload operation using tc -b ... was executed in an infinite
loop. 

Box's CPU usage was approximately 15%. Panics occur with few hours -
one day intervals.


Below I attach the panic message captured via serial console:
--
printk: 63 messages suppressed.
dst cache overflow
SMP
Modules linked in: ipt_LOG xt_hashlimit ipt_MASQUERADE ip_set_macipmap
ip_set_ipmap xt_state w83627hf hwmon_vid eeprom ifb ipt_SET ipt_set
ip_set ipip tunnel4 ip_gre e1000 i2c_i801 i2c_core CPU:1 EIP:
0060:[f255c08f]Not tainted VLI EFLAGS: 00010202   (2.6.23.1-smp
#2) EIP is at 0xf255c08f
eax: c196a000   ebx: 0100   ecx: ef2f408c   edx: f363
esi: f0332029   edi: f255c08c   ebp: 0001   esp: f3631ea8
ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
Process tc (pid: 27695, ti=f363 task=f26a9560 task.ti=f363)
Stack: c01280ad f26a9560 0001 f3631eb4 c0106f57 ef2f408c f0e8c08c
0031 c0495308 000a c0124eb6 0046  f7657740 f363
c0124f4c c180f120 c0114e62   f7bfd224 f74ed95c c01047e0
f7bfd224 Call Trace:
 [c01280ad] run_timer_softirq+0xf5/0x154
 [c0106f57] profile_pc+0x21/0x4a
 [c0124eb6] __do_softirq+0x5d/0xc1
 [c0124f4c] do_softirq+0x32/0x36
 [c0114e62] smp_apic_timer_interrupt+0x74/0x80
 [c01047e0] apic_timer_interrupt+0x28/0x30
 [c014c82e] remove_vma+0x1c/0x36
 [c014c912] exit_mmap+0xca/0xe1
 [c011eedc] mmput+0x1d/0x75
 [c0123688] do_exit+0x1be/0x68a
 [c0123bc0] sys_exit_group+0x0/0xd
 [c0103d12] sysenter_past_esp+0x5f/0x85
 ===
Code: 00 52 41 f7 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 fa 00 00 00 ea 05 00 00 7f 00 00 00 34 a5 96 c1 8c
00 fd f3 a5 2b 09 00 d1 d8 32 c0 00 c0 55 f2 00 a0 96 c1 EIP:
[f255c08f] 0xf255c08f SS:ESP 0068:f3631ea8 Kernel panic - not
syncing: Fatal exception in interrupt
--


-- 
Marek Kierdelewicz
Kierownik Działu Systemów Sieciowych, KoBa
Manager of Network Systems Department, KoBa
tel. (85) 7406466; fax. (85) 7406467
e-mail: [EMAIL PROTECTED]


-- 
Stephen Hemminger [EMAIL PROTECTED]


kernel_config
Description: Binary data


[PATCH 03/13] SCTP: Fix difference cases of retransmit.

2007-11-07 Thread Vlad Yasevich
Commit d0ce92910bc04e107b2f3f2048f07e94f570035d broke several retransmit
cases including fast retransmit.  The reason is that we should
only delay by rto while doing retransmits as a result of a timeout.
Retransmit as a result of path mtu discovery, fast retransmit, or
other events that should trigger immediate retransmissions got broken.

Also, since rto is doubled prior to marking of packets eligible for
retransmission, we never marked correct chunks anyway.

The fix is provide a reason for a given retransmission so that we
can mark chunks appropriately and to save the old rto value to do
comparisons against.

All regressions tests passed with this code.

Spotted by Wei Yongjun [EMAIL PROTECTED]

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/command.h   |1 +
 include/net/sctp/constants.h |1 +
 include/net/sctp/sctp.h  |1 +
 include/net/sctp/structs.h   |5 +++--
 net/sctp/outqueue.c  |   33 +
 net/sctp/sm_sideeffect.c |   10 +-
 net/sctp/sm_statefuns.c  |2 +-
 net/sctp/transport.c |5 +++--
 8 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/include/net/sctp/command.h b/include/net/sctp/command.h
index b873336..c1f7976 100644
--- a/include/net/sctp/command.h
+++ b/include/net/sctp/command.h
@@ -103,6 +103,7 @@ typedef enum {
SCTP_CMD_ASSOC_CHANGE,   /* generate and send assoc_change event */
SCTP_CMD_ADAPTATION_IND, /* generate and send adaptation event */
SCTP_CMD_ASSOC_SHKEY,/* generate the association shared keys */
+   SCTP_CMD_T1_RETRAN,  /* Mark for retransmission after T1 timeout  */
SCTP_CMD_LAST
 } sctp_verb_t;
 
diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
index da8354e..73fbdf6 100644
--- a/include/net/sctp/constants.h
+++ b/include/net/sctp/constants.h
@@ -407,6 +407,7 @@ typedef enum {
SCTP_RTXR_T3_RTX,
SCTP_RTXR_FAST_RTX,
SCTP_RTXR_PMTUD,
+   SCTP_RTXR_T1_RTX,
 } sctp_retransmit_reason_t;
 
 /* Reasons to lower cwnd. */
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 93eb708..7082730 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -267,6 +267,7 @@ enum
SCTP_MIB_T5_SHUTDOWN_GUARD_EXPIREDS,
SCTP_MIB_DELAY_SACK_EXPIREDS,
SCTP_MIB_AUTOCLOSE_EXPIREDS,
+   SCTP_MIB_T1_RETRANSMITS,
SCTP_MIB_T3_RETRANSMITS,
SCTP_MIB_PMTUD_RETRANSMITS,
SCTP_MIB_FAST_RETRANSMITS,
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index ef892e0..482c2aa 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -873,10 +873,11 @@ struct sctp_transport {
 * address list derived from the INIT or INIT ACK chunk, a
 * number of data elements needs to be maintained including:
 */
-   __u32 rtt;  /* This is the most recent RTT.  */
-
/* RTO : The current retransmission timeout value.  */
unsigned long rto;
+   unsigned long last_rto;
+
+   __u32 rtt;  /* This is the most recent RTT.  */
 
/* RTTVAR  : The current RTT variation.  */
__u32 rttvar;
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index e315c6c..99a3db5 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -382,7 +382,7 @@ static void sctp_insert_list(struct list_head *head, struct 
list_head *new)
 /* Mark all the eligible packets on a transport for retransmission.  */
 void sctp_retransmit_mark(struct sctp_outq *q,
  struct sctp_transport *transport,
- __u8 fast_retransmit)
+ __u8 reason)
 {
struct list_head *lchunk, *ltemp;
struct sctp_chunk *chunk;
@@ -412,20 +412,20 @@ void sctp_retransmit_mark(struct sctp_outq *q,
continue;
}
 
-   /* If we are doing retransmission due to a fast retransmit,
-* only the chunk's that are marked for fast retransmit
-* should be added to the retransmit queue.  If we are doing
-* retransmission due to a timeout or pmtu discovery, only the
-* chunks that are not yet acked should be added to the
-* retransmit queue.
+   /* If we are doing  retransmission due to a timeout or pmtu
+* discovery, only the  chunks that are not yet acked should
+* be added to the retransmit queue.
 */
-   if ((fast_retransmit  (chunk-fast_retransmit  0)) ||
-  (!fast_retransmit  !chunk-tsn_gap_acked)) {
+   if ((reason == SCTP_RTXR_FAST_RTX  
+   (chunk-fast_retransmit  0)) ||
+   (reason != SCTP_RTXR_FAST_RTX   !chunk-tsn_gap_acked)) {
/* If this chunk was sent less then 1 rto ago, do not
   

[PATCH 04/13] SCTP: Update RCU handling during the ADD-IP case

2007-11-07 Thread Vlad Yasevich
After learning more about rcu, it looks like the ADD-IP handling
doesn't need to call call_rcu_bh.  All the rcu critical sections
use rcu_read_lock, so using call_rcu_bh is wrong here.
Now, restore the local_bh_disable() code blocks and use normal
call_rcu() calls.  Also restore the missing return statement.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/structs.h |4 +---
 net/sctp/bind_addr.c   |   13 +++--
 net/sctp/sm_make_chunk.c   |6 +-
 net/sctp/socket.c  |2 +-
 4 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 482c2aa..a177017 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1185,9 +1185,7 @@ int sctp_bind_addr_copy(struct sctp_bind_addr *dest,
int flags);
 int sctp_add_bind_addr(struct sctp_bind_addr *, union sctp_addr *,
   __u8 use_as_src, gfp_t gfp);
-int sctp_del_bind_addr(struct sctp_bind_addr *, union sctp_addr *,
-   void fastcall (*rcu_call)(struct rcu_head *,
- void (*func)(struct rcu_head *)));
+int sctp_del_bind_addr(struct sctp_bind_addr *, union sctp_addr *);
 int sctp_bind_addr_match(struct sctp_bind_addr *, const union sctp_addr *,
 struct sctp_sock *);
 union sctp_addr *sctp_find_unmatch_addr(struct sctp_bind_addr  *bp,
diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index dfffa94..cae95af 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -180,9 +180,7 @@ int sctp_add_bind_addr(struct sctp_bind_addr *bp, union 
sctp_addr *new,
 /* Delete an address from the bind address list in the SCTP_bind_addr
  * structure.
  */
-int sctp_del_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *del_addr,
-   void fastcall (*rcu_call)(struct rcu_head *head,
-void (*func)(struct rcu_head *head)))
+int sctp_del_bind_addr(struct sctp_bind_addr *bp, union sctp_addr *del_addr)
 {
struct sctp_sockaddr_entry *addr, *temp;
 
@@ -198,15 +196,10 @@ int sctp_del_bind_addr(struct sctp_bind_addr *bp, union 
sctp_addr *del_addr,
}
}
 
-   /* Call the rcu callback provided in the args.  This function is
-* called by both BH packet processing and user side socket option
-* processing, but it works on different lists in those 2 contexts.
-* Each context provides it's own callback, whether call_rcu_bh()
-* or call_rcu(), to make sure that we wait for an appropriate time.
-*/
if (addr  !addr-valid) {
-   rcu_call(addr-rcu, sctp_local_addr_free);
+   call_rcu(addr-rcu, sctp_local_addr_free);
SCTP_DBG_OBJCNT_DEC(addr);
+   return 0;
}
 
return -EINVAL;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 5de4729..c60564d 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -2953,13 +2953,17 @@ static int sctp_asconf_param_success(struct 
sctp_association *asoc,
/* This is always done in BH context with a socket lock
 * held, so the list can not change.
 */
+   local_bh_disable();
list_for_each_entry(saddr, bp-address_list, list) {
if (sctp_cmp_addr_exact(saddr-a, addr))
saddr-use_as_src = 1;
}
+   local_bh_enable();
break;
case SCTP_PARAM_DEL_IP:
-   retval = sctp_del_bind_addr(bp, addr, call_rcu_bh);
+   local_bh_disable();
+   retval = sctp_del_bind_addr(bp, addr);
+   local_bh_enable();
list_for_each(pos, asoc-peer.transport_addr_list) {
transport = list_entry(pos, struct sctp_transport,
 transports);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index a7ecf31..6ce9b49 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -660,7 +660,7 @@ static int sctp_bindx_rem(struct sock *sk, struct sockaddr 
*addrs, int addrcnt)
 * socket routing and failover schemes. Refer to comments in
 * sctp_do_bind(). -daisy
 */
-   retval = sctp_del_bind_addr(bp, sa_addr, call_rcu);
+   retval = sctp_del_bind_addr(bp, sa_addr);
 
addr_buf += af-sockaddr_len;
 err_bindx_rem:
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 00/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
As further clarification, here is the US patent office
transaction history for the SRI application, which shows
that the application was rejected on 8/02/04:

http://portal.uspto.gov/external/portal/!ut/p/kcxml/04_Sj9SPykssy0xPLMnM
z0vM0Y_QjzKLN4gPMATJgFieAfqRqCLGpugijnABX4_83FT9IKBEpDlQxNDCRz8qJzU9MblS
P1jfWz9AvyA3NDSi3NsRAHxEBJg!/delta/base64xml/L0lJSk03dWlDU1lKSi9vQXd3QUF
NWWdBQ0VJUWhDRUVJaEZLQSEvNEZHZ2RZbktKMEZSb1hmckNIZGgvN18wXzE4TC81L3NhLmd
ldEJpYg!!?selectedTab=fileHistorytabisSubmitted=isSubmitteddosnum=0972
8253public_selectedSearchOption=

and here is the 12/01/04 IPR Status summary from KAME
stating the basis for including ISATAP in their product: 

http://www.kame.net/newsletter/20041201/

Fred
[EMAIL PROTECTED] 

 -Original Message-
 From: Templin, Fred L 
 Sent: Wednesday, November 07, 2007 6:42 AM
 To: David Stevens; Pekka Savola
 Cc: David Miller; netdev@vger.kernel.org; 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: RE: [PATCH 00/05] ipv6: RFC4214 Support
 
 I think I can clear this up. The patent office rejected
 SRI's patent application, therefore there are no valid
 claims that could prevent ISATAP from being included
 in public domain software releases. Indeed, Microsoft,
 cisco, and FreeBSD/KAME are shipping ISATAP and have
 been doing so for a long time, and I believe there are
 also several others.
 
 Fred
 [EMAIL PROTECTED]
 
  -Original Message-
  From: David Stevens [mailto:[EMAIL PROTECTED] 
  Sent: Tuesday, November 06, 2007 11:54 PM
  To: Pekka Savola
  Cc: David Miller; Templin, Fred L; netdev@vger.kernel.org; 
  [EMAIL PROTECTED]; [EMAIL PROTECTED]
  Subject: Re: [PATCH 00/05] ipv6: RFC4214 Support
  
   give it away on this specific instance.  I'm not sure if 
 you should 
   attribute to hidden agendas what you can explain by doing 
  the right 
   thing (granted, very few companies do this which may make 
  it suspect, 
   but still..).
  
  Pekka,
  I'm not assuming hidden agendas here; I simply 
 don't know what
  they mean by no license for implementers.  It doesn't say they
  relinquish *all* licensing, which would be clearer if 
 that's what they
  mean. If implementers, distributors, and users are included, then
  who's left that does need licensing? If that answer really 
 is nobody,
  then why bother with for implementers.?
  So, I don't think it's a hidden agenda, I think 
 they said what
  they mean. I just don't know what they mean. :-)
  
  
 +-DLS
  
  
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
 @@ -395,8 +451,6 @@ static int ipip6_rcv(struct sk_buff *skb
   }
  
   icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
 - kfree_skb(skb);
 - read_unlock(ipip6_lock);
  out:
   return 0;
  }

Note that the above lines were incorrectly deleted.
This has been fixed and tested.

Fred
[EMAIL PROTECTED]

 -Original Message-
 From: Templin, Fred L 
 Sent: Tuesday, November 06, 2007 5:16 PM
 To: netdev@vger.kernel.org
 Subject: [PATCH 04/05] ipv6: RFC4214 Support
 
 From: Fred L. Templin [EMAIL PROTECTED]
 
 This is experimental support for the Intra-Site Automatic
 Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
 the SIT module, and is configured using the unmodified
 ip utility with device names beginning with: isatap.
 
 The following diffs are specific to the Linux 2.6.23
 kernel distribution.
 
 Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
 
 ---
 
 --- linux-2.6.23/net/ipv6/sit.c.orig  2007-10-09 13:31:38.0
 -0700
 +++ linux-2.6.23/net/ipv6/sit.c   2007-11-06 
 15:32:27.0 -0800
 @@ -16,6 +16,7 @@
   *   Changes:
   * Roger Venning [EMAIL PROTECTED]:6to4 support
   * Nate Thompson [EMAIL PROTECTED]:6to4 support
 + * Fred L. Templin [EMAIL PROTECTED]:  isatap support
   */
  
  #include linux/module.h
 @@ -154,6 +155,14 @@ static struct ip_tunnel * ipip6_tunnel_l
   struct net_device *dev;
   char name[IFNAMSIZ];
  
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - router address in daddr */
 + if (!strncmp(parms-name, isatap, 6)) {
 + parms-i_key = parms-iph.daddr;
 + parms-iph.daddr = remote = 0;
 + }
 +#endif
 +
   for (tp = __ipip6_bucket(parms); (t = *tp) != NULL; tp =
 t-next) {
   if (local == t-parms.iph.saddr  remote ==
 t-parms.iph.daddr)
   return t;
 @@ -182,6 +191,11 @@ static struct ip_tunnel * ipip6_tunnel_l
   dev-init = ipip6_tunnel_init;
   nt-parms = *parms;
  
 +#if defined(CONFIG_IPV6_ISATAP)
 + if (!strncmp(dev-name, isatap, 6))
 + dev-priv_flags |= IFF_ISATAP;
 +#endif
 +
   if (register_netdevice(dev)  0) {
   free_netdev(dev);
   goto failed;
 @@ -382,6 +396,48 @@ static int ipip6_rcv(struct sk_buff *skb
   IPCB(skb)-flags = 0;
   skb-protocol = htons(ETH_P_IPV6);
   skb-pkt_type = PACKET_HOST;
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - check source address */
 + if (tunnel-dev-priv_flags  IFF_ISATAP) {
 + struct neighbour *neigh;
 + struct dst_entry *dst;
 + struct flowi fl;
 + struct in6_addr *addr6;
 + struct ipv6hdr *iph6;
 +
 + /* from ISATAP router */
 + if (iph-saddr == tunnel-parms.i_key) goto accept;
 +
 + iph6 = ipv6_hdr(skb);
 + addr6 = iph6-saddr;
 +
 + /* from legitimate previous hop */
 + memset(fl, 0, sizeof(fl));
 + fl.proto = iph6-nexthdr;
 + ipv6_addr_copy(fl.fl6_dst, addr6);
 + fl.oif = tunnel-dev-ifindex;
 + security_skb_classify_flow(skb, fl);
 +
 + if (!(dst = ip6_route_output(NULL, fl)) ||
 +  (dst-dev != tunnel-dev) ||
 +  ((neigh = dst-neighbour) == NULL))
 + goto drop;
 +
 + addr6 = (struct in6_addr*)neigh-primary_key;
 +
 + if (!(ipv6_addr_is_isatap(addr6)) ||
 + (addr6-s6_addr32[3] != iph-saddr)) {
 +drop:
 + tunnel-stat.rx_errors++;
 + dst_release(dst);
 + kfree_skb(skb);
 + read_unlock(ipip6_lock);
 + return 0;
 + }
 + dst_release(dst);
 + }
 +accept:
 +#endif
   tunnel-stat.rx_packets++;
   tunnel-stat.rx_bytes += skb-len;
   skb-dev = tunnel-dev;
 @@ -395,8 +451,6 @@ static int ipip6_rcv(struct sk_buff *skb
   }
  
   icmp_send(skb, ICMP_DEST_UNREACH, ICMP_PORT_UNREACH, 0);
 - kfree_skb(skb);
 - read_unlock(ipip6_lock);
  out:
   return 0;
  }
 @@ -444,6 +498,31 @@ static int ipip6_tunnel_xmit(struct sk_b
   if (skb-protocol != htons(ETH_P_IPV6))
   goto tx_error;
  
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - must come before 6to4 */
 + if (dev-priv_flags  IFF_ISATAP) {
 + struct neighbour *neigh = NULL;
 +
 + if (skb-dst)
 + neigh = skb-dst-neighbour;
 +
 + if (neigh == NULL) {
 + if (net_ratelimit())
 + printk(KERN_DEBUG sit: nexthop ==
 NULL\n);
 + goto tx_error;
 + }
 +
 + addr6 = (struct 

[PATCH 1/2] [LIB]: Introduce struct pcounter

2007-11-07 Thread Arnaldo Carvalho de Melo
This just generalises what was introduced by Eric Dumazet for the struct proto
inuse field in 286ab3d46058840d68e5d7d52e316c1f7e98c59f:

[NET]: Define infrastructure to keep 'inuse' changes in an efficent 
SMP/NUMA way.

Please look at the comment in there to see the rationale.

Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 include/linux/pcounter.h |  102 ++
 lib/Makefile |1 +
 lib/pcounter.c   |   26 
 3 files changed, 129 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/pcounter.h
 create mode 100644 lib/pcounter.c

diff --git a/include/linux/pcounter.h b/include/linux/pcounter.h
new file mode 100644
index 000..3d3891b
--- /dev/null
+++ b/include/linux/pcounter.h
@@ -0,0 +1,102 @@
+#ifndef __LINUX_PCOUNTER_H
+#define __LINUX_PCOUNTER_H
+
+struct pcounter {
+#ifdef CONFIG_SMP
+   void(*add)(struct pcounter *self, int inc);
+   int (*getval)(const struct pcounter *self);
+   int *per_cpu_values;
+#else
+   int val;
+#endif
+};
+
+/*
+ * Special macros to let pcounters use a fast version of {getvalue|add}
+ * using a static percpu variable per pcounter instead of an allocated one,
+ * saving one dereference.
+ * This might be changed if/when dynamic percpu vars become fast.
+ */
+#ifdef CONFIG_SMP
+#include linux/cpumask.h
+#include linux/percpu.h
+
+#define DEFINE_PCOUNTER(NAME)  \
+static DEFINE_PER_CPU(int, NAME##_pcounter_values);\
+static void NAME##_pcounter_add(struct pcounter *self, int inc)\
+{  \
+   __get_cpu_var(NAME##_pcounter_values) += inc;   \
+}  \
+   \
+static int NAME##_pcounter_getval(const struct pcounter *self) \
+{  \
+   int res = 0, cpu;   \
+   \
+   for_each_possible_cpu(cpu)  \
+   res += per_cpu(NAME##_pcounter_values, cpu);\
+   return res; \
+}
+   
+#define PCOUNTER_MEMBER_INITIALIZER(NAME, MEMBER)  \
+   MEMBER = {  \
+   .add= NAME##_pcounter_add,  \
+   .getval = NAME##_pcounter_getval,   \
+   }
+
+extern void pcounter_def_add(struct pcounter *self, int inc);
+extern int pcounter_def_getval(const struct pcounter *self);
+
+static inline int pcounter_alloc(struct pcounter *self)
+{
+   int rc = 0;
+   if (self-add == NULL) {
+   self-per_cpu_values = alloc_percpu(int);
+   if (self-per_cpu_values != NULL) {
+   self-add= pcounter_def_add;
+   self-getval = pcounter_def_getval;
+   } else
+   rc = 1;
+   }
+   return rc;
+}
+
+static inline void pcounter_free(struct pcounter *self)
+{
+   if (self-per_cpu_values != NULL) {
+   free_percpu(self-per_cpu_values);
+   self-per_cpu_values = NULL;
+   self-getval = NULL;
+   self-add = NULL;
+   }
+}
+
+static inline void pcounter_add(struct pcounter *self, int inc)
+{
+   self-add(self, inc);
+}
+
+static inline int pcounter_getval(const struct pcounter *self)
+{
+   return self-getval(self);
+}
+
+#else /* CONFIG_SMP */
+
+static inline void pcounter_add(struct pcounter *self, int inc)
+{
+   self-value += inc;
+}
+
+static inline int pcounter_getval(const struct pcounter *self)
+{
+   return self-val;
+}
+
+#define DEFINE_PCOUNTER(NAME)
+#define PCOUNTER_MEMBER_INITIALIZER(NAME, MEMBER)
+#define pcounter_alloc(self) 0
+#define pcounter_free(self)
+
+#endif /* CONFIG_SMP */
+
+#endif /* __LINUX_PCOUNTER_H */
diff --git a/lib/Makefile b/lib/Makefile
index 3a0983b..0fe94ec 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_TEXTSEARCH_KMP) += ts_kmp.o
 obj-$(CONFIG_TEXTSEARCH_BM) += ts_bm.o
 obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o
 obj-$(CONFIG_SMP) += percpu_counter.o
+obj-$(CONFIG_SMP) += pcounter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
diff --git a/lib/pcounter.c b/lib/pcounter.c
new file mode 100644
index 000..e89880e
--- /dev/null
+++ b/lib/pcounter.c
@@ -0,0 +1,26 @@
+/*
+ * Define default pcounter functions 
+ * Note that often used pcounters use dedicated functions to get a speed 
increase.
+ * (see DEFINE_PCOUNTER/REF_PCOUNTER_MEMBER)
+ */
+
+#include linux/module.h
+#include linux/pcounter.h
+#include linux/smp.h
+

Re: [PATCH][PACKET] Remove unneeded packet_socks_nr variable

2007-11-07 Thread Pavel Emelyanov
Arnaldo Carvalho de Melo wrote:
 Em Wed, Nov 07, 2007 at 01:50:04PM -0200, Arnaldo Carvalho de Melo escreveu:
 Em Wed, Nov 07, 2007 at 06:32:51PM +0300, Pavel Emelyanov escreveu:
 This one is used only under ifdef PACKET_REFCNT_DEBUG in
 printk and is not needed otherwise. So hide all this stuff
 under the PACKET_REFCNT_DEBUG.

 Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]
 Look at sk_refcnt_debug_inc, etc and you'll se a more standard way. I
 forgot to make this when making all protocol families use sk_prot, even
 if just partially :-)
 
 As a bonus you'll get this information on /proc/net/protocols, removing
 '-1' from PACKET column for sockets.

Hm... I actually thought about this, but I decided that packet
sockets were not accounted in this way deliberately.

So, shall I break this compatibility (-1 in proc) and provide
a packet socket number in this file?

 - Arnaldo

Thanks,
Pavel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
Thanks for these, Ingo. Will fix and test immediately.

Fred
[EMAIL PROTECTED] 

 -Original Message-
 From: Ingo Oeser [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, November 07, 2007 7:59 AM
 To: Templin, Fred L
 Cc: netdev@vger.kernel.org
 Subject: Re: [PATCH 02/05] ipv6: RFC4214 Support
 
 Hi Fred,
 
 some comments.
 
 Templin, Fred L schrieb:
  From: Fred L. Templin [EMAIL PROTECTED]
  
  This is experimental support for the Intra-Site Automatic
  Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
  the SIT module, and is configured using the unmodified
  ip utility with device names beginning with: isatap.
  
  The following diffs are specific to the Linux 2.6.23
  kernel distribution.
  
  Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
  
  ---
  
  --- linux-2.6.23/include/net/addrconf.h.orig2007-10-09
  13:31:38.0 -0700
  +++ linux-2.6.23/include/net/addrconf.h 2007-10-26 
 10:49:40.0
  -0700
  @@ -241,6 +241,34 @@ static inline int ipv6_addr_is_ll_all_ro
  addr-s6_addr32[3] == htonl(0x0002));
   }
   
  +#if defined(CONFIG_IPV6_ISATAP)
  +static inline int ipv6_isatap_eui64(u8 *eui, __be32 *addr)
 addr is only used for reading, not writing. No need to pass 
 it as a pointer.
 
  +{
  +   __be32 ipv4 = ntohl(*addr);
 
 ntohl(be32_value) != be32_value, so the _be32 attribution of ipv4 
 is wrong here and sparse will scream.
 
  +
  +   eui[0] = 0;
  +
  +   /* Check for RFC3330 global address ranges */
  +   if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
  +   ((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
  +   ((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
  +   ((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
  +   ((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
  +   ((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
  +   ((ipv4 = 0xc614)  (ipv4  0xe000))) eui[0] |=
  0x2;
  +
 
 Instead of converting network to host byte order at runtime 
 and comparing the results to constants, let the compiler convert
 the constants to network byte order and compare in network order.
 
 so use:
 
  if (((*addr = htonl(0x0100))  (*addr  
 htonl(0x0a00))) || 
 
 instead. The compiler will notice that 0x0100 is a 
 constant and will
 use _constant_htonl() automatically.
 
 
  +   eui[1] = 0; eui[2] = 0x5E; eui[3] = 0xFE;
  +   memcpy (eui+4, addr, 4);
  +   return (0);
  +}
 
 Nitpick: 
   return is not a function. Please write return 0; instead.
 
  +
  +static inline int ipv6_addr_is_isatap(const struct in6_addr *addr)
  +{
  +   return (addr-s6_addr32[2] == 
 __constant_htonl(0x02005EFE) ||
  +   addr-s6_addr32[2] == __constant_htonl(0x5EFE));
  +}
  +#endif
 
 The compiler will notice that 0x0100 is a constant and will
 use _constant_htonl() automatically. Please use simply htonl().
 
 
 Best Regards
 
 Ingo Oeser
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 00/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
I think I can clear this up. The patent office rejected
SRI's patent application, therefore there are no valid
claims that could prevent ISATAP from being included
in public domain software releases. Indeed, Microsoft,
cisco, and FreeBSD/KAME are shipping ISATAP and have
been doing so for a long time, and I believe there are
also several others.

Fred
[EMAIL PROTECTED]

 -Original Message-
 From: David Stevens [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, November 06, 2007 11:54 PM
 To: Pekka Savola
 Cc: David Miller; Templin, Fred L; netdev@vger.kernel.org; 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: Re: [PATCH 00/05] ipv6: RFC4214 Support
 
  give it away on this specific instance.  I'm not sure if you should 
  attribute to hidden agendas what you can explain by doing 
 the right 
  thing (granted, very few companies do this which may make 
 it suspect, 
  but still..).
 
 Pekka,
 I'm not assuming hidden agendas here; I simply don't know what
 they mean by no license for implementers.  It doesn't say they
 relinquish *all* licensing, which would be clearer if that's what they
 mean. If implementers, distributors, and users are included, then
 who's left that does need licensing? If that answer really is nobody,
 then why bother with for implementers.?
 So, I don't think it's a hidden agenda, I think they said what
 they mean. I just don't know what they mean. :-)
 
 +-DLS
 
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'fixes-davem' branch of wireless-2.6

2007-11-07 Thread Michael Buesch
On Wednesday 07 November 2007 01:13:14 John W. Linville wrote:
 ssb: Fix initcall ordering changes a subsys_initcall to an
 fs_initcall.  This seems like a bit of a hack, but it fixes a real
 problem and I'm not sure what cleaner solution is either reasonable
 or available.  The comment in the patch explains the reasoning for this
 somewhat unique situation.

Well, ssb is not the only subsystem with this special requirement.
Grep for fs_initcall. In my opinion we need another initcall to fix
this issue. I think we need a post_subsys_initcall().
But that really is another issue that we can't decide in netdev.
For now, this fix is harmless and fixes the bug.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Endianness problem with u32 classifier hash masks

2007-11-07 Thread Radu Rendec
On Wed, 2007-11-07 at 01:22 -0800, David Miller wrote:
 I've grown impatient and done the work for you :-)  I've applied
 the patch below to my tree, thank you!
 
 If someone wants to send me the ffs() thing relative to this,
 I'd appreciate it.  Thanks again!

Thanks again for making the patch and applying. I've just tested it and
it works like a charm.

Now moving on to the ffs() thing. I will send the patch later if it
works.

Cheers,
Radu


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/24] [INET]: Give outer DSCP directly to ip*_copy_dscp

2007-11-07 Thread Herbert Xu
[INET]: Give outer DSCP directly to ip*_copy_dscp

This patch changes the prototype of ipv4_copy_dscp and ipv6_copy_dscp so
that they directly take the outer DSCP rather than the outer IP header.
This will help us to unify the code for inter-family tunnels.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/inet_ecn.h   |8 
 net/ipv4/xfrm4_mode_tunnel.c |2 +-
 net/ipv6/ip6_tunnel.c|2 +-
 net/ipv6/xfrm6_mode_tunnel.c |3 ++-
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/inet_ecn.h b/include/net/inet_ecn.h
index de8399a..ba33db0 100644
--- a/include/net/inet_ecn.h
+++ b/include/net/inet_ecn.h
@@ -83,9 +83,9 @@ static inline void IP_ECN_clear(struct iphdr *iph)
iph-tos = ~INET_ECN_MASK;
 }
 
-static inline void ipv4_copy_dscp(struct iphdr *outer, struct iphdr *inner)
+static inline void ipv4_copy_dscp(unsigned int dscp, struct iphdr *inner)
 {
-   u32 dscp = ipv4_get_dsfield(outer)  ~INET_ECN_MASK;
+   dscp = ~INET_ECN_MASK;
ipv4_change_dsfield(inner, INET_ECN_MASK, dscp);
 }
 
@@ -104,9 +104,9 @@ static inline void IP6_ECN_clear(struct ipv6hdr *iph)
*(__be32*)iph = ~htonl(INET_ECN_MASK  20);
 }
 
-static inline void ipv6_copy_dscp(struct ipv6hdr *outer, struct ipv6hdr *inner)
+static inline void ipv6_copy_dscp(unsigned int dscp, struct ipv6hdr *inner)
 {
-   u32 dscp = ipv6_get_dsfield(outer)  ~INET_ECN_MASK;
+   dscp = ~INET_ECN_MASK;
ipv6_change_dsfield(inner, INET_ECN_MASK, dscp);
 }
 
diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c
index e4deecb..68a9f56 100644
--- a/net/ipv4/xfrm4_mode_tunnel.c
+++ b/net/ipv4/xfrm4_mode_tunnel.c
@@ -113,7 +113,7 @@ static int xfrm4_tunnel_input(struct xfrm_state *x, struct 
sk_buff *skb)
iph = ip_hdr(skb);
if (iph-protocol == IPPROTO_IPIP) {
if (x-props.flags  XFRM_STATE_DECAP_DSCP)
-   ipv4_copy_dscp(iph, ipip_hdr(skb));
+   ipv4_copy_dscp(ipv4_get_dsfield(iph), ipip_hdr(skb));
if (!(x-props.flags  XFRM_STATE_NOECN))
ipip_ecn_decapsulate(skb);
}
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 5383b33..a4051af 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -635,7 +635,7 @@ static void ip6ip6_dscp_ecn_decapsulate(struct ip6_tnl *t,
struct sk_buff *skb)
 {
if (t-parms.flags  IP6_TNL_F_RCV_DSCP_COPY)
-   ipv6_copy_dscp(ipv6h, ipv6_hdr(skb));
+   ipv6_copy_dscp(ipv6_get_dsfield(ipv6h), ipv6_hdr(skb));
 
if (INET_ECN_is_ce(ipv6_get_dsfield(ipv6h)))
IP6_ECN_set_ce(ipv6_hdr(skb));
diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c
index fd84e22..9a43ea7 100644
--- a/net/ipv6/xfrm6_mode_tunnel.c
+++ b/net/ipv6/xfrm6_mode_tunnel.c
@@ -95,7 +95,8 @@ static int xfrm6_tunnel_input(struct xfrm_state *x, struct 
sk_buff *skb)
nh = skb_network_header(skb);
if (nh[IP6CB(skb)-nhoff] == IPPROTO_IPV6) {
if (x-props.flags  XFRM_STATE_DECAP_DSCP)
-   ipv6_copy_dscp(ipv6_hdr(skb), ipipv6_hdr(skb));
+   ipv6_copy_dscp(ipv6_get_dsfield(ipv6_hdr(skb)),
+  ipipv6_hdr(skb));
if (!(x-props.flags  XFRM_STATE_NOECN))
ipip6_ecn_decapsulate(skb);
} else {
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/24] [IPSEC]: Move flow construction into xfrm_dst_lookup

2007-11-07 Thread Herbert Xu
[IPSEC]: Move flow construction into xfrm_dst_lookup

This patch moves the flow construction from the callers of xfrm_dst_lookup
into that function.  It also changes xfrm_dst_lookup so that it takes an
xfrm state as its argument instead of explicit addresses.

This removes any address-specific logic from the callers of xfrm_dst_lookup
which is needed to correctly support inter-family transforms.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/xfrm.h  |   10 +---
 net/ipv4/xfrm4_policy.c |   80 ++-
 net/ipv6/xfrm6_policy.c |   97 +---
 net/xfrm/xfrm_policy.c  |   25 +++-
 4 files changed, 91 insertions(+), 121 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index f96bae2..9b6af22 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -233,7 +233,8 @@ struct xfrm_policy_afinfo {
unsigned short  family;
struct dst_ops  *dst_ops;
void(*garbage_collect)(void);
-   int (*dst_lookup)(struct xfrm_dst **dst, struct 
flowi *fl);
+   struct dst_entry*(*dst_lookup)(int tos, xfrm_address_t *saddr,
+  xfrm_address_t *daddr);
int (*get_saddr)(xfrm_address_t *saddr, 
xfrm_address_t *daddr);
struct dst_entry*(*find_bundle)(struct flowi *fl, struct 
xfrm_policy *policy);
int (*bundle_create)(struct xfrm_policy *policy, 
@@ -1079,7 +1080,6 @@ extern int xfrm6_find_1stfragopt(struct xfrm_state *x, 
struct sk_buff *skb,
 #ifdef CONFIG_XFRM
 extern int xfrm4_udp_encap_rcv(struct sock *sk, struct sk_buff *skb);
 extern int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, 
int optlen);
-extern int xfrm_dst_lookup(struct xfrm_dst **dst, struct flowi *fl, unsigned 
short family);
 #else
 static inline int xfrm_user_policy(struct sock *sk, int optname, u8 __user 
*optval, int optlen)
 {
@@ -1092,13 +1092,9 @@ static inline int xfrm4_udp_encap_rcv(struct sock *sk, 
struct sk_buff *skb)
kfree_skb(skb);
return 0;
 }
-
-static inline int xfrm_dst_lookup(struct xfrm_dst **dst, struct flowi *fl, 
unsigned short family)
-{
-   return -EINVAL;
-} 
 #endif
 
+extern struct dst_entry *xfrm_dst_lookup(struct xfrm_state *x, int tos);
 struct xfrm_policy *xfrm_policy_alloc(gfp_t gfp);
 extern int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, 
int, void*), void *);
 int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl);
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index d903c8b..cebc847 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -8,7 +8,8 @@
  *
  */
 
-#include linux/compiler.h
+#include linux/err.h
+#include linux/kernel.h
 #include linux/inetdevice.h
 #include net/dst.h
 #include net/xfrm.h
@@ -17,28 +18,44 @@
 static struct dst_ops xfrm4_dst_ops;
 static struct xfrm_policy_afinfo xfrm4_policy_afinfo;
 
-static int xfrm4_dst_lookup(struct xfrm_dst **dst, struct flowi *fl)
+static struct dst_entry *xfrm4_dst_lookup(int tos, xfrm_address_t *saddr,
+ xfrm_address_t *daddr)
 {
-   return __ip_route_output_key((struct rtable**)dst, fl);
-}
-
-static int xfrm4_get_saddr(xfrm_address_t *saddr, xfrm_address_t *daddr)
-{
-   struct rtable *rt;
-   struct flowi fl_tunnel = {
+   struct flowi fl = {
.nl_u = {
.ip4_u = {
+   .tos = tos,
.daddr = daddr-a4,
},
},
};
+   struct dst_entry *dst;
+   struct rtable *rt;
+   int err;
 
-   if (!xfrm4_dst_lookup((struct xfrm_dst **)rt, fl_tunnel)) {
-   saddr-a4 = rt-rt_src;
-   dst_release(rt-u.dst);
-   return 0;
-   }
-   return -EHOSTUNREACH;
+   if (saddr)
+   fl.fl4_src = saddr-a4;
+
+   err = __ip_route_output_key(rt, fl);
+   dst = rt-u.dst;
+   if (err)
+   dst = ERR_PTR(err);
+   return dst;
+}
+
+static int xfrm4_get_saddr(xfrm_address_t *saddr, xfrm_address_t *daddr)
+{
+   struct dst_entry *dst;
+   struct rtable *rt;
+
+   dst = xfrm4_dst_lookup(0, NULL, daddr);
+   if (IS_ERR(dst))
+   return -EHOSTUNREACH;
+
+   rt = (struct rtable *)dst;
+   saddr-a4 = rt-rt_src;
+   dst_release(dst);
+   return 0;
 }
 
 static struct dst_entry *
@@ -73,15 +90,7 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
struct dst_entry *dst, *dst_prev;
struct rtable *rt0 = (struct rtable*)(*dst_p);
struct rtable *rt = rt0;
-   struct flowi fl_tunnel = {
-   .nl_u = {
-   .ip4_u = {
-   .saddr = 

[PATCH 16/24] [IPSEC]: Separate inner/outer mode processing on input

2007-11-07 Thread Herbert Xu
[IPSEC]: Separate inner/outer mode processing on input

With inter-family transforms the inner mode differs from the outer mode.
Attempting to handle both sides from the same function means that it
needs to handle both IPv4 and IPv6 which creates duplication and confusion.

This patch separates the two parts on the input path so that each function
deals with one family only.

In particular, the functions xfrm4_extract_inut/xfrm6_extract_inut
moves the pertinent fields from the IPv4/IPv6 IP headers into a neutral
format stored in skb-cb.  This is then used by the inner mode input
functions to modify the inner IP header.  In this way the input function
no longer has to know about the outer address family.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/xfrm.h   |   27 +
 net/ipv4/xfrm4_input.c   |7 +++-
 net/ipv4/xfrm4_mode_beet.c   |   67 +--
 net/ipv4/xfrm4_mode_tunnel.c |   44 ++--
 net/ipv4/xfrm4_state.c   |2 +
 net/ipv6/xfrm6_input.c   |7 +++-
 net/ipv6/xfrm6_mode_beet.c   |   36 +++
 net/ipv6/xfrm6_mode_tunnel.c |   31 +--
 net/ipv6/xfrm6_output.c  |1 
 net/ipv6/xfrm6_state.c   |5 ++-
 net/xfrm/xfrm_input.c|   13 
 11 files changed, 141 insertions(+), 99 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 9115baa..448a57f 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -258,6 +258,7 @@ extern int __xfrm_state_delete(struct xfrm_state *x);
 struct xfrm_state_afinfo {
unsigned intfamily;
unsigned intproto;
+   unsigned inteth_proto;
struct module   *owner;
struct xfrm_type*type_map[IPPROTO_MAX];
struct xfrm_mode*mode_map[XFRM_MODE_MAX];
@@ -268,6 +269,8 @@ struct xfrm_state_afinfo {
int (*tmpl_sort)(struct xfrm_tmpl **dst, struct 
xfrm_tmpl **src, int n);
int (*state_sort)(struct xfrm_state **dst, struct 
xfrm_state **src, int n);
int (*output)(struct sk_buff *skb);
+   int (*extract_input)(struct xfrm_state *x,
+struct sk_buff *skb);
int (*extract_output)(struct xfrm_state *x,
  struct sk_buff *skb);
 };
@@ -302,6 +305,27 @@ extern int xfrm_register_type(struct xfrm_type *type, 
unsigned short family);
 extern int xfrm_unregister_type(struct xfrm_type *type, unsigned short family);
 
 struct xfrm_mode {
+   /*
+* Remove encapsulation header.
+*
+* The IP header will be moved over the top of the encapsulation
+* header.
+*
+* On entry, the transport header shall point to where the IP header
+* should be and the network header shall be set to where the IP
+* header currently is.  skb-data shall point to the start of the
+* payload.
+*/
+   int (*input2)(struct xfrm_state *x, struct sk_buff *skb);
+
+   /*
+* This is the actual input entry point.
+*
+* For transport mode and equivalent this would be identical to
+* input2 (which does not need to be set).  While tunnel mode
+* and equivalent would set this to the tunnel encapsulation function
+* xfrm4_prepare_input that would in turn call input2.
+*/
int (*input)(struct xfrm_state *x, struct sk_buff *skb);
 
/*
@@ -1093,8 +1117,10 @@ extern void xfrm_replay_advance(struct xfrm_state *x, 
__be32 seq);
 extern void xfrm_replay_notify(struct xfrm_state *x, int event);
 extern int xfrm_state_mtu(struct xfrm_state *x, int mtu);
 extern int xfrm_init_state(struct xfrm_state *x);
+extern int xfrm_prepare_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm_output(struct sk_buff *skb);
 extern int xfrm4_extract_header(struct sk_buff *skb);
+extern int xfrm4_extract_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm4_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
   int encap_type);
 extern int xfrm4_rcv(struct sk_buff *skb);
@@ -1110,6 +1136,7 @@ extern int xfrm4_output(struct sk_buff *skb);
 extern int xfrm4_tunnel_register(struct xfrm_tunnel *handler, unsigned short 
family);
 extern int xfrm4_tunnel_deregister(struct xfrm_tunnel *handler, unsigned short 
family);
 extern int xfrm6_extract_header(struct sk_buff *skb);
+extern int xfrm6_extract_input(struct xfrm_state *x, struct sk_buff *skb);
 extern int xfrm6_rcv_spi(struct sk_buff *skb, int nexthdr, __be32 spi);
 extern int xfrm6_rcv(struct sk_buff *skb);
 extern int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index 5e95c8a..c0323d0 100644
--- 

[PATCH 20/24] [IPSEC]: Add async resume support on output

2007-11-07 Thread Herbert Xu
[IPSEC]: Add async resume support on output

This patch adds support for async resumptions on output.  To do so, the
transform would return -EINPROGRESS and subsequently invoke the function
xfrm_output_resume to resume processing.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/xfrm.h |1 
 net/xfrm/xfrm_output.c |   57 ++---
 2 files changed, 41 insertions(+), 17 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index e674a78..b6c26da 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1119,6 +1119,7 @@ extern void xfrm_replay_notify(struct xfrm_state *x, int 
event);
 extern int xfrm_state_mtu(struct xfrm_state *x, int mtu);
 extern int xfrm_init_state(struct xfrm_state *x);
 extern int xfrm_prepare_input(struct xfrm_state *x, struct sk_buff *skb);
+extern int xfrm_output_resume(struct sk_buff *skb, int err);
 extern int xfrm_output(struct sk_buff *skb);
 extern int xfrm4_extract_header(struct sk_buff *skb);
 extern int xfrm4_extract_input(struct xfrm_state *x, struct sk_buff *skb);
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index bcb3701..048d240 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -18,6 +18,8 @@
 #include net/dst.h
 #include net/xfrm.h
 
+static int xfrm_output2(struct sk_buff *skb);
+
 static int xfrm_state_check_space(struct xfrm_state *x, struct sk_buff *skb)
 {
struct dst_entry *dst = skb-dst;
@@ -41,17 +43,13 @@ err:
return err;
 }
 
-static int xfrm_output_one(struct sk_buff *skb)
+static int xfrm_output_one(struct sk_buff *skb, int err)
 {
struct dst_entry *dst = skb-dst;
struct xfrm_state *x = dst-xfrm;
-   int err;
 
-   if (skb-ip_summed == CHECKSUM_PARTIAL) {
-   err = skb_checksum_help(skb);
-   if (err)
-   goto error_nolock;
-   }
+   if (err = 0)
+   goto resume;
 
do {
err = x-outer_mode-output(x, skb);
@@ -75,6 +73,8 @@ static int xfrm_output_one(struct sk_buff *skb)
spin_unlock_bh(x-lock);
 
err = x-type-output(x, skb);
+
+resume:
if (err)
goto error_nolock;
 
@@ -97,18 +97,16 @@ error_nolock:
goto out_exit;
 }
 
-static int xfrm_output2(struct sk_buff *skb)
+int xfrm_output_resume(struct sk_buff *skb, int err)
 {
-   int err;
-
-   while (likely((err = xfrm_output_one(skb)) == 0)) {
+   while (likely((err = xfrm_output_one(skb, err)) == 0)) {
struct xfrm_state *x;
 
nf_reset(skb);
 
err = skb-dst-ops-local_out(skb);
if (unlikely(err != 1))
-   break;
+   goto out;
 
x = skb-dst-xfrm;
if (!x)
@@ -118,18 +116,25 @@ static int xfrm_output2(struct sk_buff *skb)
  x-inner_mode-afinfo-nf_post_routing, skb,
  NULL, skb-dst-dev, xfrm_output2);
if (unlikely(err != 1))
-   break;
+   goto out;
}
 
+   if (err == -EINPROGRESS)
+   err = 0;
+
+out:
return err;
 }
+EXPORT_SYMBOL_GPL(xfrm_output_resume);
 
-int xfrm_output(struct sk_buff *skb)
+static int xfrm_output2(struct sk_buff *skb)
 {
-   struct sk_buff *segs;
+   return xfrm_output_resume(skb, 1);
+}
 
-   if (!skb_is_gso(skb))
-   return xfrm_output2(skb);
+static int xfrm_output_gso(struct sk_buff *skb)
+{
+   struct sk_buff *segs;
 
segs = skb_gso_segment(skb, 0);
kfree_skb(skb);
@@ -157,4 +162,22 @@ int xfrm_output(struct sk_buff *skb)
 
return 0;
 }
+
+int xfrm_output(struct sk_buff *skb)
+{
+   int err;
+
+   if (skb_is_gso(skb))
+   return xfrm_output_gso(skb);
+
+   if (skb-ip_summed == CHECKSUM_PARTIAL) {
+   err = skb_checksum_help(skb);
+   if (err) {
+   kfree_skb(skb);
+   return err;
+   }
+   }
+
+   return xfrm_output2(skb);
+}
 EXPORT_SYMBOL_GPL(xfrm_output);
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/24] [IPV6]: Add ip6_local_out

2007-11-07 Thread Herbert Xu
[IPV6]: Add ip6_local_out

Most callers of the LOCAL_OUT chain will set the IP packet length
before doing so.  They also share the same output function dst_output.

This patch creates a new function called ip6_local_out which does all
of that and converts the appropriate users over to it.

Apart from removing duplicate code, it will also help in merging the
IPsec output path.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/ipv6.h   |7 +++
 net/ipv6/ip6_output.c|   35 +--
 net/ipv6/ip6_tunnel.c|4 +---
 net/ipv6/netfilter/ip6t_REJECT.c |4 +---
 net/ipv6/xfrm6_output.c  |7 +--
 5 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index ae328b6..5078605 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -509,6 +509,13 @@ extern int ip6_forward(struct sk_buff 
*skb);
 extern int ip6_input(struct sk_buff *skb);
 extern int ip6_mc_input(struct sk_buff *skb);
 
+extern int __ip6_local_out(struct sk_buff *skb);
+#ifdef CONFIG_NETFILTER
+extern int ip6_local_out(struct sk_buff *skb);
+#else
+#define ip6_local_out  __ip6_local_out
+#endif
+
 /*
  * Extension header (options) processing
  */
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 698d9d2..a0a366f 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -29,7 +29,7 @@
  */
 
 #include linux/errno.h
-#include linux/types.h
+#include linux/kernel.h
 #include linux/string.h
 #include linux/socket.h
 #include linux/net.h
@@ -70,6 +70,33 @@ static __inline__ void ipv6_select_ident(struct sk_buff 
*skb, struct frag_hdr *f
spin_unlock_bh(ip6_id_lock);
 }
 
+int __ip6_local_out(struct sk_buff *skb)
+{
+   int len;
+
+   len = skb-len - sizeof(struct ipv6hdr);
+   if (len  IPV6_MAXPLEN)
+   len = 0;
+   ipv6_hdr(skb)-payload_len = htons(len);
+
+   return nf_hook(PF_INET6, NF_IP6_LOCAL_OUT, skb, NULL, skb-dst-dev,
+  dst_output);
+}
+
+#ifdef CONFIG_NETFILTER
+int ip6_local_out(struct sk_buff *skb)
+{
+   int err;
+
+   err = __ip6_local_out(skb);
+   if (likely(err == 1))
+   err = dst_output(skb);
+
+   return err;
+}
+#endif
+EXPORT_SYMBOL_GPL(ip6_local_out);
+
 static int ip6_output_finish(struct sk_buff *skb)
 {
struct dst_entry *dst = skb-dst;
@@ -1388,10 +1415,6 @@ int ip6_push_pending_frames(struct sock *sk)
*(__be32*)hdr = fl-fl6_flowlabel |
 htonl(0x6000 | ((int)np-cork.tclass  20));
 
-   if (skb-len = sizeof(struct ipv6hdr) + IPV6_MAXPLEN)
-   hdr-payload_len = htons(skb-len - sizeof(struct ipv6hdr));
-   else
-   hdr-payload_len = 0;
hdr-hop_limit = np-cork.hop_limit;
hdr-nexthdr = proto;
ipv6_addr_copy(hdr-saddr, fl-fl6_src);
@@ -1408,7 +1431,7 @@ int ip6_push_pending_frames(struct sock *sk)
ICMP6_INC_STATS_BH(idev, ICMP6_MIB_OUTMSGS);
}
 
-   err = NF_HOOK(PF_INET6, NF_IP6_LOCAL_OUT, skb, NULL, skb-dst-dev, 
dst_output);
+   err = ip6_local_out(skb);
if (err) {
if (err  0)
err = np-recverr ? net_xmit_errno(err) : 0;
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a4051af..29b5321 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -910,15 +910,13 @@ static int ip6_tnl_xmit2(struct sk_buff *skb,
*(__be32*)ipv6h = fl-fl6_flowlabel | htonl(0x6000);
dsfield = INET_ECN_encapsulate(0, dsfield);
ipv6_change_dsfield(ipv6h, ~INET_ECN_MASK, dsfield);
-   ipv6h-payload_len = htons(skb-len - sizeof(struct ipv6hdr));
ipv6h-hop_limit = t-parms.hop_limit;
ipv6h-nexthdr = proto;
ipv6_addr_copy(ipv6h-saddr, fl-fl6_src);
ipv6_addr_copy(ipv6h-daddr, fl-fl6_dst);
nf_reset(skb);
pkt_len = skb-len;
-   err = NF_HOOK(PF_INET6, NF_IP6_LOCAL_OUT, skb, NULL,
- skb-dst-dev, dst_output);
+   err = ip6_local_out(skb);
 
if (net_xmit_eval(err) == 0) {
stats-tx_bytes += pkt_len;
diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c
index 1a7d291..c1c6634 100644
--- a/net/ipv6/netfilter/ip6t_REJECT.c
+++ b/net/ipv6/netfilter/ip6t_REJECT.c
@@ -121,7 +121,6 @@ static void send_reset(struct sk_buff *oldskb)
ip6h-version = 6;
ip6h-hop_limit = dst_metric(dst, RTAX_HOPLIMIT);
ip6h-nexthdr = IPPROTO_TCP;
-   ip6h-payload_len = htons(sizeof(struct tcphdr));
ipv6_addr_copy(ip6h-saddr, oip6h-daddr);
ipv6_addr_copy(ip6h-daddr, oip6h-saddr);
 
@@ -159,8 +158,7 @@ static void send_reset(struct sk_buff *oldskb)
 
nf_ct_attach(nskb, oldskb);
 
-   NF_HOOK(PF_INET6, NF_IP6_LOCAL_OUT, nskb, NULL, 

[PATCH 19/24] [IPSEC]: Merge most of the output path

2007-11-07 Thread Herbert Xu
[IPSEC]: Merge most of the output path

As part of the work on asynchrnous cryptographic operations, we need to
be able to resume from the spot where they occur.  As such, it helps if
we isolate them to one spot.

This patch moves most of the remaining family-specific processing into
the common output code.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/dst.h   |1 
 include/net/xfrm.h  |1 
 net/ipv4/route.c|1 
 net/ipv4/xfrm4_output.c |   76 ++-
 net/ipv4/xfrm4_policy.c |1 
 net/ipv4/xfrm4_state.c  |2 +
 net/ipv6/route.c|1 
 net/ipv6/xfrm6_output.c |   77 
 net/ipv6/xfrm6_policy.c |1 
 net/ipv6/xfrm6_state.c  |2 +
 net/xfrm/xfrm_output.c  |   70 +--
 11 files changed, 88 insertions(+), 145 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index 4df103b..13f7e32 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -98,6 +98,7 @@ struct dst_ops
struct dst_entry *  (*negative_advice)(struct dst_entry *);
void(*link_failure)(struct sk_buff *);
void(*update_pmtu)(struct dst_entry *dst, u32 mtu);
+   int (*local_out)(struct sk_buff *skb);
int entry_size;
 
atomic_tentries;
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 448a57f..e674a78 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -259,6 +259,7 @@ struct xfrm_state_afinfo {
unsigned intfamily;
unsigned intproto;
unsigned inteth_proto;
+   unsigned intnf_post_routing;
struct module   *owner;
struct xfrm_type*type_map[IPPROTO_MAX];
struct xfrm_mode*mode_map[XFRM_MODE_MAX];
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index d5cbcee..db484a2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -166,6 +166,7 @@ static struct dst_ops ipv4_dst_ops = {
.negative_advice =  ipv4_negative_advice,
.link_failure = ipv4_link_failure,
.update_pmtu =  ip_rt_update_pmtu,
+   .local_out =ip_local_out,
.entry_size =   sizeof(struct rtable),
 };
 
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index 0ffc3d0..2fb4efa 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -59,7 +59,7 @@ int xfrm4_prepare_output(struct xfrm_state *x, struct sk_buff 
*skb)
return err;
 
memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
-   IPCB(skb)-flags |= IPSKB_XFRM_TUNNEL_SIZE;
+   IPCB(skb)-flags |= IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED;
 
skb-protocol = htons(ETH_P_IP);
 
@@ -67,87 +67,19 @@ int xfrm4_prepare_output(struct xfrm_state *x, struct 
sk_buff *skb)
 }
 EXPORT_SYMBOL(xfrm4_prepare_output);
 
-static inline int xfrm4_output_one(struct sk_buff *skb)
-{
-   int err;
-
-   err = xfrm_output(skb);
-   if (err)
-   goto error_nolock;
-
-   IPCB(skb)-flags |= IPSKB_XFRM_TRANSFORMED;
-   err = 0;
-
-out_exit:
-   return err;
-error_nolock:
-   kfree_skb(skb);
-   goto out_exit;
-}
-
-static int xfrm4_output_finish2(struct sk_buff *skb)
-{
-   int err;
-
-   while (likely((err = xfrm4_output_one(skb)) == 0)) {
-   nf_reset(skb);
-
-   err = __ip_local_out(skb);
-   if (unlikely(err != 1))
-   break;
-
-   if (!skb-dst-xfrm)
-   return dst_output(skb);
-
-   err = nf_hook(PF_INET, NF_IP_POST_ROUTING, skb, NULL,
- skb-dst-dev, xfrm4_output_finish2);
-   if (unlikely(err != 1))
-   break;
-   }
-
-   return err;
-}
-
 static int xfrm4_output_finish(struct sk_buff *skb)
 {
-   struct sk_buff *segs;
-
 #ifdef CONFIG_NETFILTER
if (!skb-dst-xfrm) {
IPCB(skb)-flags |= IPSKB_REROUTED;
return dst_output(skb);
}
-#endif
 
-   if (!skb_is_gso(skb))
-   return xfrm4_output_finish2(skb);
+   IPCB(skb)-flags |= IPSKB_XFRM_TRANSFORMED;
+#endif
 
skb-protocol = htons(ETH_P_IP);
-   segs = skb_gso_segment(skb, 0);
-   kfree_skb(skb);
-   if (unlikely(IS_ERR(segs)))
-   return PTR_ERR(segs);
-
-   do {
-   struct sk_buff *nskb = segs-next;
-   int err;
-
-   segs-next = NULL;
-   err = xfrm4_output_finish2(segs);
-
-   if (unlikely(err)) {
-   while ((segs = nskb)) {
-   nskb = segs-next;
-   segs-next = NULL;
-   kfree_skb(segs);
-   }
- 

[PATCH 24/24] [IPSEC]: Move state lock into x-type-input

2007-11-07 Thread Herbert Xu
[IPSEC]: Move state lock into x-type-input

This patch releases the lock on the state before calling x-type-input.
It also adds the lock to the spots where they're currently needed.

Most of those places (all except mip6) are expected to disappear with
async crypto.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/ah4.c|   14 ++
 net/ipv4/esp4.c   |   24 +++-
 net/ipv6/ah6.c|9 +++--
 net/ipv6/esp6.c   |   37 +++--
 net/ipv6/mip6.c   |   14 ++
 net/xfrm/xfrm_input.c |4 
 6 files changed, 69 insertions(+), 33 deletions(-)

diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index a989d29..d76803a 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -169,6 +169,8 @@ static int ah_input(struct xfrm_state *x, struct sk_buff 
*skb)
if (ip_clear_mutable_options(iph, dummy))
goto out;
}
+
+   spin_lock(x-lock);
{
u8 auth_data[MAX_AH_AUTH_LEN];
 
@@ -176,12 +178,16 @@ static int ah_input(struct xfrm_state *x, struct sk_buff 
*skb)
skb_push(skb, ihl);
err = ah_mac_digest(ahp, skb, ah-auth_data);
if (err)
-   goto out;
-   if (memcmp(ahp-work_icv, auth_data, ahp-icv_trunc_len)) {
+   goto unlock;
+   if (memcmp(ahp-work_icv, auth_data, ahp-icv_trunc_len))
err = -EBADMSG;
-   goto out;
-   }
}
+unlock:
+   spin_unlock(x-lock);
+
+   if (err)
+   goto out;
+
skb-network_header += ah_hlen;
memcpy(skb_network_header(skb), work_buf, ihl);
skb-transport_header = skb-network_header;
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 7f1854c..de4592c 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -170,29 +170,31 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
if (elen = 0 || (elen  (blksize-1)))
goto out;
 
+   if ((err = skb_cow_data(skb, 0, trailer))  0)
+   goto out;
+   nfrags = err;
+
+   skb-ip_summed = CHECKSUM_NONE;
+
+   spin_lock(x-lock);
+
/* If integrity check is required, do this. */
if (esp-auth.icv_full_len) {
u8 sum[alen];
 
err = esp_mac_digest(esp, skb, 0, skb-len - alen);
if (err)
-   goto out;
+   goto unlock;
 
if (skb_copy_bits(skb, skb-len - alen, sum, alen))
BUG();
 
if (unlikely(memcmp(esp-auth.work_icv, sum, alen))) {
err = -EBADMSG;
-   goto out;
+   goto unlock;
}
}
 
-   if ((err = skb_cow_data(skb, 0, trailer))  0)
-   goto out;
-   nfrags = err;
-
-   skb-ip_summed = CHECKSUM_NONE;
-
esph = (struct ip_esp_hdr *)skb-data;
 
/* Get ivec. This can be wrong, check against another impls. */
@@ -205,7 +207,7 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
err = -ENOMEM;
sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC);
if (!sg)
-   goto out;
+   goto unlock;
}
sg_init_table(sg, nfrags);
skb_to_sgvec(skb, sg,
@@ -214,6 +216,10 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
err = crypto_blkcipher_decrypt(desc, sg, sg, elen);
if (unlikely(sg != esp-sgbuf[0]))
kfree(sg);
+
+unlock:
+   spin_unlock(x-lock);
+
if (unlikely(err))
goto out;
 
diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index d4b59ec..1b51d1e 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -370,6 +370,7 @@ static int ah6_input(struct xfrm_state *x, struct sk_buff 
*skb)
ip6h-flow_lbl[2] = 0;
ip6h-hop_limit   = 0;
 
+   spin_lock(x-lock);
{
u8 auth_data[MAX_AH_AUTH_LEN];
 
@@ -378,13 +379,17 @@ static int ah6_input(struct xfrm_state *x, struct sk_buff 
*skb)
skb_push(skb, hdr_len);
err = ah_mac_digest(ahp, skb, ah-auth_data);
if (err)
-   goto free_out;
+   goto unlock;
if (memcmp(ahp-work_icv, auth_data, ahp-icv_trunc_len)) {
LIMIT_NETDEBUG(KERN_WARNING ipsec ah authentication 
error\n);
err = -EBADMSG;
-   goto free_out;
}
}
+unlock:
+   spin_unlock(x-lock);
+
+   if (err)
+   goto free_out;
 
skb-network_header += ah_hlen;
memcpy(skb_network_header(skb), tmp_hdr, hdr_len);
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index c37982b..bb0e562 100644
--- a/net/ipv6/esp6.c

Re: Endianness problem with u32 classifier hash masks

2007-11-07 Thread Radu Rendec
On Wed, 2007-11-07 at 08:42 -0500, jamal wrote:
 On Wed, 2007-07-11 at 01:22 -0800, David Miller wrote:
 
  @@ -615,7 +615,7 @@ static int u32_change(struct tcf_proto *tp, unsigned 
  long base, u32 handle,
  n-handle = handle;
   {
  u8 i = 0;
  -   u32 mask = s-hmask;
  +   u32 mask = ntohl(s-hmask);
 
 
 Is this line needed? Radu?

Yup. Without it, the number of bits to shift would be computed on the
network ordered mask. The shift in u32_hash_fold() is done on a host
ordered u32 (obtained from applying the mask on the packet data).

Shifting the host ordered u32 with number of bits obtained from network
ordered mask would most probably break things.

I've just compiled the kernel from a fresh clone of Dave's tree (u32
patch included) and I'm about to test it.

Dave, thanks a lot for adding the patch to your tree. But I guess you
didn't test it. I'll get back to you in max 1 hour and tell you the
results.

If everything goes well, then I'll move on to testing the ffs() patch.

Cheers,

Radu


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/24] [NET]: Eliminate duplicate copies of dst_discard

2007-11-07 Thread Herbert Xu
[NET]: Eliminate duplicate copies of dst_discard

We have a number of copies of dst_discard scattered around the place which
all do the same thing, namely free a packet on the input or output paths.

This patch deletes all of them except dst_discard and points all the users
to it.

The only non-trivial bit is decnet where it returns an error.  However,
conceptually this is identical to the blackhole functions used in IPv4
and IPv6 which do not return errors.  So they should either all return
errors or all return zero.  For now I've stuck with the majority and
picked zero as the return value.

It doesn't really matter in practice since few if any driver would react
differently depending on a zero return value or NET_RX_DROP.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/dst.h |1 +
 net/core/dst.c|3 ++-
 net/decnet/dn_route.c |   13 +
 net/ipv4/route.c  |   11 +++
 net/ipv6/exthdrs.c|   13 ++---
 net/ipv6/route.c  |   21 -
 6 files changed, 13 insertions(+), 49 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index 8f88c52..4df103b 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -172,6 +172,7 @@ static inline struct dst_entry *dst_pop(struct dst_entry 
*dst)
return child;
 }
 
+extern int dst_discard(struct sk_buff *skb);
 extern void * dst_alloc(struct dst_ops * ops);
 extern void __dst_free(struct dst_entry * dst);
 extern struct dst_entry *dst_destroy(struct dst_entry * dst);
diff --git a/net/core/dst.c b/net/core/dst.c
index 16958e6..21b1d0f 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -154,11 +154,12 @@ loop:
 #endif
 }
 
-static int dst_discard(struct sk_buff *skb)
+int dst_discard(struct sk_buff *skb)
 {
kfree_skb(skb);
return 0;
 }
+EXPORT_SYMBOL(dst_discard);
 
 void * dst_alloc(struct dst_ops * ops)
 {
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index 97eee5e..e1f6669 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -769,17 +769,6 @@ drop:
 }
 
 /*
- * Drop packet. This is used for endnodes and for
- * when we should not be forwarding packets from
- * this dest.
- */
-static int dn_blackhole(struct sk_buff *skb)
-{
-   kfree_skb(skb);
-   return NET_RX_DROP;
-}
-
-/*
  * Used to catch bugs. This should never normally get
  * called.
  */
@@ -1402,7 +1391,7 @@ make_route:
default:
case RTN_UNREACHABLE:
case RTN_BLACKHOLE:
-   rt-u.dst.input = dn_blackhole;
+   rt-u.dst.input = dst_discard;
}
rt-rt_flags = flags;
if (rt-u.dst.dev)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 21b12de..d5cbcee 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -92,6 +92,7 @@
 #include linux/jhash.h
 #include linux/rcupdate.h
 #include linux/times.h
+#include net/dst.h
 #include net/net_namespace.h
 #include net/protocol.h
 #include net/ip.h
@@ -2362,12 +2363,6 @@ static struct dst_ops ipv4_dst_blackhole_ops = {
 };
 
 
-static int ipv4_blackhole_output(struct sk_buff *skb)
-{
-   kfree_skb(skb);
-   return 0;
-}
-
 static int ipv4_dst_blackhole(struct rtable **rp, struct flowi *flp, struct 
sock *sk)
 {
struct rtable *ort = *rp;
@@ -2379,8 +2374,8 @@ static int ipv4_dst_blackhole(struct rtable **rp, struct 
flowi *flp, struct sock
 
atomic_set(new-__refcnt, 1);
new-__use = 1;
-   new-input = ipv4_blackhole_output;
-   new-output = ipv4_blackhole_output;
+   new-input = dst_discard;
+   new-output = dst_discard;
memcpy(new-metrics, ort-u.dst.metrics, RTAX_MAX*sizeof(u32));
 
new-dev = ort-u.dst.dev;
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 1e89efd..cee06b1 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -32,6 +32,7 @@
 #include linux/in6.h
 #include linux/icmpv6.h
 
+#include net/dst.h
 #include net/sock.h
 #include net/snmp.h
 
@@ -318,18 +319,8 @@ void __init ipv6_destopt_init(void)
printk(KERN_ERR ipv6_destopt_init: Could not register 
protocol\n);
 }
 
-/
-  NONE header. No data in packet.
- /
-
-static int ipv6_nodata_rcv(struct sk_buff *skb)
-{
-   kfree_skb(skb);
-   return 0;
-}
-
 static struct inet6_protocol nodata_protocol = {
-   .handler=   ipv6_nodata_rcv,
+   .handler=   dst_discard,
.flags  =   INET6_PROTO_NOPOLICY,
 };
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 95f8e4a..7db3cdf 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -156,7 +156,6 @@ struct rt6_info ip6_null_entry = {
 
 static int ip6_pkt_prohibit(struct sk_buff *skb);
 static int ip6_pkt_prohibit_out(struct sk_buff *skb);
-static int ip6_pkt_blk_hole(struct sk_buff *skb);
 
 struct rt6_info ip6_prohibit_entry = {
 

[PATCH 9/24] [IPSEC]: Replace x-type-{local,remote}_addr with flags

2007-11-07 Thread Herbert Xu
[IPSEC]: Replace x-type-{local,remote}_addr with flags

The functions local_addr and remote_addr are more than what they're needed
for.  The same thing can be done easily with flags on the type object.
This patch does that and simplifies the wrapper functions in xfrm6_policy
accordingly.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/xfrm.h  |4 ++--
 net/ipv6/mip6.c |   11 ++-
 net/ipv6/xfrm6_policy.c |   20 
 3 files changed, 12 insertions(+), 23 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 58dfa82..f96bae2 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -282,6 +282,8 @@ struct xfrm_type
__u8flags;
 #define XFRM_TYPE_NON_FRAGMENT 1
 #define XFRM_TYPE_REPLAY_PROT  2
+#define XFRM_TYPE_LOCAL_COADDR 4
+#define XFRM_TYPE_REMOTE_COADDR8
 
int (*init_state)(struct xfrm_state *x);
void(*destructor)(struct xfrm_state *);
@@ -289,8 +291,6 @@ struct xfrm_type
int (*output)(struct xfrm_state *, struct sk_buff 
*pskb);
int (*reject)(struct xfrm_state *, struct sk_buff 
*, struct flowi *);
int (*hdr_offset)(struct xfrm_state *, struct 
sk_buff *, u8 **);
-   xfrm_address_t  *(*local_addr)(struct xfrm_state *, 
xfrm_address_t *);
-   xfrm_address_t  *(*remote_addr)(struct xfrm_state *, 
xfrm_address_t *);
/* Estimate maximal size of result of transformation of a dgram */
u32 (*get_mtu)(struct xfrm_state *, int size);
 };
diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index 7fd841d..edfd9cd 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -34,11 +34,6 @@
 #include net/xfrm.h
 #include net/mip6.h
 
-static xfrm_address_t *mip6_xfrm_addr(struct xfrm_state *x, xfrm_address_t 
*addr)
-{
-   return x-coaddr;
-}
-
 static inline unsigned int calc_padlen(unsigned int len, unsigned int n)
 {
return (n - len + 16)  0x7;
@@ -337,14 +332,13 @@ static struct xfrm_type mip6_destopt_type =
.description= MIP6DESTOPT,
.owner  = THIS_MODULE,
.proto  = IPPROTO_DSTOPTS,
-   .flags  = XFRM_TYPE_NON_FRAGMENT,
+   .flags  = XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_LOCAL_COADDR,
.init_state = mip6_destopt_init_state,
.destructor = mip6_destopt_destroy,
.input  = mip6_destopt_input,
.output = mip6_destopt_output,
.reject = mip6_destopt_reject,
.hdr_offset = mip6_destopt_offset,
-   .local_addr = mip6_xfrm_addr,
 };
 
 static int mip6_rthdr_input(struct xfrm_state *x, struct sk_buff *skb)
@@ -467,13 +461,12 @@ static struct xfrm_type mip6_rthdr_type =
.description= MIP6RT,
.owner  = THIS_MODULE,
.proto  = IPPROTO_ROUTING,
-   .flags  = XFRM_TYPE_NON_FRAGMENT,
+   .flags  = XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_REMOTE_COADDR,
.init_state = mip6_rthdr_init_state,
.destructor = mip6_rthdr_destroy,
.input  = mip6_rthdr_input,
.output = mip6_rthdr_output,
.hdr_offset = mip6_rthdr_offset,
-   .remote_addr= mip6_xfrm_addr,
 };
 
 static int __init mip6_init(void)
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index f04e718..ed29bef 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -87,20 +87,16 @@ __xfrm6_find_bundle(struct flowi *fl, struct xfrm_policy 
*policy)
return dst;
 }
 
-static inline struct in6_addr*
-__xfrm6_bundle_addr_remote(struct xfrm_state *x, struct in6_addr *addr)
+static inline xfrm_address_t *__xfrm6_bundle_addr_remote(struct xfrm_state *x)
 {
-   return (x-type-remote_addr) ?
-   (struct in6_addr*)x-type-remote_addr(x, (xfrm_address_t 
*)addr) :
-   (struct in6_addr*)x-id.daddr;
+   return (x-type-flags  XFRM_TYPE_REMOTE_COADDR) ? x-coaddr :
+   x-id.daddr;
 }
 
-static inline struct in6_addr*
-__xfrm6_bundle_addr_local(struct xfrm_state *x, struct in6_addr *addr)
+static inline xfrm_address_t *__xfrm6_bundle_addr_local(struct xfrm_state *x)
 {
-   return (x-type-local_addr) ?
-   (struct in6_addr*)x-type-local_addr(x, (xfrm_address_t 
*)addr) :
-   (struct in6_addr*)x-props.saddr;
+   return (x-type-flags  XFRM_TYPE_LOCAL_COADDR) ? x-coaddr :
+  x-props.saddr;
 }
 
 /* Allocate chain of dst_entry's, attach known xfrm's, calculate
@@ -171,9 +167,9 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
fl_tunnel.fl4_src = xfrm[i]-props.saddr.a4;
break;
case 

[PATCH 6/24] [IPSEC]: Only set neighbour on top xfrm dst

2007-11-07 Thread Herbert Xu
[IPSEC]: Only set neighbour on top xfrm dst

The neighbour field is only used by dst_confirm which only ever happens on
the top-most xfrm dst.  So it's a waste to duplicate for every other xfrm
dst.  This patch moves its setting out of the loop so that only the top one
gets set.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/xfrm4_policy.c |5 +++--
 net/ipv6/xfrm6_policy.c |6 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 5ee3a2f..7d250a1 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -144,6 +144,9 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev-child = rt-u.dst;
dst-path = rt-u.dst;
 
+   /* Copy neighbout for reachability confirmation */
+   dst-neighbour = neigh_clone(rt-u.dst.neighbour);
+
*dst_p = dst;
dst = dst_prev;
 
@@ -164,8 +167,6 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev-trailer_len   = trailer_len;
memcpy(dst_prev-metrics, x-route-metrics, 
sizeof(dst_prev-metrics));
 
-   /* Copy neighbout for reachability confirmation */
-   dst_prev-neighbour = neigh_clone(rt-u.dst.neighbour);
dst_prev-input = rt-u.dst.input;
dst_prev-output = dst_prev-xfrm-outer_mode-afinfo-output;
if (rt0-peer)
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 9095dfc..15747f3 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -188,6 +188,10 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
 
dst_prev-child = rt-u.dst;
dst-path = rt-u.dst;
+
+   /* Copy neighbour for reachability confirmation */
+   dst-neighbour = neigh_clone(rt-u.dst.neighbour);
+
if (rt-rt6i_node)
((struct xfrm_dst *)dst)-path_cookie = 
rt-rt6i_node-fn_sernum;
 
@@ -210,8 +214,6 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev-trailer_len   = trailer_len;
memcpy(dst_prev-metrics, x-route-metrics, 
sizeof(dst_prev-metrics));
 
-   /* Copy neighbour for reachability confirmation */
-   dst_prev-neighbour = neigh_clone(rt-u.dst.neighbour);
dst_prev-input = rt-u.dst.input;
dst_prev-output = dst_prev-xfrm-outer_mode-afinfo-output;
/* Sheit... I remember I did this right. Apparently,
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/24] [IPSEC]: Make sure idev is consistent with dev in xfrm_dst

2007-11-07 Thread Herbert Xu
[IPSEC]: Make sure idev is consistent with dev in xfrm_dst

Previously we took the device from the bottom route and idev from the top
route.  This is bad because idev may well point to a different device.
This patch changes it so that we get the idev from the device directly.

It also makes it an error if either dev or idev is NULL.  This is consistent
with the rest of the routing code which also treats these cases as errors.

I've removed the err initialisation in xfrm6_policy.c because it achieves
no purpose and hid a bug when an initial version of this patch neglected
to set err to -ENODEV (fortunately the IPv4 version warned about it).

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/xfrm4_policy.c |   13 +
 net/ipv6/xfrm6_policy.c |   15 ++-
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index c40a71b..d903c8b 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -153,14 +153,21 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
 
dst_prev = *dst_p;
i = 0;
+   err = -ENODEV;
for (; dst_prev != rt-u.dst; dst_prev = dst_prev-child) {
struct xfrm_dst *x = (struct xfrm_dst*)dst_prev;
x-u.rt.fl = *fl;
 
dst_prev-xfrm = xfrm[i++];
dst_prev-dev = rt-u.dst.dev;
-   if (rt-u.dst.dev)
-   dev_hold(rt-u.dst.dev);
+   if (!rt-u.dst.dev)
+   goto error;
+   dev_hold(rt-u.dst.dev);
+
+   x-u.rt.idev = in_dev_get(rt-u.dst.dev);
+   if (!x-u.rt.idev)
+   goto error;
+
dst_prev-obsolete  = -1;
dst_prev-flags|= DST_HOST;
dst_prev-lastuse   = jiffies;
@@ -181,8 +188,6 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
x-u.rt.rt_dst = rt0-rt_dst;
x-u.rt.rt_gateway = rt0-rt_gateway;
x-u.rt.rt_spec_dst = rt0-rt_spec_dst;
-   x-u.rt.idev = rt0-idev;
-   in_dev_hold(rt0-idev);
header_len -= x-u.dst.xfrm-props.header_len;
trailer_len -= x-u.dst.xfrm-props.trailer_len;
}
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index a1c6b7c..f04e718 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -123,7 +123,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
}
};
int i;
-   int err = 0;
+   int err;
int header_len = 0;
int trailer_len = 0;
 
@@ -201,13 +201,20 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
 
dst_prev = *dst_p;
i = 0;
+   err = -ENODEV;
for (; dst_prev != rt-u.dst; dst_prev = dst_prev-child) {
struct xfrm_dst *x = (struct xfrm_dst*)dst_prev;
 
dst_prev-xfrm = xfrm[i++];
dst_prev-dev = rt-u.dst.dev;
-   if (rt-u.dst.dev)
-   dev_hold(rt-u.dst.dev);
+   if (!rt-u.dst.dev)
+   goto error;
+   dev_hold(rt-u.dst.dev);
+
+   x-u.rt6.rt6i_idev = in6_dev_get(rt-u.dst.dev);
+   if (!x-u.rt6.rt6i_idev)
+   goto error;
+
dst_prev-obsolete  = -1;
dst_prev-flags|= DST_HOST;
dst_prev-lastuse   = jiffies;
@@ -226,8 +233,6 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
memcpy(x-u.rt6.rt6i_gateway, rt0-rt6i_gateway, 
sizeof(x-u.rt6.rt6i_gateway));
x-u.rt6.rt6i_dst  = rt0-rt6i_dst;
x-u.rt6.rt6i_src  = rt0-rt6i_src;
-   x-u.rt6.rt6i_idev = rt0-rt6i_idev;
-   in6_dev_hold(rt0-rt6i_idev);
header_len -= x-u.dst.xfrm-props.header_len;
trailer_len -= x-u.dst.xfrm-props.trailer_len;
}
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/24] [IPV6]: Move nfheader_len into rt6_info

2007-11-07 Thread Herbert Xu
[IPV6]: Move nfheader_len into rt6_info

The dst member nfheader_len is only used by IPv6.  It's also currently
creating a rather ugly alignment hole in struct dst.  Therefore this patch
moves it from there into struct rt6_info.

It also reorders the fields in rt6_info to minimize holes.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/dst.h   |1 -
 include/net/ip6_fib.h   |   11 ---
 net/ipv4/xfrm4_policy.c |1 -
 net/ipv6/ip6_output.c   |5 +++--
 net/ipv6/xfrm6_policy.c |3 ++-
 5 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index e9ff4a4..8f88c52 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -50,7 +50,6 @@ struct dst_entry
unsigned long   expires;
 
unsigned short  header_len; /* more space at head required 
*/
-   unsigned short  nfheader_len;   /* more non-fragment space at 
head required */
unsigned short  trailer_len;/* space to reserve at tail */
 
u32 metrics[RTAX_MAX];
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 8578213..4cefcff 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -99,16 +99,21 @@ struct rt6_info
u32 rt6i_flags;
u32 rt6i_metric;
atomic_trt6i_ref;
-   struct fib6_table   *rt6i_table;
 
-   struct rt6key   rt6i_dst;
-   struct rt6key   rt6i_src;
+   /* more non-fragment space at head required */
+   unsigned short  nfheader_len;
 
u8  rt6i_protocol;
 
+   struct fib6_table   *rt6i_table;
+
+   struct rt6key   rt6i_dst;
+
 #ifdef CONFIG_XFRM
u32 rt6i_flow_cache_genid;
 #endif
+
+   struct rt6key   rt6i_src;
 };
 
 static inline struct inet6_dev *ip6_dst_idev(struct dst_entry *dst)
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index cc86fb1..5ee3a2f 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -161,7 +161,6 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev-flags|= DST_HOST;
dst_prev-lastuse   = jiffies;
dst_prev-header_len= header_len;
-   dst_prev-nfheader_len  = 0;
dst_prev-trailer_len   = trailer_len;
memcpy(dst_prev-metrics, x-route-metrics, 
sizeof(dst_prev-metrics));
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6289dfc..698d9d2 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1097,7 +1097,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void 
*from, char *to,
sk-sk_sndmsg_page = NULL;
sk-sk_sndmsg_off = 0;
exthdrlen = rt-u.dst.header_len + (opt ? opt-opt_flen : 0) -
-   rt-u.dst.nfheader_len;
+   rt-nfheader_len;
length += exthdrlen;
transhdrlen += exthdrlen;
} else {
@@ -1112,7 +1112,8 @@ int ip6_append_data(struct sock *sk, int getfrag(void 
*from, char *to,
 
hh_len = LL_RESERVED_SPACE(rt-u.dst.dev);
 
-   fragheaderlen = sizeof(struct ipv6hdr) + rt-u.dst.nfheader_len + (opt 
? opt-opt_nflen : 0);
+   fragheaderlen = sizeof(struct ipv6hdr) + rt-nfheader_len +
+   (opt ? opt-opt_nflen : 0);
maxfraglen = ((mtu - fragheaderlen)  ~7) + fragheaderlen - 
sizeof(struct frag_hdr);
 
if (mtu = sizeof(struct ipv6hdr) + IPV6_MAXPLEN) {
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index e5170bc..9095dfc 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -157,7 +157,8 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev = dst1;
 
if (xfrm[i]-type-flags  XFRM_TYPE_NON_FRAGMENT)
-   dst-nfheader_len += xfrm[i]-props.header_len;
+   ((struct rt6_info *)dst)-nfheader_len +=
+   xfrm[i]-props.header_len;
header_len += xfrm[i]-props.header_len;
trailer_len += xfrm[i]-props.trailer_len;
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[0/24] Merge IPv4/IPv6 IPsec bundle creation and input/ouput

2007-11-07 Thread Herbert Xu
Hi Dave:

Here's a dump of what I've currently got in my IPsec tree.
It contains all the patches I've posted previously which are
yet to be merged.

The first 11 patches are unchanged from the previous posting.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
 

 -Original Message-
 From: Stephen Hemminger [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, November 07, 2007 10:45 AM
 To: Templin, Fred L
 Cc: YOSHIFUJI Hideaki / 吉藤英明; netdev@vger.kernel.org
 Subject: Re: [PATCH 04/05] ipv6: RFC4214 Support
 
 On Wed, 7 Nov 2007 10:41:49 -0800
 Templin, Fred L [EMAIL PROTECTED] wrote:
 
  Yoshifuji, 
  
   -Original Message-
   From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] 
   Sent: Wednesday, November 07, 2007 10:37 AM
   To: Templin, Fred L
   Cc: netdev@vger.kernel.org; [EMAIL PROTECTED]
   Subject: Re: [PATCH 04/05] ipv6: RFC4214 Support
   
   Hello.
   
   In article 
   [EMAIL PROTECTED]
   eing.com (at Tue, 6 Nov 2007 17:16:11 -0800), Templin, Fred 
   L [EMAIL PROTECTED] says:
   
@@ -154,6 +155,14 @@ static struct ip_tunnel * ipip6_tunnel_l
struct net_device *dev;
char name[IFNAMSIZ];
 
+#if defined(CONFIG_IPV6_ISATAP)
+   /* ISATAP (RFC4214) - router address in daddr */
+   if (!strncmp(parms-name, isatap, 6)) {
+   parms-i_key = parms-iph.daddr;
+   parms-iph.daddr = remote = 0;
+   }
+#endif
+
for (tp = __ipip6_bucket(parms); (t = *tp) != NULL; tp =
t-next) {
if (local == t-parms.iph.saddr  remote ==
t-parms.iph.daddr)
return t;
   
   I do not think it is a good idea to change the behavior based on
   the interface name.
  
  The goal was to avoid requiring changes to applications such as
  'iproute2', i.e., the intention was for a standalone code 
 insertion point
  within the kernel itself. What do you suggest?
 
 Agreed, magic names are evil.
 
 Change iproute2 utilities, if it is more logical for administration.

This being an experimental release, I would prefer to go
forward with a standalone kernel solution for the first
iteration then come back with the iproute2 changes at a
later time. IMHO, we should only touch iproute2 once, and
it should be an architected solution - not just a quick
hack. For the short term, timeliness of interoperability testing
with the other major OS's should be the highest priority, IMHO.

Other opinions?

Fred
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'fixes-jgarzik' branch of wireless-2.6

2007-11-07 Thread John W. Linville
Jeff,

If you haven't already pulled this then please hold-off.  I'll post
a new request soon.

Thanks,

John

On Tue, Nov 06, 2007 at 03:07:00PM -0500, John W. Linville wrote:
 Jeff,
 
 Here are a few fixes for 2.6.24.  The iwlwifi is_power_of_2 patch is
 a little questionable as a fix.  But it does bring the buildtime check
 in iwl_tx_queue_init in-line with the runtime check in iwl_queue_init,
 and it is 2x a one-liner -- so I think it is worthwhile.
 
 Thanks,
 
 John
 
 ---
 
 Individual patches available here:
 
   
 http://www.kernel.org/pub//linux/kernel/people/linville/wireless-2.6/fixes-jgarzik/
 
 ---
 
 The following changes since commit 2655e2cee2d77459fcb7e10228259e4ee0328697:
   Alan Cox (1):
 ata_piix: Add additional PCI identifier for 40 wire short cable
 
 are available in the git repository at:
 
   git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git 
 fixes-jgarzik
 
 Holger Schurig (1):
   libertas: fixes for slow hardware
 
 Ivo van Doorn (1):
   rt2x00: Block adhoc  master mode
 
 John W. Linville (1):
   hermes: clarify Intel reference in Kconfig help
 
 Marcelo Tosatti (1):
   libertas: properly account for queue commands
 
 Michael Buesch (1):
   b43: pcmcia-host initialization bugfixes
 
 Pierre Ossman (1):
   libertas: make if_sdio align packets
 
 Randy Dunlap (1):
   hostap: fix section mismatch warning
 
 Robert P. J. Day (1):
   iwlwifi: Use more obvious is_power_of_2 macro.
 
 Roel Kluin (1):
   ipw2100: fix postfix decrement errors
 
  drivers/net/wireless/Kconfig|2 +-
  drivers/net/wireless/b43/pcmcia.c   |   44 +++---
  drivers/net/wireless/hostap/hostap_pci.c|6 ++--
  drivers/net/wireless/ipw2100.c  |4 +-
  drivers/net/wireless/iwlwifi/iwl3945-base.c |3 +-
  drivers/net/wireless/iwlwifi/iwl4965-base.c |3 +-
  drivers/net/wireless/libertas/cmd.c |   10 --
  drivers/net/wireless/libertas/if_cs.c   |7 +++-
  drivers/net/wireless/libertas/if_sdio.c |4 ++-
  drivers/net/wireless/rt2x00/rt2x00mac.c |8 +
  10 files changed, 58 insertions(+), 33 deletions(-)
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Endianness problem with u32 classifier hash masks

2007-11-07 Thread jamal
On Wed, 2007-07-11 at 01:22 -0800, David Miller wrote:

 @@ -615,7 +615,7 @@ static int u32_change(struct tcf_proto *tp, unsigned long 
 base, u32 handle,
   n-handle = handle;
  {
   u8 i = 0;
 - u32 mask = s-hmask;
 + u32 mask = ntohl(s-hmask);


Is this line needed? Radu?

cheers,
jamal



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] netns: init dev_base_lock only once

2007-11-07 Thread Alexey Dobriyan
* it already statically initialized
* reinitializing live global spinlock every time netns is
  setup is also wrong

Signed-off-by: Alexey Dobriyan [EMAIL PROTECTED]
---

 net/core/dev.c |1 -
 1 file changed, 1 deletion(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4330,7 +4330,6 @@ static struct hlist_head *netdev_create_hash(void)
 static int __net_init netdev_init(struct net *net)
 {
INIT_LIST_HEAD(net-dev_base_head);
-   rwlock_init(dev_base_lock);
 
net-dev_name_head = netdev_create_hash();
if (net-dev_name_head == NULL)

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] [INET]: Remove leftover prototypes from include/net/inet_common.h

2007-11-07 Thread Arnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
---
 include/net/inet_common.h |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index 227adcb..38d5a1e 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -13,9 +13,6 @@ struct sock;
 struct sockaddr;
 struct socket;
 
-extern voidinet_remove_sock(struct sock *sk1);
-extern voidinet_put_sock(unsigned short num, 
- struct sock *sk);
 extern int inet_release(struct socket *sock);
 extern int inet_stream_connect(struct socket *sock,
struct sockaddr * uaddr,
@@ -30,7 +27,6 @@ extern intinet_sendmsg(struct kiocb *iocb,
 struct msghdr *msg, 
 size_t size);
 extern int inet_shutdown(struct socket *sock, int how);
-extern unsigned intinet_poll(struct file * file, struct socket 
*sock, struct poll_table_struct *wait);
 extern int inet_listen(struct socket *sock, int backlog);
 
 extern voidinet_sock_destruct(struct sock *sk);
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please pull 'fixes-jgarzik' branch of wireless-2.6

2007-11-07 Thread Jeff Garzik
On Wed, Nov 07, 2007 at 02:13:29PM -0500, John W. Linville wrote:
 Jeff,
 
 If you haven't already pulled this then please hold-off.  I'll post
 a new request soon.

Haven't pulled yet...

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/05] ipv6: RFC4214 Support

2007-11-07 Thread Stephen Hemminger
On Wed, 7 Nov 2007 10:41:49 -0800
Templin, Fred L [EMAIL PROTECTED] wrote:

 Yoshifuji, 
 
  -Original Message-
  From: YOSHIFUJI Hideaki / 吉藤英明 [mailto:[EMAIL PROTECTED] 
  Sent: Wednesday, November 07, 2007 10:37 AM
  To: Templin, Fred L
  Cc: netdev@vger.kernel.org; [EMAIL PROTECTED]
  Subject: Re: [PATCH 04/05] ipv6: RFC4214 Support
  
  Hello.
  
  In article 
  [EMAIL PROTECTED]
  eing.com (at Tue, 6 Nov 2007 17:16:11 -0800), Templin, Fred 
  L [EMAIL PROTECTED] says:
  
   @@ -154,6 +155,14 @@ static struct ip_tunnel * ipip6_tunnel_l
 struct net_device *dev;
 char name[IFNAMSIZ];

   +#if defined(CONFIG_IPV6_ISATAP)
   + /* ISATAP (RFC4214) - router address in daddr */
   + if (!strncmp(parms-name, isatap, 6)) {
   + parms-i_key = parms-iph.daddr;
   + parms-iph.daddr = remote = 0;
   + }
   +#endif
   +
 for (tp = __ipip6_bucket(parms); (t = *tp) != NULL; tp =
   t-next) {
 if (local == t-parms.iph.saddr  remote ==
   t-parms.iph.daddr)
 return t;
  
  I do not think it is a good idea to change the behavior based on
  the interface name.
 
 The goal was to avoid requiring changes to applications such as
 'iproute2', i.e., the intention was for a standalone code insertion point
 within the kernel itself. What do you suggest?

Agreed, magic names are evil.

Change iproute2 utilities, if it is more logical for administration.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 03/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
 

 -Original Message-
 From: Stephen Hemminger [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, November 07, 2007 9:37 AM
 To: Templin, Fred L
 Cc: netdev@vger.kernel.org
 Subject: Re: [PATCH 03/05] ipv6: RFC4214 Support
 
 On Tue, 6 Nov 2007 17:16:07 -0800
 Templin, Fred L [EMAIL PROTECTED] wrote:
 
  From: Fred L. Templin [EMAIL PROTECTED]
  
  This is experimental support for the Intra-Site Automatic
  Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
  the SIT module, and is configured using the unmodified
  ip utility with device names beginning with: isatap.
  
  The following diffs are specific to the Linux 2.6.23
  kernel distribution.
  
  Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
  
  ---
  
  --- linux-2.6.23/net/ipv6/addrconf.c.orig   2007-10-09
  13:31:38.0 -0700
  +++ linux-2.6.23/net/ipv6/addrconf.c2007-10-31 
 13:08:45.0
  -0700
  @@ -73,7 +73,11 @@
   #include net/tcp.h
   #include net/ip.h
   #include net/netlink.h
  +#if defined(CONFIG_IPV6_ISATAP)
  +#include net/ipip.h
  +#else
   #include linux/if_tunnel.h
  +#endif
 
 That seems odd, changing includes used based on config option.

The change was to remove the conditional and simply
include net/ipip.h, since it also includes
linux/if_tunnel.h. 
 
   #include linux/rtnetlink.h
   
   #ifdef CONFIG_IPV6_PRIVACY
  @@ -1426,6 +1430,11 @@ static int ipv6_generate_eui64(u8 *eui, 
  return addrconf_ifid_arcnet(eui, dev);
  case ARPHRD_INFINIBAND:
  return addrconf_ifid_infiniband(eui, dev);
  +#if defined(CONFIG_IPV6_ISATAP)
  +   case ARPHRD_SIT:
  +   if (dev-priv_flagsIFF_ISATAP)
  +   return ipv6_isatap_eui64(eui, (__be32 *)dev-dev_addr);
  +#endif
 Missing indentation

Fixed.
 
  }
  return -1;
   }
  @@ -2138,7 +2147,6 @@ static void addrconf_add_linklocal(struc
  addr_flags |= IFA_F_OPTIMISTIC;
   #endif
   
  -
 
 avoid random whitespace changes

Fixed.
 
  ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, addr_flags);
  if (!IS_ERR(ifp)) {
  addrconf_prefix_route(ifp-addr, ifp-prefix_len,
  idev-dev, 0, 0);
  @@ -2192,6 +2200,32 @@ static void addrconf_sit_config(struct n
  return;
  }
   
  +#if defined(CONFIG_IPV6_ISATAP)
  +   /* ISATAP (RFC4214) - configure as NBMA link */
  +   if (dev-priv_flagsIFF_ISATAP) {
 
 missing spaces around  operator

Fixed.
 
  +   struct in6_addr addr;
  +
  +   addrconf_add_lroute(dev);
  +
  +   addr.s6_addr32[0] = htonl(0xFE80);
 
 shouldn't this be defined somewhere rather than hardcoded
 magic constant?

Well, I see this occurring elsewhere within addrconf.c.
I did however change:

- addr.s6_addr32[0] = htonl(0xFE80);
- addr.s6_addr32[1] = 0;

to:

+ ipv6_addr_set(addr, htonl(0xFE80), 0, 0, 0);

(Similar change occurred in 2 other places.)
 
  +   addr.s6_addr32[1] = 0;
  +
  +   if (ipv6_generate_eui64(addr.s6_addr + 8, dev) == 0) {
  +   struct inet6_ifaddr *ifp;
  +
  +   if (!IS_ERR(ifp = ipv6_add_addr(idev, addr, 64,
  +   IFA_LINK, IFA_F_PERMANENT))) {
 
 split assignment and conditional please

Fixed.

  +   addrconf_prefix_route(ifp-addr,
  ifp-prefix_len,
  + idev-dev, 0, 0);
  +   addrconf_dad_start(ifp, 0);
  +   in6_ifa_put(ifp);
  +   }
  +   }
  +
  +   return;
  +   }
  +#endif
  +
  sit_add_v4_addrs(idev);
   
  if (dev-flagsIFF_POINTOPOINT) {
  @@ -2521,6 +2555,16 @@ static void addrconf_rs_timer(unsigned l
   *  Announcement received after solicitation
   *  was sent
   */
  +#if defined(CONFIG_IPV6_ISATAP)
  +   /* ISATAP (RFC4214) - Re-DAD to trigger new RS/RA */
  +   if (ifp-idev-dev-priv_flags  IFF_ISATAP) {
  +   spin_lock(ifp-lock);
  +   ifp-probes = 0;
  +   ifp-idev-if_flags = ~(IF_RS_SENT|IF_RA_RCVD);
  +   addrconf_mod_timer(ifp, AC_DAD, HZ*120);
  +   spin_unlock(ifp-lock);
  +   }
  +#endif
  goto out;
  }
   
  @@ -2535,10 +2579,32 @@ static void addrconf_rs_timer(unsigned l
 ifp-idev-cnf.rtr_solicit_interval);
  spin_unlock(ifp-lock);
   
  +#if defined(CONFIG_IPV6_ISATAP)
  +   /* ISATAP (RFC4214) - unicast RS */
  +   if (ifp-idev-dev-priv_flags  IFF_ISATAP) {
  +   struct ip_tunnel *t = netdev_priv(ifp-idev-dev);
 
 Please follow kernel indentation standard of tabs (not 4 spaces).

Fixed everywhere in addrconf.c, and also in sit.c.

Fred
[EMAIL PROTECTED]

 
  +   __be32 rtr = t-parms.i_key;
  +
  +   if (!rtr) goto out;
  +   
  +   all_routers.s6_addr32[0] = htonl(0xFE80);
  +   

Re: [PATCH] ethtool: add support for supporting 10000baseT

2007-11-07 Thread Kok, Auke
Ben Hutchings wrote:
 Auke Kok wrote:
 From: Jesse Brandeburg [EMAIL PROTECTED]

 there is missing support in ethtool for reporting 1baseT
 as SUPPORTED_1baseT_Full.  The code seems to be half
 implemented because the advertising field has the implementation.
 
 I reported this lack on Sourceforge a while back:
 http://sourceforge.net/tracker/index.php?func=detailaid=1798807group_id=3242atid=103242
 Is anyone reading bugs reported there?

not really. However with plenty of new 10gig hardware going through here we're
currently looking at ethtool support and seeing if it misses anything. If you 
spot
issues, please Cc me or Jesse and netdev and of course Jeff Garzik so we can get
some patches out.

I just noticed an e1000 specific ethtool issue on there that we'll take a look 
at
as well.

Auke
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/13] SCTP: Correctly disable ADD-IP when AUTH is not supported.

2007-11-07 Thread Vlad Yasevich
Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/structs.h |1 -
 net/sctp/associola.c   |2 +-
 net/sctp/sm_make_chunk.c   |5 +++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index a177017..41f1039 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1540,7 +1540,6 @@ struct sctp_association {
__u8asconf_capable;  /* Does peer support ADDIP? */
__u8prsctp_capable;  /* Can peer do PR-SCTP? */
__u8auth_capable;/* Is peer doing SCTP-AUTH? */
-   __u8addip_capable;   /* Can peer do ADD-IP */
 
__u32   adaptation_ind;  /* Adaptation Code point. */
 
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index 03158e3..eaad5c5 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -265,7 +265,7 @@ static struct sctp_association 
*sctp_association_init(struct sctp_association *a
/* Assume that the peer recongizes ASCONF until reported otherwise
 * via an ERROR chunk.
 */
-   asoc-peer.asconf_capable = 1;
+   asoc-peer.asconf_capable = 0;
 
/* Create an input queue.  */
sctp_inq_init(asoc-base.inqueue);
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index c60564d..2ff3a3d 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1847,7 +1847,7 @@ static void sctp_process_ext_param(struct 
sctp_association *asoc,
break;
case SCTP_CID_ASCONF:
case SCTP_CID_ASCONF_ACK:
-   asoc-peer.addip_capable = 1;
+   asoc-peer.asconf_capable = 1;
break;
default:
break;
@@ -2138,10 +2138,11 @@ int sctp_process_init(struct sctp_association *asoc, 
sctp_cid_t cid,
/* If the peer claims support for ADD-IP without support
 * for AUTH, disable support for ADD-IP.
 */
-   if (asoc-peer.addip_capable  !asoc-peer.auth_capable) {
+   if (asoc-peer.asconf_capable  !asoc-peer.auth_capable) {
asoc-peer.addip_disabled_mask |= (SCTP_PARAM_ADD_IP |
  SCTP_PARAM_DEL_IP |
  SCTP_PARAM_SET_PRIMARY);
+   asoc-peer.asconf_capable = 0;
}
 
/* Walk list of transports, removing transports in the UNKNOWN state. */
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/13] SCTP: Allow ADD-IP to work with AUTH for backward compatibility.

2007-11-07 Thread Vlad Yasevich
This patch adds a tunable that will allow ADD-IP to work without
AUTH for backward compatibility.  The default value is off since
the default value for ADD-IP is off as well.  People who need
to use ADD-IP with older implementations take risks of connection
hijacking and should consider upgrading or turning this tunable on.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/structs.h |2 ++
 net/sctp/associola.c   |8 ++--
 net/sctp/protocol.c|1 +
 net/sctp/sm_make_chunk.c   |4 +++-
 net/sctp/sysctl.c  |9 +
 5 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 41f1039..44f2672 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -212,6 +212,7 @@ extern struct sctp_globals {

/* Flag to indicate if addip is enabled. */
int addip_enable;
+   int addip_noauth_enable;
 
/* Flag to indicate if PR-SCTP is enabled. */
int prsctp_enable;
@@ -249,6 +250,7 @@ extern struct sctp_globals {
 #define sctp_local_addr_list   (sctp_globals.local_addr_list)
 #define sctp_local_addr_lock   (sctp_globals.addr_list_lock)
 #define sctp_addip_enable  (sctp_globals.addip_enable)
+#define sctp_addip_noauth  (sctp_globals.addip_noauth_enable)
 #define sctp_prsctp_enable (sctp_globals.prsctp_enable)
 #define sctp_auth_enable   (sctp_globals.auth_enable)
 
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index eaad5c5..013e3d3 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -262,10 +262,14 @@ static struct sctp_association 
*sctp_association_init(struct sctp_association *a
 */
asoc-peer.sack_needed = 1;
 
-   /* Assume that the peer recongizes ASCONF until reported otherwise
-* via an ERROR chunk.
+   /* Assume that the peer will tell us if he recognizes ASCONF
+* as part of INIT exchange.
+* The sctp_addip_noauth option is there for backward compatibility
+* and will revert old behavior.
 */
asoc-peer.asconf_capable = 0;
+   if (sctp_addip_noauth)
+   asoc-peer.asconf_capable = 1;
 
/* Create an input queue.  */
sctp_inq_init(asoc-base.inqueue);
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 40c1a47..ecfab03 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1179,6 +1179,7 @@ SCTP_STATIC __init int sctp_init(void)
 
/* Disable ADDIP by default. */
sctp_addip_enable = 0;
+   sctp_addip_noauth = 0;
 
/* Enable PR-SCTP by default. */
sctp_prsctp_enable = 1;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 2ff3a3d..43e8de1 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -2137,8 +2137,10 @@ int sctp_process_init(struct sctp_association *asoc, 
sctp_cid_t cid,
 
/* If the peer claims support for ADD-IP without support
 * for AUTH, disable support for ADD-IP.
+* Do this only if backward compatible mode is turned off.
 */
-   if (asoc-peer.asconf_capable  !asoc-peer.auth_capable) {
+   if (!sctp_addip_noauth 
+(asoc-peer.asconf_capable  !asoc-peer.auth_capable)) {
asoc-peer.addip_disabled_mask |= (SCTP_PARAM_ADD_IP |
  SCTP_PARAM_DEL_IP |
  SCTP_PARAM_SET_PRIMARY);
diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
index 0669778..da4f157 100644
--- a/net/sctp/sysctl.c
+++ b/net/sctp/sysctl.c
@@ -263,6 +263,15 @@ static ctl_table sctp_table[] = {
.proc_handler   = proc_dointvec,
.strategy   = sysctl_intvec
},
+   {
+   .ctl_name   = CTL_UNNUMBERED,
+   .procname   = addip_noauth_enable,
+   .data   = sctp_addip_noauth,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec,
+   .strategy   = sysctl_intvec
+   },
{ .ctl_name = 0 }
 };
 
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/13] SCTP: Make sctp_verify_param return multiple indications.

2007-11-07 Thread Vlad Yasevich
SCTP-AUTH and future ADD-IP updates have a requirement to
do additional verification of parameters and an ability to
ABORT the association if verification fails.  So, introduce
additional return code so that we can clearly signal a required
action.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/constants.h |2 +
 net/sctp/sm_make_chunk.c |  149 +-
 2 files changed, 77 insertions(+), 74 deletions(-)

diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
index 73fbdf6..f30b537 100644
--- a/include/net/sctp/constants.h
+++ b/include/net/sctp/constants.h
@@ -186,6 +186,8 @@ typedef enum {
SCTP_IERROR_AUTH_BAD_HMAC,
SCTP_IERROR_AUTH_BAD_KEYID,
SCTP_IERROR_PROTO_VIOLATION,
+   SCTP_IERROR_ERROR,
+   SCTP_IERROR_ABORT,
 } sctp_ierror_t;
 
 
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 43e8de1..5a9783c 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1788,9 +1788,14 @@ static int sctp_process_inv_paramlength(const struct 
sctp_association *asoc,
sizeof(sctp_paramhdr_t);
 
 
+   /* This is a fatal error.  Any accumulated non-fatal errors are
+* not reported.
+*/
+   if (*errp)
+   sctp_chunk_free(*errp);
+
/* Create an error chunk and fill it in with our payload. */
-   if (!*errp)
-   *errp = sctp_make_op_error_space(asoc, chunk, payload_len);
+   *errp = sctp_make_op_error_space(asoc, chunk, payload_len);
 
if (*errp) {
sctp_init_cause(*errp, SCTP_ERROR_PROTO_VIOLATION,
@@ -1813,9 +1818,15 @@ static int sctp_process_hn_param(const struct 
sctp_association *asoc,
 {
__u16 len = ntohs(param.p-length);
 
-   /* Make an ERROR chunk. */
-   if (!*errp)
-   *errp = sctp_make_op_error_space(asoc, chunk, len);
+   /* Processing of the HOST_NAME parameter will generate an
+* ABORT.  If we've accumulated any non-fatal errors, they
+* would be unrecognized parameters and we should not include
+* them in the ABORT.
+*/
+   if (*errp)
+   sctp_chunk_free(*errp);
+
+   *errp = sctp_make_op_error_space(asoc, chunk, len);
 
if (*errp) {
sctp_init_cause(*errp, SCTP_ERROR_DNS_FAILED, len);
@@ -1862,56 +1873,40 @@ static void sctp_process_ext_param(struct 
sctp_association *asoc,
  * taken if the processing endpoint does not recognize the
  * Parameter Type.
  *
- * 00 - Stop processing this SCTP chunk and discard it,
- * do not process any further chunks within it.
+ * 00 - Stop processing this parameter; do not process any further
+ * parameters within this chunk
  *
- * 01 - Stop processing this SCTP chunk and discard it,
- * do not process any further chunks within it, and report
- * the unrecognized parameter in an 'Unrecognized
- * Parameter Type' (in either an ERROR or in the INIT ACK).
+ * 01 - Stop processing this parameter, do not process any further
+ * parameters within this chunk, and report the unrecognized
+ * parameter in an 'Unrecognized Parameter' ERROR chunk.
  *
  * 10 - Skip this parameter and continue processing.
  *
  * 11 - Skip this parameter and continue processing but
  * report the unrecognized parameter in an
- * 'Unrecognized Parameter Type' (in either an ERROR or in
- * the INIT ACK).
+ * 'Unrecognized Parameter' ERROR chunk.
  *
  * Return value:
- * 0 - discard the chunk
- * 1 - continue with the chunk
+ * SCTP_IERROR_NO_ERROR - continue with the chunk
+ * SCTP_IERROR_ERROR- stop and report an error.
+ * SCTP_IERROR_NOMEME   - out of memory.
  */
-static int sctp_process_unk_param(const struct sctp_association *asoc,
- union sctp_params param,
- struct sctp_chunk *chunk,
- struct sctp_chunk **errp)
+static sctp_ierror_t sctp_process_unk_param(const struct sctp_association 
*asoc,
+   union sctp_params param,
+   struct sctp_chunk *chunk,
+   struct sctp_chunk **errp)
 {
-   int retval = 1;
+   int retval = SCTP_IERROR_NO_ERROR;
 
switch (param.p-type  SCTP_PARAM_ACTION_MASK) {
case SCTP_PARAM_ACTION_DISCARD:
-   retval =  0;
-   break;
-   case SCTP_PARAM_ACTION_DISCARD_ERR:
-   retval =  0;
-   /* Make an ERROR chunk, preparing enough room for
-* returning multiple unknown parameters.
-*/
-   if (NULL == *errp)
-   *errp = sctp_make_op_error_space(asoc, chunk,
-   ntohs(chunk-chunk_hdr-length));
-
-   if (*errp) {
- 

Re: Why does a connect to IPv6 LLA address fail ?

2007-11-07 Thread Vlad Yasevich
Jiri Bohac wrote:
 Hi,
 
 For this it create a socket for datagram and
 protocol IPPROTO_IP and then try to connect it with the destination
 address. This fails in the case of a LLA, because connect returns EINVAL,
 since here is no device bind to this socket at this time.
 
 [snip]
 
 Why do we have this check in ip6_datagram_connect() ?
 
 This problem has been nicely described in
 http://www.linux-ipv6.org/ml/usagi-users/msg03062.html
 without any response. 
 
 RFC2461, Appendix A, really suggests performing neighbour
 discovery on all the links. I like the idea, it would make LLAs
 much more useful. 

The reason this is in an Appendix is because it doesn't work all
the time.  It was there to document some experiments that were going
on.

 
 Has anyone experimented with this? Is there any good reason why
 we don't send NSs to all the links to find out which link the
 destination LLA is on?

The reason is that 2 different hosts may have the same link-local
address as long as they are on different links.  If the sender is
connected to both links then it may send the packet to the wrong
destination.

Link local addresses are unqualified without the scope/zone id.
The application must pass this information to the kernel as part of
the connect() call.

A different and some might say 'better' alternative is to define a
default link.  Thus when the zone id is not specified the default is used.
This will work fine for link-scoped addresses.  A default zone would also
need to be defined for other scopes as well.  That's just one idea.

-vlad
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] SCTP: Clean-up some defines for regressions tests.

2007-11-07 Thread Vlad Yasevich
The SCTP regression tests now use the in-kernel version of proc_fs.h
and we don't need to undef any more.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 include/net/sctp/sctp.h |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 67c997c..34318a3 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -65,7 +65,6 @@
 
 
 #ifdef TEST_FRAME
-#undef CONFIG_PROC_FS
 #undef CONFIG_SCTP_DBG_OBJCNT
 #undef CONFIG_SYSCTL
 #endif /* TEST_FRAME */
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/13] SCTP: Fix PR-SCTP to deliver all the accumulated ordered chunks

2007-11-07 Thread Vlad Yasevich
There is a small bug when we process a FWD-TSN.  We'll deliver
anything up to the current next expected SSN.  However, if the
next expected is already in the queue, it will take another
chunk to trigger its delivery.  The fix is to simply check
the current queued SSN is the next expected one.

Signed-off-by: Vlad Yasevich [EMAIL PROTECTED]
---
 net/sctp/ulpqueue.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index 4be92d0..4908041 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -862,7 +862,7 @@ static inline void sctp_ulpq_reap_ordered(struct sctp_ulpq 
*ulpq, __u16 sid)
continue;
 
/* see if this ssn has been marked by skipping */
-   if (!SSN_lt(cssn, sctp_ssn_peek(in, csid)))
+   if (!SSN_lte(cssn, sctp_ssn_peek(in, csid)))
break;
 
__skb_unlink(pos, ulpq-lobby);
-- 
1.5.2.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/05] ipv6: RFC4214 Support

2007-11-07 Thread Stephen Hemminger
On Tue, 6 Nov 2007 17:16:07 -0800
Templin, Fred L [EMAIL PROTECTED] wrote:

 From: Fred L. Templin [EMAIL PROTECTED]
 
 This is experimental support for the Intra-Site Automatic
 Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
 the SIT module, and is configured using the unmodified
 ip utility with device names beginning with: isatap.
 
 The following diffs are specific to the Linux 2.6.23
 kernel distribution.
 
 Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
 
 ---
 
 --- linux-2.6.23/net/ipv6/addrconf.c.orig 2007-10-09
 13:31:38.0 -0700
 +++ linux-2.6.23/net/ipv6/addrconf.c  2007-10-31 13:08:45.0
 -0700
 @@ -73,7 +73,11 @@
  #include net/tcp.h
  #include net/ip.h
  #include net/netlink.h
 +#if defined(CONFIG_IPV6_ISATAP)
 +#include net/ipip.h
 +#else
  #include linux/if_tunnel.h
 +#endif

That seems odd, changing includes used based on config option.


  #include linux/rtnetlink.h
  
  #ifdef CONFIG_IPV6_PRIVACY
 @@ -1426,6 +1430,11 @@ static int ipv6_generate_eui64(u8 *eui, 
   return addrconf_ifid_arcnet(eui, dev);
   case ARPHRD_INFINIBAND:
   return addrconf_ifid_infiniband(eui, dev);
 +#if defined(CONFIG_IPV6_ISATAP)
 + case ARPHRD_SIT:
 + if (dev-priv_flagsIFF_ISATAP)
 + return ipv6_isatap_eui64(eui, (__be32 *)dev-dev_addr);
 +#endif
Missing indentation


   }
   return -1;
  }
 @@ -2138,7 +2147,6 @@ static void addrconf_add_linklocal(struc
   addr_flags |= IFA_F_OPTIMISTIC;
  #endif
  
 -

avoid random whitespace changes

   ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, addr_flags);
   if (!IS_ERR(ifp)) {
   addrconf_prefix_route(ifp-addr, ifp-prefix_len,
 idev-dev, 0, 0);
 @@ -2192,6 +2200,32 @@ static void addrconf_sit_config(struct n
   return;
   }
  
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - configure as NBMA link */
 + if (dev-priv_flagsIFF_ISATAP) {

missing spaces around  operator

 + struct in6_addr addr;
 +
 + addrconf_add_lroute(dev);
 +
 + addr.s6_addr32[0] = htonl(0xFE80);

shouldn't this be defined somewhere rather than hardcoded
magic constant?

 + addr.s6_addr32[1] = 0;
 +
 + if (ipv6_generate_eui64(addr.s6_addr + 8, dev) == 0) {
 + struct inet6_ifaddr *ifp;
 +
 + if (!IS_ERR(ifp = ipv6_add_addr(idev, addr, 64,
 + IFA_LINK, IFA_F_PERMANENT))) {

split assignment and conditional please


 + addrconf_prefix_route(ifp-addr,
 ifp-prefix_len,
 +   idev-dev, 0, 0);
 + addrconf_dad_start(ifp, 0);
 + in6_ifa_put(ifp);
 + }
 + }
 +
 + return;
 + }
 +#endif
 +
   sit_add_v4_addrs(idev);
  
   if (dev-flagsIFF_POINTOPOINT) {
 @@ -2521,6 +2555,16 @@ static void addrconf_rs_timer(unsigned l
*  Announcement received after solicitation
*  was sent
*/
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - Re-DAD to trigger new RS/RA */
 + if (ifp-idev-dev-priv_flags  IFF_ISATAP) {
 + spin_lock(ifp-lock);
 + ifp-probes = 0;
 + ifp-idev-if_flags = ~(IF_RS_SENT|IF_RA_RCVD);
 + addrconf_mod_timer(ifp, AC_DAD, HZ*120);
 + spin_unlock(ifp-lock);
 + }
 +#endif
   goto out;
   }
  
 @@ -2535,10 +2579,32 @@ static void addrconf_rs_timer(unsigned l
  ifp-idev-cnf.rtr_solicit_interval);
   spin_unlock(ifp-lock);
  
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - unicast RS */
 + if (ifp-idev-dev-priv_flags  IFF_ISATAP) {
 + struct ip_tunnel *t = netdev_priv(ifp-idev-dev);

Please follow kernel indentation standard of tabs (not 4 spaces).

 + __be32 rtr = t-parms.i_key;
 +
 + if (!rtr) goto out;
 + 
 + all_routers.s6_addr32[0] = htonl(0xFE80);
 + all_routers.s6_addr32[1] = 0;
 + ipv6_isatap_eui64(all_routers.s6_addr + 8, rtr);
 +
 + } else
 +#endif
   ipv6_addr_all_routers(all_routers);
  
   ndisc_send_rs(ifp-idev-dev, ifp-addr, all_routers);
   } else {
 +#if defined(CONFIG_IPV6_ISATAP)
 + /* ISATAP (RFC4214) - Re-DAD to trigger new RS/RA */
 + if (ifp-idev-dev-priv_flags  IFF_ISATAP) {
 + ifp-probes = 0;
 + ifp-idev-if_flags = ~(IF_RS_SENT|IF_RA_RCVD);
 + addrconf_mod_timer(ifp, AC_DAD, HZ*120);
 + }
 +#endif
   spin_unlock(ifp-lock);
   /*
* Note: we do not support deprecated all on-link
 @@ 

RE: [PATCH 01/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
Thanks, and will fix.

Fred 

 -Original Message-
 From: Stephen Hemminger [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, November 07, 2007 9:31 AM
 To: Templin, Fred L
 Cc: netdev@vger.kernel.org
 Subject: Re: [PATCH 01/05] ipv6: RFC4214 Support
 
 On Tue, 6 Nov 2007 17:16:01 -0800
 Templin, Fred L [EMAIL PROTECTED] wrote:
 
  From: Fred L. Templin [EMAIL PROTECTED]
  
  This is experimental support for the Intra-Site Automatic
  Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
  the SIT module, and is configured using the unmodified
  ip utility with device names beginning with: isatap.
  
  The following diffs are specific to the Linux 2.6.23
  kernel distribution.
  
  Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
  
  ---
  
  --- linux-2.6.23/include/linux/if.h.orig2007-10-29
  09:22:26.0 -0700
  +++ linux-2.6.23/include/linux/if.h 2007-10-26 11:00:06.0
  -0700
  @@ -61,6 +61,9 @@
   #define IFF_MASTER_ALB 0x10/* bonding 
 master, balance-alb.
  */
   #define IFF_BONDING0x20/* bonding 
 master or slave
  */
   #define IFF_SLAVE_NEEDARP 0x40 /* need ARPs 
 for validation
  */
  +#if defined(CONFIG_IPV6_ISATAP)
  +#define IFF_ISATAP 0x80/* ISATAP interface (RFC4214)
  */
  +#endif
   
 
 Don't make this conditional.. You always want the number assigned
 and available, plus this file is used from user space where kernel
 configuration is unknown or irrelevant.
 
 -- 
 Stephen Hemminger [EMAIL PROTECTED]
 
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/05] ipv6: RFC4214 Support

2007-11-07 Thread Stephen Hemminger
On Tue, 6 Nov 2007 17:16:01 -0800
Templin, Fred L [EMAIL PROTECTED] wrote:

 From: Fred L. Templin [EMAIL PROTECTED]
 
 This is experimental support for the Intra-Site Automatic
 Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
 the SIT module, and is configured using the unmodified
 ip utility with device names beginning with: isatap.
 
 The following diffs are specific to the Linux 2.6.23
 kernel distribution.
 
 Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
 
 ---
 
 --- linux-2.6.23/include/linux/if.h.orig  2007-10-29
 09:22:26.0 -0700
 +++ linux-2.6.23/include/linux/if.h   2007-10-26 11:00:06.0
 -0700
 @@ -61,6 +61,9 @@
  #define IFF_MASTER_ALB   0x10/* bonding master, balance-alb.
 */
  #define IFF_BONDING  0x20/* bonding master or slave
 */
  #define IFF_SLAVE_NEEDARP 0x40   /* need ARPs for validation
 */
 +#if defined(CONFIG_IPV6_ISATAP)
 +#define IFF_ISATAP   0x80/* ISATAP interface (RFC4214)
 */
 +#endif
  

Don't make this conditional.. You always want the number assigned
and available, plus this file is used from user space where kernel
configuration is unknown or irrelevant.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ethtool: add support for supporting 10000baseT

2007-11-07 Thread Auke Kok
From: Jesse Brandeburg [EMAIL PROTECTED]

there is missing support in ethtool for reporting 1baseT
as SUPPORTED_1baseT_Full.  The code seems to be half
implemented because the advertising field has the implementation.

this patch just adds it for supported reporting.

Signed-off-by: Jesse Brandeburg [EMAIL PROTECTED]
Signed-off-by: Auke Kok [EMAIL PROTECTED]
---

 ethtool.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index 6c7a2e3..888be57 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -725,6 +725,13 @@ static void dump_supported(struct ethtool_cmd *ep)
if (mask  SUPPORTED_2500baseX_Full) {
did1++; fprintf(stdout, 2500baseX/Full );
}
+   if (did1  (mask  SUPPORTED_1baseT_Full)) {
+   fprintf(stdout, \n);
+   fprintf(stdout,);
+   }
+   if (mask  SUPPORTED_1baseT_Full) {
+   did1++; fprintf(stdout, 1baseT/Full );
+   }
fprintf(stdout, \n);
 
fprintf(stdout,Supports auto-negotiation: );
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][PACKET] Remove unneeded packet_socks_nr variable

2007-11-07 Thread Arnaldo Carvalho de Melo
Em Wed, Nov 07, 2007 at 01:50:04PM -0200, Arnaldo Carvalho de Melo escreveu:
 Em Wed, Nov 07, 2007 at 06:32:51PM +0300, Pavel Emelyanov escreveu:
  This one is used only under ifdef PACKET_REFCNT_DEBUG in
  printk and is not needed otherwise. So hide all this stuff
  under the PACKET_REFCNT_DEBUG.
  
  Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]
 
 Look at sk_refcnt_debug_inc, etc and you'll se a more standard way. I
 forgot to make this when making all protocol families use sk_prot, even
 if just partially :-)

As a bonus you'll get this information on /proc/net/protocols, removing
'-1' from PACKET column for sockets.

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread Ingo Oeser
Hi Fred,

some comments.

Templin, Fred L schrieb:
 From: Fred L. Templin [EMAIL PROTECTED]
 
 This is experimental support for the Intra-Site Automatic
 Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses
 the SIT module, and is configured using the unmodified
 ip utility with device names beginning with: isatap.
 
 The following diffs are specific to the Linux 2.6.23
 kernel distribution.
 
 Signed-off-by: Fred L. Templin [EMAIL PROTECTED]
 
 ---
 
 --- linux-2.6.23/include/net/addrconf.h.orig  2007-10-09
 13:31:38.0 -0700
 +++ linux-2.6.23/include/net/addrconf.h   2007-10-26 10:49:40.0
 -0700
 @@ -241,6 +241,34 @@ static inline int ipv6_addr_is_ll_all_ro
   addr-s6_addr32[3] == htonl(0x0002));
  }
  
 +#if defined(CONFIG_IPV6_ISATAP)
 +static inline int ipv6_isatap_eui64(u8 *eui, __be32 *addr)
addr is only used for reading, not writing. No need to pass it as a pointer.

 +{
 + __be32 ipv4 = ntohl(*addr);

ntohl(be32_value) != be32_value, so the _be32 attribution of ipv4 
is wrong here and sparse will scream.

 +
 + eui[0] = 0;
 +
 + /* Check for RFC3330 global address ranges */
 + if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
 + ((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
 + ((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
 + ((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
 + ((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
 + ((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
 + ((ipv4 = 0xc614)  (ipv4  0xe000))) eui[0] |=
 0x2;
 +

Instead of converting network to host byte order at runtime 
and comparing the results to constants, let the compiler convert
the constants to network byte order and compare in network order.

so use:

 if (((*addr = htonl(0x0100))  (*addr  htonl(0x0a00))) || 

instead. The compiler will notice that 0x0100 is a constant and will
use _constant_htonl() automatically.


 + eui[1] = 0; eui[2] = 0x5E; eui[3] = 0xFE;
 + memcpy (eui+4, addr, 4);
 + return (0);
 +}

Nitpick: 
return is not a function. Please write return 0; instead.

 +
 +static inline int ipv6_addr_is_isatap(const struct in6_addr *addr)
 +{
 +   return (addr-s6_addr32[2] == __constant_htonl(0x02005EFE) ||
 +   addr-s6_addr32[2] == __constant_htonl(0x5EFE));
 +}
 +#endif

The compiler will notice that 0x0100 is a constant and will
use _constant_htonl() automatically. Please use simply htonl().


Best Regards

Ingo Oeser
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][PACKET] Remove unneeded packet_socks_nr variable

2007-11-07 Thread Arnaldo Carvalho de Melo
Em Wed, Nov 07, 2007 at 06:32:51PM +0300, Pavel Emelyanov escreveu:
 This one is used only under ifdef PACKET_REFCNT_DEBUG in
 printk and is not needed otherwise. So hide all this stuff
 under the PACKET_REFCNT_DEBUG.
 
 Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

Look at sk_refcnt_debug_inc, etc and you'll se a more standard way. I
forgot to make this when making all protocol families use sk_prot, even
if just partially :-)

- Arnaldo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][PACKET] Remove unneeded packet_socks_nr variable

2007-11-07 Thread Pavel Emelyanov
This one is used only under ifdef PACKET_REFCNT_DEBUG in
printk and is not needed otherwise. So hide all this stuff
under the PACKET_REFCNT_DEBUG.

Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

---

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 4cb2dfb..e6a96ee 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -139,8 +139,30 @@ dev-hard_header == NULL (ll header is added by device, we 
cannot control it)
 static HLIST_HEAD(packet_sklist);
 static DEFINE_RWLOCK(packet_sklist_lock);
 
+#ifdef PACKET_REFCNT_DEBUG
 static atomic_t packet_socks_nr;
 
+static void packet_sock_inc(void)
+{
+   atomic_inc(packet_socks_nr);
+}
+
+static void packet_sock_dec(void)
+{
+   atomic_dec(packet_socks_nr);
+   printk(KERN_DEBUG PACKET socket %p is free, %d are alive\n,
+   sk, atomic_read(packet_socks_nr));
+}
+#else
+static inline void packet_sock_inc(void)
+{
+}
+
+static inline void packet_sock_dec(void)
+{
+}
+#endif
+
 
 /* Private packet socket structures. */
 
@@ -236,10 +258,7 @@ static void packet_sock_destruct(struct sock *sk)
return;
}
 
-   atomic_dec(packet_socks_nr);
-#ifdef PACKET_REFCNT_DEBUG
-   printk(KERN_DEBUG PACKET socket %p is free, %d are alive\n, sk, 
atomic_read(packet_socks_nr));
-#endif
+   packet_sock_dec();
 }
 
 
@@ -1010,7 +1029,7 @@ static int packet_create(struct net *net, struct socket 
*sock, int protocol)
po-num = proto;
 
sk-sk_destruct = packet_sock_destruct;
-   atomic_inc(packet_socks_nr);
+   packet_sock_inc();
 
/*
 *  Attach a protocol block
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/24] [IPSEC]: Forbid BEET + ipcomp for now

2007-11-07 Thread Herbert Xu
[IPSEC]: Forbid BEET + ipcomp for now

While BEET can theoretically work with IPComp the current code can't do that
because it tries to construct a BEET mode tunnel type which doesn't (and
cannot) exist.  In fact as it is it won't even attach a tunnel object at
all for BEET which is bogus.

To support this fully we'd also need to change the policy checks on input
to recognise a plain tunnel as a legal variant of an optional BEET transform.

This patch simply fails such constructions for now.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/ipcomp.c  |   20 
 net/ipv6/ipcomp6.c |   19 ---
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c
index ca1b5fd..afcdbf4 100644
--- a/net/ipv4/ipcomp.c
+++ b/net/ipv4/ipcomp.c
@@ -181,7 +181,6 @@ static void ipcomp4_err(struct sk_buff *skb, u32 info)
 static struct xfrm_state *ipcomp_tunnel_create(struct xfrm_state *x)
 {
struct xfrm_state *t;
-   u8 mode = XFRM_MODE_TUNNEL;
 
t = xfrm_state_alloc();
if (t == NULL)
@@ -192,9 +191,7 @@ static struct xfrm_state *ipcomp_tunnel_create(struct 
xfrm_state *x)
t-id.daddr.a4 = x-id.daddr.a4;
memcpy(t-sel, x-sel, sizeof(t-sel));
t-props.family = AF_INET;
-   if (x-props.mode == XFRM_MODE_BEET)
-   mode = x-props.mode;
-   t-props.mode = mode;
+   t-props.mode = x-props.mode;
t-props.saddr.a4 = x-props.saddr.a4;
t-props.flags = x-props.flags;
 
@@ -388,15 +385,22 @@ static int ipcomp_init_state(struct xfrm_state *x)
if (x-encap)
goto out;
 
+   x-props.header_len = 0;
+   switch (x-props.mode) {
+   case XFRM_MODE_TRANSPORT:
+   break;
+   case XFRM_MODE_TUNNEL:
+   x-props.header_len += sizeof(struct iphdr);
+   break;
+   default:
+   goto out;
+   }
+
err = -ENOMEM;
ipcd = kzalloc(sizeof(*ipcd), GFP_KERNEL);
if (!ipcd)
goto out;
 
-   x-props.header_len = 0;
-   if (x-props.mode == XFRM_MODE_TUNNEL)
-   x-props.header_len += sizeof(struct iphdr);
-
mutex_lock(ipcomp_resource_mutex);
if (!ipcomp_alloc_scratches())
goto error;
diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c
index 85eb479..eca0595 100644
--- a/net/ipv6/ipcomp6.c
+++ b/net/ipv6/ipcomp6.c
@@ -189,7 +189,6 @@ static void ipcomp6_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
 static struct xfrm_state *ipcomp6_tunnel_create(struct xfrm_state *x)
 {
struct xfrm_state *t = NULL;
-   u8 mode = XFRM_MODE_TUNNEL;
 
t = xfrm_state_alloc();
if (!t)
@@ -203,9 +202,7 @@ static struct xfrm_state *ipcomp6_tunnel_create(struct 
xfrm_state *x)
memcpy(t-id.daddr.a6, x-id.daddr.a6, sizeof(struct in6_addr));
memcpy(t-sel, x-sel, sizeof(t-sel));
t-props.family = AF_INET6;
-   if (x-props.mode == XFRM_MODE_BEET)
-   mode = x-props.mode;
-   t-props.mode = mode;
+   t-props.mode = x-props.mode;
memcpy(t-props.saddr.a6, x-props.saddr.a6, sizeof(struct in6_addr));
 
if (xfrm_init_state(t))
@@ -404,22 +401,22 @@ static int ipcomp6_init_state(struct xfrm_state *x)
if (x-encap)
goto out;
 
-   err = -ENOMEM;
-   ipcd = kzalloc(sizeof(*ipcd), GFP_KERNEL);
-   if (!ipcd)
-   goto out;
-
x-props.header_len = 0;
switch (x-props.mode) {
-   case XFRM_MODE_BEET:
case XFRM_MODE_TRANSPORT:
break;
case XFRM_MODE_TUNNEL:
x-props.header_len += sizeof(struct ipv6hdr);
+   break;
default:
-   goto error;
+   goto out;
}
 
+   err = -ENOMEM;
+   ipcd = kzalloc(sizeof(*ipcd), GFP_KERNEL);
+   if (!ipcd)
+   goto out;
+
mutex_lock(ipcomp6_resource_mutex);
if (!ipcomp6_alloc_scratches())
goto error;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/24] [IPSEC]: Move integrity stat collection into xfrm_input

2007-11-07 Thread Herbert Xu
[IPSEC]: Move integrity stat collection into xfrm_input

Similar to the moving out of the replay processing on the output,
this patch moves the integrity stat collectin from x-type-input
into xfrm_input.

This would eventually allow transforms such as AH/ESP to be lockless.

The error value EBADMSG (currently unused in the crypto layer) is used
to indicate a failed integrity check.  In future this error can be
directly returned by the crypto layer once we switch to aead algorithms.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/ah4.c|3 +--
 net/ipv4/esp4.c   |   13 -
 net/ipv6/ah6.c|3 +--
 net/ipv6/esp6.c   |3 +--
 net/xfrm/xfrm_input.c |5 -
 5 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index 5fc346d..a989d29 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -177,9 +177,8 @@ static int ah_input(struct xfrm_state *x, struct sk_buff 
*skb)
err = ah_mac_digest(ahp, skb, ah-auth_data);
if (err)
goto out;
-   err = -EINVAL;
if (memcmp(ahp-work_icv, auth_data, ahp-icv_trunc_len)) {
-   x-stats.integrity_failed++;
+   err = -EBADMSG;
goto out;
}
}
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index c31bccb..7f1854c 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -162,7 +162,7 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
u8 nexthdr[2];
struct scatterlist *sg;
int padlen;
-   int err;
+   int err = -EINVAL;
 
if (!pskb_may_pull(skb, sizeof(*esph)))
goto out;
@@ -182,13 +182,14 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
BUG();
 
if (unlikely(memcmp(esp-auth.work_icv, sum, alen))) {
-   x-stats.integrity_failed++;
+   err = -EBADMSG;
goto out;
}
}
 
-   if ((nfrags = skb_cow_data(skb, 0, trailer))  0)
+   if ((err = skb_cow_data(skb, 0, trailer))  0)
goto out;
+   nfrags = err;
 
skb-ip_summed = CHECKSUM_NONE;
 
@@ -201,6 +202,7 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
sg = esp-sgbuf[0];
 
if (unlikely(nfrags  ESP_NUM_FAST_SG)) {
+   err = -ENOMEM;
sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC);
if (!sg)
goto out;
@@ -213,11 +215,12 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
if (unlikely(sg != esp-sgbuf[0]))
kfree(sg);
if (unlikely(err))
-   return err;
+   goto out;
 
if (skb_copy_bits(skb, skb-len-alen-2, nexthdr, 2))
BUG();
 
+   err = -EINVAL;
padlen = nexthdr[0];
if (padlen+2 = elen)
goto out;
@@ -271,7 +274,7 @@ static int esp_input(struct xfrm_state *x, struct sk_buff 
*skb)
return nexthdr[1];
 
 out:
-   return -EINVAL;
+   return err;
 }
 
 static u32 esp4_get_mtu(struct xfrm_state *x, int mtu)
diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index 4eaf550..d4b59ec 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -379,10 +379,9 @@ static int ah6_input(struct xfrm_state *x, struct sk_buff 
*skb)
err = ah_mac_digest(ahp, skb, ah-auth_data);
if (err)
goto free_out;
-   err = -EINVAL;
if (memcmp(ahp-work_icv, auth_data, ahp-icv_trunc_len)) {
LIMIT_NETDEBUG(KERN_WARNING ipsec ah authentication 
error\n);
-   x-stats.integrity_failed++;
+   err = -EBADMSG;
goto free_out;
}
}
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 7db66f1..c37982b 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -177,8 +177,7 @@ static int esp6_input(struct xfrm_state *x, struct sk_buff 
*skb)
BUG();
 
if (unlikely(memcmp(esp-auth.work_icv, sum, alen))) {
-   x-stats.integrity_failed++;
-   ret = -EINVAL;
+   ret = -EBADMSG;
goto out;
}
}
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 587f347..b7d68eb 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -147,8 +147,11 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
goto drop_unlock;
 
nexthdr = x-type-input(x, skb);
-   if (nexthdr = 0)
+   if (nexthdr = 0) {
+   if (nexthdr == -EBADMSG)
+   x-stats.integrity_failed++;
 

[PATCH 11/24] [IPSEC]: Merge common code into xfrm_bundle_create

2007-11-07 Thread Herbert Xu
[IPSEC]: Merge common code into xfrm_bundle_create

Half of the code in xfrm4_bundle_create and xfrm6_bundle_create are common.
This patch extracts that logic and puts it into xfrm_bundle_create.  The
rest of it are then accessed through afinfo.

As a result this fixes the problem with inter-family transforms where we
treat every xfrm dst in the bundle as if it belongs to the top family.

This patch also fixes a long-standing error-path bug where we may free the
xfrm states twice.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/xfrm.h  |   11 +-
 net/ipv4/xfrm4_policy.c |  134 ++-
 net/ipv6/xfrm6_policy.c |  143 ++---
 net/xfrm/xfrm_policy.c  |  183 +---
 4 files changed, 215 insertions(+), 256 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 9b6af22..4178c2b 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -227,6 +227,7 @@ struct km_event
u32 event;
 };
 
+struct net_device;
 struct xfrm_type;
 struct xfrm_dst;
 struct xfrm_policy_afinfo {
@@ -237,13 +238,11 @@ struct xfrm_policy_afinfo {
   xfrm_address_t *daddr);
int (*get_saddr)(xfrm_address_t *saddr, 
xfrm_address_t *daddr);
struct dst_entry*(*find_bundle)(struct flowi *fl, struct 
xfrm_policy *policy);
-   int (*bundle_create)(struct xfrm_policy *policy, 
-struct xfrm_state **xfrm, 
-int nx,
-struct flowi *fl, 
-struct dst_entry **dst_p);
void(*decode_session)(struct sk_buff *skb,
  struct flowi *fl);
+   int (*get_tos)(struct flowi *fl);
+   int (*fill_dst)(struct xfrm_dst *xdst,
+   struct net_device *dev);
 };
 
 extern int xfrm_policy_register_afinfo(struct xfrm_policy_afinfo *afinfo);
@@ -1094,7 +1093,6 @@ static inline int xfrm4_udp_encap_rcv(struct sock *sk, 
struct sk_buff *skb)
 }
 #endif
 
-extern struct dst_entry *xfrm_dst_lookup(struct xfrm_state *x, int tos);
 struct xfrm_policy *xfrm_policy_alloc(gfp_t gfp);
 extern int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, 
int, void*), void *);
 int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl);
@@ -1113,7 +,6 @@ extern int xfrm_policy_flush(u8 type, struct xfrm_audit 
*audit_info);
 extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy 
*pol);
 extern int xfrm_bundle_ok(struct xfrm_policy *pol, struct xfrm_dst *xdst,
  struct flowi *fl, int family, int strict);
-extern void xfrm_init_pmtu(struct dst_entry *dst);
 
 #ifdef CONFIG_XFRM_MIGRATE
 extern int km_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index cebc847..1d75243 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -79,122 +79,39 @@ __xfrm4_find_bundle(struct flowi *fl, struct xfrm_policy 
*policy)
return dst;
 }
 
-/* Allocate chain of dst_entry's, attach known xfrm's, calculate
- * all the metrics... Shortly, bundle a bundle.
- */
-
-static int
-__xfrm4_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, 
int nx,
- struct flowi *fl, struct dst_entry **dst_p)
+static int xfrm4_get_tos(struct flowi *fl)
 {
-   struct dst_entry *dst, *dst_prev;
-   struct rtable *rt0 = (struct rtable*)(*dst_p);
-   struct rtable *rt = rt0;
-   int tos = fl-fl4_tos;
-   int i;
-   int err;
-   int header_len = 0;
-   int trailer_len = 0;
-
-   dst = dst_prev = NULL;
-   dst_hold(rt-u.dst);
-
-   for (i = 0; i  nx; i++) {
-   struct dst_entry *dst1 = dst_alloc(xfrm4_dst_ops);
-   struct xfrm_dst *xdst;
-
-   if (unlikely(dst1 == NULL)) {
-   err = -ENOBUFS;
-   dst_release(rt-u.dst);
-   goto error;
-   }
+   return fl-fl4_tos;
+}
 
-   if (!dst)
-   dst = dst1;
-   else {
-   dst_prev-child = dst1;
-   dst1-flags |= DST_NOHASH;
-   dst_clone(dst1);
-   }
+static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev)
+{
+   struct rtable *rt = (struct rtable *)xdst-route;
 
-   xdst = (struct xfrm_dst *)dst1;
-   xdst-route = rt-u.dst;
-   xdst-genid = xfrm[i]-genid;
+   xdst-u.rt.fl = rt-fl;
 
-   dst1-next = dst_prev;
-   dst_prev = dst1;
+   

[PATCH 7/24] [IPSEC]: Set dst-input to dst_discard

2007-11-07 Thread Herbert Xu
[IPSEC]: Set dst-input to dst_discard

The input function should never be invoked on IPsec dst objects.  This is
because we don't apply IPsec on input until after we've made the routing
decision.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/ipv4/xfrm4_policy.c |3 ++-
 net/ipv6/xfrm6_policy.c |3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 7d250a1..c40a71b 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -10,6 +10,7 @@
 
 #include linux/compiler.h
 #include linux/inetdevice.h
+#include net/dst.h
 #include net/xfrm.h
 #include net/ip.h
 
@@ -167,7 +168,7 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev-trailer_len   = trailer_len;
memcpy(dst_prev-metrics, x-route-metrics, 
sizeof(dst_prev-metrics));
 
-   dst_prev-input = rt-u.dst.input;
+   dst_prev-input = dst_discard;
dst_prev-output = dst_prev-xfrm-outer_mode-afinfo-output;
if (rt0-peer)
atomic_inc(rt0-peer-refcnt);
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 15747f3..a1c6b7c 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -14,6 +14,7 @@
 #include linux/compiler.h
 #include linux/netdevice.h
 #include net/addrconf.h
+#include net/dst.h
 #include net/xfrm.h
 #include net/ip.h
 #include net/ipv6.h
@@ -214,7 +215,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct 
xfrm_state **xfrm, int
dst_prev-trailer_len   = trailer_len;
memcpy(dst_prev-metrics, x-route-metrics, 
sizeof(dst_prev-metrics));
 
-   dst_prev-input = rt-u.dst.input;
+   dst_prev-input = dst_discard;
dst_prev-output = dst_prev-xfrm-outer_mode-afinfo-output;
/* Sheit... I remember I did this right. Apparently,
 * it was magically lost, so this code needs audit */
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/24] [IPSEC]: Store xfrm states in security path directly

2007-11-07 Thread Herbert Xu
[IPSEC]: Store xfrm states in security path directly

As it is xfrm_input first collects a list of xfrm states on the stack
before storing them in the packet's security path just before it returns.
For async crypto, this construction presents an obstacle since we may
need to leave the loop after each transform.

In fact, it's much easier to just skip the stack completely and always
store to the security path.  This is proven by the fact that this patch
actually shrinks the code.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/xfrm/xfrm_input.c |   42 +++---
 1 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index b980095..587f347 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -100,19 +100,29 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
 {
int err;
__be32 seq;
-   struct xfrm_state *xfrm_vec[XFRM_MAX_DEPTH];
struct xfrm_state *x;
-   int xfrm_nr = 0;
int decaps = 0;
unsigned int nhoff = XFRM_SPI_SKB_CB(skb)-nhoff;
unsigned int daddroff = XFRM_SPI_SKB_CB(skb)-daddroff;
 
+   /* Allocate new secpath or COW existing one. */
+   if (!skb-sp || atomic_read(skb-sp-refcnt) != 1) {
+   struct sec_path *sp;
+
+   sp = secpath_dup(skb-sp);
+   if (!sp)
+   goto drop;
+   if (skb-sp)
+   secpath_put(skb-sp);
+   skb-sp = sp;
+   }
+
seq = 0;
if (!spi  (err = xfrm_parse_spi(skb, nexthdr, spi, seq)) != 0)
goto drop;
 
do {
-   if (xfrm_nr == XFRM_MAX_DEPTH)
+   if (skb-sp-len == XFRM_MAX_DEPTH)
goto drop;
 
x = xfrm_state_lookup((xfrm_address_t *)
@@ -121,6 +131,8 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
if (x == NULL)
goto drop;
 
+   skb-sp-xvec[skb-sp-len++] = x;
+
spin_lock(x-lock);
if (unlikely(x-km.state != XFRM_STATE_VALID))
goto drop_unlock;
@@ -151,8 +163,6 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
 
spin_unlock(x-lock);
 
-   xfrm_vec[xfrm_nr++] = x;
-
if (x-inner_mode-input(x, skb))
goto drop;
 
@@ -166,24 +176,6 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
goto drop;
} while (!err);
 
-   /* Allocate new secpath or COW existing one. */
-
-   if (!skb-sp || atomic_read(skb-sp-refcnt) != 1) {
-   struct sec_path *sp;
-   sp = secpath_dup(skb-sp);
-   if (!sp)
-   goto drop;
-   if (skb-sp)
-   secpath_put(skb-sp);
-   skb-sp = sp;
-   }
-   if (xfrm_nr + skb-sp-len  XFRM_MAX_DEPTH)
-   goto drop;
-
-   memcpy(skb-sp-xvec + skb-sp-len, xfrm_vec,
-  xfrm_nr * sizeof(xfrm_vec[0]));
-   skb-sp-len += xfrm_nr;
-
nf_reset(skb);
 
if (decaps) {
@@ -197,11 +189,7 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 
spi, int encap_type)
 
 drop_unlock:
spin_unlock(x-lock);
-   xfrm_state_put(x);
 drop:
-   while (--xfrm_nr = 0)
-   xfrm_state_put(xfrm_vec[xfrm_nr]);
-
kfree_skb(skb);
return 0;
 }
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/24] [NET]: Remove unnecessary inclusion of dst.h

2007-11-07 Thread Herbert Xu
[NET]: Remove unnecessary inclusion of dst.h

The file net/netevent.h only refers to struct dst_entry * so it doesn't
need to include dst.h.  I've replaced it with a forward declaration.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 include/net/netevent.h |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/netevent.h b/include/net/netevent.h
index e5d2162..e82b7ba 100644
--- a/include/net/netevent.h
+++ b/include/net/netevent.h
@@ -12,7 +12,7 @@
  */
 #ifdef __KERNEL__
 
-#include net/dst.h
+struct dst_entry;
 
 struct netevent_redirect {
struct dst_entry *old;
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/24] [IPSEC]: Use dst-header_len when resizing on output

2007-11-07 Thread Herbert Xu
[IPSEC]: Use dst-header_len when resizing on output

Currently we use x-props.header_len when resizing on output.  However,
if we're resizing at all we might as well go the whole hog and do it
for the whole dst.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]
---

 net/xfrm/xfrm_output.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index f4bfd6c..58d5a74 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -19,7 +19,8 @@
 
 static int xfrm_state_check_space(struct xfrm_state *x, struct sk_buff *skb)
 {
-   int nhead = x-props.header_len + LL_RESERVED_SPACE(skb-dst-dev)
+   struct dst_entry *dst = skb-dst;
+   int nhead = dst-header_len + LL_RESERVED_SPACE(dst-dev)
- skb_headroom(skb);
 
if (nhead  0)
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3][UNIX] Make unix_tot_inflight counter non-atomic

2007-11-07 Thread Pavel Emelyanov
This counter is _always_ modified under the unix_gc_lock spinlock, 
so its atomicity can be provided w/o additional efforts.

Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]

---

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index 0864a77..a1c805d 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -12,7 +12,7 @@ extern void unix_gc(void);
 
 #define UNIX_HASH_SIZE 256
 
-extern atomic_t unix_tot_inflight;
+extern unsigned int unix_tot_inflight;
 
 struct unix_address {
atomic_trefcnt;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 515e7a6..ab9048a 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -457,7 +457,7 @@ static int unix_release_sock (struct sock *sk, int embrion)
 *What the above comment does talk about? --ANK(980817)
 */
 
-   if (atomic_read(unix_tot_inflight))
+   if (unix_tot_inflight)
unix_gc();  /* Garbage collect fds */
 
return 0;
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 406b643..399717e 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -92,7 +92,7 @@ static LIST_HEAD(gc_inflight_list);
 static LIST_HEAD(gc_candidates);
 static DEFINE_SPINLOCK(unix_gc_lock);
 
-atomic_t unix_tot_inflight = ATOMIC_INIT(0);
+unsigned int unix_tot_inflight;
 
 
 static struct sock *unix_get_socket(struct file *filp)
@@ -133,7 +133,7 @@ void unix_inflight(struct file *fp)
} else {
BUG_ON(list_empty(u-link));
}
-   atomic_inc(unix_tot_inflight);
+   unix_tot_inflight++;
spin_unlock(unix_gc_lock);
}
 }
@@ -147,7 +147,7 @@ void unix_notinflight(struct file *fp)
BUG_ON(list_empty(u-link));
if (atomic_dec_and_test(u-inflight))
list_del_init(u-link);
-   atomic_dec(unix_tot_inflight);
+   unix_tot_inflight--;
spin_unlock(unix_gc_lock);
}
 }
-- 
1.5.3.4

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/05] ipv6: RFC4214 Support

2007-11-07 Thread Templin, Fred L
 

 -Original Message-
 From: Simon Arlott [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, November 07, 2007 11:02 AM
 To: Templin, Fred L
 Cc: YOSHIFUJI Hideaki / 吉藤英明; [EMAIL PROTECTED]; netdev@vger.kernel.org
 Subject: Re: [PATCH 02/05] ipv6: RFC4214 Support
 
 On 07/11/07 18:52, Templin, Fred L wrote:
  +  eui[0] = 0;
  +
  +  /* Check for RFC3330 global address ranges */
  +  if (((ipv4 = 0x0100)  (ipv4  0x0a00)) ||
  +  ((ipv4 = 0x0b00)  (ipv4  0x7f00)) ||
  +  ((ipv4 = 0x8000)  (ipv4  0xa9fe)) ||
  +  ((ipv4 = 0xa9ff)  (ipv4  0xac10)) ||
  +  ((ipv4 = 0xac20)  (ipv4  0xc0a8)) ||
  +  ((ipv4 = 0xc0a9)  (ipv4  0xc612)) ||
  +  ((ipv4 = 0xc614)  (ipv4  
  0xe000))) eui[0] |=
  0x2;
  I don't understand.
  
  For example, 1.0.0.11 is valid IPv4 global address.
  In little-endian, this is not in the range of
  0x0001 = addr = 0x000a (addr is 0x0b01).
  
  Maybe it is I who did not understand. Can you suggest a 
 clean solution?
 
 ((ipv4  htonl(0xFF00)) == htonl(0x0A00)) etc.?

I'm not sure this works when we consider disjoint ranges
of globally-unique IP prefixes. Do you have a vision for
how the entire conditional would look like?

Thanks - Fred
[EMAIL PROTECTED]

 Simon Arlott
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >