date:20060609

Re: [PATCH 1/1] LSM-IPsec SELinux Authorize (with minor fix)

2006-06-09 Thread David Miller

From: Xiaolan Zhang [EMAIL PROTECTED]
Date: Tue, 6 Jun 2006 10:55:58 -0400

 Singned-off-by: Catherine Zhang [EMAIL PROTECTED]

 James, is this enough or do I need to modify the original patch to add the 
 above line?  The code was taken from various pieces of patches originally 
 from Trent and merged/modified by me.  Let me know what else I need to do.

That's good enough for me, patch applied, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] LSM-IPsec SELinux Authorize (with minor fix)

2006-06-09 Thread David Miller

From: David Miller [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 23:40:03 -0700 (PDT)

 From: Xiaolan Zhang [EMAIL PROTECTED]
 Date: Tue, 6 Jun 2006 10:55:58 -0400

  Singned-off-by: Catherine Zhang [EMAIL PROTECTED]

  James, is this enough or do I need to modify the original patch to add the 
  above line?  The code was taken from various pieces of patches originally 
  from Trent and merged/modified by me.  Let me know what else I need to do.

 That's good enough for me, patch applied, thanks a lot.

BTW, can I ask you SELINUX folks to at least attempt to do
a build with SELINUX disabled when you submit networking
changes to me?  It would save me a lot of time, this one
failed with:

net/xfrm/xfrm_user.c: In function $,1rx(Bxfrm_del_sa$,1ry(B:
net/xfrm/xfrm_user.c:430: warning: passing argument 1 of 
$,1rx(Bsecurity_xfrm_state_delete$,1ry(B from incompatible pointer type
net/xfrm/xfrm_user.c:430: warning: suggest parentheses around assignment used 
as truth value
net/xfrm/xfrm_user.c: In function $,1rx(Bxfrm_get_policy$,1ry(B:
net/xfrm/xfrm_user.c:1060: warning: suggest parentheses around assignment used 
as truth value

because the nop implementation of security_xfrm_state_delete()
was declared to take an xfrm_policy instead of an xfrm_state.

I've fixed all of this up, but please test this stuff out next
time around.  Thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 2/9] selinux: add security class for appletalk sockets

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:52 -0700

 From: Christopher J. PeBenito [EMAIL PROTECTED]

 Add a security class for appletalk sockets so that they can be
 distinguished in SELinux policy.  Please apply.

 Signed-off-by: Stephen Smalley [EMAIL PROTECTED]
 Acked-by: James Morris [EMAIL PROTECTED]
 Cc: David S. Miller [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 3/9] secmark: Add new flask definitions to SELinux

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:54 -0700

 This patch:

 Add support for a new object class ('packet'), and associated permissions
 ('send', 'recv', 'relabelto').  These are used to enforce security policy for
 network packets labeled with SECMARK, and for adding labeling rules.

 Signed-off-by: James Morris [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 4/9] secmark: Add SELinux exports

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:54 -0700

 From: James Morris [EMAIL PROTECTED]

 Add and export new functions to the in-kernel SELinux API in support of the
 new secmark-based packet controls.

 Signed-off-by: James Morris [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/9] secmark: Add secmark support to core networking.

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:55 -0700

 Add a secmark field to the skbuff structure, to allow security subsystems to
 place security markings on network packets.  This is similar to the nfmark
 field, except is intended for implementing security policy, rather than than
 networking policy.

 This patch was already acked in principle by Dave Miller.

 Signed-off-by: James Morris [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.

Remember James, you're on the hook now to shrink sk_buff
when you get a chance :-)

Thanks again.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 6/9] secmark: Add xtables SECMARK target

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:56 -0700

 Add a SECMARK target to xtables, allowing the admin to apply security marks to
 packets via both iptables and ip6tables.

 The target currently handles SELinux security marking, but can be extended
 for other purposes as needed.

 Signed-off-by: James Morris [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 7/9] secmark: Add secmark support to conntrack

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:57 -0700

 Add a secmark field to IP and NF conntracks, so that security markings on
 packets can be copied to their associated connections, and also copied back to
 packets as required.  This is similar to the network mark field currently used
 with conntrack, although it is intended for enforcement of security policy
 rather than network policy.

 Signed-off-by: James Morris [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 8/9] secmark: Add CONNSECMARK xtables target

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:58 -0700

 Add a new xtables target, CONNSECMARK, which is used to specify rules for
 copying security marks from packets to connections, and for copyying security
 marks back from connections to packets.  This is similar to the CONNMARK
 target, but is more limited in scope in that it only allows copying of
 security marks to and from packets, as this is all it needs to do.

 A typical scenario would be to apply a security mark to a 'new' packet with
 SECMARK, then copy that to its conntrack via CONNMARK, and then restore the
 security mark from the connection to established and related packets on that
 connection.

 Signed-off-by: James Morris [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 9/9] secmark: Add new packet controls to SELinux

2006-06-09 Thread David Miller

From: [EMAIL PROTECTED]
Date: Thu, 08 Jun 2006 22:20:59 -0700

 Add new per-packet access controls to SELinux, replacing the old packet
 controls.
 ...
 Signed-off-by: James Morris [EMAIL PROTECTED]
 Cc: Stephen Smalley [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]

Applied to net-2.6.18, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[2/3] [NET] ppp: Remove unnecessary pskb_may_pull

2006-06-09 Thread Herbert Xu

Hi:

[NET] ppp: Remove unnecessary pskb_may_pull

In ppp_receive_nonmp_frame, we call pskb_may_pull(skb, skb-len) if the
tailroom is = 124.  This is pointless because this pskb_may_pull is only
needed if the skb is non-linear.  However, if it is non-linear then the
tailroom would be zero.

So it can be safely removed.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/drivers/net/ppp_generic.c b/drivers/net/ppp_generic.c
--- a/drivers/net/ppp_generic.c
+++ b/drivers/net/ppp_generic.c
@@ -1609,8 +1609,6 @@ ppp_receive_nonmp_frame(struct ppp *ppp,
kfree_skb(skb);
skb = ns;
}
-   else if (!pskb_may_pull(skb, skb-len))
-   goto err;
else
skb-ip_summed = CHECKSUM_NONE;

[1/3] [NET]: Clean up skb_linearize

2006-06-09 Thread Herbert Xu

Hi:

The following patches are based on net-2.6.18.

[NET]: Clean up skb_linearize

The linearisation operation doesn't need to be super-optimised.  So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.

Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.

Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.

Last but not least, I've removed the gfp argument since nobody uses it
anymore.  If it's ever needed we can easily add it back.

Misc bugs fixed by this patch:

* via-velocity error handling (also, no SG = no frags)

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -116,8 +116,7 @@ aoenet_rcv(struct sk_buff *skb, struct n
skb = skb_share_check(skb, GFP_ATOMIC);
if (skb == NULL)
return 0;
-   if (skb_is_nonlinear(skb))
-   if (skb_linearize(skb, GFP_ATOMIC)  0)
+   if (skb_linearize(skb))
goto exit;
if (!is_aoe_netif(ifp))
goto exit;
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -1200,7 +1200,7 @@ static int mv643xx_eth_start_xmit(struct
}
 
if (has_tiny_unaligned_frags(skb)) {
-   if ((skb_linearize(skb, GFP_ATOMIC) != 0)) {
+   if (__skb_linearize(skb)) {
stats-tx_dropped++;
printk(KERN_DEBUG %s: failed to linearize tiny 
unaligned fragment\n, dev-name);
diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -1899,6 +1899,13 @@ static int velocity_xmit(struct sk_buff 
 
int pktlen = skb-len;
 
+#ifdef VELOCITY_ZERO_COPY_SUPPORT
+   if (skb_shinfo(skb)-nr_frags  6  __skb_linearize(skb)) {
+   kfree_skb(skb);
+   return 0;
+   }
+#endif
+
spin_lock_irqsave(vptr-lock, flags);
 
index = vptr-td_curr[qnum];
@@ -1914,8 +1921,6 @@ static int velocity_xmit(struct sk_buff 
 */
if (pktlen  ETH_ZLEN) {
/* Cannot occur until ZC support */
-   if(skb_linearize(skb, GFP_ATOMIC))
-   return 0; 
pktlen = ETH_ZLEN;
memcpy(tdinfo-buf, skb-data, skb-len);
memset(tdinfo-buf + skb-len, 0, ETH_ZLEN - skb-len);
@@ -1933,7 +1938,6 @@ static int velocity_xmit(struct sk_buff 
int nfrags = skb_shinfo(skb)-nr_frags;
tdinfo-skb = skb;
if (nfrags  6) {
-   skb_linearize(skb, GFP_ATOMIC);
memcpy(tdinfo-buf, skb-data, skb-len);
tdinfo-skb_dma[0] = tdinfo-buf_dma;
td_ptr-tdesc0.pktsize = 
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1165,18 +1165,34 @@ static inline int skb_can_coalesce(struc
return 0;
 }
 
+static inline int __skb_linearize(struct sk_buff *skb)
+{
+   return __pskb_pull_tail(skb, skb-data_len) ? 0 : -ENOMEM;
+}
+
 /**
  * skb_linearize - convert paged skb to linear one
  * @skb: buffer to linarize
- * @gfp: allocation mode
  *
  * If there is no free memory -ENOMEM is returned, otherwise zero
  * is returned and the old skb data released.
  */
-extern int __skb_linearize(struct sk_buff *skb, gfp_t gfp);
-static inline int skb_linearize(struct sk_buff *skb, gfp_t gfp)
+static inline int skb_linearize(struct sk_buff *skb)
+{
+   return skb_is_nonlinear(skb) ? __skb_linearize(skb) : 0;
+}
+
+/**
+ * skb_linearize_cow - make sure skb is linear and writable
+ * @skb: buffer to process
+ *
+ * If there is no free memory -ENOMEM is returned, otherwise zero
+ * is returned and the old skb data released.
+ */
+static inline int skb_linearize_cow(struct sk_buff *skb)
 {
-   return __skb_linearize(skb, gfp);
+   return skb_is_nonlinear(skb) || skb_cloned(skb) ?
+  __skb_linearize(skb) : 0;
 }
 
 /**
diff --git a/net/core/dev.c b/net/core/dev.c
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1220,64 +1220,6 @@ static inline int illegal_highdma(struct
 #define illegal_highdma(dev, skb)  (0)
 #endif
 
-/* Keep head the same: replace data */
-int __skb_linearize(struct sk_buff *skb, gfp_t gfp_mask)
-{
-   unsigned int size;

[3/3] [NET]: skb_trim audit

2006-06-09 Thread Herbert Xu

Hi:

[NET]: skb_trim audit

I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.

Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
instead of skb_trim.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/net/sock.h b/include/net/sock.h
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -873,10 +873,7 @@ static inline int sk_filter(struct sock 
if (filter) {
unsigned int pkt_len = sk_run_filter(skb, filter-insns,
 filter-len);
-   if (!pkt_len)
-   err = -EPERM;
-   else
-   skb_trim(skb, pkt_len);
+   err = pkt_len ? pskb_trim(skb, pkt_len) : -EPERM;
}
 
if (needlock)
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -407,12 +407,8 @@ static unsigned int br_nf_pre_routing_ip
if (pkt_len || hdr-nexthdr != NEXTHDR_HOP) {
if (pkt_len + sizeof(struct ipv6hdr)  skb-len)
goto inhdr_error;
-   if (pkt_len + sizeof(struct ipv6hdr)  skb-len) {
-   if (__pskb_trim(skb, pkt_len + sizeof(struct ipv6hdr)))
-   goto inhdr_error;
-   if (skb-ip_summed == CHECKSUM_HW)
-   skb-ip_summed = CHECKSUM_NONE;
-   }
+   if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr)))
+   goto inhdr_error;
}
if (hdr-nexthdr == NEXTHDR_HOP  check_hbh_len(skb))
goto inhdr_error;
@@ -495,11 +491,7 @@ static unsigned int br_nf_pre_routing(un
if (skb-len  len || len  4 * iph-ihl)
goto inhdr_error;
 
-   if (skb-len  len) {
-   __pskb_trim(skb, len);
-   if (skb-ip_summed == CHECKSUM_HW)
-   skb-ip_summed = CHECKSUM_NONE;
-   }
+   pskb_trim_rcsum(skb, len);
 
nf_bridge_put(skb-nf_bridge);
if (!nf_bridge_alloc(skb))
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
b/net/ipv6/netfilter/nf_conntrack_reasm.c
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -456,13 +456,9 @@ static int nf_ct_frag6_queue(struct nf_c
DEBUGP(queue: message is too short.\n);
goto err;
}
-   if (end-offset  skb-len) {
-   if (pskb_trim(skb, end - offset)) {
-   DEBUGP(Can't trim\n);
-   goto err;
-   }
-   if (skb-ip_summed != CHECKSUM_UNNECESSARY)
-   skb-ip_summed = CHECKSUM_NONE;
+   if (pskb_trim_rcsum(skb, end - offset)) {
+   DEBUGP(Can't trim\n);
+   goto err;
}
 
/* Find out which fragments are in front and at the back of us

[4/3] [NET]: Warn in __skb_trim if skb is paged

2006-06-09 Thread Herbert Xu

Hi:

[NET]: Warn in __skb_trim if skb is paged

It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.

Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 197f2d2..6ceec04 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -971,15 +971,16 @@ #ifndef NET_SKB_PAD
 #define NET_SKB_PAD16
 #endif
 
-extern int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc);
+extern int ___pskb_trim(struct sk_buff *skb, unsigned int len);
 
 static inline void __skb_trim(struct sk_buff *skb, unsigned int len)
 {
-   if (!skb-data_len) {
-   skb-len  = len;
-   skb-tail = skb-data + len;
-   } else
-   ___pskb_trim(skb, len, 0);
+   if (unlikely(skb-data_len)) {
+   WARN_ON(1);
+   return;
+   }
+   skb-len  = len;
+   skb-tail = skb-data + len;
 }
 
 /**
@@ -989,6 +990,7 @@ static inline void __skb_trim(struct sk_
  *
  * Cut the length of a buffer down by removing data from the tail. If
  * the buffer is already under the length specified it is not modified.
+ * The skb must be linear.
  */
 static inline void skb_trim(struct sk_buff *skb, unsigned int len)
 {
@@ -999,12 +1001,10 @@ static inline void skb_trim(struct sk_bu
 
 static inline int __pskb_trim(struct sk_buff *skb, unsigned int len)
 {
-   if (!skb-data_len) {
-   skb-len  = len;
-   skb-tail = skb-data+len;
-   return 0;
-   }
-   return ___pskb_trim(skb, len, 1);
+   if (skb-data_len)
+   return ___pskb_trim(skb, len);
+   __skb_trim(skb, len);
+   return 0;
 }
 
 static inline int pskb_trim(struct sk_buff *skb, unsigned int len)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index fb3770f..0af4861 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -800,12 +800,10 @@ struct sk_buff *skb_pad(struct sk_buff *
return nskb;
 }  
  
-/* Trims skb to length len. It can change skb pointers, if realloc is 1.
- * If realloc==0 and trimming is impossible without change of data,
- * it is BUG().
+/* Trims skb to length len. It can change skb pointers.
  */
 
-int ___pskb_trim(struct sk_buff *skb, unsigned int len, int realloc)
+int ___pskb_trim(struct sk_buff *skb, unsigned int len)
 {
int offset = skb_headlen(skb);
int nfrags = skb_shinfo(skb)-nr_frags;
@@ -815,7 +813,6 @@ int ___pskb_trim(struct sk_buff *skb, un
int end = offset + skb_shinfo(skb)-frags[i].size;
if (end  len) {
if (skb_cloned(skb)) {
-   BUG_ON(!realloc);
if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
return -ENOMEM;
}

Re: ipsec tunnel asymmetrical mtu

2006-06-09 Thread Marco Berizzi


Marco Berizzi wrote:


Marco Berizzi wrote:


Herbert Xu wrote:


However, the fact that the tcpdump causes more chunky packets to
make it through could be an indication that there is a bug somewhere
in our NAT/IPsec code or at least a suboptimal memory allocation
strategy that's somehow avoided when AF_PACKET pins the skb down.


JFYI: same problem with 2.6.17-rc4-git5


JFYI: same problem with 2.6.17-rc6


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 5/9] secmark: Add secmark support to core networking.

2006-06-09 Thread James Morris

On Fri, 9 Jun 2006, David Miller wrote:

 Remember James, you're on the hook now to shrink sk_buff
 when you get a chance :-)

Yep, I remember.


-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.17-rc6-mm1 ] net: RFC 3828-compliant UDP-Lite support

2006-06-09 Thread Gerrit Renker

Quoting David Miller:
|  From: Gerrit Renker [EMAIL PROTECTED]
|  Date: Thu, 8 Jun 2006 21:09:33 +0100
|  
|   That is why I held back regarding the IPv6 port:
snip 
|  
|  It's not like an ipv6 port is such a big pile of work.
|  
I see the point and will port to v6 (have asked colleages for help). 
Until then, I will keep an up-to-date (-mm) patch in the tarball

http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/udplite_linux.tar.gz

This has applications as well. I would value any more input: the suggestion to 
use SOCK_DGRAM has already been integrated and proved a really good idea (much 
less 
to patch, cleaner code). Usually an update is there on the same day the new 
kernel 
comes out.
Thank you for your replies and comments, I will be back when the v6 side is 
ready.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Using netconsole for debugging suspend/resume

2006-06-09 Thread Rafael J. Wysocki

On Friday 09 June 2006 03:56, Jeremy Fitzhardinge wrote:
 Rafael J. Wysocki wrote:
  Please try doing echo 8  /proc/sys/kernel/printk before suspend.

 Um, why?  That would increase the amount of log output, but I don't see 
 how it would help with netconsole preventing suspend, or not being able 
 to see console messages on a blank screen after resume.

Ah, that's after resume.  Sorry for the noise. :-)

Rafael
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Fwd: Packet Lost] ip_rt_bug error

2006-06-09 Thread Vasantha Kumar Puttappa

Hi,
  I am working on a small application using iptables/libipq. In this, the
application would capture a specific packets based on the destination IP
address. Then I encapsulate this IP packet inside another new IP packet.

My problem is that the encapsulation part works fine in
kernel-2.6.11-6(mandriva 2005) and IPtables V 1.2.9.
(I can capture encapsulated packets using tcpdump at the sender side i.e,
packets are being put on to the network)


But this doesn't work in kernel-2.6.12-12(mandriva 2006) and the
IPtables-1.3.5(even though there are no erros after callig
ipq_set_verdict, the packets are not being put on to the channel. The
packets are getting lost after
the call to ipq_set_verdict)


/***

In my syslog I can see ip_rt_bug , but isn't ip_rt_bug occurs when
destiantion address is changed to local address ?

In my application I am changing destination IP to an another non-local IP
address.
/

please let me know if you need more information


As you already have experience with these kinds of error plz help me out here

Thanx






-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem authenticating using WPA with bcm43xx-softmac

2006-06-09 Thread Johannes Berg

On Wed, 2006-06-07 at 13:12 -0500, Larry Finger wrote:
  (ie, add the hh before the x to tell the print that it's a char)
  
 That doesn't work - the result is
 
 %hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx%hx

Looks like the kernel doesn't support that modifier.

 I changed the line to cast the output byte as a u8 as follows:
 
  dprintk(%.2x, (u8)mac-wpa.IE[i]);
 
 This produces the line
 
 generic IE set to dd160050f2010150f2020150f2020150f202
 
 This is the WPA IE supplied by wpa_supplicant and it matches the one used in 
 the ndiswrapper case. 
 One mystery solved, 

Yeah good :)

 but why doesn't it work?

No idea. If we had a dump maybe we could tell :/

 Johannes - should I submit the patch to fix this printout, or would you like 
 to do it?

Please do.

johannes


signature.asc
Description: This is a digitally signed message part

Re: [patch 1/8] myri10ge: alpha build fix

2006-06-09 Thread Brice Goglin

A similar fix is included in the myri10ge update that Jeff merged into
netdev yesterday.

thanks,
Brice



[EMAIL PROTECTED] wrote:
 From: Andrew Morton [EMAIL PROTECTED]

 drivers/net/myri10ge/myri10ge.c: In function 'myri10ge_submit_8rx':
 drivers/net/myri10ge/myri10ge.c:772: error: 'DMA_32BIT_MASK' undeclared 
 (first use in this function)
 drivers/net/myri10ge/myri10ge.c:772: error: (Each undeclared identifier is 
 reported only once
 drivers/net/myri10ge/myri10ge.c:772: error: for each function it appears in.)
 drivers/net/myri10ge/myri10ge.c: In function 'myri10ge_probe':
 drivers/net/myri10ge/myri10ge.c:2607: error: 'DMA_64BIT_MASK' undeclared 
 (first use in this function)
 drivers/net/myri10ge/myri10ge.c:2612: error: 'DMA_32BIT_MASK' undeclared 
 (first use in this function)

 Cc: Brice Goglin [EMAIL PROTECTED]
 Cc: Jeff Garzik [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 ---

  drivers/net/myri10ge/myri10ge.c |2 ++
  1 file changed, 2 insertions(+)

 diff -puN drivers/net/myri10ge/myri10ge.c~myri10ge-alpha-build-fix 
 drivers/net/myri10ge/myri10ge.c
 --- devel/drivers/net/myri10ge/myri10ge.c~myri10ge-alpha-build-fix
 2006-06-03 21:13:30.0 -0700
 +++ devel-akpm/drivers/net/myri10ge/myri10ge.c2006-06-03 
 21:13:43.0 -0700
 @@ -59,6 +59,8 @@
  #include linux/crc32.h
  #include linux/moduleparam.h
  #include linux/io.h
 +#include linux/dma-mapping.h
 +
  #include net/checksum.h
  #include asm/byteorder.h
  #include asm/io.h
 _
   

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Firewall question

2006-06-09 Thread Lennart Sorensen

On Fri, Jun 09, 2006 at 05:43:24AM +0200, Andi Kleen wrote:
 No one out on the internet, but it would be trivial for someone outside
 his house. All his traffic will be on a long unsecured cable. 
 
 That is why I would never bridge home ethernet traffic onto a DSL line.

Hmm, traffic sent between his machines would not go over the DSL since
the MAC address doesn't match the DSL modem (I would think so at
least).  It would be a mess if the DSL modem tried to forwards all
traffic on an ethernet segment (well it doesn't have the bandwidth for
sure).  Maybe I am incorrectly assuming the DSL modem only forwards the
PPPoE traffic being sent at it.  I could see broadcast traffic being
forwarded, although arps and such are generally not that interesting.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] ehea: IBM eHEA Ethernet Device Driver - first full release

2006-06-09 Thread Jan-Bernd Themann


Hello,

here is the URL for our device driver. It is a tarball containing
a patch set for kernel 2.6.17-rc6. This version should compile
without warning.

http://prdownloads.sourceforge.net/ibmehcad/ehea_EHEA_0005_2.6.17-rc6.tgz?download

Signed-off-by: Jan-Bernd Themann [EMAIL PROTECTED]
Changelog-by:  Jan-Bernd Themann [EMAIL PROTECTED]

Differences to patch set http://www.spinics.net/lists/netdev/msg05889.html

Changelog:

- Added Kconfig and Makefile patch in drivers/net
- Changed tarball to patches instead of .c files

Jan-Bernd


 Patches and new drivers should always go to the mailing
 list.. except for the case where they are too big to
 be posted to the mailing list.  For that special case,
 a URL to a patch should be posted.

Jeff
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 6/8] drivers/char/hw_random.c: remove assert()'s

2006-06-09 Thread Jeff Garzik


[EMAIL PROTECTED] wrote:

From: Adrian Bunk [EMAIL PROTECTED]

Remove the assert()'s from drivers/char/hw_random.c since you both needed
to enable a manual option in the driver source to make them effective and
they only covered some obviously impossible cases.

Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]


100% NAK.  They are there, obviously, for driver debugging and 
development.  Just like libata's debug stuff, you certainly have to 
enable them manually.


Until this driver goes away (real soon, right?), the debugging facility 
should stay.


Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem authenticating using WPA with bcm43xx-softmac

2006-06-09 Thread Larry Finger


Johannes Berg wrote:

On Wed, 2006-06-07 at 13:12 -0500, Larry Finger wrote:

but why doesn't it work?


No idea. If we had a dump maybe we could tell :/


Do you mean a special dump, or is the kernel debug output and wpa_supplicant 
debug output sufficient?

Larry
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: netif_tx_disable vs netif_stop_queue (possible races?)

2006-06-09 Thread Daniel Drake


Herbert Xu wrote:

Daniel Drake [EMAIL PROTECTED] wrote:
More specifically, we're talking about drivers/usb/net/usbnet.c and the 
usbnet_disconnect() function.  The race I am highlighting is that 
usbnet's hard_start_xmit handler (usbnet_start_xmit) may be running when 
the disconnect happens.


Is this a possible scenario?


It should be safe, if only because of the synchronize_net that occurs
before a netdev can be freed.


Can I interpret your response as: If the TX queue is disabled in 
advance, no hard_start_xmit functions will be running on any CPU after 
synchronize_net() has returned?


The synchronize_net() code doesn't make it very clear.

Thanks,
Daniel
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem authenticating using WPA with bcm43xx-softmac

2006-06-09 Thread Johannes Berg

On Fri, 2006-06-09 at 10:31 -0500, Larry Finger wrote:

 Do you mean a special dump, or is the kernel debug output and wpa_supplicant 
 debug output sufficient?

I was thinking of packet dumps but earlier you said you couldn't create
any so I'm out of ideas for now.

johannes


signature.asc
Description: This is a digitally signed message part

r8169: freeze at high speeds

2006-06-09 Thread Mourad De Clerck

Hello,


I have a problem where my machine freezes as soon as I send it data at
high speeds. It works perfectly fine when transferring files slowly
(over the internet for instance). But after sending some data for a few
seconds at relatively high speed (let's say 10MB/sec), the whole
machine just freezes. I've had it happen at relatively low speeds too
(1MB/sec), but it's much less frequent. When I stick to really slow
speeds, I can work without problems for days.

I'm using the latest Debian kernel, which is based on 2.6.16.17 at the
moment.

btw: I tried using ethtool to force it on 100Mbit, but it seems to have
little effect (it stays put on 1000Mbit). Autonegotiation stays on
even after trying to switch it off manually.

The machine is a nforce2-based k7 (no SMP). One possibly weird thing is
that my SATA controller and my RT8169 are both on the same PCI card
(behind a PCI bridge) - lspci is attached.

I tried the patch that Francois Romieu posted on 2006-04-18, but it
still locks up. I also tried the r1000 driver from Realtek themselves,
but that one locks up too.

btw2: I do often use nvidia's binary driver, but I made sure it had
never been loaded when testing (fresh reboot, without nvidia.ko ever
being loaded).

Is this a known issue? Can I do anything to track down this bug?


Thank you,


-- Mourad DC

00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different version?) (rev 
c1)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0
Region 0: Memory at e000 (32-bit, prefetchable) [size=128M]
Capabilities: [40] AGP version 3.0
Status: RQ=32 Iso- ArqSz=2 Cal=0 SBA+ ITACoh- GART64- HTrans- 
64bit- FW+ AGP3+ Rate=x4,x8
Command: RQ=1 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x8
Capabilities: [60] HyperTransport: Host or Secondary Interface
Command: WarmRst+ DblEnd-
Link Control: CFlE- CST- CFE- LkFail- Init+ EOC- TXO- CRCErr=0
Link Config: MLWI=8bit MLWO=8bit LWI=8bit LWO=8bit
Revision ID: 0.16

00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1 (rev c1)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Unknown device f541
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-

00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4 (rev c1)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Unknown device f541
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-

00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3 (rev c1)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Unknown device f541
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-

00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2 (rev c1)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Unknown device f541
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-

00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5 (rev c1)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Unknown device f541
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-

00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a4)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Unknown device f541
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0
Capabilities: [48] HyperTransport: Slave or Primary Interface
Command: BaseUnitID=1 UnitCnt=15 MastHost- DefDir-
Link Control 0: CFlE- CST- CFE- LkFail- Init+ EOC+ TXO- 
CRCErr=0
Link Config 0: MLWI=8bit MLWO=8bit LWI=8bit LWO=8bit
Link Control 1: CFlE- CST- CFE- LkFail- Init+ EOC- TXO+ 
CRCErr=0
Link Config 1: MLWI=8bit MLWO=8bit LWI=8bit LWO=8bit
Revision ID: 0.00

00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer

Re: Problem authenticating using WPA with bcm43xx-softmac

2006-06-09 Thread Larry Finger


Johannes Berg wrote:

On Fri, 2006-06-09 at 10:31 -0500, Larry Finger wrote:


Do you mean a special dump, or is the kernel debug output and wpa_supplicant 
debug output sufficient?


I was thinking of packet dumps but earlier you said you couldn't create
any so I'm out of ideas for now.


Actually, I will be able to get packet dumps. The main disk drive in my server, which is a laptop, 
died last night. This will be an opportunity to upgrade it to a newer OS that will be able to run my 
other Wifi card. Neither one will be able to authenticate, but one can packet dump for the other. 
I'll send them when I get the server running again.


Larry


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Using netconsole for debugging suspend/resume

2006-06-09 Thread Matt Mackall

On Fri, Jun 09, 2006 at 07:50:25AM +0200, Andi Kleen wrote:
 On Friday 09 June 2006 07:23, David Miller wrote:
  From: Auke Kok [EMAIL PROTECTED]
  Date: Thu, 08 Jun 2006 22:13:48 -0700

   netconsole should retry. There is no timeout programmed here since that 
   might
   lose important information, and you rather want netconsole to survive an 
   odd
   unplugged cable then to lose vital debugging information when the system 
   is
   busy for instance. (losing link will cause the interface to be down and 
   thus
   the queue to be stopped)

  I completely disagree that netpoll should loop when the ethernet
  cable is plugged out. 

 Currently it is a bit dumb and doesn't distingush the various cases
 well.

 I submitted a patch to loop to be a bit more clever at some point. It can be 
 still
 found in the netdev archives.

Agreed that timeouts should happen.

IIRC, the trouble with your patch was that it a) timed out on far too
short a timescale and b) locked up on my box. Unfortunately, so did my
own patch, which made timeouts approximately 1ms.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] ipv6: order addresses by scope

2006-06-09 Thread Brian Haley

If IPv6 addresses are ordered by scope, then ipv6_dev_get_saddr() can 
break-out of the device addr_list for() loop when the candidate source 
address scope is less than the destination address scope.


Signed-off-by: Brian Haley [EMAIL PROTECTED]
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 445006e..e1d6a6f 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -509,6 +509,25 @@ void inet6_ifa_finish_destroy(struct ine
 	kfree(ifp);
 }
 
+static void
+ipv6_link_dev_addr(struct inet6_dev *idev, struct inet6_ifaddr *ifp)
+{
+	struct inet6_ifaddr *ifa, **ifap;
+
+	/*
+	 * Each device address list is sorted in order of scope -
+	 * global before linklocal.
+	 */
+	for (ifap = idev-addr_list; (ifa = *ifap) != NULL;
+	 ifap = ifa-if_next) {
+		if (ifp-scope  ifa-scope)
+			break;
+	}
+
+	ifp-if_next = *ifap;
+	*ifap = ifp;
+}
+
 /* On success it returns ifp with increased reference count */
 
 static struct inet6_ifaddr *
@@ -574,8 +593,7 @@ ipv6_add_addr(struct inet6_dev *idev, co
 
 	write_lock(idev-lock);
 	/* Add to inet6_dev unicast addr list. */
-	ifa-if_next = idev-addr_list;
-	idev-addr_list = ifa;
+	ipv6_link_dev_addr(idev, ifa);
 
 #ifdef CONFIG_IPV6_PRIVACY
 	if (ifa-flagsIFA_F_TEMPORARY) {
@@ -982,7 +1000,7 @@ int ipv6_dev_get_saddr(struct net_device
 	continue;
 			} else if (score.scope  hiscore.scope) {
 if (score.scope  daddr_scope)
-	continue;
+	break; /* addresses sorted by scope */
 else {
 	score.rule = 2;
 	goto record_it;

Re: [RFT] Realtek 8168 ethernet support

2006-06-09 Thread Francois Romieu

Jeff Garzik [EMAIL PROTECTED] :
 Randy.Dunlap wrote:
 Conversely, any reason to use the RealTek r1000 driver?
 
 FWIW, RealTek emailed me about merging r1000.  I suggested that, if the 

Which one ?

r1000_n.c where #define RELEASE_DATE 2006/02/23

-- 
Ueimor
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/5] ehea: queue management

2006-06-09 Thread John Rose

Hi-

 +#define EHEA_MEM_START 0xc000

You probably don't want to hardcode this.  Maybe KERNELBASE from page.h?

 +
 +int ehea_reg_mr_adapter(struct ehea_adapter *adapter)
 +{
 + int i;
 + u64 hret;
 + u64 start = EHEA_MEM_START;
 + u64 end = (u64) high_memory;
 + u64 nr_pages = (end - start) / PAGE_SIZE;
 + u32 acc_ctrl = EHEA_MEM_ACC_CTRL;
 +
 + EDEB_EN(7, adapter=%p, adapter);
 +
 + hret = ehea_h_alloc_resource_mr(adapter-handle,
 + start,
 + end - start,
 + acc_ctrl,
 + adapter-pd,
 + adapter-mr_handle,
 + adapter-lkey);
 + if (hret != H_SUCCESS) {
 + EDEB_EX(4, Error: hret=%lX\n, hret);
 + return -EINVAL;
 + }
 +
 + for (i = 0; i  nr_pages; i++) {
 + hret = ehea_h_register_rpage_mr(adapter-handle,
 + adapter-mr_handle,
 + 0,
 + 0,
 + virt_to_abs(
 + (void *)(((u64) start)
 + + (i * PAGE_SIZE))),
 + 1);
 +
 + if (((hret != H_SUCCESS)  (hret != H_PAGE_REGISTERED))) {
 + ehea_h_free_resource_mr(adapter-handle, 
 adapter-mr_handle);
 + EDEB_EX(4,  register rpage_mr: hret=%lX\n, hret);
 + return -EINVAL;
 + }
 + }

This creates DMA mappings for the entirety of kernel memory, right?  Has
this been run by the ppc64 folks for possible impacts?

Thanks-
John

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFT] Realtek 8168 ethernet support

2006-06-09 Thread Jeff Garzik


Francois Romieu wrote:

Jeff Garzik [EMAIL PROTECTED] :

Randy.Dunlap wrote:

Conversely, any reason to use the RealTek r1000 driver?
FWIW, RealTek emailed me about merging r1000.  I suggested that, if the 


Which one ?

r1000_n.c where #define RELEASE_DATE 2006/02/23


They didn't say.  Just r1000

Jeff



-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [NET]: Add netif_tx_lock

2006-06-09 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Fri, 9 Jun 2006 15:48:16 +1000

 On Thu, Jun 01, 2006 at 09:15:03PM +1000, herbert wrote:

  OK, here is a patch which does this.

  [NET]: Add netif_tx_lock

 Just noticed that I showed dyslexia in winbond.c :) Here is the corrected
 version.

 [NET]: Add netif_tx_lock

:-)  Applied, thanks a lot Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Using netconsole for debugging suspend/resume

2006-06-09 Thread Mark Lord


Andi Kleen wrote:


If your laptop has firewire you can also use firescope.
(ftp://ftp.suse.com/pub/people/ak/firescope/) 

..

FW keeps running as long as nobody resets the ieee1394 chip.


This looks interesting.  But how does one set it up for use
on the *other* end of that firewire cable?  The Quickstart and
manpage don't seem to describe this fully.

Thanks
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch 4/8] e1000: prevent statistics from getting garbled during reset

2006-06-09 Thread Auke Kok



Ack,

Jeff, please pull this patch from:

   git://lost.foo-projects.org/~ahkok/git/netdev-2.6 upstream

which is against netdev-2.6#upstream cac925a4aab1b7233d3beb591f53498816058a08

Cheers,

Auke


---

Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
Cc: Jesse Brandeburg [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Acked-by: Auke Kok [EMAIL PROTECTED]

---
 e1000_main.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletion(-)


[EMAIL PROTECTED] wrote:

From: Linas Vepstas [EMAIL PROTECTED]

If a PCI bus error/fault triggers a PCI bus reset, attempts to get the
ethernet packet count statistics from the hardware will fail, returning
garbage data upstream.  This patch skips statistics data collection if the
PCI device is not on the bus.

[snip]

e1000: prevent statistics from garbling during bus resets

If a PCI bus error/fault triggers a PCI bus reset, attempts to get
the ethernet packet count statistics from the hardware will fail,
returning garbage data upstream.  This patch skips statistics data
collection if the PCI device is not on the bus.

Signed-off-by: Linas Vepstas [EMAIL PROTECTED]
Cc: Jesse Brandeburg [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Acked-by: Auke Kok [EMAIL PROTECTED]

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 56c7492..a373ccb 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3045,14 +3045,20 @@ void
 e1000_update_stats(struct e1000_adapter *adapter)
 {
 	struct e1000_hw *hw = adapter-hw;
+	struct pci_dev *pdev = adapter-pdev;
 	unsigned long flags;
 	uint16_t phy_tmp;
 
 #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
 
-	/* Prevent stats update while adapter is being reset */
+	/*
+	 * Prevent stats update while adapter is being reset, or if the pci
+	 * connection is down.
+	 */
 	if (adapter-link_speed == 0)
 		return;
+	if (pdev-error_state  pdev-error_state != pci_channel_io_normal)
+		return;
 
 	spin_lock_irqsave(adapter-stats_lock, flags);

Re: [patch 06/17] neighbour.c, pneigh_get_next() skips published entry

2006-06-09 Thread Jari Takkala


On Fri, 9 Jun 2006, Herbert Xu wrote:

 Could you post an exact sequence of commands that reproduces the bug?
 That would help us in verifying your fix.
 

Publish a large number of ARP entries (greater than 10 required on my
system):
'arp -Ds IP iface pub'

View output of /proc/net/arp:

'dd if=/proc/net/arp of=arp-1024.out bs=1024'

The produced output will be missing on average one entry for every ten
entries published. Occasionally, the output will vary and the missing
entry will be displayed.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] [patch 4/6] [Network namespace] Network inet devices isolation

2006-06-09 Thread dlezcano

The network isolation relies on the fact that an application can not
use IP addresses not belonging to the container in which it's
running. This patch isolates the inet device level by adding a
structure namespace pointer in the structure in_ifaddr. When an ip
address is set inside a network namespace, the structure in_ifaddr is
filled with the current namespace pointer. There is a special case
with loopback address which belongs to all the namespaces and its
particularity is to have the network namespace pointer set to NULL.
This patch isolates the ifconfig, ip addr commands, so when an IP
address is set, this one it is not visible by another network
namespaces.

Replace-Subject: [Network namespace] Network inet devices isolation 
Signed-off-by: Daniel Lezcano [EMAIL PROTECTED] 
--
 include/linux/inetdevice.h |1 +
 net/ipv4/devinet.c |   28 +++-
 2 files changed, 28 insertions(+), 1 deletion(-)

Index: 2.6-mm/include/linux/inetdevice.h
===
--- 2.6-mm.orig/include/linux/inetdevice.h
+++ 2.6-mm/include/linux/inetdevice.h
@@ -99,6 +99,7 @@
unsigned char   ifa_flags;
unsigned char   ifa_prefixlen;
charifa_label[IFNAMSIZ];
+   struct net_namespace*ifa_net_ns;
 };
 
 extern int register_inetaddr_notifier(struct notifier_block *nb);
Index: 2.6-mm/net/ipv4/devinet.c
===
--- 2.6-mm.orig/net/ipv4/devinet.c
+++ 2.6-mm/net/ipv4/devinet.c
@@ -54,6 +54,7 @@
 #include linux/notifier.h
 #include linux/inetdevice.h
 #include linux/igmp.h
+#include linux/net_ns.h
 #ifdef CONFIG_SYSCTL
 #include linux/sysctl.h
 #endif
@@ -257,6 +258,7 @@
 
if (!(ifa-ifa_flags  IFA_F_SECONDARY) ||
ifa1-ifa_mask != ifa-ifa_mask ||
+   ifa-ifa_net_ns != net_ns() ||
!inet_ifa_match(ifa1-ifa_address, ifa)) {
ifap1 = ifa-ifa_next;
prev_prom = ifa;
@@ -317,6 +319,8 @@
if (destroy) {
inet_free_ifa(ifa1);
 
+   put_net_ns(ifa1-ifa_net_ns);
+
if (!in_dev-ifa_list)
inetdev_destroy(in_dev);
}
@@ -343,6 +347,7 @@
ifa-ifa_scope = ifa1-ifa_scope)
last_primary = ifa1-ifa_next;
if (ifa1-ifa_mask == ifa-ifa_mask 
+   ifa1-ifa_net_ns == ifa-ifa_net_ns 
inet_ifa_match(ifa1-ifa_address, ifa)) {
if (ifa1-ifa_local == ifa-ifa_local) {
inet_free_ifa(ifa);
@@ -437,6 +442,8 @@
 
for (ifap = in_dev-ifa_list; (ifa = *ifap) != NULL;
 ifap = ifa-ifa_next) {
+   if (ifa-ifa_net_ns != net_ns())
+   continue;
if ((rta[IFA_LOCAL - 1] 
 memcmp(RTA_DATA(rta[IFA_LOCAL - 1]),
ifa-ifa_local, 4)) ||
@@ -497,6 +504,9 @@
ifa-ifa_scope = ifm-ifa_scope;
in_dev_hold(in_dev);
ifa-ifa_dev   = in_dev;
+   ifa-ifa_net_ns = net_ns();
+   get_net_ns(net_ns());
+
if (rta[IFA_LABEL - 1])
rtattr_strlcpy(ifa-ifa_label, rta[IFA_LABEL - 1], IFNAMSIZ);
else
@@ -631,10 +641,15 @@
for (ifap = in_dev-ifa_list; (ifa = *ifap) != NULL;
 ifap = ifa-ifa_next)
if (!strcmp(ifr.ifr_name, ifa-ifa_label))
-   break;
+   if (!ifa-ifa_net_ns ||
+   ifa-ifa_net_ns == net_ns())
+   break;
}
}
 
+   if (ifa  ifa-ifa_net_ns  ifa-ifa_net_ns != net_ns())
+   goto done;
+
ret = -EADDRNOTAVAIL;
if (!ifa  cmd != SIOCSIFADDR  cmd != SIOCSIFFLAGS)
goto done;
@@ -678,6 +693,12 @@
ret = -ENOBUFS;
if ((ifa = inet_alloc_ifa()) == NULL)
break;
+   if (!LOOPBACK(sin-sin_addr.s_addr)) {
+   ifa-ifa_net_ns = net_ns();
+   get_net_ns(net_ns());
+   } else
+   ifa-ifa_net_ns = NULL;
+
if (colon)
memcpy(ifa-ifa_label, ifr.ifr_name, IFNAMSIZ);
else
@@ -782,6 +803,8 @@
goto out;
 
for (; ifa; ifa = ifa-ifa_next) {
+   if (ifa-ifa_net_ns  ifa-ifa_net_ns != net_ns())
+   continue;
if (!buf) {
done += sizeof(ifr);
continue;
@@ -1012,6

[RFC] [patch 2/6] [Network namespace] Network device sharing by view

2006-06-09 Thread dlezcano

Adds to the network namespace a device list view. This view is emptied
when the unshare is done. The view is filled/emptied by a set of
function which can be called by an external module.

Replace-Subject: [Network namespace] Network device sharing by view
Signed-off-by: Daniel Lezcano [EMAIL PROTECTED] 
--
 include/linux/net_ns.h |2 
 include/linux/net_ns_dev.h |   32 +++
 init/version.c |4 
 net/core/Makefile  |2 
 net/core/net_ns_dev.c  |  205 +
 net/net_ns.c   |6 +
 6 files changed, 250 insertions(+), 1 deletion(-)

Index: 2.6-mm/include/linux/net_ns_dev.h
===
--- /dev/null
+++ 2.6-mm/include/linux/net_ns_dev.h
@@ -0,0 +1,32 @@
+#ifndef _LINUX_NET_NS_DEV_H
+#define _LINUX_NET_NS_DEV_H
+
+struct net_device;
+
+struct net_ns_dev {
+   struct list_head list;
+   struct net_device *dev;
+};
+
+struct net_ns_dev_list {
+   struct list_head list;
+   rwlock_t lock;
+};
+
+extern int net_ns_dev_unregister(struct net_device *dev,
+struct net_ns_dev_list *devlist);
+
+extern int net_ns_dev_register(struct net_device *dev,
+  struct net_ns_dev_list *devlist);
+
+extern struct net_device *net_ns_dev_find_by_name(const char *devname,
+ struct net_ns_dev_list 
*devlist);
+extern int net_ns_dev_remove(const char *devname,
+struct net_ns_dev_list *devlist);
+
+extern int net_ns_dev_add(const char *devname,
+ struct net_ns_dev_list *devlist);
+
+extern int free_net_ns_dev(struct net_ns_dev_list *devlist);
+
+#endif
Index: 2.6-mm/include/linux/net_ns.h
===
--- 2.6-mm.orig/include/linux/net_ns.h
+++ 2.6-mm/include/linux/net_ns.h
@@ -4,9 +4,11 @@
 #include linux/kref.h
 #include linux/sched.h
 #include linux/nsproxy.h
+#include linux/net_ns_dev.h
 
 struct net_namespace {
struct kref kref;
+   struct net_ns_dev_list dev_list;
 };
 
 extern struct net_namespace init_net_ns;
Index: 2.6-mm/net/core/net_ns_dev.c
===
--- /dev/null
+++ 2.6-mm/net/core/net_ns_dev.c
@@ -0,0 +1,205 @@
+/*
+ *  net_ns_dev.c - adds namespace netwok device view
+ *
+ *  Copyright (C) 2006 IBM
+ *
+ *  Author: Daniel Lezcano [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ */
+#include linux/list.h
+#include linux/spinlock.h
+#include linux/netdevice.h
+#include linux/net_ns_dev.h
+
+int free_net_ns_dev(struct net_ns_dev_list *devlist)
+{
+   struct list_head *l, *next;
+   struct net_ns_dev *db;
+   struct net_device *dev;
+
+   write_lock(devlist-lock);
+   list_for_each_safe(l, next, devlist-list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   dev = db-dev;
+   list_del(db-list);
+   dev_put(dev);
+   kfree(db);
+   }
+   write_unlock(devlist-lock);
+
+   return 0;
+}
+
+/*
+ * Remove a device to the namespace network devices list
+ * when registered from a namespace
+ * @dev : network device
+ * @dev_list: network namespace devices
+ * Return ENODEV if the device does not exist,
+ */
+int net_ns_dev_unregister(struct net_device *dev,
+ struct net_ns_dev_list *devlist)
+{
+   struct net_ns_dev *db;
+   struct list_head *l;
+   int ret = -ENODEV;
+
+   write_lock(devlist-lock);
+   list_for_each(l, devlist-list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   if (dev != db-dev)
+   continue;
+
+   list_del(db-list);
+   dev_put(dev);
+   kfree(db);
+   ret = 0;
+   break;
+   }
+   write_unlock(devlist-lock);
+   return ret;
+}
+
+EXPORT_SYMBOL_GPL(net_ns_dev_unregister);
+
+/*
+ * Add a device to the namespace network devices list
+ * when registered from a namespace
+ * @dev : network device
+ * @dev_list: network namespace devices
+ * Return ENOMEM if allocation fails, 0 on success
+ */
+int net_ns_dev_register(struct net_device *dev,
+   struct net_ns_dev_list *devlist)
+{
+   struct net_ns_dev *db;
+
+   db = kmalloc(sizeof(*db), GFP_KERNEL);
+   if (!db)
+   return -ENOMEM;
+
+   write_lock(devlist-lock);
+   dev_hold(dev);
+   db-dev = dev;
+   list_add_tail(db-list, devlist-list);
+   write_unlock(devlist-lock);
+
+   return 0;
+}
+
+EXPORT_SYMBOL_GPL(net_ns_dev_register);
+
+/*
+ * Add a device to the namespace network devices list
+ *

[RFC] [patch 3/6] [Network namespace] Network devices isolation

2006-06-09 Thread dlezcano

The dev list view is filled and used from here. The dev_base_list has
been replaced to the dev list view and devices can be accessed only if
the view has the device in its list. All calls from the userspace,
ioctls, netlinks and procfs, will use the network devices view instead
of the global network device list.

Replace-Subject: [Network namespace] Network devices isolation 
Signed-off-by: Daniel Lezcano [EMAIL PROTECTED] 
--
 net/core/dev.c   |  147 ++-
 net/core/rtnetlink.c |   21 +--
 2 files changed, 126 insertions(+), 42 deletions(-)

Index: 2.6-mm/net/core/dev.c
===
--- 2.6-mm.orig/net/core/dev.c
+++ 2.6-mm/net/core/dev.c
@@ -115,6 +115,7 @@
 #include net/iw_handler.h
 #include asm/current.h
 #include linux/audit.h
+#include linux/net_ns.h
 #include linux/dmaengine.h
 
 /*
@@ -474,13 +475,16 @@
 
 struct net_device *__dev_get_by_name(const char *name)
 {
-   struct hlist_node *p;
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
+   struct list_head *l, *list = dev_list-list;
+   struct net_ns_dev *db;
+   struct net_device *dev;
 
-   hlist_for_each(p, dev_name_hash(name)) {
-   struct net_device *dev
-   = hlist_entry(p, struct net_device, name_hlist);
+   list_for_each(l, list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   dev = db-dev;
if (!strncmp(dev-name, name, IFNAMSIZ))
-   return dev;
+   return dev;
}
return NULL;
 }
@@ -498,13 +502,14 @@
 
 struct net_device *dev_get_by_name(const char *name)
 {
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
struct net_device *dev;
 
-   read_lock(dev_base_lock);
+   read_lock(dev_list-lock);
dev = __dev_get_by_name(name);
if (dev)
dev_hold(dev);
-   read_unlock(dev_base_lock);
+   read_unlock(dev_list-lock);
return dev;
 }
 
@@ -521,11 +526,14 @@
 
 struct net_device *__dev_get_by_index(int ifindex)
 {
-   struct hlist_node *p;
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
+   struct list_head *l, *list = dev_list-list;
+   struct net_ns_dev *db;
+   struct net_device *dev;
 
-   hlist_for_each(p, dev_index_hash(ifindex)) {
-   struct net_device *dev
-   = hlist_entry(p, struct net_device, index_hlist);
+   list_for_each(l, list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   dev = db-dev;
if (dev-ifindex == ifindex)
return dev;
}
@@ -545,13 +553,14 @@
 
 struct net_device *dev_get_by_index(int ifindex)
 {
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
struct net_device *dev;
 
-   read_lock(dev_base_lock);
+   read_lock(dev_list-lock);
dev = __dev_get_by_index(ifindex);
if (dev)
dev_hold(dev);
-   read_unlock(dev_base_lock);
+   read_unlock(dev_list-lock);
return dev;
 }
 
@@ -571,14 +580,24 @@
 
 struct net_device *dev_getbyhwaddr(unsigned short type, char *ha)
 {
-   struct net_device *dev;
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
+   struct list_head *l, *list = dev_list-list;
+   struct net_ns_dev *db;
+   struct net_device *dev = NULL;
 
ASSERT_RTNL();
 
-   for (dev = dev_base; dev; dev = dev-next)
+   read_lock(dev_list-lock);
+   list_for_each(l, list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   dev = db-dev;
if (dev-type == type 
!memcmp(dev-dev_addr, ha, dev-addr_len))
-   break;
+   goto out;
+   }
+   dev = NULL;
+out:
+   read_unlock(dev_list-lock);
return dev;
 }
 
@@ -586,15 +605,25 @@
 
 struct net_device *dev_getfirstbyhwtype(unsigned short type)
 {
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
+   struct list_head *l, *list = dev_list-list;
+   struct net_ns_dev *db;
struct net_device *dev;
 
rtnl_lock();
-   for (dev = dev_base; dev; dev = dev-next) {
+
+   read_lock(dev_list-lock);
+   list_for_each(l, list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   dev = db-dev;
if (dev-type == type) {
dev_hold(dev);
-   break;
+   goto out;
}
}
+   dev = NULL;
+out:
+   read_unlock(dev_list-lock);
rtnl_unlock();
return dev;
 }
@@ -614,16 +643,23 @@
 
 struct net_device * dev_get_by_flags(unsigned short if_flags, unsigned short 
mask)
 {
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
+   struct list_head *l, *list =

[RFC] [patch 6/6] [Network namespace] Network namespace debugfs

2006-06-09 Thread dlezcano

This patch is for testing purpose. It allows to read which network
devices are accessible and to add a network device to the view.
This RFC hack is purely for discussing the best way to do that.

After unsharing with CLONE_NEWNET flag:
--
 To see which devices are accessible:
 cat /sys/kernel/debug/net_ns/dev

 To add a device:
 echo eth1  /sys/kernel/debug/net_ns/dev

This functionnality is intended to be implemented in an higher level
container configuration.

Replace-Subject: [Network namespace] Network namespace debugfs
Signed-off-by: Daniel Lezcano [EMAIL PROTECTED] 
--
 fs/debugfs/Makefile |2 
 fs/debugfs/net_ns.c |  141 
 net/Kconfig |4 +
 3 files changed, 146 insertions(+), 1 deletion(-)

Index: 2.6-mm/fs/debugfs/Makefile
===
--- 2.6-mm.orig/fs/debugfs/Makefile
+++ 2.6-mm/fs/debugfs/Makefile
@@ -1,4 +1,4 @@
 debugfs-objs   := inode.o file.o
 
 obj-$(CONFIG_DEBUG_FS) += debugfs.o
-
+obj-$(CONFIG_NET_NS_DEBUG) += net_ns.o
Index: 2.6-mm/fs/debugfs/net_ns.c
===
--- /dev/null
+++ 2.6-mm/fs/debugfs/net_ns.c
@@ -0,0 +1,141 @@
+/*
+ *  net_ns.c - adds a net_ns/ directory to debug NET namespaces
+ *
+ *  Copyright (C) 2006 IBM
+ *
+ *  Author: Daniel Lezcano [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ */
+
+#include linux/module.h
+#include linux/kernel.h
+#include linux/pagemap.h
+#include linux/debugfs.h
+#include linux/sched.h
+#include linux/netdevice.h
+#include linux/net_ns.h
+
+static struct dentry *net_ns_dentry;
+static struct dentry *net_ns_dentry_dev;
+
+static ssize_t net_ns_dev_read_file(struct file *file, char __user *user_buf,
+   size_t count, loff_t *ppos)
+{
+   size_t len;
+   char *buf;
+   struct net_ns_dev_list *devlist = (net_ns()-dev_list);
+   struct net_ns_dev *db;
+   struct net_device *dev;
+   struct list_head *l;
+
+   if (*ppos  0)
+   return -EINVAL;
+   if (*ppos = count)
+   return 0;
+
+   /* It's for debug, everything should fit */
+   buf = kmalloc(4096, GFP_KERNEL);
+   if (!buf)
+   return -ENOMEM;
+   buf[0] = '\0';
+
+   read_lock(devlist-lock);
+   list_for_each(l, devlist-list) {
+   db = list_entry(l, struct net_ns_dev, list);
+   dev = db-dev;
+   strcat(buf,dev-name);
+   strcat(buf,\n);
+   }
+   read_unlock(devlist-lock);
+
+   len = strlen(buf);
+
+   if (len  count)
+   len = count;
+
+   if (copy_to_user(user_buf, buf, len)) {
+   kfree(buf);
+   return -EFAULT;
+   }
+
+   *ppos += count;
+   kfree(buf);
+
+   return count;
+}
+
+static ssize_t net_ns_dev_write_file(struct file *file,
+const char __user *user_buf,
+size_t count, loff_t *ppos)
+{
+   int ret;
+   size_t len;
+   const char __user *p;
+   char c;
+   char devname[IFNAMSIZ];
+   struct net_ns_dev_list *dev_list = (net_ns()-dev_list);
+
+   len = 0;
+   p = user_buf;
+   while (len  count) {
+   if (get_user(c, p++))
+   return -EFAULT;
+   if (c == 0 || c == '\n')
+   break;
+   len++;
+   }
+
+   if (len = IFNAMSIZ)
+   return -EINVAL;
+
+   if (copy_from_user(devname, user_buf, len))
+   return -EFAULT;
+
+   devname[len] = '\0';
+
+   ret = net_ns_dev_add(devname, dev_list);
+   if (ret)
+   return ret;
+
+   *ppos += count;
+   return count;
+}
+
+static int net_ns_dev_open_file(struct inode *inode, struct file *file)
+{
+   return 0;
+}
+
+static struct file_operations net_ns_dev_fops = {
+   .read = net_ns_dev_read_file,
+   .write =net_ns_dev_write_file,
+   .open = net_ns_dev_open_file,
+};
+
+static int __init net_ns_init(void)
+{
+   net_ns_dentry = debugfs_create_dir(net_ns, NULL);
+
+   net_ns_dentry_dev = debugfs_create_file(dev, 0666,
+   net_ns_dentry,
+   NULL,
+   net_ns_dev_fops);
+   return 0;
+}
+
+static void __exit net_ns_exit(void)
+{
+   debugfs_remove(net_ns_dentry_dev);
+   debugfs_remove(net_ns_dentry);
+}
+
+module_init(net_ns_init);
+module_exit(net_ns_exit);
+
+MODULE_DESCRIPTION(NET namespace debugfs);
+MODULE_AUTHOR(Daniel Lezcano [EMAIL

[RFC] [patch 5/6] [Network namespace] ipv4 isolation

2006-06-09 Thread dlezcano

This patch partially isolates ipv4 by adding the network namespace
structure in the structure sock, bind bucket and skbuf. When a socket
is created, the pointer to the network namespace is stored in the
struct sock and the socket belongs to the namespace by this way. That
allows to identify sockets related to a namespace for lookup and
procfs. 

The lookup is extended with a network namespace pointer, in
order to identify listen points binded to the same port. That allows
to have several applications binded to INADDR_ANY:port in different
network namespace without conflicting. The bind is checked against
port and network namespace.

When an outgoing packet has the loopback destination addres, the
skbuff is filled with the network namespace. So the loopback packets
never go outside the namespace. This approach facilitate the migration
of loopback because identification is done by network namespace and
not by address. The loopback has been benchmarked by tbench and the
overhead is roughly 1.5 %

Replace-Subject: [Network namespace] ipv4 isolation
Signed-off-by: Daniel Lezcano [EMAIL PROTECTED] 
--
 include/linux/skbuff.h   |2 ++
 include/net/inet_hashtables.h|   34 --
 include/net/inet_timewait_sock.h |1 +
 include/net/sock.h   |4 
 net/dccp/ipv4.c  |7 ---
 net/ipv4/af_inet.c   |2 ++
 net/ipv4/inet_connection_sock.c  |3 ++-
 net/ipv4/inet_diag.c |3 ++-
 net/ipv4/inet_hashtables.c   |6 +-
 net/ipv4/inet_timewait_sock.c|1 +
 net/ipv4/ip_output.c |4 
 net/ipv4/tcp_ipv4.c  |   25 -
 net/ipv4/udp.c   |7 +--
 13 files changed, 72 insertions(+), 27 deletions(-)

Index: 2.6-mm/include/linux/skbuff.h
===
--- 2.6-mm.orig/include/linux/skbuff.h
+++ 2.6-mm/include/linux/skbuff.h
@@ -27,6 +27,7 @@
 #include linux/poll.h
 #include linux/net.h
 #include linux/textsearch.h
+#include linux/net_ns.h
 #include net/checksum.h
 #include linux/dmaengine.h
 
@@ -301,6 +302,7 @@
*data,
*tail,
*end;
+   struct net_namespace*net_ns;
 };
 
 #ifdef __KERNEL__
Index: 2.6-mm/include/net/inet_hashtables.h
===
--- 2.6-mm.orig/include/net/inet_hashtables.h
+++ 2.6-mm/include/net/inet_hashtables.h
@@ -23,6 +23,8 @@
 #include linux/spinlock.h
 #include linux/types.h
 #include linux/wait.h
+#include linux/in.h
+#include linux/net_ns.h
 
 #include net/inet_connection_sock.h
 #include net/inet_sock.h
@@ -78,6 +80,7 @@
signed shortfastreuse;
struct hlist_node   node;
struct hlist_head   owners;
+   struct net_namespace*net_ns;
 };
 
 #define inet_bind_bucket_for_each(tb, node, head) \
@@ -274,13 +277,15 @@
 extern struct sock *__inet_lookup_listener(const struct hlist_head *head,
   const u32 daddr,
   const unsigned short hnum,
-  const int dif);
+  const int dif,
+  const struct net_namespace *net_ns);
 
 /* Optimize the common listener case. */
 static inline struct sock *
inet_lookup_listener(struct inet_hashinfo *hashinfo,
 const u32 daddr,
-const unsigned short hnum, const int dif)
+const unsigned short hnum, const int dif,
+const struct net_namespace *net_ns)
 {
struct sock *sk = NULL;
const struct hlist_head *head;
@@ -294,8 +299,9 @@
(!inet-rcv_saddr || inet-rcv_saddr == daddr) 
(sk-sk_family == PF_INET || !ipv6_only_sock(sk)) 
!sk-sk_bound_dev_if)
-   goto sherry_cache;
-   sk = __inet_lookup_listener(head, daddr, hnum, dif);
+   if (sk-sk_net_ns == net_ns  LOOPBACK(daddr))
+   goto sherry_cache;
+   sk = __inet_lookup_listener(head, daddr, hnum, dif, net_ns);
}
if (sk) {
 sherry_cache:
@@ -358,7 +364,8 @@
__inet_lookup_established(struct inet_hashinfo *hashinfo,
  const u32 saddr, const u16 sport,
  const u32 daddr, const u16 hnum,
- const int dif)
+ const int dif,
+ const struct net_namespace *net_ns)
 {
INET_ADDR_COOKIE(acookie, saddr, daddr)
const __u32 ports = INET_COMBINED_PORTS(sport, hnum);
@@ -373,12

[RFC] [patch 1/6] [Network namespace] Network namespace structure

2006-06-09 Thread dlezcano

This patch adds to the nsproxy the network namespace and a set of
functions to unshare it. The network namespace structure should be
filled later with the identified network ressources needed for more
isolation.

Replace-Subject: [Network namespace] Network namespace structure
Signed-off-by: Daniel Lezcano [EMAIL PROTECTED] 
--
 include/linux/init_task.h |2 
 include/linux/net_ns.h|   59 
 include/linux/nsproxy.h   |2 
 include/linux/sched.h |1 
 init/version.c|8 +++
 kernel/fork.c |   24 +--
 kernel/nsproxy.c  |   38 +++---
 net/Kconfig   |9 
 net/Makefile  |1 
 net/net_ns.c  |   96 ++
 10 files changed, 222 insertions(+), 18 deletions(-)

Index: 2.6-mm/include/linux/net_ns.h
===
--- /dev/null
+++ 2.6-mm/include/linux/net_ns.h
@@ -0,0 +1,59 @@
+#ifndef _LINUX_NET_NS_H
+#define _LINUX_NET_NS_H
+
+#include linux/kref.h
+#include linux/sched.h
+#include linux/nsproxy.h
+
+struct net_namespace {
+   struct kref kref;
+};
+
+extern struct net_namespace init_net_ns;
+
+#ifdef CONFIG_NET_NS
+
+extern int unshare_network(unsigned long unshare_flags,
+  struct net_namespace **new_net);
+
+extern int copy_network(int flags, struct task_struct *tsk);
+
+static inline void get_net_ns(struct net_namespace *ns)
+{
+   kref_get(ns-kref);
+}
+
+void free_net_ns(struct kref *kref);
+
+static inline void put_net_ns(struct net_namespace *ns)
+{
+   kref_put(ns-kref, free_net_ns);
+}
+
+static inline void exit_network(struct task_struct *p)
+{
+   struct net_namespace *net_ns = p-nsproxy-net_ns;
+   if (net_ns)
+   put_net_ns(net_ns);
+}
+#else /* !CONFIG_NET_NS */
+static inline int unshare_network(unsigned long unshare_flags,
+ struct net_namespace **new_net)
+{
+   return -EINVAL;
+}
+static inline int copy_network(int flags, struct task_struct *tsk)
+{
+   return 0;
+}
+static inline void get_net_ns(struct net_namespace *ns) {}
+static inline void put_net_ns(struct net_namespace *ns) {}
+static inline void exit_network(struct task_struct *p) {}
+#endif /* CONFIG_NET_NS */
+
+static inline struct net_namespace *net_ns(void)
+{
+   return current-nsproxy-net_ns;
+}
+
+#endif
Index: 2.6-mm/net/net_ns.c
===
--- /dev/null
+++ 2.6-mm/net/net_ns.c
@@ -0,0 +1,96 @@
+/*
+ *  net_ns.c - adds support for network namespace
+ *
+ *  Copyright (C) 2006 IBM
+ *
+ *  Author: Daniel Lezcano [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ */
+
+#include linux/net_ns.h
+#include linux/module.h
+
+/*
+ * Clone a new ns copying an original, setting refcount to 1
+ * Cloned process will have
+ * @old_ns: namespace to clone
+ * Return NULL on error (failure to kmalloc), new ns otherwise
+ */
+struct net_namespace *clone_net_ns(struct net_namespace *old_ns)
+{
+   struct net_namespace *new_ns;
+
+   new_ns = kmalloc(sizeof(*new_ns), GFP_KERNEL);
+   if (!new_ns)
+   return NULL;
+   kref_init(new_ns-kref);
+   return new_ns;
+}
+
+/*
+ * unshare the current process' network namespace.
+ * called only in sys_unshare()
+ */
+int unshare_network(unsigned long unshare_flags,
+   struct net_namespace **new_net)
+{
+   if (!(unshare_flags  CLONE_NEWNET))
+   return 0;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
+   *new_net = clone_net_ns(current-nsproxy-net_ns);
+   if (!*new_net)
+   return -ENOMEM;
+
+   return 0;
+}
+
+/*
+ * Copy task tsk's network namespace, or clone it if flags specifies
+ * CLONE_NEWNET.  In latter case, changes to the network ressources of
+ * this process won't be seen by parent, and vice versa.
+ */
+int copy_network(int flags, struct task_struct *tsk)
+{
+   struct net_namespace *old_ns = tsk-nsproxy-net_ns;
+   struct net_namespace *new_ns;
+   int err = 0;
+
+   if (!old_ns)
+   return 0;
+
+   get_net_ns(old_ns);
+
+   if (!(flags  CLONE_NEWNET))
+   return 0;
+
+   if (!capable(CAP_SYS_ADMIN)) {
+   err = -EPERM;
+   goto out;
+   }
+
+   new_ns = clone_net_ns(old_ns);
+   if (!new_ns) {
+   err = -ENOMEM;
+   goto out;
+   }
+   tsk-nsproxy-net_ns = new_ns;
+
+out:
+   put_net_ns(old_ns);
+   return err;
+}
+
+void free_net_ns(struct kref *kref)
+{
+   struct net_namespace *ns;
+
+   ns = container_of(kref, struct net_namespace, kref);
+   kfree(ns);

[RFC] [patch 0/6] [Network namespace] introduction

2006-06-09 Thread dlezcano

The following patches create a private network namespace for use
within containers. This is intended for use with system containers
like vserver, but might also be useful for restricting individual
applications' access to the network stack.

These patches isolate traffic inside the network namespace. The
network ressources, the incoming and the outgoing packets are
identified to be related to a namespace. 

It hides network resource not contained in the current namespace, but
still allows administration of the network with normal commands like
ifconfig.

It applies to the kernel version 2.6.17-rc6-mm1

It provides the following:
-
   - when an application unshares its network namespace, it looses its
 view of all network devices by default. The administrator can
 choose to make any devices to become visible again. The container
 then gains a view to the device but without the ip address
 configured on it. It is up to the container administrator to use
 ifconfig or ip command to setup a new ip address. This ip address
 is only visible inside the container.

   - the loopback is isolated inside the container and it is not
 possible to communicate between containers via the
 loopback. 

   - several containers can have an application bind to the same
 address:port without conflicting. 

What is for ?
-
   - security : an application can be bounded inside a container
 without interacting with the network used by another container

   - consolidation : several instance of the same application can be
 ran in different container because the network namespace allows
 to bind to the same addr:port

What could be done ?

- because the network ressources are related to a namespace, it is
  easy to identify them. That facilitate the implementation of the
  network migration

How to use ?

   - do unshare with the CLONE_NEWNET flag as root
   - do echo eth0  /sys/kernel/debug/net_ns/dev
   - use ifconfig or ip command to set a new ip address

What is missing ?
-
The routes are not yet isolated, that implies:

   - binding to another container's address is allowed

   - an outgoing packet which has an unset source address can
 potentially get another container's address

   - an incoming packet can be routed to the wrong container if there
 are several containers listening to the same addr:port

--
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] workaround zd1201 interference problem

2006-06-09 Thread Daniel Drake


Pavel Machek wrote:

if you plug zd1201 into USB, it starts jamming radio,
immediately. Enable/disable, or iwlist wlan0 scan, or basically any
operation unjams the radio. This patch works it around:


Can we be any more specific?

What is the interference - is it transmitting random packets, or just 
emitting a magical cloud of invisible anti-wifi?


At which precise point does the interference start? Does it happen even 
without the driver loaded?


Which operation is the one which stops the interference, the enable or 
the disable?


Does this happen on every plug in, or just sometimes? Is it affected by 
usage patterns such as having the device plugged in throughout boot, 
reloading the module, etc?


Thanks,
Daniel

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] workaround zd1201 interference problem

2006-06-09 Thread Pavel Machek

Hi!

I'll try to.

 if you plug zd1201 into USB, it starts jamming radio,
 immediately. Enable/disable, or iwlist wlan0 scan, or basically any
 operation unjams the radio. This patch works it around:
 
 Can we be any more specific?
 
 What is the interference - is it transmitting random packets, or just 
 emitting a magical cloud of invisible anti-wifi?

Magical cloud, I'm afraid.

 At which precise point does the interference start? 

When the card is inserted.

 Does it happen even 
 without the driver loaded?

Will try.

 Which operation is the one which stops the interference, the enable or 
 the disable?

Disable alone was not enough to stop interference.

 Does this happen on every plug in, or just sometimes? 

In 70% or so.

 Is it affected by 
 usage patterns such as having the device plugged in throughout boot, 
 reloading the module, etc?

Will try.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [2/3] [NET] ppp: Remove unnecessary pskb_may_pull

2006-06-09 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Fri, 9 Jun 2006 17:43:44 +1000

 [NET] ppp: Remove unnecessary pskb_may_pull

Applied, thanks a lot.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [1/3] [NET]: Clean up skb_linearize

2006-06-09 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Fri, 9 Jun 2006 17:42:34 +1000

 [NET]: Clean up skb_linearize

Looks good, applied to net-2.6.18
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [4/3] [NET]: Warn in __skb_trim if skb is paged

2006-06-09 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Fri, 9 Jun 2006 17:55:39 +1000

 [NET]: Warn in __skb_trim if skb is paged

 It's better to warn and fail rather than rarely triggering BUG on paths
 that incorrectly call skb_trim/__skb_trim on a non-linear skb.

 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Agreed, patch applied, thanks a lot Herbert.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [3/3] [NET]: skb_trim audit

2006-06-09 Thread David Miller

From: Herbert Xu [EMAIL PROTECTED]
Date: Fri, 9 Jun 2006 17:44:33 +1000

 [NET]: skb_trim audit

 I found a few more spots where pskb_trim_rcsum could be used but were not.
 This patch changes them to use it.

 Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
 instead of skb_trim.

 Signed-off-by: Herbert Xu [EMAIL PROTECTED]

Applied, thanks.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: netif_tx_disable vs netif_stop_queue (possible races?)

2006-06-09 Thread Herbert Xu

On Fri, Jun 09, 2006 at 04:29:13PM +0100, Daniel Drake wrote:
 
 Can I interpret your response as: If the TX queue is disabled in 
 advance, no hard_start_xmit functions will be running on any CPU after 
 synchronize_net() has returned?

Correct.  All callers of hard_start_xmit do so under RCU or equivalent
locks so they must be complete by the time synchronize_net() returns.

What you got watch out for though are paths where the driver might
reenable the queue.  That could be a real bug.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] [patch 5/6] [Network namespace] ipv4 isolation

2006-06-09 Thread James Morris

On Fri, 9 Jun 2006, [EMAIL PROTECTED] wrote:

 When an outgoing packet has the loopback destination addres, the
 skbuff is filled with the network namespace. So the loopback packets
 never go outside the namespace. This approach facilitate the migration
 of loopback because identification is done by network namespace and
 not by address. The loopback has been benchmarked by tbench and the
 overhead is roughly 1.5 %

I think you'll need to make it so this code has zero impact when not 
configured.


- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] [patch 5/6] [Network namespace] ipv4 isolation

2006-06-09 Thread Rick Jones


James Morris wrote:

On Fri, 9 Jun 2006, [EMAIL PROTECTED] wrote:



When an outgoing packet has the loopback destination addres, the
skbuff is filled with the network namespace. So the loopback packets
never go outside the namespace. This approach facilitate the migration
of loopback because identification is done by network namespace and
not by address. The loopback has been benchmarked by tbench and the
overhead is roughly 1.5 %



I think you'll need to make it so this code has zero impact when not 
configured.


Indeed, and over stuff other than loopback too.  I'll not so humbly 
suggest :)  netperf TCP_STREAM and TCP_RR figures _with_ CPU 
utilization/service demand measures.


rick jones
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] [patch 5/6] [Network namespace] ipv4 isolation

2006-06-09 Thread James Morris

On Fri, 9 Jun 2006, Rick Jones wrote:

  I think you'll need to make it so this code has zero impact when not
  configured.
 
 Indeed, and over stuff other than loopback too.  I'll not so humbly suggest :)

Yes, I meant the whole lot.



- James
-- 
James Morris
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

55 matches

Mail list logo