[PATCHES] Networking -stable for 3.5.x

David Miller Sat, 06 Oct 2012 08:26:57 -0700

Please queue up the following networking bug fixes for
3.5-stable

Thanks!

From fcccb33fd0bc7880264a8521a729fb547229001b Mon Sep 17 00:00:00 2001
From: Michal Schmidt <[email protected]>
Date: Thu, 13 Sep 2012 12:59:44 +0000
Subject: [PATCH 01/34] bnx2x: fix rx checksum validation for IPv6


[ Upstream commit e488921f44765e8ab6c48ca35e3f6b78df9819df ]

Commit d6cb3e41 "bnx2x: fix checksum validation" caused a performance
regression for IPv6. Rx checksum offload does not work. IPv6 packets
are passed to the stack with CHECKSUM_NONE.

The hardware obviously cannot perform IP checksum validation for IPv6,
because there is no checksum in the IPv6 header. This should not prevent
us from setting CHECKSUM_UNNECESSARY.

Tested on BCM57711.

Signed-off-by: Michal Schmidt <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Acked-by: Eilon Greenstein <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 8098eea..2364f66 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -620,14 +620,16 @@ static int bnx2x_alloc_rx_data(struct bnx2x *bp,
 static void bnx2x_csum_validate(struct sk_buff *skb, union eth_rx_cqe *cqe,
                                struct bnx2x_fastpath *fp)
 {
-       /* Do nothing if no IP/L4 csum validation was done */
-
+       /* Do nothing if no L4 csum validation was done.
+        * We do not check whether IP csum was validated. For IPv4 we assume
+        * that if the card got as far as validating the L4 csum, it also
+        * validated the IP csum. IPv6 has no IP csum.
+        */
        if (cqe->fast_path_cqe.status_flags &
-           (ETH_FAST_PATH_RX_CQE_IP_XSUM_NO_VALIDATION_FLG |
-            ETH_FAST_PATH_RX_CQE_L4_XSUM_NO_VALIDATION_FLG))
+           ETH_FAST_PATH_RX_CQE_L4_XSUM_NO_VALIDATION_FLG)
                return;
 
-       /* If both IP/L4 validation were done, check if an error was found. */
+       /* If L4 validation was done, check if an error was found. */
 
        if (cqe->fast_path_cqe.type_error_flags &
            (ETH_FAST_PATH_RX_CQE_IP_BAD_XSUM_FLG |
-- 
1.7.11.4


From eb69fec480c82c63a5f445a3e4d2f0373964a04b Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Mon, 17 Sep 2012 12:51:39 +0000
Subject: [PATCH 02/34] tcp: fix regression in urgent data handling

[ Upstream commit 1d57f19539c074105791da6384a8ad674bba8037 ]

Stephan Springl found that commit 1402d366019fed "tcp: introduce
tcp_try_coalesce" introduced a regression for rlogin

It turns out problem comes from TCP urgent data handling and
a change in behavior in input path.

rlogin sends two one-byte packets with URG ptr set, and when next data
frame is coalesced, we lack sk_data_ready() calls to wakeup consumer.

Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Stephan Springl <[email protected]>
Cc: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv4/tcp_input.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 262316b..ab30c96 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4818,7 +4818,7 @@ queue_and_out:
 
                if (eaten > 0)
                        kfree_skb_partial(skb, fragstolen);
-               else if (!sock_flag(sk, SOCK_DEAD))
+               if (!sock_flag(sk, SOCK_DEAD))
                        sk->sk_data_ready(sk, 0);
                return;
        }
@@ -5680,8 +5680,7 @@ no_ack:
 #endif
                        if (eaten)
                                kfree_skb_partial(skb, fragstolen);
-                       else
-                               sk->sk_data_ready(sk, 0);
+                       sk->sk_data_ready(sk, 0);
                        return 0;
                }
        }
-- 
1.7.11.4


From 0dba3f1cd85c795d62cb83f53bf7cc9b104cef09 Mon Sep 17 00:00:00 2001
From: Steffen Klassert <[email protected]>
Date: Tue, 4 Sep 2012 00:03:29 +0000
Subject: [PATCH 03/34] xfrm: Workaround incompatibility of ESN and async
 crypto

[ Upstream commit 3b59df46a449ec9975146d71318c4777ad086744 ]

ESN for esp is defined in RFC 4303. This RFC assumes that the
sequence number counters are always up to date. However,
this is not true if an async crypto algorithm is employed.

If the sequence number counters are not up to date on sequence
number check, we may incorrectly update the upper 32 bit of
the sequence number. This leads to a DOS.

We workaround this by comparing the upper sequence number,
(used for authentication) with the upper sequence number
computed after the async processing. We drop the packet
if these numbers are different.

To do this, we introduce a recheck function that does this
check in the ESN case.

Signed-off-by: Steffen Klassert <[email protected]>
Acked-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 include/net/xfrm.h     |  3 +++
 net/xfrm/xfrm_input.c  |  2 +-
 net/xfrm/xfrm_replay.c | 15 +++++++++++++++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index e0a55df..887a232 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -269,6 +269,9 @@ struct xfrm_replay {
        int     (*check)(struct xfrm_state *x,
                         struct sk_buff *skb,
                         __be32 net_seq);
+       int     (*recheck)(struct xfrm_state *x,
+                          struct sk_buff *skb,
+                          __be32 net_seq);
        void    (*notify)(struct xfrm_state *x, int event);
        int     (*overflow)(struct xfrm_state *x, struct sk_buff *skb);
 };
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 54a0dc2..ab2bb42 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -212,7 +212,7 @@ resume:
                /* only the first xfrm gets the encap type */
                encap_type = 0;
 
-               if (async && x->repl->check(x, skb, seq)) {
+               if (async && x->repl->recheck(x, skb, seq)) {
                        XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATESEQERROR);
                        goto drop_unlock;
                }
diff --git a/net/xfrm/xfrm_replay.c b/net/xfrm/xfrm_replay.c
index 2f6d11d..3efb07d 100644
--- a/net/xfrm/xfrm_replay.c
+++ b/net/xfrm/xfrm_replay.c
@@ -420,6 +420,18 @@ err:
        return -EINVAL;
 }
 
+static int xfrm_replay_recheck_esn(struct xfrm_state *x,
+                                  struct sk_buff *skb, __be32 net_seq)
+{
+       if (unlikely(XFRM_SKB_CB(skb)->seq.input.hi !=
+                    htonl(xfrm_replay_seqhi(x, net_seq)))) {
+                       x->stats.replay_window++;
+                       return -EINVAL;
+       }
+
+       return xfrm_replay_check_esn(x, skb, net_seq);
+}
+
 static void xfrm_replay_advance_esn(struct xfrm_state *x, __be32 net_seq)
 {
        unsigned int bitnr, nr, i;
@@ -479,6 +491,7 @@ static void xfrm_replay_advance_esn(struct xfrm_state *x, 
__be32 net_seq)
 static struct xfrm_replay xfrm_replay_legacy = {
        .advance        = xfrm_replay_advance,
        .check          = xfrm_replay_check,
+       .recheck        = xfrm_replay_check,
        .notify         = xfrm_replay_notify,
        .overflow       = xfrm_replay_overflow,
 };
@@ -486,6 +499,7 @@ static struct xfrm_replay xfrm_replay_legacy = {
 static struct xfrm_replay xfrm_replay_bmp = {
        .advance        = xfrm_replay_advance_bmp,
        .check          = xfrm_replay_check_bmp,
+       .recheck        = xfrm_replay_check_bmp,
        .notify         = xfrm_replay_notify_bmp,
        .overflow       = xfrm_replay_overflow_bmp,
 };
@@ -493,6 +507,7 @@ static struct xfrm_replay xfrm_replay_bmp = {
 static struct xfrm_replay xfrm_replay_esn = {
        .advance        = xfrm_replay_advance_esn,
        .check          = xfrm_replay_check_esn,
+       .recheck        = xfrm_replay_recheck_esn,
        .notify         = xfrm_replay_notify_bmp,
        .overflow       = xfrm_replay_overflow_esn,
 };
-- 
1.7.11.4


From 706e09b031db9e8a585f306a3863648a30c46a73 Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Thu, 13 Sep 2012 11:41:26 +0000
Subject: [PATCH 04/34] xfrm_user: return error pointer instead of NULL

[ Upstream commit 864745d291b5ba80ea0bd0edcbe67273de368836 ]

When dump_one_state() returns an error, e.g. because of a too small
buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL
instead of an error pointer. But its callers expect an error pointer
and therefore continue to operate on a NULL skbuff.

This could lead to a privilege escalation (execution of user code in
kernel context) if the attacker has CAP_NET_ADMIN and is able to map
address 0.

Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 44293b3..d9d4f63 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -872,6 +872,7 @@ static struct sk_buff *xfrm_state_netlink(struct sk_buff 
*in_skb,
 {
        struct xfrm_dump_info info;
        struct sk_buff *skb;
+       int err;
 
        skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
        if (!skb)
@@ -882,9 +883,10 @@ static struct sk_buff *xfrm_state_netlink(struct sk_buff 
*in_skb,
        info.nlmsg_seq = seq;
        info.nlmsg_flags = 0;
 
-       if (dump_one_state(x, 0, &info)) {
+       err = dump_one_state(x, 0, &info);
+       if (err) {
                kfree_skb(skb);
-               return NULL;
+               return ERR_PTR(err);
        }
 
        return skb;
-- 
1.7.11.4


From 913615bcee091ead37ad7c2ae5b7826d0de4325c Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Fri, 14 Sep 2012 09:58:32 +0000
Subject: [PATCH 05/34] xfrm_user: return error pointer instead of NULL #2

[ Upstream commit c25463722509fef0ed630b271576a8c9a70236f3 ]

When dump_one_policy() returns an error, e.g. because of a too small
buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns
NULL instead of an error pointer. But its caller expects an error
pointer and therefore continues to operate on a NULL skbuff.

Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index d9d4f63..61e05bb 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1543,6 +1543,7 @@ static struct sk_buff *xfrm_policy_netlink(struct sk_buff 
*in_skb,
 {
        struct xfrm_dump_info info;
        struct sk_buff *skb;
+       int err;
 
        skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
        if (!skb)
@@ -1553,9 +1554,10 @@ static struct sk_buff *xfrm_policy_netlink(struct 
sk_buff *in_skb,
        info.nlmsg_seq = seq;
        info.nlmsg_flags = 0;
 
-       if (dump_one_policy(xp, dir, 0, &info) < 0) {
+       err = dump_one_policy(xp, dir, 0, &info);
+       if (err) {
                kfree_skb(skb);
-               return NULL;
+               return ERR_PTR(err);
        }
 
        return skb;
-- 
1.7.11.4


From 07b2fe9d06b394925110d89e36879704de8a1039 Mon Sep 17 00:00:00 2001
From: Li RongQing <[email protected]>
Date: Mon, 17 Sep 2012 22:40:10 +0000
Subject: [PATCH 06/34] xfrm: fix a read lock imbalance in make_blackhole

[ Upstream commit 433a19548061bb5457b6ab77ed7ea58ca6e43ddb ]

if xfrm_policy_get_afinfo returns 0, it has already released the read
lock, xfrm_policy_put_afinfo should not be called again.

Signed-off-by: Li RongQing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_policy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index ccfbd32..ce19467 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1763,7 +1763,7 @@ static struct dst_entry *make_blackhole(struct net *net, 
u16 family,
 
        if (!afinfo) {
                dst_release(dst_orig);
-               ret = ERR_PTR(-EINVAL);
+               return ERR_PTR(-EINVAL);
        } else {
                ret = afinfo->blackhole_route(net, dst_orig);
        }
-- 
1.7.11.4


From c13a67d1b857c25d64e79be02a66b81ad8765948 Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Wed, 19 Sep 2012 11:33:38 +0000
Subject: [PATCH 07/34] xfrm_user: fix info leak in copy_to_user_auth()

[ Upstream commit 4c87308bdea31a7b4828a51f6156e6f721a1fcc9 ]

copy_to_user_auth() fails to initialize the remainder of alg_name and
therefore discloses up to 54 bytes of heap memory via netlink to
userland.

Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name
with null bytes.

Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 61e05bb..67f2727 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -742,7 +742,7 @@ static int copy_to_user_auth(struct xfrm_algo_auth *auth, 
struct sk_buff *skb)
                return -EMSGSIZE;
 
        algo = nla_data(nla);
-       strcpy(algo->alg_name, auth->alg_name);
+       strncpy(algo->alg_name, auth->alg_name, sizeof(algo->alg_name));
        memcpy(algo->alg_key, auth->alg_key, (auth->alg_key_len + 7) / 8);
        algo->alg_key_len = auth->alg_key_len;
 
-- 
1.7.11.4


From aaddf260abe3d3e5a49024e656fa3807efc82a50 Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Wed, 19 Sep 2012 11:33:39 +0000
Subject: [PATCH 08/34] xfrm_user: fix info leak in copy_to_user_state()

[ Upstream commit f778a636713a435d3a922c60b1622a91136560c1 ]

The memory reserved to dump the xfrm state includes the padding bytes of
struct xfrm_usersa_info added by the compiler for alignment (7 for
amd64, 3 for i386). Add an explicit memset(0) before filling the buffer
to avoid the info leak.

Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 67f2727..d680eed 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -689,6 +689,7 @@ out:
 
 static void copy_to_user_state(struct xfrm_state *x, struct xfrm_usersa_info 
*p)
 {
+       memset(p, 0, sizeof(*p));
        memcpy(&p->id, &x->id, sizeof(p->id));
        memcpy(&p->sel, &x->sel, sizeof(p->sel));
        memcpy(&p->lft, &x->lft, sizeof(p->lft));
-- 
1.7.11.4


From d6aab458656080bbd46eae2842be27651168f876 Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Wed, 19 Sep 2012 11:33:40 +0000
Subject: [PATCH 09/34] xfrm_user: fix info leak in copy_to_user_policy()

[ Upstream commit 7b789836f434c87168eab067cfbed1ec4783dffd ]

The memory reserved to dump the xfrm policy includes multiple padding
bytes added by the compiler for alignment (padding bytes in struct
xfrm_selector and struct xfrm_userpolicy_info). Add an explicit
memset(0) before filling the buffer to avoid the heap info leak.

Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index d680eed..90d1c47 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1312,6 +1312,7 @@ static void copy_from_user_policy(struct xfrm_policy *xp, 
struct xfrm_userpolicy
 
 static void copy_to_user_policy(struct xfrm_policy *xp, struct 
xfrm_userpolicy_info *p, int dir)
 {
+       memset(p, 0, sizeof(*p));
        memcpy(&p->sel, &xp->selector, sizeof(p->sel));
        memcpy(&p->lft, &xp->lft, sizeof(p->lft));
        memcpy(&p->curlft, &xp->curlft, sizeof(p->curlft));
-- 
1.7.11.4


From a0bf8c28035418830d8f19fdaade9173fb7a07e3 Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Wed, 19 Sep 2012 11:33:41 +0000
Subject: [PATCH 10/34] xfrm_user: fix info leak in copy_to_user_tmpl()

[ Upstream commit 1f86840f897717f86d523a13e99a447e6a5d2fa5 ]

The memory used for the template copy is a local stack variable. As
struct xfrm_user_tmpl contains multiple holes added by the compiler for
alignment, not initializing the memory will lead to leaking stack bytes
to userland. Add an explicit memset(0) to avoid the info leak.

Initial version of the patch by Brad Spengler.

Cc: Brad Spengler <[email protected]>
Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 90d1c47..21733b4 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1417,6 +1417,7 @@ static int copy_to_user_tmpl(struct xfrm_policy *xp, 
struct sk_buff *skb)
                struct xfrm_user_tmpl *up = &vec[i];
                struct xfrm_tmpl *kp = &xp->xfrm_vec[i];
 
+               memset(up, 0, sizeof(*up));
                memcpy(&up->id, &kp->id, sizeof(up->id));
                up->family = kp->encap_family;
                memcpy(&up->saddr, &kp->saddr, sizeof(up->saddr));
-- 
1.7.11.4


From 48a86b1f8fd04276ad7e2878e0a27512fb8f1e40 Mon Sep 17 00:00:00 2001
From: Mathias Krause <[email protected]>
Date: Wed, 19 Sep 2012 11:33:43 +0000
Subject: [PATCH 11/34] xfrm_user: don't copy esn replay window twice for new
 states

[ Upstream commit e3ac104d41a97b42316915020ba228c505447d21 ]

The ESN replay window was already fully initialized in
xfrm_alloc_replay_state_esn(). No need to copy it again.

Cc: Steffen Klassert <[email protected]>
Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/xfrm/xfrm_user.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 21733b4..a6b2ad4 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -442,10 +442,11 @@ static void copy_from_user_state(struct xfrm_state *x, 
struct xfrm_usersa_info *
  * somehow made shareable and move it to xfrm_state.c - JHS
  *
 */
-static void xfrm_update_ae_params(struct xfrm_state *x, struct nlattr **attrs)
+static void xfrm_update_ae_params(struct xfrm_state *x, struct nlattr **attrs,
+                                 int update_esn)
 {
        struct nlattr *rp = attrs[XFRMA_REPLAY_VAL];
-       struct nlattr *re = attrs[XFRMA_REPLAY_ESN_VAL];
+       struct nlattr *re = update_esn ? attrs[XFRMA_REPLAY_ESN_VAL] : NULL;
        struct nlattr *lt = attrs[XFRMA_LTIME_VAL];
        struct nlattr *et = attrs[XFRMA_ETIMER_THRESH];
        struct nlattr *rt = attrs[XFRMA_REPLAY_THRESH];
@@ -555,7 +556,7 @@ static struct xfrm_state *xfrm_state_construct(struct net 
*net,
                goto error;
 
        /* override default values from above */
-       xfrm_update_ae_params(x, attrs);
+       xfrm_update_ae_params(x, attrs, 0);
 
        return x;
 
@@ -1819,7 +1820,7 @@ static int xfrm_new_ae(struct sk_buff *skb, struct 
nlmsghdr *nlh,
                goto out;
 
        spin_lock_bh(&x->lock);
-       xfrm_update_ae_params(x, attrs);
+       xfrm_update_ae_params(x, attrs, 1);
        spin_unlock_bh(&x->lock);
 
        c.event = nlh->nlmsg_type;
-- 
1.7.11.4


From 811914736639bd185b93514cff75bfec206124bf Mon Sep 17 00:00:00 2001
From: htbegin <[email protected]>
Date: Mon, 1 Oct 2012 16:42:43 +0000
Subject: [PATCH 12/34] net: ethernet: davinci_cpdma: decrease the desc count
 when cleaning up the remaining packets

[ Upstream commit ffb5ba90017505a19e238e986e6d33f09e4df765 ]

chan->count is used by rx channel. If the desc count is not updated by
the clean up loop in cpdma_chan_stop, the value written to the rxfree
register in cpdma_chan_start will be incorrect.

Signed-off-by: Tao Hou <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/net/ethernet/ti/davinci_cpdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c 
b/drivers/net/ethernet/ti/davinci_cpdma.c
index 3b5c457..28892eb 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -862,6 +862,7 @@ int cpdma_chan_stop(struct cpdma_chan *chan)
 
                next_dma = desc_read(desc, hw_next);
                chan->head = desc_from_phys(pool, next_dma);
+               chan->count--;
                chan->stats.teardown_dequeue++;
 
                /* issue callback without locks held */
-- 
1.7.11.4


From cdf822bd15e94b704184f32b2190cbe0182c7d52 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <[email protected]>
Date: Mon, 10 Sep 2012 14:06:58 +0200
Subject: [PATCH 13/34] ixp4xx_hss: fix build failure due to missing
 linux/module.h inclusion

[ Upstream commit 0b836ddde177bdd5790ade83772860940bd481ea ]

Commit 36a1211970193ce215de50ed1e4e1272bc814df1 (netprio_cgroup.h:
dont include module.h from other includes) made the following build
error on ixp4xx_hss pop up:

  CC [M]  drivers/net/wan/ixp4xx_hss.o
 drivers/net/wan/ixp4xx_hss.c:1412:20: error: expected ';', ',' or ')'
 before string constant
 drivers/net/wan/ixp4xx_hss.c:1413:25: error: expected ';', ',' or ')'
 before string constant
 drivers/net/wan/ixp4xx_hss.c:1414:21: error: expected ';', ',' or ')'
 before string constant
 drivers/net/wan/ixp4xx_hss.c:1415:19: error: expected ';', ',' or ')'
 before string constant
 make[8]: *** [drivers/net/wan/ixp4xx_hss.o] Error 1

This was previously hidden because ixp4xx_hss includes linux/hdlc.h which
includes linux/netdevice.h which includes linux/netprio_cgroup.h which
used to include linux/module.h. The real issue was actually present since
the initial commit that added this driver since it uses macros from
linux/module.h without including this file.

Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/net/wan/ixp4xx_hss.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c
index aaaca9a..3f575af 100644
--- a/drivers/net/wan/ixp4xx_hss.c
+++ b/drivers/net/wan/ixp4xx_hss.c
@@ -10,6 +10,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include <linux/module.h>
 #include <linux/bitops.h>
 #include <linux/cdev.h>
 #include <linux/dma-mapping.h>
-- 
1.7.11.4


From 1ef3dab1132c9839b1666d946abf614204161ded Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <[email protected]>
Date: Fri, 14 Sep 2012 05:50:03 +0000
Subject: [PATCH 14/34] netxen: check for root bus in
 netxen_mask_aer_correctable

[ Upstream commit e4d1aa40e363ed3e0486aeeeb0d173f7f822737e ]

Add a check if pdev->bus->self == NULL (root bus). When attaching
a netxen NIC to a VM it can be on the root bus and the guest would
crash in netxen_mask_aer_correctable() because of a NULL pointer
dereference if CONFIG_PCIEAER is present.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c 
b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
index 342b3a7..a77c558 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c
@@ -1378,6 +1378,10 @@ static void netxen_mask_aer_correctable(struct 
netxen_adapter *adapter)
        struct pci_dev *root = pdev->bus->self;
        u32 aer_pos;
 
+       /* root bus? */
+       if (!root)
+               return;
+
        if (adapter->ahw.board_type != NETXEN_BRDTYPE_P3_4_GB_MM &&
                adapter->ahw.board_type != NETXEN_BRDTYPE_P3_10G_TP)
                return;
-- 
1.7.11.4


From 59ccff67a37eab1af67574467ed73d75930978d0 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Tue, 11 Sep 2012 13:11:12 +0000
Subject: [PATCH 15/34] net-sched: sch_cbq: avoid infinite loop

[ Upstream commit bdfc87f7d1e253e0a61e2fc6a75ea9d76f7fc03a ]

Its possible to setup a bad cbq configuration leading to
an infinite loop in cbq_classify()

DEV_OUT=eth0
ICMP="match ip protocol 1 0xff"
U32="protocol ip u32"
DST="match ip dst"
tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \
        bandwidth 100mbit
tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \
        rate 512kbit allot 1500 prio 5 bounded isolated
tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \
        $ICMP $DST 192.168.3.234 flowid 1:

Reported-by: Denys Fedoryschenko <[email protected]>
Tested-by: Denys Fedoryschenko <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/sched/sch_cbq.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 6aabd77..564b9fc 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -250,10 +250,11 @@ cbq_classify(struct sk_buff *skb, struct Qdisc *sch, int 
*qerr)
                        else if ((cl = defmap[res.classid & TC_PRIO_MAX]) == 
NULL)
                                cl = defmap[TC_PRIO_BESTEFFORT];
 
-                       if (cl == NULL || cl->level >= head->level)
+                       if (cl == NULL)
                                goto fallback;
                }
-
+               if (cl->level >= head->level)
+                       goto fallback;
 #ifdef CONFIG_NET_CLS_ACT
                switch (result) {
                case TC_ACT_QUEUED:
-- 
1.7.11.4


From e985d11c858cc02bc0d61c23f3439f9b842fd6c6 Mon Sep 17 00:00:00 2001
From: Paolo Valente <[email protected]>
Date: Sat, 15 Sep 2012 00:41:35 +0000
Subject: [PATCH 16/34] pkt_sched: fix virtual-start-time update in QFQ

[ Upstream commit 71261956973ba9e0637848a5adb4a5819b4bae83 ]

If the old timestamps of a class, say cl, are stale when the class
becomes active, then QFQ may assign to cl a much higher start time
than the maximum value allowed. This may happen when QFQ assigns to
the start time of cl the finish time of a group whose classes are
characterized by a higher value of the ratio
max_class_pkt/weight_of_the_class with respect to that of
cl. Inserting a class with a too high start time into the bucket list
corrupts the data structure and may eventually lead to crashes.
This patch limits the maximum start time assigned to a class.

Signed-off-by: Paolo Valente <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/sched/sch_qfq.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 9af01f3..d92f169 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -831,7 +831,10 @@ static void qfq_update_start(struct qfq_sched *q, struct 
qfq_class *cl)
                if (mask) {
                        struct qfq_group *next = qfq_ffs(q, mask);
                        if (qfq_gt(roundedF, next->F)) {
-                               cl->S = next->F;
+                               if (qfq_gt(limit, next->F))
+                                       cl->S = next->F;
+                               else /* preserve timestamp correctness */
+                                       cl->S = limit;
                                return;
                        }
                }
-- 
1.7.11.4


From 5d5945648a633b24cc83e83d9058699f02b4bd84 Mon Sep 17 00:00:00 2001
From: Lennart Sorensen <[email protected]>
Date: Fri, 7 Sep 2012 12:14:02 +0000
Subject: [PATCH 17/34] sierra_net: Endianess bug fix.

[ Upstream commit 2120c52da6fe741454a60644018ad2a6abd957ac ]

I discovered I couldn't get sierra_net to work on a powerpc.  Turns out
the firmware attribute check assumes the system is little endian and
hence fails because the attributes is a 16 bit value.

Signed-off-by: Len Sorensen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/net/usb/sierra_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index d75d1f5..e43f1cf 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -678,7 +678,7 @@ static int sierra_net_get_fw_attr(struct usbnet *dev, u16 
*datap)
                return -EIO;
        }
 
-       *datap = *attrdata;
+       *datap = le16_to_cpu(*attrdata);
 
        kfree(attrdata);
        return result;
-- 
1.7.11.4


From a067f819d51c0e909860788a9e955e83b60c27f9 Mon Sep 17 00:00:00 2001
From: Antonio Quartulli <[email protected]>
Date: Tue, 2 Oct 2012 06:14:17 +0000
Subject: [PATCH 18/34] 8021q: fix mac_len recomputation in vlan_untag()

[ Upstream commit 5316cf9a5197eb80b2800e1acadde287924ca975 ]

skb_reset_mac_len() relies on the value of the skb->network_header pointer,
therefore we must wait for such pointer to be recalculated before computing
the new mac_len value.

Signed-off-by: Antonio Quartulli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/8021q/vlan_core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 8ca533c..830059d 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -105,7 +105,6 @@ static struct sk_buff *vlan_reorder_header(struct sk_buff 
*skb)
                return NULL;
        memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 2 * ETH_ALEN);
        skb->mac_header += VLAN_HLEN;
-       skb_reset_mac_len(skb);
        return skb;
 }
 
@@ -139,6 +138,8 @@ struct sk_buff *vlan_untag(struct sk_buff *skb)
 
        skb_reset_network_header(skb);
        skb_reset_transport_header(skb);
+       skb_reset_mac_len(skb);
+
        return skb;
 
 err_free:
-- 
1.7.11.4


From 592550871eec84a19a4955be9cc0d08e124d4a7c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Linus=20L=C3=BCssing?= <[email protected]>
Date: Fri, 14 Sep 2012 00:40:54 +0000
Subject: [PATCH 19/34] batman-adv: make batadv_test_bit() return 0 or 1 only
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

[ Upstream commit dbd6b11e15a2f96030da17dbeda943a8a98ee990 ]

On some architectures test_bit() can return other values than 0 or 1:

With a generic x86 OpenWrt image in a kvm setup (batadv_)test_bit()
frequently returns -1 for me, leading to batadv_iv_ogm_update_seqnos()
wrongly signaling a protected seqno window.

This patch tries to fix this issue by making batadv_test_bit() return 0
or 1 only.

Signed-off-by: Linus Lüssing <[email protected]>
Acked-by: Sven Eckelmann <[email protected]>
Signed-off-by: Antonio Quartulli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/batman-adv/bitarray.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/batman-adv/bitarray.h b/net/batman-adv/bitarray.h
index 1835c15..7d1840b 100644
--- a/net/batman-adv/bitarray.h
+++ b/net/batman-adv/bitarray.h
@@ -22,8 +22,9 @@
 #ifndef _NET_BATMAN_ADV_BITARRAY_H_
 #define _NET_BATMAN_ADV_BITARRAY_H_
 
-/* returns true if the corresponding bit in the given seq_bits indicates true
- * and curr_seqno is within range of last_seqno */
+/* Returns 1 if the corresponding bit in the given seq_bits indicates true
+ * and curr_seqno is within range of last_seqno. Otherwise returns 0.
+ */
 static inline int bat_test_bit(const unsigned long *seq_bits,
                               uint32_t last_seqno, uint32_t curr_seqno)
 {
@@ -33,7 +34,7 @@ static inline int bat_test_bit(const unsigned long *seq_bits,
        if (diff < 0 || diff >= TQ_LOCAL_WINDOW_SIZE)
                return 0;
        else
-               return  test_bit(diff, seq_bits);
+               return test_bit(diff, seq_bits) != 0;
 }
 
 /* turn corresponding bit on, so we can remember that we got the packet */
-- 
1.7.11.4


From 077ca44c68581d619f430d326bb57d39fe120787 Mon Sep 17 00:00:00 2001
From: Gao feng <[email protected]>
Date: Wed, 19 Sep 2012 19:25:34 +0000
Subject: [PATCH 20/34] ipv6: release reference of ip6_null_entry's dst entry
 in __ip6_del_rt

[ Upstream commit 6825a26c2dc21eb4f8df9c06d3786ddec97cf53b ]

as we hold dst_entry before we call __ip6_del_rt,
so we should alse call dst_release not only return
-ENOENT when the rt6_info is ip6_null_entry.

and we already hold the dst entry, so I think it's
safe to call dst_release out of the write-read lock.

Signed-off-by: Gao feng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv6/route.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index becb048..a3ef2ad 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1485,17 +1485,18 @@ static int __ip6_del_rt(struct rt6_info *rt, struct 
nl_info *info)
        struct fib6_table *table;
        struct net *net = dev_net(rt->dst.dev);
 
-       if (rt == net->ipv6.ip6_null_entry)
-               return -ENOENT;
+       if (rt == net->ipv6.ip6_null_entry) {
+               err = -ENOENT;
+               goto out;
+       }
 
        table = rt->rt6i_table;
        write_lock_bh(&table->tb6_lock);
-
        err = fib6_del(rt, info);
-       dst_release(&rt->dst);
-
        write_unlock_bh(&table->tb6_lock);
 
+out:
+       dst_release(&rt->dst);
        return err;
 }
 
-- 
1.7.11.4


From 820c6de05d0bd6d77eea64fa26316185db1ea87a Mon Sep 17 00:00:00 2001
From: Nicolas Dichtel <[email protected]>
Date: Wed, 26 Sep 2012 00:04:55 +0000
Subject: [PATCH 21/34] ipv6: del unreachable route when an addr is deleted on
 lo

[ Upstream commit 64c6d08e6490fb18cea09bb03686c149946bd818 ]

When an address is added on loopback (ip -6 a a 2002::1/128 dev lo), two routes
are added:
 - one in the local table:
    local 2002::1 via :: dev lo  proto none  metric 0
 - one the in main table (for the prefix):
    unreachable 2002::1 dev lo  proto kernel  metric 256  error -101

When the address is deleted, the route inserted in the main table remains
because we use rt6_lookup(), which returns NULL when dst->error is set, which
is the case here! Thus, it is better to use ip6_route_lookup() to avoid this
kind of filter.

Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv6/addrconf.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 938ba6d..0344f8e 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -793,10 +793,16 @@ static void ipv6_del_addr(struct inet6_ifaddr *ifp)
                struct in6_addr prefix;
                struct rt6_info *rt;
                struct net *net = dev_net(ifp->idev->dev);
+               struct flowi6 fl6 = {};
+
                ipv6_addr_prefix(&prefix, &ifp->addr, ifp->prefix_len);
-               rt = rt6_lookup(net, &prefix, NULL, ifp->idev->dev->ifindex, 1);
+               fl6.flowi6_oif = ifp->idev->dev->ifindex;
+               fl6.daddr = prefix;
+               rt = (struct rt6_info *)ip6_route_lookup(net, &fl6,
+                                                        RT6_LOOKUP_F_IFACE);
 
-               if (rt && addrconf_is_prefix_route(rt)) {
+               if (rt != net->ipv6.ip6_null_entry &&
+                   addrconf_is_prefix_route(rt)) {
                        if (onlink == 0) {
                                ip6_del_rt(rt);
                                rt = NULL;
-- 
1.7.11.4


From 0b0b4e031aa43dff7a0de9d35d9efe20cc60a32f Mon Sep 17 00:00:00 2001
From: Wei Yongjun <[email protected]>
Date: Thu, 20 Sep 2012 18:29:56 +0000
Subject: [PATCH 22/34] ipv6: fix return value check in fib6_add()

[ Upstream commit f950c0ecc78f745e490d615280e031de4dbb1306 ]

In case of error, the function fib6_add_1() returns ERR_PTR()
or NULL pointer. The ERR_PTR() case check is missing in fib6_add().

dpatch engine is used to generated this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv6/ip6_fib.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 6083276..0907191 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -818,6 +818,10 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, 
struct nl_info *info)
                                        offsetof(struct rt6_info, rt6i_src),
                                        allow_create, replace_required);
 
+                       if (IS_ERR(sn)) {
+                               err = PTR_ERR(sn);
+                               sn = NULL;
+                       }
                        if (!sn) {
                                /* If it is failed, discard just allocated
                                   root, and then (in st_failure) stale node
-- 
1.7.11.4


From 614cf46bd6e320f786593179b5a36898f9f3c58b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Kube=C4=8Dek?= <[email protected]>
Date: Fri, 14 Sep 2012 04:59:52 +0000
Subject: [PATCH 23/34] tcp: flush DMA queue before sk_wait_data if rcv_wnd is
 zero

[ Upstream commit 15c041759bfcd9ab0a4e43f1c16e2644977d0467 ]

If recv() syscall is called for a TCP socket so that
  - IOAT DMA is used
  - MSG_WAITALL flag is used
  - requested length is bigger than sk_rcvbuf
  - enough data has already arrived to bring rcv_wnd to zero
then when tcp_recvmsg() gets to calling sk_wait_data(), receive
window can be still zero while sk_async_wait_queue exhausts
enough space to keep it zero. As this queue isn't cleaned until
the tcp_service_net_dma() call, sk_wait_data() cannot receive
any data and blocks forever.

If zero receive window and non-empty sk_async_wait_queue is
detected before calling sk_wait_data(), process the queue first.

Signed-off-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv4/tcp.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7ac0ce6..56e9fa7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1706,8 +1706,14 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, 
struct msghdr *msg,
                }
 
 #ifdef CONFIG_NET_DMA
-               if (tp->ucopy.dma_chan)
-                       dma_async_memcpy_issue_pending(tp->ucopy.dma_chan);
+               if (tp->ucopy.dma_chan) {
+                       if (tp->rcv_wnd == 0 &&
+                           !skb_queue_empty(&sk->sk_async_wait_queue)) {
+                               tcp_service_net_dma(sk, true);
+                               tcp_cleanup_rbuf(sk, copied);
+                       } else
+                               
dma_async_memcpy_issue_pending(tp->ucopy.dma_chan);
+               }
 #endif
                if (copied >= target) {
                        /* Do not sleep, just process backlog. */
-- 
1.7.11.4


From 1743e04ab007ade760e637ddd400f88006aae296 Mon Sep 17 00:00:00 2001
From: Thomas Graf <[email protected]>
Date: Mon, 3 Sep 2012 04:27:42 +0000
Subject: [PATCH 24/34] sctp: Don't charge for data in sndbuf again when
 transmitting packet

[ Upstream commit 4c3a5bdae293f75cdf729c6c00124e8489af2276 ]

SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via
skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up
by __sctp_write_space().

Buffer space charged via sctp_set_owner_w() is released in sctp_wfree()
which calls __sctp_write_space() directly.

Buffer space charged via skb_set_owner_w() is released via sock_wfree()
which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.

Therefore if sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is
interrupted by a signal.

This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...

Charging for the data twice does not make sense in the first place, it
leads to overcharging sndbuf by a factor 2. Therefore this patch only
charges a single byte in wmem_alloc when transmitting an SCTP packet to
ensure that the socket stays alive until the packet has been released.

This means that control chunks are no longer accounted for in wmem_alloc
which I believe is not a problem as skb->truesize will typically lead
to overcharging anyway and thus compensates for any control overhead.

Signed-off-by: Thomas Graf <[email protected]>
CC: Vlad Yasevich <[email protected]>
CC: Neil Horman <[email protected]>
CC: David Miller <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/sctp/output.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 6ae47ac..83dddb9 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -339,6 +339,25 @@ finish:
        return retval;
 }
 
+static void sctp_packet_release_owner(struct sk_buff *skb)
+{
+       sk_free(skb->sk);
+}
+
+static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
+{
+       skb_orphan(skb);
+       skb->sk = sk;
+       skb->destructor = sctp_packet_release_owner;
+
+       /*
+        * The data chunks have already been accounted for in sctp_sendmsg(),
+        * therefore only reserve a single byte to keep socket around until
+        * the packet has been transmitted.
+        */
+       atomic_inc(&sk->sk_wmem_alloc);
+}
+
 /* All packets are sent to the network through this function from
  * sctp_outq_tail().
  *
@@ -380,7 +399,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
        /* Set the owning socket so that we know where to get the
         * destination IP address.
         */
-       skb_set_owner_w(nskb, sk);
+       sctp_packet_set_owner_w(nskb, sk);
 
        if (!sctp_transport_dst_check(tp)) {
                sctp_transport_route(tp, NULL, sctp_sk(sk));
-- 
1.7.11.4


From 66215ac2e4fb92b51c2f49eaf326c45bdf06088f Mon Sep 17 00:00:00 2001
From: Xiaodong Xu <[email protected]>
Date: Sat, 22 Sep 2012 00:09:32 +0000
Subject: [PATCH 25/34] pppoe: drop PPPOX_ZOMBIEs in pppoe_release

[ Upstream commit 2b018d57ff18e5405823e5cb59651a5b4d946d7b ]

When PPPOE is running over a virtual ethernet interface (e.g., a
bonding interface) and the user tries to delete the interface in case
the PPPOE state is ZOMBIE, the kernel will loop forever while
unregistering net_device for the reference count is not decreased to
zero which should have been done with dev_put().

Signed-off-by: Xiaodong Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/net/ppp/pppoe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index cbf7047..20f31d0 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -570,7 +570,7 @@ static int pppoe_release(struct socket *sock)
 
        po = pppox_sk(sk);
 
-       if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND)) {
+       if (sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) {
                dev_put(po->pppoe_dev);
                po->pppoe_dev = NULL;
        }
-- 
1.7.11.4


From fc6777bd29dc4c480ce9533f48e7a921c6e72ed2 Mon Sep 17 00:00:00 2001
From: Chema Gonzalez <[email protected]>
Date: Fri, 7 Sep 2012 13:40:50 +0000
Subject: [PATCH 26/34] net: small bug on rxhash calculation

[ Upstream commit 6862234238e84648c305526af2edd98badcad1e0 ]

In the current rxhash calculation function, while the
sorting of the ports/addrs is coherent (you get the
same rxhash for packets sharing the same 4-tuple, in
both directions), ports and addrs are sorted
independently. This implies packets from a connection
between the same addresses but crossed ports hash to
the same rxhash.

For example, traffic between A=S:l and B=L:s is hashed
(in both directions) from {L, S, {s, l}}. The same
rxhash is obtained for packets between C=S:s and D=L:l.

This patch ensures that you either swap both addrs and ports,
or you swap none. Traffic between A and B, and traffic
between C and D, get their rxhash from different sources
({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D)

The patch is co-written with Eric Dumazet <[email protected]>

Signed-off-by: Chema Gonzalez <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/core/dev.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index e2215ee..a3bc7bf 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2615,15 +2615,16 @@ void __skb_get_rxhash(struct sk_buff *skb)
        if (!skb_flow_dissect(skb, &keys))
                return;
 
-       if (keys.ports) {
-               if ((__force u16)keys.port16[1] < (__force u16)keys.port16[0])
-                       swap(keys.port16[0], keys.port16[1]);
+       if (keys.ports)
                skb->l4_rxhash = 1;
-       }
 
        /* get a consistent hash (same value on both flow directions) */
-       if ((__force u32)keys.dst < (__force u32)keys.src)
+       if (((__force u32)keys.dst < (__force u32)keys.src) ||
+           (((__force u32)keys.dst == (__force u32)keys.src) &&
+            ((__force u16)keys.port16[1] < (__force u16)keys.port16[0]))) {
                swap(keys.dst, keys.src);
+               swap(keys.port16[0], keys.port16[1]);
+       }
 
        hash = jhash_3words((__force u32)keys.dst,
                            (__force u32)keys.src,
-- 
1.7.11.4


From 694e6baad3988f75a8b859c8f39fb7dcdbfffe3a Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Mon, 24 Sep 2012 07:00:11 +0000
Subject: [PATCH 27/34] net: guard tcp_set_keepalive() to tcp sockets

[ Upstream commit 3e10986d1d698140747fcfc2761ec9cb64c1d582 ]

Its possible to use RAW sockets to get a crash in
tcp_set_keepalive() / sk_reset_timer()

Fix is to make sure socket is a SOCK_STREAM one.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/core/sock.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index fcf8fdf..c5f765c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -636,7 +636,8 @@ set_rcvbuf:
 
        case SO_KEEPALIVE:
 #ifdef CONFIG_INET
-               if (sk->sk_protocol == IPPROTO_TCP)
+               if (sk->sk_protocol == IPPROTO_TCP &&
+                   sk->sk_type == SOCK_STREAM)
                        tcp_set_keepalive(sk, valbool);
 #endif
                sock_valbool_flag(sk, SOCK_KEEPOPEN, valbool);
-- 
1.7.11.4


From 553b8ef6d807518c47345a8b1296fa0d1df2e945 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Sat, 22 Sep 2012 00:08:29 +0000
Subject: [PATCH 28/34] ipv4: raw: fix icmp_filter()

[ Upstream commit ab43ed8b7490cb387782423ecf74aeee7237e591 ]

icmp_filter() should not modify its input, or else its caller
would need to recompute ip_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv4/raw.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 4032b81..d53aa77 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -131,18 +131,20 @@ found:
  *     0 - deliver
  *     1 - block
  */
-static __inline__ int icmp_filter(struct sock *sk, struct sk_buff *skb)
+static int icmp_filter(const struct sock *sk, const struct sk_buff *skb)
 {
-       int type;
+       struct icmphdr _hdr;
+       const struct icmphdr *hdr;
 
-       if (!pskb_may_pull(skb, sizeof(struct icmphdr)))
+       hdr = skb_header_pointer(skb, skb_transport_offset(skb),
+                                sizeof(_hdr), &_hdr);
+       if (!hdr)
                return 1;
 
-       type = icmp_hdr(skb)->type;
-       if (type < 32) {
+       if (hdr->type < 32) {
                __u32 data = raw_sk(sk)->filter.data;
 
-               return ((1 << type) & data) != 0;
+               return ((1U << hdr->type) & data) != 0;
        }
 
        /* Do not block unknown ICMP types */
-- 
1.7.11.4


From 5f5ab6141d12f4c2ee079597fee39c4dc4cb44f4 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Tue, 25 Sep 2012 07:03:40 +0000
Subject: [PATCH 29/34] ipv6: raw: fix icmpv6_filter()

[ Upstream commit 1b05c4b50edbddbdde715c4a7350629819f6655e ]

icmpv6_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull() and
change the prototype to make clear both sk and skb are const.

Also, if icmpv6 header cannot be found, do not deliver the packet,
as we do in IPv4.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv6/raw.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 93d6983..9dca4a8 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -107,21 +107,20 @@ found:
  *     0 - deliver
  *     1 - block
  */
-static __inline__ int icmpv6_filter(struct sock *sk, struct sk_buff *skb)
+static int icmpv6_filter(const struct sock *sk, const struct sk_buff *skb)
 {
-       struct icmp6hdr *icmph;
-       struct raw6_sock *rp = raw6_sk(sk);
-
-       if (pskb_may_pull(skb, sizeof(struct icmp6hdr))) {
-               __u32 *data = &rp->filter.data[0];
-               int bit_nr;
+       struct icmp6hdr *_hdr;
+       const struct icmp6hdr *hdr;
 
-               icmph = (struct icmp6hdr *) skb->data;
-               bit_nr = icmph->icmp6_type;
+       hdr = skb_header_pointer(skb, skb_transport_offset(skb),
+                                sizeof(_hdr), &_hdr);
+       if (hdr) {
+               const __u32 *data = &raw6_sk(sk)->filter.data[0];
+               unsigned int type = hdr->icmp6_type;
 
-               return (data[bit_nr >> 5] & (1 << (bit_nr & 31))) != 0;
+               return (data[type >> 5] & (1U << (type & 31))) != 0;
        }
-       return 0;
+       return 1;
 }
 
 #if defined(CONFIG_IPV6_MIP6) || defined(CONFIG_IPV6_MIP6_MODULE)
-- 
1.7.11.4


From bb98b26d7dc8c73f9367d184dba429624ea9e77d Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Tue, 25 Sep 2012 22:01:28 +0200
Subject: [PATCH 30/34] ipv6: mip6: fix mip6_mh_filter()

[ Upstream commit 96af69ea2a83d292238bdba20e4508ee967cf8cb ]

mip6_mh_filter() should not modify its input, or else its caller
would need to recompute ipv6_hdr() if skb->head is reallocated.

Use skb_header_pointer() instead of pskb_may_pull()

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/ipv6/mip6.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index 5b087c3..0f9bdc5 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -86,28 +86,30 @@ static int mip6_mh_len(int type)
 
 static int mip6_mh_filter(struct sock *sk, struct sk_buff *skb)
 {
-       struct ip6_mh *mh;
+       struct ip6_mh _hdr;
+       const struct ip6_mh *mh;
 
-       if (!pskb_may_pull(skb, (skb_transport_offset(skb)) + 8) ||
-           !pskb_may_pull(skb, (skb_transport_offset(skb) +
-                                ((skb_transport_header(skb)[1] + 1) << 3))))
+       mh = skb_header_pointer(skb, skb_transport_offset(skb),
+                               sizeof(_hdr), &_hdr);
+       if (!mh)
                return -1;
 
-       mh = (struct ip6_mh *)skb_transport_header(skb);
+       if (((mh->ip6mh_hdrlen + 1) << 3) > skb->len)
+               return -1;
 
        if (mh->ip6mh_hdrlen < mip6_mh_len(mh->ip6mh_type)) {
                LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH message too short: %d vs 
>=%d\n",
                               mh->ip6mh_hdrlen, mip6_mh_len(mh->ip6mh_type));
-               mip6_param_prob(skb, 0, ((&mh->ip6mh_hdrlen) -
-                                        skb_network_header(skb)));
+               mip6_param_prob(skb, 0, offsetof(struct ip6_mh, ip6mh_hdrlen) +
+                               skb_network_header_len(skb));
                return -1;
        }
 
        if (mh->ip6mh_proto != IPPROTO_NONE) {
                LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH invalid payload proto = 
%d\n",
                               mh->ip6mh_proto);
-               mip6_param_prob(skb, 0, ((&mh->ip6mh_proto) -
-                                        skb_network_header(skb)));
+               mip6_param_prob(skb, 0, offsetof(struct ip6_mh, ip6mh_proto) +
+                               skb_network_header_len(skb));
                return -1;
        }
 
-- 
1.7.11.4


From 6bda609d7b295492bdd8744c57488efc46d4c7d8 Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Tue, 4 Sep 2012 15:54:55 -0400
Subject: [PATCH 31/34] l2tp: fix a typo in l2tp_eth_dev_recv()

[ Upstream commit c0cc88a7627c333de50b07b7c60b1d49d9d2e6cc ]

While investigating l2tp bug, I hit a bug in eth_type_trans(),
because not enough bytes were pulled in skb head.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/l2tp/l2tp_eth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
index 47b259f..94840d8 100644
--- a/net/l2tp/l2tp_eth.c
+++ b/net/l2tp/l2tp_eth.c
@@ -148,7 +148,7 @@ static void l2tp_eth_dev_recv(struct l2tp_session *session, 
struct sk_buff *skb,
                print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, skb->data, length);
        }
 
-       if (!pskb_may_pull(skb, sizeof(ETH_HLEN)))
+       if (!pskb_may_pull(skb, ETH_HLEN))
                goto error;
 
        secpath_reset(skb);
-- 
1.7.11.4


From a94739bd80cc022ab059ebc39c24540e550ccdf8 Mon Sep 17 00:00:00 2001
From: Alan Cox <[email protected]>
Date: Tue, 4 Sep 2012 04:13:18 +0000
Subject: [PATCH 32/34] netrom: copy_datagram_iovec can fail

[ Upstream commit 6cf5c951175abcec4da470c50565cc0afe6cd11d ]

Check for an error from this and if so bail properly.

Signed-off-by: Alan Cox <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/netrom/af_netrom.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 06592d8..1b9024e 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1169,7 +1169,12 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket 
*sock,
                msg->msg_flags |= MSG_TRUNC;
        }
 
-       skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+       er = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+       if (er < 0) {
+               skb_free_datagram(sk, skb);
+               release_sock(sk);
+               return er;
+       }
 
        if (sax != NULL) {
                sax->sax25_family = AF_NETROM;
-- 
1.7.11.4


From 6163dbb8cf79a4ce2f9526a03c948ccdec4d1af1 Mon Sep 17 00:00:00 2001
From: Ed Cashin <[email protected]>
Date: Wed, 19 Sep 2012 15:49:00 +0000
Subject: [PATCH 33/34] net: do not disable sg for packets requiring no
 checksum

[ Upstream commit c0d680e577ff171e7b37dbdb1b1bf5451e851f04 ]

A change in a series of VLAN-related changes appears to have
inadvertently disabled the use of the scatter gather feature of
network cards for transmission of non-IP ethernet protocols like ATA
over Ethernet (AoE).  Below is a reference to the commit that
introduces a "harmonize_features" function that turns off scatter
gather when the NIC does not support hardware checksumming for the
ethernet protocol of an sk buff.

  commit f01a5236bd4b140198fbcc550f085e8361fd73fa
  Author: Jesse Gross <[email protected]>
  Date:   Sun Jan 9 06:23:31 2011 +0000

      net offloading: Generalize netif_get_vlan_features().

The can_checksum_protocol function is not equipped to consider a
protocol that does not require checksumming.  Calling it for a
protocol that requires no checksum is inappropriate.

The patch below has harmonize_features call can_checksum_protocol when
the protocol needs a checksum, so that the network layer is not forced
to perform unnecessary skb linearization on the transmission of AoE
packets.  Unnecessary linearization results in decreased performance
and increased memory pressure, as reported here:

  http://www.spinics.net/lists/linux-mm/msg15184.html

The problem has probably not been widely experienced yet, because
only recently has the kernel.org-distributed aoe driver acquired the
ability to use payloads of over a page in size, with the patchset
recently included in the mm tree:

  https://lkml.org/lkml/2012/8/28/140

The coraid.com-distributed aoe driver already could use payloads of
greater than a page in size, but its users generally do not use the
newest kernels.

Signed-off-by: Ed Cashin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 net/core/dev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index a3bc7bf..8b3dee5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2119,7 +2119,8 @@ static bool can_checksum_protocol(netdev_features_t 
features, __be16 protocol)
 static netdev_features_t harmonize_features(struct sk_buff *skb,
        __be16 protocol, netdev_features_t features)
 {
-       if (!can_checksum_protocol(features, protocol)) {
+       if (skb->ip_summed != CHECKSUM_NONE &&
+           !can_checksum_protocol(features, protocol)) {
                features &= ~NETIF_F_ALL_CSUM;
                features &= ~NETIF_F_SG;
        } else if (illegal_highdma(skb->dev, skb)) {
-- 
1.7.11.4


From 45919b80a06a1e9371eb7d092cb1807ccc21b424 Mon Sep 17 00:00:00 2001
From: Ed Cashin <[email protected]>
Date: Wed, 19 Sep 2012 15:46:39 +0000
Subject: [PATCH 34/34] aoe: assert AoE packets marked as requiring no
 checksum

[ Upstream commit 8babe8cc6570ed896b7b596337eb8fe730c3ff45 ]

In order for the network layer to see that AoE requires
no checksumming in a generic way, the packets must be
marked as requiring no checksum, so we make this requirement
explicit with the assertion.

Signed-off-by: Ed Cashin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
 drivers/block/aoe/aoecmd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index de0435e..887f68f 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -35,6 +35,7 @@ new_skb(ulong len)
                skb_reset_mac_header(skb);
                skb_reset_network_header(skb);
                skb->protocol = __constant_htons(ETH_P_AOE);
+               skb_checksum_none_assert(skb);
        }
        return skb;
 }
-- 
1.7.11.4

[PATCHES] Networking -stable for 3.5.x

Reply via email to