From: Eric Dumazet <eduma...@google.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit f88649721268999bdff09777847080a52004f691 ]

When IP route cache had been removed in linux-3.6, we broke assumption
that dst entries were all freed after rcu grace period. DST_NOCACHE
dst were supposed to be freed from dst_release(). But it appears
we want to keep such dst around, either in UDP sockets or tunnels.

In sk_dst_get() we need to make sure dst refcount is not 0
before incrementing it, or else we might end up freeing a dst
twice.

DST_NOCACHE set on a dst does not mean this dst can not be attached
to a socket or a tunnel.

Then, before actual freeing, we need to observe a rcu grace period
to make sure all other cpus can catch the fact the dst is no longer
usable.

Signed-off-by: Eric Dumazet <eduma...@google.com>
Reported-by: Dormando <dorma...@rydia.net>
Signed-off-by: David S. Miller <da...@davemloft.net>
Signed-off-by: Jiri Slaby <jsl...@suse.cz>
---
 include/net/sock.h |  4 ++--
 net/core/dst.c     | 16 +++++++++++-----
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 4aa873a6267f..dc84ec518ff5 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1749,8 +1749,8 @@ sk_dst_get(struct sock *sk)
 
        rcu_read_lock();
        dst = rcu_dereference(sk->sk_dst_cache);
-       if (dst)
-               dst_hold(dst);
+       if (dst && !atomic_inc_not_zero(&dst->__refcnt))
+               dst = NULL;
        rcu_read_unlock();
        return dst;
 }
diff --git a/net/core/dst.c b/net/core/dst.c
index ca4231ec7347..15b6792e6ebb 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -267,6 +267,15 @@ again:
 }
 EXPORT_SYMBOL(dst_destroy);
 
+static void dst_destroy_rcu(struct rcu_head *head)
+{
+       struct dst_entry *dst = container_of(head, struct dst_entry, rcu_head);
+
+       dst = dst_destroy(dst);
+       if (dst)
+               __dst_free(dst);
+}
+
 void dst_release(struct dst_entry *dst)
 {
        if (dst) {
@@ -274,11 +283,8 @@ void dst_release(struct dst_entry *dst)
 
                newrefcnt = atomic_dec_return(&dst->__refcnt);
                WARN_ON(newrefcnt < 0);
-               if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) {
-                       dst = dst_destroy(dst);
-                       if (dst)
-                               __dst_free(dst);
-               }
+               if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt)
+                       call_rcu(&dst->rcu_head, dst_destroy_rcu);
        }
 }
 EXPORT_SYMBOL(dst_release);
-- 
2.0.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to