On 1/28/2021 4:19 AM, Kuniyuki Iwashima wrote:
The commit 41b14fb8724d ("net: Do not clear the sock TX queue in
sk_set_socket()") removes sk_tx_queue_clear() from sk_set_socket() and adds
it instead in sk_alloc() and sk_clone_lock() to fix an issue introduced in
the commit e022f0b4a03f ("net: Introduce sk_tx_queue_mapping"). On the
other hand, the original commit had already put sk_tx_queue_clear() in
sk_prot_alloc(): the callee of sk_alloc() and sk_clone_lock(). Thus
sk_tx_queue_clear() is called twice in each path.

If we remove sk_tx_queue_clear() in sk_alloc() and sk_clone_lock(), it
currently works well because (i) sk_tx_queue_mapping is defined between
sk_dontcopy_begin and sk_dontcopy_end, and (ii) sock_copy() called after
sk_prot_alloc() in sk_clone_lock() does not overwrite sk_tx_queue_mapping.
However, if we move sk_tx_queue_mapping out of the no copy area, it
introduces a bug unintentionally.

Therefore, this patch adds a runtime

compile-time

check to take care of the order of
sock_copy() and sk_tx_queue_clear() and removes sk_tx_queue_clear() from
sk_prot_alloc() so that it does the only allocation and its callers
initialize fields.

v3:
* Remove Fixes: tag
* Add BUILD_BUG_ON
* Remove sk_tx_queue_clear() from sk_prot_alloc()
   instead of sk_alloc() and sk_clone_lock()

v2: https://lore.kernel.org/netdev/20210127132215.10842-1-kun...@amazon.co.jp/
* Remove Reviewed-by: tag

v1: https://lore.kernel.org/netdev/20210127125018.7059-1-kun...@amazon.co.jp/

CC: Tariq Toukan <tar...@mellanox.com>
CC: Boris Pismenny <bor...@mellanox.com>
Signed-off-by: Kuniyuki Iwashima <kun...@amazon.co.jp>
---
  net/core/sock.c | 11 ++++++++++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index bbcd4b97eddd..cfbd62a5e079 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1657,6 +1657,16 @@ static void sock_copy(struct sock *nsk, const struct 
sock *osk)
  #ifdef CONFIG_SECURITY_NETWORK
        void *sptr = nsk->sk_security;
  #endif
+
+       /* If we move sk_tx_queue_mapping out of the private section,
+        * we must check if sk_tx_queue_clear() is called after
+        * sock_copy() in sk_clone_lock().
+        */
+       BUILD_BUG_ON(offsetof(struct sock, sk_tx_queue_mapping) <
+                    offsetof(struct sock, sk_dontcopy_begin) ||
+                    offsetof(struct sock, sk_tx_queue_mapping) >=
+                    offsetof(struct sock, sk_dontcopy_end));
+
        memcpy(nsk, osk, offsetof(struct sock, sk_dontcopy_begin));
memcpy(&nsk->sk_dontcopy_end, &osk->sk_dontcopy_end,
@@ -1690,7 +1700,6 @@ static struct sock *sk_prot_alloc(struct proto *prot, 
gfp_t priority,
if (!try_module_get(prot->owner))
                        goto out_free_sec;
-               sk_tx_queue_clear(sk);
        }
return sk;

Reply via email to