From: Waldemar Kozaczuk <jwkozac...@gmail.com>
Committer: Nadav Har'El <n...@scylladb.com>
Branch: master

Remove soisconnected() from syncache_socket()

This patch fixes an intermittent issue happening in TCP stack code
during session 3-way handshake protocol due to inconsistent state
of transmission control block (TCB) data structure. Specifically
right after server sends SYN+ACK to client the function syncache_socket()
would prematurely invoke soisconnected() to move child socket from so_incomp
queue to so_comp queue and release ACCEPT lock. This results in premature
accepting of new connection that leads to sending data using inconsistent
variables in TCB structure (snd_nxt, snd_una). Finally it is manifested
by client missing to receive 1st octet when server thinks it has sent all
data (see failing tst-tcp-nbwrite test).

This bug was inadvertently introduced when backporting a patch from FreeBSD
- https://github.com/cloudius-systems/osv/commit/330d85c2e858fd0f32f7377ce40ea21cd8a87041#diff-689dfa92b697b6e1327d6c0bad5fad26. Two years later part of the patch was reversed in FreeBSD codebase - https://github.com/freebsd/freebsd/commit/0b9cdc127b85d70aaf5f72c14f72e6f7bc43647f#diff-f3fd9f911dd76ffd9fbd0c8bfb458a61. For more details please see - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183659 and https://reviews.freebsd.org/D8072.

So in essence this patch removes invocation of soisconnected() along
with SOCK_LOCK from syncache_socket() and adds SOCK_LOCK right
before soisconnected() invocation in tcp_do_segment().

Unfortunately it is not clear why soisconnected() was added to syncache_socket() in first place in original freebsd patch backported in 2014 to OSv neither why
it was no longer necessary per patch in 2016. The former patch also removed
locking logic around V_tcbinfo in tcp_usr_accept() to alleviate lock contention
but it is not clear what the consequences of removing soisconnected() from
syncache_socket() might be. However based on over 1000 (1K) successful runs
of tst-tcp-nbwrite test and many executions of concurrent tests of rust-httpserver using apache benchmark tool with 100 concurrent clients, TCP stack seems to be
functioning fine and does seem to perform in similar way
(around 10-15K request per second with 4 cpus setup).

Fixes #859

Signed-off-by: Waldemar Kozaczuk <jwkozac...@gmail.com>
Message-Id: <1524721407-10659-1-git-send-email-jwkozac...@gmail.com>

---
diff --git a/bsd/sys/netinet/tcp_input.cc b/bsd/sys/netinet/tcp_input.cc
--- a/bsd/sys/netinet/tcp_input.cc
+++ b/bsd/sys/netinet/tcp_input.cc
@@ -1914,6 +1914,7 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so,
        case TCPS_SYN_RECEIVED:

                TCPSTAT_INC(tcps_connects);
+               SOCK_LOCK(so);
                soisconnected(so);
                /* Do window scaling? */
                if ((tp->t_flags & (TF_RCVD_SCALE|TF_REQ_SCALE)) ==
diff --git a/bsd/sys/netinet/tcp_syncache.cc b/bsd/sys/netinet/tcp_syncache.cc
--- a/bsd/sys/netinet/tcp_syncache.cc
+++ b/bsd/sys/netinet/tcp_syncache.cc
@@ -810,9 +810,6 @@ syncache_socket(struct syncache *sc, struct socket *lso, struct mbuf *m)

        INP_UNLOCK(inp);

-       SOCK_LOCK(so);
-       soisconnected(so);
-
        TCPSTAT_INC(tcps_accepts);
        return (so);

--
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to