Hi -net, On 23/09/15 16:47, Julien Charbon wrote: > Thanks to Palle, I got access to the kernel dump. And the results is > more interesting than expected: Thus somehow the kernel reaches a state > in tcp_detach() where: > > INP_TIMEWAIT && INP_DROPPED && tp != NULL > > In details: > > - inp is in TIMEWAIT state > - inp has been dropped by in_pcbdrop() > - inp->inp_ppcb (a struct tcptw) is not NULL > > All the related structures looks good from the coredump: socket, inp, > and tcptw, thus no sign of any memory corruption (so far). > > And for the kernel, this state it is _not_ ok. Hopefully, there are > only two functions that set the INP_DROPPED flags: > > - tcp_twclose() and, > - tcp_close() > > If tcp_twclose() is called inp->inp_ppcb is set to NULL and the struct > tcptw is freed (all good, not assertion) > > If tcp_close() is called inp->inp_ppcb is left untouched (less ok, > potential assertion) > > Almost all tcp_close() calls (or tcp_close() parents calls) use a > pattern like: > > if (inp->inp_flags & INP_TIMEWAIT) { > /* Don't call tcp_close() just return */ > return; > } > > /* Call tcp_close() */ > tcp_close(); > > But not _all_ tcp_close() calls. > > Thus the most important point here is: Either this assertion is wrong, > either tcp_close() in INP_TIMEWAIT state should not happen. > > This assert and tcp_close() current behavior is here since a long time, > thus I would like old beards^W^W^W more experimented TCP stack > developers to give an opinion/refresh theirs memories on this very > specific case.
So the issue is: - tcp_close() is called for some reasons with an inp in INP_TIMEWAIT state and sets the INP_DROPPED flag, - tcp_detach() is called when the last reference on socket is dropped then now in_pcbfree() can be called twice instead of once: 1. First in tcp_detach(): static void tcp_detach(struct socket *so, struct inpcb *inp) { struct tcpcb *tp; tp = intotcpcb(inp); if (inp->inp_flags & INP_TIMEWAIT) { if (inp->inp_flags & INP_DROPPED) { in_pcbdetach(inp); in_pcbfree(inp); <-- } 2. Second when tcptw expires here: void tcp_twclose(struct tcptw *tw, int reuse) { struct socket *so; struct inpcb *inp; inp = tw->tw_inpcb; tcp_tw_2msl_stop(tw, reuse); inp->inp_ppcb = NULL; in_pcbdrop(inp); so = inp->inp_socket; if (so != NULL) { ... } else { in_pcbfree(inp); <-- } This behavior is backed by Palle kernel panic backstraces and coredumps. o Solutions: Long: Forbid to call tcp_close() when inp is in INP_TIMEWAIT state, the TCP stack rule being: - if !INP_TIMEWAIT: Call tcp_close() - if INP_TIMEWAIT: Call tcp_twclose() (or call nothing, the tcptw will expire/be recycled anyway) Short: if INP_TIMEWAIT & INP_DROPPED: Do not call in_pcbfree(inp) in tcp_detach() unless tcptw is already discarded. The long solution seems cleaner, backed by tcp_detach() old comments and most of current tcp_close() calls but I won't take that longer path without -net approval first. Thanks. -- Julien "For every complex problem there is an answer that is clear, simple, and wrong" -- H. L. Mencken
signature.asc
Description: OpenPGP digital signature