Michal Mertl wrote:
> I'm getting panics on SMP -CURRENT while running apachebench (binary ab
> from apache distribution, not the Perl one) against httpd on the machine.
> 
> The panics don't occur when I have WITNESS and INVARIANTS turned on.

[ ... ]

> #10 0xc01bd46f in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:503
> #11 0xc01f7e1e in sofree (so=0xc58f05d0) at
> /usr/src/sys/kern/uipc_socket.c:312
> #12 0xc01fa508 in sonewconn (head=0xc43874d8, connstatus=2)
>     at /usr/src/sys/kern/uipc_socket2.c:208
> #13 0xc023f42f in syncache_socket (sc=0x2, lso=0xc43874d8, m=0xc1662200)
>     at /usr/src/sys/netinet/tcp_syncache.c:564
> #14 0xc023f748 in syncache_expand (inc=0xd6a62b3c, th=0xc1f6c834,
>     sop=0xd6a62b10, m=0xc1662200)
> /usr/src/sys/netinet/tcp_syncache.c:783
> #15 0xc0239978 in tcp_input (m=0xc1f6c834, off0=20)
>     at /usr/src/sys/netinet/tcp_input.c:713


soreserve is called to get mbufs reserved to the socket, and
sbreserve is called, and this fails, because you have too few
mbufs in your system for the number of connections you have
configured.

This is a problem because the sotryfree() in sonewconn() (see
the definition in sys/socketvar.h) sees a so_count of zero, and
calls sofree() directly.

The sofree() fails because the socket is not enqueued as being
an incomplete connection, and not enqueued as being a complete
connection (not on a queue, and so_state does not have SS_INCOMP
or SS_COMP flags set).

Basically, this code dies not expect to be called in this case,
and the call occurs because the SYN cache code runs at NETISR.

Personally, I do not understand why a prereservation for mbufs
is necessary in this particular case: if you are out of mbufs,
the packets should end up dropped, in any case, so it should not
matter.  I guess it's an attempt to "protect you" from massive
connection attempts acting as a denial of service attack.

One "fix" would be to reference the socket before making the
call, in syncache_socket().  The basically correct way to do
this would be to invert the order of the "if" test in sonewconn()
(see attached patch).

This can also fail, though: if the protocol attach fails, then
it will still panic.  Also, if the protocol attach doesn't fail,
and there's an soabort(), if the protocol detach fails, it will
still call sotryfree() in the abort... and, once again, panic.

My suggestion:

1)      Try the attached patch; it will probably cover up the
        problem for you.

2)      Make sure you don't put the number of connections you
        allow to be larger than the number of mbufs, divided
        by 2, divided by the number of mbufs you have set in
        the net.inet.tcp.recvspace (i.e.: Do Not Overcommit
        Mbufs).

3)      Disable the use of "SYN cookies", e.g.:

                sysctl net.inet.tcp.syncookies=0

        SYN cookies are incredibly evil, and will put pressure
        on your resources by drastically increasing pool retention
        time, if they end up being invoked.

-- Terry
Index: uipc_socket2.c
===================================================================
RCS file: /cvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.104
diff -c -r1.104 uipc_socket2.c
*** uipc_socket2.c      18 Sep 2002 19:44:11 -0000      1.104
--- uipc_socket2.c      1 Nov 2002 17:16:39 -0000
***************
*** 203,210 ****
  #ifdef MAC
        mac_create_socket_from_socket(head, so);
  #endif
!       if (soreserve(so, head->so_snd.sb_hiwat, head->so_rcv.sb_hiwat) ||
!           (*so->so_proto->pr_usrreqs->pru_attach)(so, 0, NULL)) {
                sotryfree(so);
                return ((struct socket *)0);
        }
--- 203,210 ----
  #ifdef MAC
        mac_create_socket_from_socket(head, so);
  #endif
!       if ((*so->so_proto->pr_usrreqs->pru_attach)(so, 0, NULL) ||
!           soreserve(so, head->so_snd.sb_hiwat, head->so_rcv.sb_hiwat)) {
                sotryfree(so);
                return ((struct socket *)0);
        }

Reply via email to