Hi Slawa,

On 9/26/16 7:22 PM, Slawa Olhovchenkov wrote:
> On Mon, Sep 26, 2016 at 11:33:12AM +0200, Julien Charbon wrote:
>> On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote:
>>> On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote:
>>>>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
>>>>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
>>>>>>>  You can also use Dtrace and lockstat (especially with the lockstat -s
>>>>>>> option):
>>>>>>>
>>>>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
>>>>>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE
>>>>>>>
>>>>>>>  But I am less familiar with Dtrace/lockstat tools.
>>>>>>
>>>>>> I am still use old kernel and got lockdown again.
>>>>>> Try using lockstat (I am save more output), interesting may be next:
>>>>>>
>>>>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 
>>>>>> events/sec)
>>>>>>
>>>>>> -------------------------------------------------------------------------------
>>>>>> Count indv cuml rcnt     nsec Lock                   Caller              
>>>>>>     
>>>>>> 140839  74%  74% 0.00    24659 tcpinp                 
>>>>>> tcp_tw_2msl_scan+0xc6   
>>>>>>
>>>>>>       nsec ------ Time Distribution ------ count     Stack               
>>>>>>     
>>>>>>       4096 |                               913       tcp_twstart+0xa3    
>>>>>>     
>>>>>>       8192 |@@@@@@@@@@@@                   58191     
>>>>>> tcp_do_segment+0x201f   
>>>>>>      16384 |@@@@@@                         29594     tcp_input+0xe1c     
>>>>>>     
>>>>>>      32768 |@@@@                           23447     ip_input+0x15f      
>>>>>>     
>>>>>>      65536 |@@@                            16197     
>>>>>>     131072 |@                              8674      
>>>>>>     262144 |                               3358      
>>>>>>     524288 |                               456       
>>>>>>    1048576 |                               9         
>>>>>> -------------------------------------------------------------------------------
>>>>>> Count indv cuml rcnt     nsec Lock                   Caller              
>>>>>>     
>>>>>> 49180  26% 100% 0.00    15929 tcpinp                 
>>>>>> tcp_tw_2msl_scan+0xc6   
>>>>>>
>>>>>>       nsec ------ Time Distribution ------ count     Stack               
>>>>>>     
>>>>>>       4096 |                               157       pfslowtimo+0x54     
>>>>>>     
>>>>>>       8192 |@@@@@@@@@@@@@@@                24796     
>>>>>> softclock_call_cc+0x179 
>>>>>>      16384 |@@@@@@                         11223     softclock+0x44      
>>>>>>     
>>>>>>      32768 |@@@@                           7426      
>>>>>> intr_event_execute_handlers+0x95
>>>>>>      65536 |@@                             3918      
>>>>>>     131072 |                               1363      
>>>>>>     262144 |                               278       
>>>>>>     524288 |                               19        
>>>>>> -------------------------------------------------------------------------------
>>>>>
>>>>>  This is interesting, it seems that you have two call paths competing
>>>>> for INP locks here:
>>>>>
>>>>>  - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and
>>>>>
>>>>>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)
>>>>
>>>> My current hypothesis:
>>>>
>>>> nginx do write() (or may be close()?) to socket, kernel lock
>>>> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
>>>> CPU core and tcp_tw_2msl_scan infinity locked on same inp.
>>>>
>>>> In this case you modification can't help, before next try we need some
>>>> like yeld().
>>>
>>> Or may be locks leaks.
>>> Or both.
>>
>>  You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=0) is
>> infinitely blocked on INP_WLOCK() by "something" (that could be related
>> to write()).
>>
>>  As I reached my limit of debugging without WITNESS, could you share
>> your /etc/sysctl.conf, /boot/loader.conf files?  And any specific
>> configuration you have (like having a Nginx workers affinity, Nginx
>> special options, etc.).  Like that I can try to reproduce it on releng/11.0.
> 
> Some more traces from ddb:
> 
> Tracing command intr pid 12 tid 100103 td 0xfffff8012508ea00
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe2020ea8330
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020ea8360
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe2020ea83a0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe2020ea8420
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020ea8460
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020ea8490
> tcp_do_segment() at 0xffffffff80610226 = tcp_do_segment+0x1666/frame 
> 0xfffffe2020ea8590
> tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020ea86e0
> ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020ea8740
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020ea87a0
> ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame 
> 0xfffffe2020ea87d0
> ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame 
> 0xfffffe2020ea8830
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020ea8890
> ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame 
> 0xfffffe2020ea88b0
> ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame 
> 0xfffffe2020ea8990
> ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame 
> 0xfffffe2020ea89e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe2020ea8a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe2020ea8a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020ea8ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe2020ea8ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100105 td 0xfffff8012508e000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe2020eb2330
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020eb2360
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe2020eb23a0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe2020eb2420
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020eb2460
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020eb2490
> tcp_do_segment() at 0xffffffff80610226 = tcp_do_segment+0x1666/frame 
> 0xfffffe2020eb2590
> tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020eb26e0
> ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020eb2740
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020eb27a0
> ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame 
> 0xfffffe2020eb27d0
> ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame 
> 0xfffffe2020eb2830
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020eb2890
> ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame 
> 0xfffffe2020eb28b0
> ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame 
> 0xfffffe2020eb2990
> ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame 
> 0xfffffe2020eb29e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe2020eb2a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe2020eb2a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020eb2ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe2020eb2ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100107 td 0xfffff8012508d500
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe2020ebc2b0
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020ebc2e0
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe2020ebc320
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe2020ebc3a0
> soalloc() at 0xffffffff8051b914 = soalloc+0x1b4/frame 0xfffffe2020ebc3f0
> sonewconn() at 0xffffffff8051bb9f = sonewconn+0xbf/frame 0xfffffe2020ebc430
> syncache_expand() at 0xffffffff8061b85b = syncache_expand+0x78b/frame 
> 0xfffffe2020ebc590
> tcp_input() at 0xffffffff8060e10e = tcp_input+0xdae/frame 0xfffffe2020ebc6e0
> ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020ebc740
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020ebc7a0
> ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame 
> 0xfffffe2020ebc7d0
> ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame 
> 0xfffffe2020ebc830
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020ebc890
> ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame 
> 0xfffffe2020ebc8b0
> ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame 
> 0xfffffe2020ebc990
> ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame 
> 0xfffffe2020ebc9e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe2020ebca20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe2020ebca70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020ebcab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe2020ebcab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100111 td 0xfffff801250a2000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe2020f302f0
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020f30320
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe2020f30360
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe2020f303e0
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020f30420
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020f30450
> tcp_twstart() at 0xffffffff8061f0e7 = tcp_twstart+0x2b7/frame 
> 0xfffffe2020f30490
> tcp_do_segment() at 0xffffffff80610bdf = tcp_do_segment+0x201f/frame 
> 0xfffffe2020f30590
> tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020f306e0
> ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020f30740
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020f307a0
> ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame 
> 0xfffffe2020f307d0
> ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame 
> 0xfffffe2020f30830
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020f30890
> ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame 
> 0xfffffe2020f308b0
> ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame 
> 0xfffffe2020f30990
> ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame 
> 0xfffffe2020f309e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe2020f30a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe2020f30a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020f30ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe2020f30ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100113 td 0xfffff801250a1500
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe2020f3a2f0
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020f3a320
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe2020f3a360
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe2020f3a3e0
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020f3a420
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020f3a450
> tcp_twstart() at 0xffffffff8061f0e7 = tcp_twstart+0x2b7/frame 
> 0xfffffe2020f3a490
> tcp_do_segment() at 0xffffffff80610bdf = tcp_do_segment+0x201f/frame 
> 0xfffffe2020f3a590
> tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020f3a6e0
> ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020f3a740
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020f3a7a0
> ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame 
> 0xfffffe2020f3a7d0
> ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame 
> 0xfffffe2020f3a830
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020f3a890
> ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame 
> 0xfffffe2020f3a8b0
> ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame 
> 0xfffffe2020f3a990
> ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame 
> 0xfffffe2020f3a9e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe2020f3aa20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe2020f3aa70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020f3aab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe2020f3aab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100115 td 0xfffff801250a0a00
> cpustop_handler() at 0xffffffff80775998 = cpustop_handler+0x28/frame 
> 0xfffffe1f9e182cf0
> ipi_nmi_handler() at 0xffffffff8077595a = ipi_nmi_handler+0x4a/frame 
> 0xfffffe1f9e182d10
> trap() at 0xffffffff806e2e4a = trap+0x3a/frame 0xfffffe1f9e182f20
> nmi_calltrap() at 0xffffffff806cb413 = nmi_calltrap+0x8/frame 
> 0xfffffe1f9e182f20
> --- trap 0x13, rip = 0xffffffff8059b9f9, rsp = 0xfffffe2020f44420, rbp = 
> 0xfffffe2020f44420 ---
> in_pcbref() at 0xffffffff8059b9f9 = in_pcbref+0x9/frame 0xfffffe2020f44420
> tcp_tw_2msl_scan() at 0xffffffff8061f1a3 = tcp_tw_2msl_scan+0x73/frame 
> 0xfffffe2020f44450
> tcp_twstart() at 0xffffffff8061eed3 = tcp_twstart+0xa3/frame 
> 0xfffffe2020f44490
> tcp_do_segment() at 0xffffffff80610bdf = tcp_do_segment+0x201f/frame 
> 0xfffffe2020f44590
> tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020f446e0
> ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020f44740
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020f447a0
> ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame 
> 0xfffffe2020f447d0
> ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame 
> 0xfffffe2020f44830
> netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame 
> 0xfffffe2020f44890
> ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame 
> 0xfffffe2020f448b0
> ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame 
> 0xfffffe2020f44990
> ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame 
> 0xfffffe2020f449e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe2020f44a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe2020f44a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020f44ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe2020f44ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command ps pid 37011 tid 101992 td 0xfffff80144378000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe20224367b0
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe20224367e0
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe2022436820
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe20224368a0
> soalloc() at 0xffffffff8051b914 = soalloc+0x1b4/frame 0xfffffe20224368f0
> socreate() at 0xffffffff8051b617 = socreate+0xa7/frame 0xfffffe2022436940
> sys_socket() at 0xffffffff8052144d = sys_socket+0xed/frame 0xfffffe20224369a0
> amd64_syscall() at 0xffffffff806e4051 = amd64_syscall+0x2c1/frame 
> 0xfffffe2022436ab0
> Xfast_syscall() at 0xffffffff806cb2bb = Xfast_syscall+0xfb/frame 
> 0xfffffe2022436ab0
> --- syscall (97, FreeBSD ELF64, sys_socket), rip = 0x8011c413a, rsp = 
> 0x7fffffffc748, rbp = 0x7fffffffc770 ---
> 
> Tracing command cron pid 37008 tid 102228 td 0xfffff801a4090000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe20228d67b0
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe20228d67e0
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe20228d6820
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe20228d68a0
> soalloc() at 0xffffffff8051b914 = soalloc+0x1b4/frame 0xfffffe20228d68f0
> socreate() at 0xffffffff8051b617 = socreate+0xa7/frame 0xfffffe20228d6940
> sys_socket() at 0xffffffff8052144d = sys_socket+0xed/frame 0xfffffe20228d69a0
> amd64_syscall() at 0xffffffff806e4051 = amd64_syscall+0x2c1/frame 
> 0xfffffe20228d6ab0
> Xfast_syscall() at 0xffffffff806cb2bb = Xfast_syscall+0xfb/frame 
> 0xfffffe20228d6ab0
> --- syscall (97, FreeBSD ELF64, sys_socket), rip = 0x800d8c13a, rsp = 
> 0x7fffffffd658, rbp = 0x7fffffffd6f0 ---
> 
> [many likes]
> 
> Tracing command intr pid 12 tid 100015 td 0xfffff8011422b000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1cf760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1cf790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1cf7d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1cf850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1cf890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1cf8c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1cf8f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1cf9c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1cf9e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1cfa20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1cfa70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1cfab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1cfab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100016 td 0xfffff8011422aa00
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1d4760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1d4790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1d47d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1d4850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1d4890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1d48c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1d48f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1d49c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1d49e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1d4a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1d4a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1d4ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1d4ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100017 td 0xfffff8011422a500
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1d9760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1d9790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1d97d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1d9850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1d9890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1d98c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1d98f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1d99c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1d99e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1d9a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1d9a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1d9ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1d9ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100018 td 0xfffff8011422a000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1de760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1de790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1de7d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1de850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1de890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1de8c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1de8f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1de9c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1de9e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1dea20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1dea70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1deab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1deab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100019 td 0xfffff8011424da00
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1e3760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1e3790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1e37d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1e3850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1e3890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1e38c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1e38f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1e39c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1e39e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = intr_event_execute-- 
> 7:zsh -- time-stamp -- Sep/26/16 20:00:13 --
> -- 7:zsh -- time-stamp -- Sep/26/16 20:00:13 --
> _handlers+0x95/frame 0xfffffe1f9e1e3a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1e3a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1e3ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1e3ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100020 td 0xfffff8011424d500
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1e8760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1e8790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1e87d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1e8850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1e8890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1e88c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1e88f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1e89c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1e89e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1e8a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1e8a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1e8ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1e8ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100021 td 0xfffff8011424d000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1ed760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1ed790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1ed7d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1ed850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1ed890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1ed8c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1ed8f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1ed9c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1ed9e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1eda20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1eda70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1edab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1edab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100022 td 0xfffff8011424ca00
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1f2760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1f2790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1f27d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1f2850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1f2890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1f28c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1f28f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1f29c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1f29e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1f2a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1f2a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1f2ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1f2ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100023 td 0xfffff8011424c500
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1f7760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1f7790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1f77d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1f7850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1f7890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1f78c0
> tcp_timer_rexmt() at 0xffffffff8061e454 = tcp_timer_rexmt+0x114/frame 
> 0xfffffe1f9e1f78f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1f79c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1f79e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1f7a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1f7a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1f7ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1f7ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100024 td 0xfffff8011424c000
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe1f9e1fc760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1fc790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe1f9e1fc7d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe1f9e1fc850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1fc890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1fc8c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe1f9e1fc8f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe1f9e1fc9c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1fc9e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1fca20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe1f9e1fca70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1fcab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe1f9e1fcab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100025 td 0xfffff8011424ba00
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe0000382760
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe0000382790
> turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame 
> 0xfffffe00003827d0
> __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame 
> 0xfffffe0000382850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe0000382890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe00003828c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe00003828f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe00003829c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe00003829e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe0000382a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe0000382a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe0000382ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe0000382ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> Tracing command intr pid 12 tid 100026 td 0xfffff8011424b500
> sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame 
> 0xfffffe00003876f0
> mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe0000387720
> critical_exit() at 0xffffffff804a6bee = critical_exit+0x7e/frame 
> 0xfffffe0000387740
> ipi_bitmap_handler() at 0xffffffff80775629 = ipi_bitmap_handler+0x79/frame 
> 0xfffffe0000387780
> Xipi_intr_bitmap_handler() at 0xffffffff806cc15e = 
> Xipi_intr_bitmap_handler+0x8e/frame 0xfffffe0000387780
> --- interrupt, rip = 0xffffffff80484c1f, rsp = 0xfffffe0000387850, rbp = 
> 0xfffffe0000387850 ---
> __mtx_lock_flags() at 0xffffffff80484c1f = __mtx_lock_flags+0x2f/frame 
> 0xfffffe0000387850
> sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe0000387890
> tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe00003878c0
> tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame 
> 0xfffffe00003878f0
> softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame 
> 0xfffffe00003879c0
> softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe00003879e0
> intr_event_execute_handlers() at 0xffffffff8046c605 = 
> intr_event_execute_handlers+0x95/frame 0xfffffe0000387a20
> ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame 
> 0xfffffe0000387a70
> fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe0000387ab0
> fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 
> 0xfffffe0000387ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---

 Nice stack traces, all threads are blocked in sodealloc() or soalloc()
and if you look at how mtx_lock(&so_global_mtx) and
mtx_unlock(&so_global_mtx) are used, it is hard to think about a
scenario that can lead to this state.

 I am still trying to reproduce your issue, without success so far.

--
Julien

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to