Re: Panic with r346530 [Re: svn commit: r346530 - in head/sys: netinet netinet6]

2019-04-22 Thread Hans Petter Selasky

On 4/22/19 3:28 PM, Kristof Provost wrote:

On 22 Apr 2019, at 12:25, Enji Cooper wrote:
Either the sys/netinet/ or sys/netipsec/ tests triggered the panic. 
Not sure which right now.


That looks to be happening during a vnet jail teardown, so it’s likely 
the sys/netipsec or sys/netpfil/pf tests.


I’ve done a quick test with the pf tests, and they provoke this panic:

 panic: mtx_lock() of destroyed mutex @ 
/usr/src/sys/netinet/ip_reass.c:628

 cpuid = 0
 time = 1555939645
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe0091d68530

 vpanic() at vpanic+0x19d/frame 0xfe0091d68580
 panic() at panic+0x43/frame 0xfe0091d685e0
 __mtx_lock_flags() at __mtx_lock_flags+0x12e/frame 0xfe0091d68630
 ipreass_cleanup() at ipreass_cleanup+0x86/frame 0xfe0091d68670
 if_detach_internal() at if_detach_internal+0x786/frame 
0xfe0091d686f0

 if_detach() at if_detach+0x3d/frame 0xfe0091d68710
 lo_clone_destroy() at lo_clone_destroy+0x16/frame 0xfe0091d68730
 if_clone_destroyif() at if_clone_destroyif+0x21f/frame 
0xfe0091d68780

 if_clone_detach() at if_clone_detach+0xb8/frame 0xfe0091d687b0
 vnet_loif_uninit() at vnet_loif_uninit+0x26/frame 0xfe0091d687d0
 vnet_destroy() at vnet_destroy+0x124/frame 0xfe0091d68800
 prison_deref() at prison_deref+0x29d/frame 0xfe0091d68840
 sys_jail_remove() at sys_jail_remove+0x28f/frame 0xfe0091d68890
 amd64_syscall() at amd64_syscall+0x276/frame 0xfe0091d689b0
 fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe0091d689b0
 --- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 
0x80031e12a, rsp = 0x7fffe848, rbp = 0x7fffe8d0 ---

 KDB: enter: panic
 [ thread pid 1223 tid 100501 ]
 Stopped at  kdb_enter+0x3b: movq    $0,kdb_why
 db>

To reproduce:

     kldload pfsync
     cd /usr/tests/sys/netpfil/pf
     sudo kyua test



I'll revert r346530 until further testing has taken place.

Thank you!

--HPS

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic with r346530 [Re: svn commit: r346530 - in head/sys: netinet netinet6]

2019-04-22 Thread Kristof Provost

On 22 Apr 2019, at 12:25, Enji Cooper wrote:
Either the sys/netinet/ or sys/netipsec/ tests triggered the panic. 
Not sure which right now.


That looks to be happening during a vnet jail teardown, so it’s likely 
the sys/netipsec or sys/netpfil/pf tests.


I’ve done a quick test with the pf tests, and they provoke this panic:

	panic: mtx_lock() of destroyed mutex @ 
/usr/src/sys/netinet/ip_reass.c:628

cpuid = 0
time = 1555939645
KDB: stack backtrace:
	db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe0091d68530

vpanic() at vpanic+0x19d/frame 0xfe0091d68580
panic() at panic+0x43/frame 0xfe0091d685e0
__mtx_lock_flags() at __mtx_lock_flags+0x12e/frame 0xfe0091d68630
ipreass_cleanup() at ipreass_cleanup+0x86/frame 0xfe0091d68670
	if_detach_internal() at if_detach_internal+0x786/frame 
0xfe0091d686f0

if_detach() at if_detach+0x3d/frame 0xfe0091d68710
lo_clone_destroy() at lo_clone_destroy+0x16/frame 0xfe0091d68730
	if_clone_destroyif() at if_clone_destroyif+0x21f/frame 
0xfe0091d68780

if_clone_detach() at if_clone_detach+0xb8/frame 0xfe0091d687b0
vnet_loif_uninit() at vnet_loif_uninit+0x26/frame 0xfe0091d687d0
vnet_destroy() at vnet_destroy+0x124/frame 0xfe0091d68800
prison_deref() at prison_deref+0x29d/frame 0xfe0091d68840
sys_jail_remove() at sys_jail_remove+0x28f/frame 0xfe0091d68890
amd64_syscall() at amd64_syscall+0x276/frame 0xfe0091d689b0
	fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe0091d689b0
	--- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x80031e12a, 
rsp = 0x7fffe848, rbp = 0x7fffe8d0 ---

KDB: enter: panic
[ thread pid 1223 tid 100501 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db>

To reproduce:

kldload pfsync
cd /usr/tests/sys/netpfil/pf
sudo kyua test

Regards,
Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Panic with r346530 [Re: svn commit: r346530 - in head/sys: netinet netinet6]

2019-04-22 Thread Enji Cooper
Hi Hans,

> On Apr 22, 2019, at 1:32 AM, Hans Petter Selasky  wrote:
> 
> On 4/22/19 10:10 AM, Hans Petter Selasky wrote:
>> On 4/22/19 9:52 AM, Enji Cooper wrote:
>>> 
 On Apr 22, 2019, at 12:27 AM, Hans Petter Selasky  
 wrote:
 
 Author: hselasky
 Date: Mon Apr 22 07:27:24 2019
 New Revision: 346530
 URL: https://svnweb.freebsd.org/changeset/base/346530
 
 Log:
   Fix panic in network stack due to memory use after free in relation to
   fragmented packets.
 
   When sending IPv4 and IPv6 fragmented packets and a fragment is lost,
   the mbuf making up the fragment will remain in the temporary hashed
   fragment list for a while. If the network interface departs before the
   so-called slow timeout clears the packet, the fragment causes a panic
   when the timeout kicks in due to accessing a freed network interface
   structure.
 
   Make sure that when a network device is departing, all hashed IPv4 and
   IPv6 fragments belonging to it, get freed.
 
   Backtrace:
   panic()
   icmp6_reflect()
 
   hlim = ND_IFINFO(m->m_pkthdr.rcvif)->chlim;
    rcvif->if_afdata[AF_INET6] is NULL.
 
   icmp6_error()
   frag6_freef()
   frag6_slowtimo()
   pfslowtimo()
   softclock_call_cc()
   softclock()
   ithread_loop()
 
   Differential Revision:https://reviews.freebsd.org/D19622
   Reviewed by:bz (network), adrian
   MFC after:1 week
   Sponsored by:Mellanox Technologies
> 
> Should be fixed by
> 
> r346535
> 
> Else I'll revert.


...

The code compiles, but unfortunately panics when running the test suite. From 
https://ci.freebsd.org/job/FreeBSD-head-amd64-test/10926/console:

03:05:01  1st 0x820967f0 allprison (allprison) @ 
/usr/src/sys/kern/kern_jail.c:966
03:05:01  2nd 0x820c47f0 vnet_sysinit_sxlock (vnet_sysinit_sxlock) @ 
/usr/src/sys/net/vnet.c:575
03:05:01 stack backtrace:
03:05:01 #0 0x80c477f3 at witness_debugger+0x73
03:05:01 #1 0x80c4753d at witness_checkorder+0xa7d
03:05:01 #2 0x80be9088 at _sx_slock_int+0x68
03:05:01 #3 0x80d0ef97 at vnet_alloc+0x117
03:05:01 #4 0x80ba4111 at kern_jail_set+0x1bb1
03:05:01 #5 0x80ba5b70 at sys_jail_set+0x40
03:05:01 #6 0x810b2e16 at amd64_syscall+0x276
03:05:01 #7 0x8108b44d at fast_syscall_common+0x101
03:05:01 panic: mtx_lock() of destroyed mutex @ 
/usr/src/sys/netinet/ip_reass.c:628
03:05:01 cpuid = 1
03:05:01 time = 1555927501
03:05:01 KDB: stack backtrace:
03:05:01 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe0030eec630
03:05:01 vpanic() at vpanic+0x19d/frame 0xfe0030eec680
03:05:01 panic() at panic+0x43/frame 0xfe0030eec6e0
03:05:02 __mtx_lock_flags() at __mtx_lock_flags+0x12e/frame 0xfe0030eec730
03:05:02 ipreass_cleanup() at ipreass_cleanup+0x86/frame 0xfe0030eec770
03:05:02 if_detach_internal() at if_detach_internal+0x786/frame 
0xfe0030eec7f0
03:05:02 if_detach() at if_detach+0x3d/frame 0xfe0030eec810
03:05:02 lo_clone_destroy() at lo_clone_destroy+0x16/frame 0xfe0030eec830
03:05:02 if_clone_destroyif() at if_clone_destroyif+0x21f/frame 
0xfe0030eec880
03:05:02 if_clone_detach() at if_clone_detach+0xb8/frame 0xfe0030eec8b0
03:05:02 vnet_loif_uninit() at vnet_loif_uninit+0x26/frame 0xfe0030eec8d0
03:05:02 vnet_destroy() at vnet_destroy+0x124/frame 0xfe0030eec900
03:05:02 prison_deref() at prison_deref+0x29d/frame 0xfe0030eec940
03:05:02 sys_jail_remove() at sys_jail_remove+0x28f/frame 0xfe0030eec990
03:05:02 amd64_syscall() at amd64_syscall+0x276/frame 0xfe0030eecab0
03:05:02 fast_syscall_common() at fast_syscall_common+0x101/frame 
0xfe0030eecab0
03:05:02 --- syscall (508, FreeBSD ELF64, sys_jail_remove), rip = 0x80031e12a, 
rsp = 0x7fffe998, rbp = 0x7fffea20 ---
03:05:02 KDB: enter: panic
03:05:02 [ thread pid 13109 tid 100150 ]
03:05:02 Stopped at  kdb_enter+0x3b: movq$0,kdb_why
03:05:02 db:0:kdb.enter.panic> show pcpu
03:05:02 cpuid= 1
03:05:02 dynamic pcpu = 0xfe0080191800
03:05:02 curthread= 0xf80005c1f000: pid 13109 tid 100150 "jail"
03:05:02 curpcb   = 0xfe0030eecb80
03:05:02 fpcurthread  = 0xf80005c1f000: pid 13109 "jail"
03:05:02 idlethread   = 0xf800032765a0: tid 14 "idle: cpu1"
03:05:02 curpmap  = 0xf8013d837130
03:05:02 tssp = 0x821cd388
03:05:02 commontssp   = 0x821cd388
03:05:02 rsp0 = 0xfe0030eecb80
03:05:02 gs32p= 0x821d3fc0
03:05:02 ldt  = 0x821d4000
03:05:02 tss  = 0x821d3ff0
03:05:02 tlb gen  = 314416
03:05:02 curvnet  = 0xf80139320200
03:05:02 spin locks held:
03:05:02 db:0:kdb.enter.panic> alltrace

Either the sys/netinet/ or sys/netipsec/ tests triggered the panic. Not 
sure which right now.
Cheers,
-Enji