On Wed, Jul 24, 2019 at 10:48:25PM +0200, Alexander Bluhm wrote:
> On Wed, Jul 24, 2019 at 08:59:44PM +0200, Alexander Bluhm wrote:
> > The reaper on CPU 0 does a NULL dereference when removing the page.
> > On CPU 1 zerothread is waiting for kernel lock.  CPU 2 and 3 are
> > idle.
> >
> > uvm_fault(0xfffffd8240760cc8, 0x7f827ea48908, 0, 2) -> e
> > kernel: page fault trap, code=0
> > Stopped at      pmap_page_remove+0x210: xchgq   %rax,0(%rcx,%rdx,1)
> 
> Forgot to mention, that was C source line pmap.c:1878
> 
>                 opte = pmap_pte_set(&PTE_BASE[pl1_i(pve->pv_va)], 0);
> 
> > I will update kernel and look if panic is reproducable.
> 
> It is reproduceable
> 
> ddb{3}> x/s version
> version:        OpenBSD 6.5-current (GENERIC.MP) #139: Wed Jul 24 05:11:28 
> MDT 2
> 019\012    
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> \012
> 
> ddb{3}> show panic
> kernel page fault
> uvm_fault(0xfffffd823efc7998, 0x7f8444c11f08, 0, 1) -> e
> pmap_enter(fffffd823e1ce3f8,889823e1000,5f3c2000,3,22) at pmap_enter+0x1d6
> end trace frame: 0xffff80002210ed30, count: 0
> 
> Now it happens in pmap.c:2624
> 
>         opte = PTE_BASE[pl1_i(va)];             /* old PTE */
> 
> Something in PTE_BASE array is not mapped.
> 

I wrote a quick program to calculate what address this would be (thinking
maybe we had some overflow or something) but it does indeed match the
faulting address above (0x7f8444c11f08) for the VA 0x889823e1000.
This address (0x7f8444c11f08) is in the PTE range, so it looks like it
was never allocated or possibly double-freed. Double free matches the
previous email's comment as well.

If this happens again, it might be interesting to see what pages around
that are mapped. For example, for this particular instance, to see if
0x7f8444c10000 is mapped, or 0x7f8444c12000. ddb>'s 'x' command can do that
(see if you get another fault or if you get some data). Maybe the data in
those pages around it might provide a hint (although that's a longshot).

-ml

> ddb{3}> trace
> pmap_enter(fffffd823e1ce3f8,889823e1000,5f3c2000,3,22) at pmap_enter+0x1d6
> uvm_fault(fffffd823efc7998,889823e1000,0,2) at uvm_fault+0xa2a
> pageflttrap() at pageflttrap+0x145
> usertrap(ffff80002210ee20) at usertrap+0x1e3
> recall_trap(6,dfdfdfdfdfdfdfdf,0,6,1000,8890b6fc7c0) at recall_trap+0x8
> end of kernel
> end trace frame: 0x888fdfc9330, count: -5
> 
> Note that at June 11th I reported a similiar trace in pmap to bugs@
> when ld caused a crash.
> 
> ddb{3}> ps
>    PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
>  76368  342680   5059      0  2         0x2                malloc_duel
>  76368  101339   5059      0  7   0x4000002                malloc_duel
>  76368  514296   5059      0  3   0x4000082  fsleep        malloc_duel
> *76368  384915   5059      0  7   0x4000002                malloc_duel
>  76368  221830   5059      0  7   0x4000002                malloc_duel
>  76368  361827   5059      0  7   0x4000002                malloc_duel
>  76368  480274   5059      0  3   0x4000082  fsleep        malloc_duel
>  76368  468117   5059      0  3   0x4000082  fsleep        malloc_duel
>  76368  461971   5059      0  3   0x4000082  fsleep        malloc_duel
>  76368  266728   5059      0  2   0x4000002                malloc_duel
>  76368   82327   5059      0  2   0x4000002                malloc_duel
>   5059  194815   4702      0  3    0x10008a  pause         make
>   4702  434789  57398      0  3    0x10008a  pause         sh
>  57398  272052  80135      0  3    0x10008a  pause         make
>  80135   83438  74843      0  3    0x10008a  pause         sh
>  74843  269959  24644      0  3    0x10008a  pause         make
>  71213   91038  31378      0  3    0x100082  piperd        gzip
>  31378  297755  24644      0  3    0x100082  piperd        pax
>  24644  139228  73204      0  3        0x82  piperd        perl
>  73204  241400   3907      0  3    0x10008a  pause         ksh
>   3907  427314  77842      0  3        0x92  select        sshd
>  49732  259852      1      0  3    0x100083  ttyin         getty
>  58444  180559      1      0  3    0x100083  ttyin         getty
>  30659  289121      1      0  3    0x100083  ttyin         getty
>   9656  108850      1      0  3    0x100083  ttyin         getty
>  24203   10241      1      0  3    0x100083  ttyin         getty
>  65063  251469      1      0  3    0x100083  ttyin         getty
>  16142  523320      1      0  3    0x100098  poll          cron
>  90805    3316      0      0  3     0x14280  nfsidl        nfsio
>  11202  322177      0      0  3     0x14280  nfsidl        nfsio
>  73491  331359      0      0  3     0x14280  nfsidl        nfsio
>  37841  249018      0      0  3     0x14280  nfsidl        nfsio
>   4136  428500      1     99  3    0x100090  poll          sndiod
>  12112  519438      1    110  3    0x100090  poll          sndiod
>  49306   97767    137     95  3    0x100092  kqread        smtpd
>  70869  189393    137    103  3    0x100092  kqread        smtpd
>  79867  131344    137     95  3    0x100092  kqread        smtpd
>  66859  375509    137     95  3    0x100092  kqread        smtpd
>  22396   48018    137     95  3    0x100092  kqread        smtpd
>  16604   93317    137     95  3    0x100092  kqread        smtpd
>    137  452544      1      0  3    0x100080  kqread        smtpd
>  77842  219221      1      0  3        0x80  select        sshd
>  88298  318549      0      0  3     0x14200  acct          acct
>   7436  211089      1      0  3    0x100080  poll          ntpd
>  15596  214430  72873     83  3    0x100092  poll          ntpd
>  72873  423080      1     83  3    0x100092  poll          ntpd
>    639  455748   5843     74  3    0x100092  bpf           pflogd
>   5843  152563      1      0  3        0x80  netio         pflogd
>  49089   65344  96782     73  3    0x100090  kqread        syslogd
>  96782  134250      1      0  3    0x100082  netio         syslogd
>  15309   57931      1     77  3    0x100090  poll          dhclient
>  92131  300080      1      0  3        0x80  poll          dhclient
>    440  434925  45137    115  3    0x100092  kqread        slaacd
>  23230  157398  45137    115  3    0x100092  kqread        slaacd
>  45137  283018      1      0  3    0x100080  kqread        slaacd
>  11751  424885      0      0  3     0x14200  pgzero        zerothread
>  94669  233757      0      0  3     0x14200  aiodoned      aiodoned
>  39044  189625      0      0  3     0x14200  syncer        update
>  11265  246421      0      0  3     0x14200  cleaner       cleaner
>  86967  386950      0      0  3     0x14200  reaper        reaper
>  48511  221734      0      0  3     0x14200  pgdaemon      pagedaemon
>  27362  255648      0      0  3     0x14200  bored         crynlk
>  58949  107875      0      0  3     0x14200  bored         crypto
>  88305  317139      0      0  3     0x14200  bored         sensors
>  62804  248570      0      0  3     0x14200  usbtsk        usbtask
>    717  253829      0      0  3     0x14200  usbatsk       usbatsk
>  48070  263826      0      0  3  0x40014200  acpi0         acpi0
>  65386  442770      0      0  3  0x40014200                idle3
>  33089  148765      0      0  3  0x40014200                idle2
>  65055  498669      0      0  3  0x40014200                idle1
>  10578  506553      0      0  3     0x14200  bored         softnet
>  70559   53653      0      0  3     0x14200  bored         systqmp
>   6788     104      0      0  3     0x14200  bored         systq
>  23919  173929      0      0  3  0x40014200  bored         softclock
>  87424  241507      0      0  3  0x40014200                idle0
>  44349  256295      0      0  3     0x14200  bored         smr
>      1  488173      0      0  3        0x82  wait          init
>      0       0     -1      0  3     0x10200  scheduler     swapper
> 
> bluhm
> 

Reply via email to