Re: server freezes under heavy CPU usage

K R Tue, 01 Jul 2025 10:30:32 -0700

On Tue, Jul 1, 2025 at 9:07 AM Claudio Jeker <[email protected]> wrote:
>
> On Tue, Jul 01, 2025 at 04:57:14AM -0300, K R wrote:
> > On Mon, Jun 30, 2025 at 2:39 AM K R <[email protected]> wrote:
> > >
> > > On Fri, Jun 27, 2025 at 3:34 AM Martin Pieuchot <[email protected]> 
> > > wrote:
> > > >
> > > > On 26/06/25(Thu) 11:02, K R wrote:
> > > > > On Wed, Jun 25, 2025 at 1:30 PM K R <[email protected]> wrote:
> > > > > >
> > > > > > [...]
> > > > > > > Hi Alexander,
> > > > > > >
> > > > > > > The good news: I can consistently reproduce the hang problem.  
> > > > > > > But the
> > > > > > > bad news is that even with a WITNESS kernel and 
> > > > > > > kern.witness.watch=2
> > > > > > > (or even 3) I don't see any message or kernel panic.
> > > >
> > > > Do you mind sharing your recipe to reproduce the hang?
> >
> > Another one.  This time I've included output from 'show witness /b', below:
> >
> > pmap_tlb_shootwait: spun outStopped at  db_enter+0x14:  popq    %rbp
>
> I think the number one question here is why did pmap_tlb_shootwait spin
> out?
> This is waiting for other CPUs to finish their work.
> So first of all what value has tlb_shoot_wait (x /qx tlb_shoot_wait)
> The probably also the values of tlb_shoot_first_pcid, tlb_shoot_addr1 and
> tlb_shoot_addr2 are of interest. (use x /qx for the addrs and x /x for
> first_pcid).


Hi Claudio, please see below the output from the commands you
requested.  If you need something else, please let me know.

Thanks,
--Kor

pmap_tlb_shootwait: spun outStopped at  db_enter+0x14:  popq    %rbp
ddb{1}> trace
db_enter() at db_enter+0x14
pmap_enter(fffffd9ca7a56300,b411ac3f000,1cd64e7000,3,22) at pmap_enter+0x725
uvm_fault_lower(ffff800055e1c5b8,ffff800055e1c5f0,ffff800055e1c530) at uvm_faul
t_lower+0x27f
uvm_fault(fffffd9ca826c3c8,b411ac3f000,0,2) at uvm_fault+0x1c5
upageflttrap(ffff800055e1c720,b411ac3f000) at upageflttrap+0x6c
usertrap(ffff800055e1c720) at usertrap+0x1f5
recall_trap() at recall_trap+0x8
end of kernel
end trace frame: 0xb4079c21950, count: -7
ddb{1}> x /qx tlb_shoot_wait
tlb_shoot_wait:                    1
ddb{1}> x /qx tlb_shoot_addr1
tlb_shoot_addr1:                 66b92b45000
ddb{1}> x /qx tlb_shoot_addr2
tlb_shoot_addr2:                 66b92b50000
ddb{1}> x /x tlb_shoot_first_pcid
tlb_shoot_first_pcid:              1
ddb{1}> show all locks
Process 93180 (nfdump) thread 0xffff800055d267e0 (454561)
exclusive rwlock amaplk r = 0 (0xfffffd9bd3d855c8)
exclusive rwlock vmmaplk r = 0 (0xfffffd9ca7719a70)
Process 68612 (nfdump) thread 0xffff800055d26030 (18900)
exclusive rwlock amaplk r = 0 (0xfffffd9be2563698)
exclusive rwlock vmmaplk r = 0 (0xfffffd9ca51fb308)
Process 68612 (nfdump) thread 0xffff8000ffff07f8 (504595)
exclusive kernel_lock &kernel_lock r = 0 (0xffffffff82b2a910)
exclusive rrwlock inode r = 0 (0xfffffd9c20986d18)
Process 90287 (nfdump) thread 0xffff800055d96038 (392807)
exclusive rwlock amaplk r = 0 (0xfffffd9be5663930)
shared rwlock vmmaplk r = 0 (0xfffffd9ca826c4c8)
Process 90287 (nfdump) thread 0xffff8000ffff1238 (503279)
exclusive rwlock amaplk r = 0 (0xfffffd9be5663a08)
shared rwlock vmmaplk r = 0 (0xfffffd9ca826c4c8)
ddb{1}> show witness /b
Number of known direct relationships is 360

Lock order reversal between "&mp->mnt_lock"(rwlock) and "&ip->i_lock"(rrwlock)!

witness: incomplete path, depth 4
ddb{1}> ps /o
    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
 454561  93180      0         0x3          0    2  nfdump
  18900  68612      0         0x3          0    8  nfdump
 246146  68612      0         0x3  0x4000000    7  nfdump
 504595  68612      0         0x3  0x4000000    6K nfdump
*392807  90287      0         0x3  0x4000000    1  nfdump
 503279  90287      0         0x3  0x4000000    0  nfdump

>
> The question to answer is why does tlb_shoot_wait not reach 0.
> The above information may shed some light (maybe).
>
> > ddb{3}> show locks
> > exclusive rwlock amaplk r = 0 (0xfffffd9619a06de0)
> > shared rwlock vmmaplk r = 0 (0xfffffd9ca1d22e38)
>
> Please also include 'show all locks'.
>
> > ddb{3}> show witness /b
> > Number of known direct relationships is 400
> >
> > Lock order reversal between "&mp->mnt_lock"(rwlock) and 
> > "&ip->i_lock"(rrwlock)!
> >
> > witness: incomplete path, depth 4
> > ddb{3}> trace
> > db_enter() at db_enter+0x14
> > pmap_enter(fffffda07ffdfc00,9f5a20cd000,160950d000,3,22) at pmap_enter+0x725
> > uvm_fault_lower(ffff800055e422b8,ffff800055e422f0,ffff800055e42230) at 
> > uvm_faul
> > t_lower+0x27f
> > uvm_fault(fffffd9ca1d22d38,9f5a20cd000,0,2) at uvm_fault+0x1c5
> > upageflttrap(ffff800055e42420,9f5a20cd000) at upageflttrap+0x6c
> > usertrap(ffff800055e42420) at usertrap+0x1f5
> > recall_trap() at recall_trap+0x8
> > end of kernel
> > end trace frame: 0x9f54e22cfd0, count: -7
>
> Looking at pmap_enter() I see this:
>
>         if (! pmap_valid_entry(opte)) {
>                 PTE_BASE[pl1_i(va)] = npte;
>         } else if ((opte | (npte ^ PG_RW)) & PG_RW) {
>                 /* previously writable or not making writable */
>                 PTE_BASE[pl1_i(va)] = npte;
>                 if (nocache && (opte & PG_N) == 0)
>                         wbinvd_on_all_cpus();
>                 pmap_tlb_shootpage(pmap, va, shootself);
>         } else {
>                 PTE_BASE[pl1_i(va)] = npte ^ PG_RW;
>                 if (nocache && (opte & PG_N) == 0) /* XXX impossible? */
>                         wbinvd_on_all_cpus();
>                 pmap_tlb_shootpage(pmap, va, shootself);
>                 pmap_tlb_shootwait();
>                 PTE_BASE[pl1_i(va)] = npte;
>         }
>
>         pmap_unmap_ptes(pmap, scr3);
>         pmap_tlb_shootwait();
>
>         error = 0;
>
> The else case of the above if block does pmap_tlb_shootwait() but then
> after that a 2nd pmap_tlb_shootwait() is issued. Not sure if this is safe
> in all cases. Especially the first pmap_tlb_shootwait() is done with the
> pm_mtx locked.
>
> --
> :wq Claudio

Re: server freezes under heavy CPU usage

Reply via email to