On 16/08/25(Sat) 21:56, Kirill A. Korinsky wrote:
> On Fri, 15 Aug 2025 14:44:51 +0200,
> Martin Pieuchot <[email protected]> wrote:
> > 
> > On 07/08/25(Thu) 17:17, Kirill A. Korinsky wrote:
> > > On Wed, 06 Aug 2025 14:41:51 +0200,
> > > Kirill A. Korinsky <[email protected]> wrote:
> > > > 
> > > > On Wed, 06 Aug 2025 11:34:45 +0200,
> > > > Kirill A. Korinsky <[email protected]> wrote:
> > > > > 
> > > > > >How-To-Repeat:
> > > > >       Not sure, can guess that the next build will crash as well.
> > > > 
> > > > Indeed it does. Serveral attempts lead to the same state.
> > 
> > What is /etc/fstab for this machine?  Do you have a swap partition on a
> > local disk?
> 
> I do have in fstab followin lines:
> 
> /swap.local none swap sw
> 
> #172.31.2.23:/volume1/octeon/swap none swap sw,nfsmntpt=/swap
> 
> #172.31.2.23:/volume1/octeon /mnt nfs rw,nodev,soft,intr,tcp,-x=2 0 0
> #/mnt/swap none swap sw
> 
> so, it is three different way to use swap on this device with different (or
> similar?) issues.

That's a setup to swap over NFS...

> I mount ports as:
> 
> 172.31.2.23:/volume1/octeon/ports /usr/ports nfs rw,nodev,soft,intr,tcp,-x=2 
> 0 0
> 
> 
> > > > 
> > > > Each attempt to build it, leads to the crash or frozen device.
> > > > 
> > > > If I mount the swap via /etc/fstab:
> > > > 
> > > > 172.31.2.23:/volume1/octeon/swap none swap sw,nfsmntpt=/swap
> > > > 
> > > > it frozes the device. When I use the following mount
> > > > 
> > > > 172.31.2.23:/volume1/octeon /mnt nfs rw,nodev,soft,intr,tcp,-x=2 0 0
> > > > 
> > > > and add swap via swapon /mnt/swap, it crashes on ISSET.
> > 
> > Could you reproduce this crash on an amd64?  octeon's ddb is too limited
> > and we don't even have a usable stack trace.
> >
> 
> Reproducing it on octeon is quite simple: I'm building lang/gcc/15 from
> ports. On very small amount of RAM.

Well the machine is out of "low" pages and all I/O are blocked.

I added some workarounds to ensure swap on a disk makes progress in such
case.  Unfortunately swapping on a file requires even more low pages and
is completely broken in such situation. 

> So, I'll try, as the next step.
> 
> > > Next attempt with swap on the local eMMC leads to somehow frozen device.

Which driver attached to this eMMC?  I saw only a USB disk in your
dmesg.

How looked your fstab in such case?

> > This might be a completely different bug.
> > 
> > > I can't connect to it via SSH, but it replies to a ping. Nor input to 
> > > serial
> > > console has any affect, but it had continius messages which repeats each 
> > > few
> > > seconds (like 5-10): pagedaemon: wait_pla deadlock detected!
> > 
> > That means the pagedaemon needs memory to write to swap.  Could you turn
> > this printf into a panic and report:
> > 
> > show panic
> > trace
> > ps
> > show uvmexp
> > show bcstats
> > 
> > 
> 
> It's already have panic a few lines belowe, I just removed ifdef.

I'm sorry there's no easy fix to this problem.  I have been advocating
moving away from flipping buffer and using bounce buffer instead in OOM
situation.

This kind of bugs are a result of the current design. I'm sorry but I
don't believe they can be fixed.

> I re-run build from console and here output of asked commands. I haven't
> touched device and it is in this state, for the case if you need something
> else.
> 
> echo timestamp > s-opinit
> build/genattrtab /usr/ports/pobj/gcc-15.2.0/gcc-15.2.0/gcc/common.md 
> /usr/ports/pobj/gcc-15.2.0/gcc-15.2.0/gcc/config/mips/mips.md 
> insn-conditions.md \
>       -Atmp-attrtab.cc -Dtmp-dfatab.cc -Ltmp-latencytab.cc
> build/genautomata /usr/ports/pobj/gcc-15.2.0/gcc-15.2.0/gcc/common.md 
> /usr/ports/pobj/gcc-15.2.0/gcc-15.2.0/gcc/config/mips/mips.md \
>   insn-conditions.md > tmp-automata.cc
> pagedaemon: wait_pla deadlock detected!
> panic: wait_pla pagedaemon deadlock
> Stopped at      db_enter+0x4:   jr      ra
> db_enter+0x8:    nop
>     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> *381199  95320      0     0x14000      0x200    1K pagedaemon
> db_enter+0x4 
> (c000000000008870,22942a29c8982d64,d2415c168fffd2f9,6eb60ac99bdd87
> 05)  ra 0xffffffff8130a8a4 sp 0x980000000ffb31f0, sz 0
> panic+0x194 (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0)  ra 
> 0xffffff
> ff81224320 sp 0x980000000ffb31f0, sz 96
> uvm_wait_pla+0x310 (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0)  ra > 0
> xffffffff81222e9c sp 0x980000000ffb3250, sz 144
> uvm_pmr_getpages+0x56c (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0) 
>  r
> a 0xffffffff8126396c sp 0x980000000ffb32e0, sz 352
> uvm_pagealloc_multi+0x134 
> (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0
> )  ra 0xffffffff814ee5a4 sp 0x980000000ffb3440, sz 96
> buf_alloc_pages+0x12c (c000000000008870,bfddef12ada96607,81df87cc80217482,0)  
> r
> a 0xffffffff81562dc4 sp 0x980000000ffb34a0, sz 48
> buf_get+0x4fc 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665b29357
> 5b7)  ra 0xffffffff81561348 sp 0x980000000ffb34d0, sz 352
> getblk+0xc8 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665b293575b
> 7)  ra 0xffffffff81438930 sp 0x980000000ffb3630, sz 384
> ufs_bmaparray+0x208 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665
> b293575b7)  ra 0xffffffff814386e8 sp 0x980000000ffb37b0, sz 192
> ufs_bmap+0x68 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665b29357
> 5b7)  ra 0x0 sp 0x980000000ffb3870, sz 0
> User-level: pid 95320
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{1}> show panic
> *cpu1: wait_pla pagedaemon deadlock
> ddb{1}> trace
> db_enter+0x4 
> (c000000000008870,22942a29c8982d64,d2415c168fffd2f9,6eb60ac99bdd87
> 05)  ra 0xffffffff8130a8a4 sp 0x980000000ffb31f0, sz 0
> panic+0x194 (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0)  ra 
> 0xffffff
> ff81224320 sp 0x980000000ffb31f0, sz 96
> uvm_wait_pla+0x310 (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0)  ra > 0
> xffffffff81222e9c sp 0x980000000ffb3250, sz 144
> uvm_pmr_getpages+0x56c (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0) 
>  r
> a 0xffffffff8126396c sp 0x980000000ffb32e0, sz 352
> uvm_pagealloc_multi+0x134 
> (c000000000008870,bfddef12ada96607,b1de6cc78de63561,0
> )  ra 0xffffffff814ee5a4 sp 0x980000000ffb3440, sz 96
> buf_alloc_pages+0x12c (c000000000008870,bfddef12ada96607,81df87cc80217482,0)  
> r
> a 0xffffffff81562dc4 sp 0x980000000ffb34a0, sz 48
> buf_get+0x4fc 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665b29357
> 5b7)  ra 0xffffffff81561348 sp 0x980000000ffb34d0, sz 352
> getblk+0xc8 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665b293575b
> 7)  ra 0xffffffff81438930 sp 0x980000000ffb3630, sz 384
> ufs_bmaparray+0x208 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665
> b293575b7)  ra 0xffffffff814386e8 sp 0x980000000ffb37b0, sz 192
> ufs_bmap+0x68 
> (c000000000008870,bfddef12ada96607,81df87cc80217482,e827665b29357
> 5b7)  ra 0x0 sp 0x980000000ffb3870, sz 0
> User-level: pid 95320
> ddb{1}> ps
>    PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
>  55543   37715  57278   1000  3         0x3  pmrwait       genautomata
>  57278   38881  50037   1000  3    0x10008b  sigsusp       sh
>  56768  111679  50037   1000  3         0x3  pmrwait       genattrtab
>  50037  378525  44588   1000  3        0x8b  kqread        gmake
>  44588   10245  20729   1000  3    0x10008b  sigsusp       sh
>  20729  159518  70980   1000  3        0x83  wait          gmake
>  70980  231145   5676   1000  3    0x10008b  sigsusp       sh
>   5676   27602  24829   1000  3        0x83  wait          gmake
>  24829  294862  55988   1000  3    0x10008b  sigsusp       sh
>  55988  438510  11929   1000  3        0x83  wait          gmake
>  11929  425593  16673   1000  3    0x10008b  sigsusp       make
>  16673  424059  45449   1000  3    0x10008b  sigsusp       sh
>  45449  109491  18381   1000  3    0x10008b  sigsusp       make
>  18381  132643      1   1000  3    0x10008b  sigsusp       ksh
>  73769  183159      1      0  3    0x100098  kqread        cron
>  46434  243672      1    847  3    0x100080  kqread        iperf3
>  49955  319276      1     99  3   0x1100090  kqread        sndiod
>  94303  148911      1    110  3    0x100090  kqread        sndiod
>   1276   16048  31686     95  3   0x1100092  kqread        smtpd
>  25602  209225  31686    103  3   0x1100092  kqread        smtpd
>     78   41284  31686     95  3   0x1100092  kqread        smtpd
>  88691  122563  31686     95  3    0x100092  kqread        smtpd
>  35356  123053  31686     95  3   0x1100092  kqread        smtpd
>  29721  105092  31686     95  3   0x1100092  kqread        smtpd
>  31686  192741      1      0  3    0x100080  kqread        smtpd
>  68601  310558      1      0  3        0x88  kqread        sshd
>  63006   77877      0      0  3     0x14280  nfsidl        nfsio
>  43902   24965      0      0  3     0x14280  nfsidl        nfsio
>  88783  136950      0      0  3     0x14280  nfsidl        nfsio
>  21497   27761      0      0  3     0x14280  nfsidl        nfsio
>  30897  106440      1      0  3    0x100080  kqread        ntpd
>  53396  312302  51659     83  3    0x100092  kqread        ntpd
>  51659  192618      1     83  3   0x1100092  kqread        ntpd
>  84850  233922  19945     73  3   0x1100010  pmrwait       syslogd
>  19945  359570      1      0  3    0x100082  sbwait        syslogd
>  54186  440515      1      0  3    0x100080  kqread        resolvd
>  42742  348592  63384     77  3    0x100092  kqread        dhcpleased
>   3985   76355  63384     77  3    0x100092  kqread        dhcpleased
>  63384  517761      1      0  3        0x80  kqread        dhcpleased
>  97704  285042      0      0  3     0x14200  bored         smr
>  30501  491686      0      0  3  0x40014200                idle1
>  81509   83644      0      0  3     0x14200  pgzero        zerothread
>  43434  473479      0      0  3     0x14200  aiodoned      aiodoned
>  50903  205844      0      0  3     0x14200  syncer        update
>  94516  155396      0      0  3     0x14200  cleaner       cleaner
>  33174  344303      0      0  3     0x14200  reaper        reaper
> *95320  381199      0      0  7     0x14200                pagedaemon
>  32803   53802      0      0  3     0x14200  usbtsk        usbtask
>  18959  425530      0      0  3     0x14200  usbatsk       usbatsk
>  31514  160865      0      0  3     0x14200  bored         dwc2
>  17042  426903      0      0  3     0x14200  bored         softnet7
>  18591  447173      0      0  3     0x14200  bored         softnet6
>  18048  499232      0      0  3     0x14200  bored         softnet5
>  58402  155762      0      0  3     0x14200  bored         softnet4
>  73930  231238      0      0  3     0x14200  bored         softnet3
>   5664  417500      0      0  3     0x14200  bored         softnet2
>  12598  387718      0      0  3     0x14200  bored         softnet1
>  98089  271582      0      0  3     0x14200  bored         softnet0
>  25981  350671      0      0  3     0x14200  bored         systqmp
>  82545  248825      0      0  3     0x14200  bored         systq
>  12243  325588      0      0  3     0x14200  tmoslp        softclockmp
>  40138  464926      0      0  3  0x40014200  tmoslp        softclock
>  33528  490024      0      0  7  0x40014200                idle0
>      1  328639      0      0  3        0x82  wait          init
>      0       0     -1      0  3     0x10200  scheduler     swapper
> ddb{1}> show uvmexp
> Current UVM status:
>   pagesize=16384 (0x4000), pagemask=0x3fff, pageshift=14
>   31803 VM pages: 19986 active, 8177 inactive, 1 wired, 502 free (135 zero)
>   freemin=1060, free-target=1413, inactive-target=9269, wired-max=10601
>   faults=48501898, traps=-1467734103, intrs=62567691, ctxswitch=16729591 
> fpuswi
> tch=0
>   softint=5250196, syscalls=6222497, kmapent=8
>   fault counts:
>     noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0
>     relocks=40938(101), upgrades=0(0) anget(retries)=42423212(120), 
> amapcopy=12
> 94364
>     neighbor anon/obj pg=844620/1349833, gets(lock/unlock)=2631017/40922
>     cases: anon=41700070, anoncow=723139, obj=2372360, prcopy=258553, 
> przero=34
> 47773
>   daemon and swap counts:
>     woke=1499, revs=1286, scans=31678, obscans=8506, anscans=23172
>     busy=0, freed=17325, reactivate=0, deactivate=31921
>     pageouts=3595, pending=3594, nswget=124
>     nswapdev=1
>     swpages=65536, swpginuse=14309, swpgonly=8724 paging=0
>   kernel pointers:
>     objs(kern)=0xffffffff8172f4b8
> ddb{1}> show bcstats
> Current Buffer Cache status:
> numbufs 843 busymapped 0, delwri 0
> kvaslots 789 avail kva slots 789
> bufpages 842, dmapages 773, dirtypages 0
> pendingreads 0, pendingwrites -1654
> highflips 138747, highflops 0, dmaflips 215
> ddb{1}> 
> 
> 
> 
> -- 
> wbr, Kirill
> 


Reply via email to