> On 20 Nov 2015, at 09:48, Steven Chamberlain <ste...@pyro.eu.org> wrote: > > Hi, > > Fred wrote: >> I've just updated to: >> OpenBSD 5.8-current (GENERIC) #801: Wed Nov 18 16:37:51 MST 2015 > > I'd used: > OpenBSD 5.8-current (GENERIC) #799: Wed Nov 18 01:34:20 MST 2015 > >> but I still had the following panic: >> >> panic: psycho0: uncorrectable DMA error AFAR 663f8450 (pa=0 tte=0/6218a012) >> AFSR 410000ff40800000 > > Damn! although this was stable for a couple of hours yesterday > (and that's quite an improvement to how it was) - today I booted > it up and it crashed on my very first SSH login attempt. > > Thank you for re-testing anyway. I notice your machine crashed in > process sshd also. My backtrace looks slightly different to yours so I > share it here:
my guess this is something to do with handling of long mbuf chains for transmit. i say that because ssh is very good at generating these long chains, and it has caused problems on various other chips. either that or the chip has alignment or minimum transfer requirements that such packets dont respect. if you're inclined to play with the code, try making the code coalesce (dc_coal) every packet and see if the problem still occurs. dlg > > panic: psycho0: uncorrectable DMA error AFAR 6e868448 (pa=0 tte=0/69a12012) > AFSR 4100ff0020800000 > Stopped at Debugger+0x8: nop > TID PID UID PRFLAGS PFLAGS CPU COMMAND > *13074 13074 0 0x12 0 0 sshd > psycho_ue(400008a3200, 0, 4000fb13bc8, 0, 0, 0) at psycho_ue+0x7c > intr_handler(e0017ec8, 400008a3300, 58c86, 4000fb13cd8, 1, 0) at > intr_handler+0xc > sparc_interrupt(0, 4000fb13db0, 4000fb13df0, 0, 0, 14b) at > sparc_interrupt+0x298 > syscall(4000fb13ed0, 404, 1cd56442e8, 1cd56442ec, 0, 0) at syscall+0x34c > softtrap(3, 1c49412ee4, 54, 0, 0, 0) at softtrap+0x19c > http://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb> trace > psycho_ue(400008a3200, 0, 4000fb13bc8, 0, 0, 0) at psycho_ue+0x7c > intr_handler(e0017ec8, 400008a3300, 58c86, 4000fb13cd8, 1, 0) at > intr_handler+0xc > sparc_interrupt(0, 4000fb13db0, 4000fb13df0, 0, 0, 14b) at > sparc_interrupt+0x298 > syscall(4000fb13ed0, 404, 1cd56442e8, 1cd56442ec, 0, 0) at syscall+0x34c > softtrap(3, 1c49412ee4, 54, 0, 0, 0) at softtrap+0x19c > ddb> ps > TID PPID PGRP UID S FLAGS WAIT COMMAND > 4621 13074 4621 0 2 0x11 sshd > *13074 5310 13074 0 7 0x12 sshd > 19768 1 19768 0 3 0x83 ttyin getty > 9712 1 9712 0 3 0x80 poll cron > 30895 1 30895 99 3 0x90 poll sndiod > 24430 1 24430 79 3 0x90 kqread tftpd > 15649 18969 18969 95 3 0x90 kqread smtpd > 9170 18969 18969 95 3 0x90 kqread smtpd > 31583 18969 18969 95 3 0x90 kqread smtpd > 46 18969 18969 95 3 0x90 kqread smtpd > 32085 18969 18969 95 3 0x90 kqread smtpd > 14545 18969 18969 103 3 0x90 kqread smtpd > 18969 1 18969 0 3 0x80 kqread smtpd > 17656 10194 10194 0 3 0x82 piperd cat > 10194 17726 10194 0 3 0x8a pause ksh > 24201 1 24201 77 3 0x90 poll dhcpd > 17726 5310 17726 0 2 0x12 sshd > 5310 1 5310 0 3 0x80 select sshd > 11444 22135 26256 83 3 0x90 poll ntpd > 22135 26256 26256 83 3 0x90 poll ntpd > 26256 1 26256 0 3 0x80 poll ntpd > 20693 19390 19390 74 3 0x90 bpf pflogd > 19390 1 19390 0 3 0x80 netio pflogd > 18490 15555 15555 73 3 0x90 kqread syslogd > 15555 1 15555 0 3 0x80 netio syslogd > 19396 0 0 0 2 0x14200 zerothread > 27403 0 0 0 3 0x14200 aiodoned aiodoned > 30791 0 0 0 3 0x14200 syncer update > 5885 0 0 0 3 0x14200 cleaner cleaner > 17162 0 0 0 3 0x14200 reaper reaper > 6922 0 0 0 3 0x14200 pgdaemon pagedaemon > 10422 0 0 0 3 0x14200 bored crypto > 7995 0 0 0 3 0x14200 pftm pfpurge > 28357 0 0 0 3 0x14200 usbtsk usbtask > 16205 0 0 0 3 0x14200 usbatsk usbatsk > 9316 0 0 0 3 0x14200 bored sensors > 14821 0 0 0 3 0x14200 bored softnet > 25203 0 0 0 3 0x14200 bored systqmp > 20530 0 0 0 3 0x14200 bored systq > 25274 0 0 0 3 0x40014200 idle0 > 22093 0 0 0 3 0x14200 kmalloc kmthread > 1 0 1 0 3 0x82 wait init > 0 -1 0 0 3 0x10200 scheduler swapper > > Out of curiosity I tried 'show pool' but that did not work so well: > > ddb> show pool > POOLpanic: kernel data fault: pc=1344444 addr=83286000 > Stopped at Debugger+0x8: nop > data_access_fault(e0017028, 30, 1344444, 83286000, 83287017, 801009) at > data_access_fault+0x2ec > trapbase_sun4v(80a0a00083287017, 80a0a00083287018, 1, e00172cc, 0, e00173c8) > at trapbase_sun4v+0x8790 > kprintf(5, 0, 0, 0, 0, 0) at kprintf+0xda4 > db_printf(1693788, 80a0a00083287017, 82102000, 9c23bf30, 3, e00175f0) at > db_printf+0x40 > pool_print1(15628e8, e00174d8, 11b3de0, e00174d8, 0, 1889498) at > pool_print1+0x54 > db_command(180b6b8, 0, 0, 0, 15628e8, 180b000) at db_command+0x134 > db_command_loop(90, fffffffffffef691, e, 5, 11b3de0, 17b1500) at > db_command_loop+0x10c > db_trap(1834000, 0, 0, 0, 0, 5) at db_trap+0x234 > kdb_trap(101, e0017998, 1, 800, 0, e0017c20) at kdb_trap+0x168 > trap(e0017998, 101, 15628e4, 1d0006, 0, 10) at trap+0x2cc > slowtrap(1, e0017cf0, 1833670, 180e000, 54, 34) at slowtrap+0x1d8 > panic(17aaa68, 400008a3224, 6e868448, e0017cf0, 1895000, 100) at panic+0xb8 > psycho_ue(400008a3200, 0, 4000fb13bc8, 0, 0, 0) at psycho_ue+0x7c > intr_handler(e0017ec8, 400008a3300, 58c86, 4000fb13cd8, 1, 0) at > intr_handler+0xc > > Regards, > -- > Steven Chamberlain > ste...@pyro.eu.org >