On Wed, Jan 27, 2021 at 04:16:50PM +0000, Stuart Henderson wrote:
> On 2021/01/27 09:03, Bryan Steele wrote:
> > On Wed, Jan 27, 2021 at 07:11:49AM +0100, alf wrote:
> > > Hello,
> > > 
> > > while trying to upgrade one of our machines to 6.8 we experienced a
> > > repeatable crash while booting (bsd.rd + install went fine).
> > > 
> > > The machine in question is a:
> > > ...
> > > hw.vendor=HP
> > > hw.product=ProLiant DL360 G7
> > > hw.serialno=CZ3451KJW6
> > > hw.uuid=36333337-3738-435a-3334-35314b4a5736
> > > hw.physmem=8562860032
> > > hw.usermem=8562847744
> > > hw.ncpufound=12
> > > hw.allowpowerdown=1
> > > hw.perfpolicy=manual
> > > hw.smt=0
> > > hw.ncpuonline=6
> > > ...
> > > 
> > > Since this is a production machine we downgraded to 6.7 (upgrade from
> > > 6.6 which it was running before went flawlessly).
> > > 
> > > Find below the dmesg of the 6.8 kernel, 6.8-current and finally the
> > > 6.7 kernel. For the 6.8* I also provided 'trace' and 'show registers'
> > > output.
> > > 
> > > I hope this is enough info to get an idea of what was going on.
> > > I'll happily will provide additional info if needed.
> > > 
> > > Alf
> > > 
> > > [SNIP]
> > > ... 
> > > radeondrm0: RV100
> > > NMI ... going to debugger
> > 
> > The machine did not panic, instead it received an NMI (non-maskable
> > interrupt), this could be a sign of hardware failure.
> 
> radeondrm was updated in the 6.7 -> 6.8 window too. I wonder if it still
> occurs with radeondrm disabled. alf, if you have time to test, you can
> at least see if a kernel boots without updating the rest of the OS; fetch
> a 6.8 bsd.mp on the existing system (e.g. as /bsd.mp.68), reboot, at
> the boot loader prompt "boot -c bsd.mp.68", "disable radeondrm", "quit".

Since this is a production machine I'll need to find a time window
for this test.  Actually I thought about disabling radeondrm however since
I didn't use config(8) for years I misremembered it and typed -s which
of course didn't help:)

Alf

> 
> > -Bryan.
> > 
> > > Stopped at      tsc_delay+0x66: rdtsc
> > > ddb{0}> trace
> > > tsc_delay(1) at tsc_delay+0x66
> > > r100_ring_test(ffff8000001a5000,ffff8000001a6938) at r100_ring_test+0x228
> > > r100_cp_init(ffff8000001a5000,100000) at r100_cp_init+0x499
> > > r100_startup(ffff8000001a5000) at r100_startup+0x457
> > > r100_init(ffff8000001a5000) at r100_init+0x3f8
> > > radeon_device_init(ffff8000001a5000,ffff800000198800,ffff800000198850,840001)
> > >  a
> > > t radeon_device_init+0x963
> > > radeondrm_attachhook(ffff8000001a5000) at radeondrm_attachhook+0x36
> > > config_process_deferred_mountroot() at 
> > > config_process_deferred_mountroot+0x6b
> > > main(0) at main+0x733
> > > end trace frame: 0x0, count: -9
> > > ddb{0}> shw ow registers
> > > rdi                              0x1
> > > rsi                     0x45d5418924
> > > rbp               0xffffffff82520cd0    end+0x120cd0
> > > rbx                       0xc8000400
> > > rdx                     0x4500000000
> > > rcx                            0xa6a
> > > rax                            0x1fe
> > > r8                               0x5
> > > r9                    0x7f7fffffc000
> > > r10               0xda85623203c11f01
> > > r11               0xd6f674e06cf62a5b
> > > r12                       0xcafedead
> > > r13               0xffff8000001a54c0
> > > r14               0xffff8000001a5000
> > > r15                              0x1
> > > rip               0xffffffff81131ec6    tsc_delay+0x66
> > > cs                               0x8
> > > rflags                         0x283
> > > rsp               0xffffffff82520cc0    end+0x120cc0
> > > ss                              0x10
> > > tsc_delay+0x66: rdtsc
> > > ddb{0}> re  boot rebvoo   oot
> > > rebooting...
> > 
> 
> 

Reply via email to