On 2021/01/28 12:21, Jonathan Gray wrote:
> > NMI ... going to debugger
> > Stopped at      tsc_delay+0x63: lfence
> > ddb{0}> trace
> > tsc_delay(1) at tsc_delay+0x63
> > r100_ring_test(ffff8000001a4000,ffff8000001a5858) at r100_ring_test+0x277
> > r100_cp_init(ffff8000001a4000,100000) at r100_cp_init+0x5a1
> > r100_startup(ffff8000001a4000) at r100_startup+0x535
> > r100_init(ffff8000001a4000) at r100_init+0x4ac
> > radeon_device_init(ffff8000001a4000,ffff800000196800,ffff800000196840,840001)
> >  a
> > t radeon_device_init+0x944
> > radeondrm_attachhook(ffff8000001a4000) at radeondrm_attachhook+0x36
> > config_process_deferred_mountroot() at 
> > config_process_deferred_mountroot+0x6b
> > main(0) at main+0x723
> > end trace frame: 0x0, count: -9
> 
> I don't understand why an lfence would cause an nmi.

I was thinking that it might not be the lfence triggering it but
something that happened just before connected with the video init,
and it's just that the tsc_delay/lfence is what's running when it hit ..

> Does it still occur with the below diff to change lfence;rdtsc to rdtscp?
> This requires RDTSCP which your machine has but bluhm's machine does not.
> 
> Perhaps it is related to some kind of watchdog timer?  Can you check if
> the ilo event log has any relevant information?
> 
> Index: sys/arch/amd64/include/cpufunc.h
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/include/cpufunc.h,v
> retrieving revision 1.36
> diff -u -p -r1.36 cpufunc.h
> --- sys/arch/amd64/include/cpufunc.h  13 Sep 2020 11:53:16 -0000      1.36
> +++ sys/arch/amd64/include/cpufunc.h  28 Jan 2021 00:47:16 -0000
> @@ -307,7 +307,8 @@ rdtsc_lfence(void)
>  {
>       uint32_t hi, lo;
>  
> -     __asm volatile("lfence; rdtsc" : "=d" (hi), "=a" (lo));
> +//   __asm volatile("lfence; rdtsc" : "=d" (hi), "=a" (lo));
> +     __asm volatile("rdtscp" : "=d" (hi), "=a" (lo) :: "ecx");
>       return (((uint64_t)hi << 32) | (uint64_t) lo);
>  }
>  
> 

Reply via email to