> Date: Wed, 14 Jul 2021 15:50:46 +1000 > From: Jonathan Gray <j...@jsg.id.au> > > On Tue, Jul 13, 2021 at 10:11:29PM +0100, Tom Murphy wrote: > > Hi Jonathan, > > > > On Tue, Jul 13, 2021 at 01:13:03PM +1000, Jonathan Gray wrote: > > > On Mon, Jul 12, 2021 at 06:22:36PM +0000, Tom Murphy wrote: > > > > I had firefox open (various tabs/windows) and was playing a 3D game > > > > (games/quakespasm) and after a random amount of time I got a hard lock > > > > up, > > > > but the second time it happened I was able to get into a ddb prompt. > > > > I've > > > > added the panic message and trace and dmesg. > > > > > > > > I don't have a serial console on this laptop so had to transcribe this > > > > by > > > > hand from a photo I took on my phone. (Is there an easier way to save > > > > these?) > > > > > > It is possible to get a trace out of a crash dump, see crash(8). > > > But yes serial or amt sol is easier. > > > > Thanks! I'll have a closer look at crash(8). > > > > > > panic: kernel diagnostic assertion "to_ticks >= 0" failed: file > > > > "/usr/src/sys/kern/kern_timeout.c", line 299 > > > > Stopped at db_enter+0x10: popq %rbp > > > > TID PID UID PRFLAGS PFLAGS CPU COMMAND > > > > *395451 18070 0 0x14000 0x200 0K drmtskl > > > > 61931 46160 0 0x14000 0x200 2 drmwq > > > > 185485 53694 0 0x14000 0x200 1 drmwq > > > > 284820 77991 0 0x14000 0x200 3 drmwq > > > > db_enter() at db_enter+0x10 > > > > panic(ffffffff81e5f243) at panic+0xbf > > > > __assert(ffffffff81ec940a,ffffffff81eb2a3e,12b,ffffffff81ec1795) at > > > > __assert+0x2b > > > > timeout_add(ffff800000bf0410,ffffffff) at timeout_add+0x1cc > > > > process_csb(ffff800000bf0000) at process_csb+0x38b > > > > execlists_submission_tasklet(ffff800000bf0000) at > > > > execlists_submission_tasklet+0x48 > > > > tasklet_run(ffff800000bf03c0) at tasklet_run+0x44 > > > > taskq_thread(ffff800000220f00) at taskq_thread+0x81 > > > > end trace frame: 0x0, count: 7 > > > > > > The timeout_add() call comes from i915_utils.c set_timer_ms() > > > mod_timer(t, jiffies + timeout ?: 1); > > > > > > can you try this patch? > > > > > > Index: sys/dev/pci/drm/include/linux/timer.h > > > =================================================================== > > > RCS file: /cvs/src/sys/dev/pci/drm/include/linux/timer.h,v > > > retrieving revision 1.6 > > > diff -u -p -r1.6 timer.h > > > --- sys/dev/pci/drm/include/linux/timer.h 7 Jul 2021 02:38:36 -0000 > > > 1.6 > > > +++ sys/dev/pci/drm/include/linux/timer.h 13 Jul 2021 02:54:26 -0000 > > > @@ -24,10 +24,20 @@ > > > #include <sys/kernel.h> > > > #include <linux/ktime.h> > > > > > > -#define mod_timer(x, y) timeout_add((x), ((y) - jiffies)) > > > #define del_timer_sync(x) timeout_del_barrier((x)) > > > #define del_timer(x) timeout_del((x)) > > > #define timer_pending(x) timeout_pending((x)) > > > + > > > +static inline int > > > +mod_timer(struct timeout *to, unsigned long j) > > > +{ > > > + int ticks = j - jiffies; > > > + if (ticks <= 0) { > > > + timeout_del(to); > > > + return timeout_add(to, 1); > > > + } > > > + return timeout_add(to, ticks); > > > +} > > > > > > static inline unsigned long > > > round_jiffies_up(unsigned long j) > > > > This patch seems to work for me. I did some pretty rigorous testing > > with it and attempted to recreate the same conditions that made it > > crash however I wasn't able to get the kernel panic so that is a good > > sign! > > Thanks, I committed a slightly different version which does the test > without a type conversion. > > Index: sys/dev/pci/drm/include/linux/timer.h > =================================================================== > RCS file: /cvs/src/sys/dev/pci/drm/include/linux/timer.h,v > retrieving revision 1.6 > diff -u -p -r1.6 timer.h > --- sys/dev/pci/drm/include/linux/timer.h 7 Jul 2021 02:38:36 -0000 > 1.6 > +++ sys/dev/pci/drm/include/linux/timer.h 14 Jul 2021 04:49:03 -0000 > @@ -24,10 +24,19 @@ > #include <sys/kernel.h> > #include <linux/ktime.h> > > -#define mod_timer(x, y) timeout_add((x), ((y) - jiffies)) > #define del_timer_sync(x) timeout_del_barrier((x)) > #define del_timer(x) timeout_del((x)) > #define timer_pending(x) timeout_pending((x)) > + > +static inline int > +mod_timer(struct timeout *to, unsigned long j) > +{ > + if (j <= jiffies) { > + timeout_del(to);
Any reason why you do a timeout_del() here? > + return timeout_add(to, 1); > + } > + return timeout_add(to, j - jiffies); > +} > > static inline unsigned long > round_jiffies_up(unsigned long j) > >