On Sat, May 03, 2025 at 05:34:17PM GMT, Philip Guenther wrote:
> On Sat, 3 May 2025, Marcus Glocker wrote:
> > On Sat, May 03, 2025 at 10:08:15PM GMT, Marcus Glocker wrote:
> >
> > > On Sat, May 03, 2025 at 09:53:11PM GMT, Marcus Glocker wrote:
> > >
> > > > On Sat, May 03, 2025 at 02:42:09PM GMT, George Koehler wrote:
> > > > > On Sat, 3 May 2025 08:02:29 +0200
> > > > > Marcus Glocker <[email protected]> wrote:
> ...
> > > > > I don't see a panic message. I guess that you entered ddb from the
> > > > > db_ktrap call at /sys/arch/amd64/amd64/trap.c:323 (below we_re_toast:
> > > > > in kerntrap), but I don't know the kind of trap. It might help to
> > > > > move the trap_print call above the db_ktrap call, then build a kernel
> > > > > (without your workaround patch) and reproduce.
> > > >
> > > > Yes. That's the output when I move trap_print(), and panic() (converted
> > > > to an printf) before "if (db_ktrap(type, frame->tf_err, frame))":
> > > >
> > > > trashcan# halt -p
> > > > syncing disks... done
> > > > fatal trace trap in supervisor mode
> > > > trap type 5 code 0 rip ffffffff8217727d cs 8 rflags 2 cr2
> > > > ffff80003c16fa38 cpl d rsp ffff80003c083430
> > >
> > > "rflags 2" means that the Trap Flag (TF) is set I guess.
>
> Hmm? The trap flag aka PSL_T is 0x100. 0x2 is a must-be-one bit.
>
> Yes, trap 5 is the debugging trap, but the way to see what caused
> it is to examine %dr6. Perhaps try this on top of your diff moving
> up the trap_print() (but without your ignore-T_TRCTRAP-in-kernel
> diff), to clear %dr6 during boot and show it if its a trace-trap:
Thanks for the diff! Attached the complete diff which I've applied to
test. And this is the result:
trashcan# halt -p
syncing disks... done
fatal trace trap in supervisor mode
trap type 5 code 0 rip ffffffff813cb7cd cs 8 rflags 6 cr2 ffff80003c088ec8 cpl
d rsp ffff80003c17aed0
gsbase 0xffffffff829cdff0 kgsbase 0x0
dr6 ffff0ff8
Stopped at x86_bus_space_io_write_4+0x1d: leave
ddb{0}>
When I try to interpret the DR6 register value correctly, according to
the documentation, it would mean that Bit 3 B3 (Breakpoint #3 Condition
Detected) was set. Bit 11 BLD (Bus Lock Detection) would be cleared if
detected, which doesn't seem to be the case here since it's set to 1.
Index: sys/arch/amd64/amd64/locore0.S
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/locore0.S,v
diff -u -p -u -p -r1.26 locore0.S
--- sys/arch/amd64/amd64/locore0.S 4 Oct 2024 21:15:52 -0000 1.26
+++ sys/arch/amd64/amd64/locore0.S 4 May 2025 06:21:50 -0000
@@ -193,6 +193,8 @@ bi_size_ok:
pushl $PSL_MBO
popfl
+ movl $0x80,%eax
+ movl %eax,%dr6
xorl %eax,%eax
cpuid
movl %eax,RELOC(cpuid_level)
Index: sys/arch/amd64/amd64/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
diff -u -p -u -p -r1.106 trap.c
--- sys/arch/amd64/amd64/trap.c 4 Sep 2024 07:54:51 -0000 1.106
+++ sys/arch/amd64/amd64/trap.c 4 May 2025 06:21:50 -0000
@@ -320,6 +320,7 @@ kerntrap(struct trapframe *frame)
default:
we_re_toast:
#ifdef DDB
+ trap_print(frame, type);
if (db_ktrap(type, frame->tf_err, frame))
return;
#endif
@@ -465,6 +466,11 @@ trap_print(struct trapframe *frame, int
frame->tf_rflags, rcr2(), curcpu()->ci_ilevel, frame->tf_rsp);
printf("gsbase %p kgsbase %p\n",
(void *)rdmsr(MSR_GSBASE), (void *)rdmsr(MSR_KERNELGSBASE));
+ if (type == T_TRCTRAP) {
+ u_int64_t dr6;
+ __asm volatile("movq %%dr6,%0" : "=r" (dr6));
+ printf("dr6 %llx\n", dr6);
+ }
}