On Sun, May 04, 2025 at 09:09:37AM GMT, Marcus Glocker wrote:

> On Sun, May 04, 2025 at 08:28:31AM GMT, Marcus Glocker wrote:
> 
> > On Sat, May 03, 2025 at 05:34:17PM GMT, Philip Guenther wrote:
> > 
> > > On Sat, 3 May 2025, Marcus Glocker wrote:
> > > > On Sat, May 03, 2025 at 10:08:15PM GMT, Marcus Glocker wrote:
> > > > 
> > > > > On Sat, May 03, 2025 at 09:53:11PM GMT, Marcus Glocker wrote:
> > > > > 
> > > > > > On Sat, May 03, 2025 at 02:42:09PM GMT, George Koehler wrote:
> > > > > > > On Sat, 3 May 2025 08:02:29 +0200
> > > > > > > Marcus Glocker <[email protected]> wrote:
> > > ...
> > > > > > > I don't see a panic message.  I guess that you entered ddb from 
> > > > > > > the
> > > > > > > db_ktrap call at /sys/arch/amd64/amd64/trap.c:323 (below 
> > > > > > > we_re_toast:
> > > > > > > in kerntrap), but I don't know the kind of trap.  It might help to
> > > > > > > move the trap_print call above the db_ktrap call, then build a 
> > > > > > > kernel
> > > > > > > (without your workaround patch) and reproduce.
> > > > > > 
> > > > > > Yes.  That's the output when I move trap_print(), and panic() 
> > > > > > (converted
> > > > > > to an printf) before "if (db_ktrap(type, frame->tf_err, frame))":
> > > > > > 
> > > > > > trashcan# halt -p
> > > > > > syncing disks... done
> > > > > > fatal trace trap in supervisor mode
> > > > > > trap type 5 code 0 rip ffffffff8217727d cs 8 rflags 2 cr2 
> > > > > > ffff80003c16fa38 cpl d rsp ffff80003c083430
> > > > > 
> > > > > "rflags 2" means that the Trap Flag (TF) is set I guess.
> > > 
> > > Hmm?  The trap flag aka PSL_T is 0x100.  0x2 is a must-be-one bit.
> > > 
> > > Yes, trap 5 is the debugging trap, but the way to see what caused
> > > it is to examine %dr6.  Perhaps try this on top of your diff moving
> > > up the trap_print() (but without your ignore-T_TRCTRAP-in-kernel
> > > diff), to clear %dr6 during boot and show it if its a trace-trap:
> > 
> > Thanks for the diff!  Attached the complete diff which I've applied to
> > test.  And this is the result:
> > 
> > trashcan# halt -p
> > syncing disks... done
> > fatal trace trap in supervisor mode
> > trap type 5 code 0 rip ffffffff813cb7cd cs 8 rflags 6 cr2 ffff80003c088ec8 
> > cpl d rsp ffff80003c17aed0
> > gsbase 0xffffffff829cdff0  kgsbase 0x0
> > dr6 ffff0ff8
> > Stopped at  x86_bus_space_io_write_4+0x1d:  leave
> > ddb{0}>
> > 
> > When I try to interpret the DR6 register value correctly, according to
> > the documentation, it would mean that Bit 3 B3 (Breakpoint #3 Condition
> > Detected) was set.  Bit 11 BLD (Bus Lock Detection) would be cleared if
> > detected, which doesn't seem to be the case here since it's set to 1.
> 
> And that's the content of the DR7 register (Debug Control Register)
> during the trap:
> 
> dr7 200004c0
> 
> Which means that local and global breakpoints #3 are enabled (bit 6 and
> 7).  When I clear the DR7 register as well during boot, the trap doesn't
> happen anymore during "halt -p".

And the breakpoint #3 condition (bit 29:28) is set to 10b which means:

+----------------------------------+
| Value Break on                   |
+----------------------------------+
| 00b   Instruction execution only |
| 01b   Data writes only           |
| 10b   I/O reads and writes       | <-
|       (only defined if CR4.DE=1) |
| 11b   Data reads and writes      |
+----------------------------------+

But then it should happen for any read/write I/O, which we should have
plenty before the I/O write to port 0x80.  So I'm not sure why it only
happens at that point ...
 
> Index: sys/arch/amd64/amd64/locore0.S
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/locore0.S,v
> diff -u -p -u -p -r1.26 locore0.S
> --- sys/arch/amd64/amd64/locore0.S    4 Oct 2024 21:15:52 -0000       1.26
> +++ sys/arch/amd64/amd64/locore0.S    4 May 2025 07:07:33 -0000
> @@ -193,6 +193,10 @@ bi_size_ok:
>       pushl   $PSL_MBO
>       popfl
>  
> +     movl    $0x80,%eax
> +     movl    %eax,%dr6
> +     movl    $0x00,%eax
> +     movl    %eax,%dr7
>       xorl    %eax,%eax
>       cpuid
>       movl    %eax,RELOC(cpuid_level)
> Index: sys/arch/amd64/amd64/trap.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
> diff -u -p -u -p -r1.106 trap.c
> --- sys/arch/amd64/amd64/trap.c       4 Sep 2024 07:54:51 -0000       1.106
> +++ sys/arch/amd64/amd64/trap.c       4 May 2025 07:07:33 -0000
> @@ -320,6 +320,7 @@ kerntrap(struct trapframe *frame)
>       default:
>       we_re_toast:
>  #ifdef DDB
> +             trap_print(frame, type);
>               if (db_ktrap(type, frame->tf_err, frame))
>                       return;
>  #endif
> @@ -465,6 +466,14 @@ trap_print(struct trapframe *frame, int 
>           frame->tf_rflags, rcr2(), curcpu()->ci_ilevel, frame->tf_rsp);
>       printf("gsbase %p  kgsbase %p\n",
>           (void *)rdmsr(MSR_GSBASE), (void *)rdmsr(MSR_KERNELGSBASE));
> +     if (type == T_TRCTRAP) {
> +             u_int64_t dr6;
> +             u_int64_t dr7;
> +             __asm volatile("movq %%dr6,%0" : "=r" (dr6));
> +             __asm volatile("movq %%dr7,%0" : "=r" (dr7));
> +             printf("dr6 %llx\n", dr6);
> +             printf("dr7 %llx\n", dr7);
> +     }
>  }

Reply via email to