Hi, not sure if this is already known, at least I failed to find any related report: The way faults are handled on x86 during debugger memory access breaks the preemption counter of the interrupted task. Try to issue a "print *(int *)0" over CONFIG_PREEMPT and then continue execution, you will get endless "scheduling while atomic" warnings.
Reason:
Before kgdb touches any memory on the target, kgdb_fault_setjmp is invoked and
stores
the caller state in kgdb_fault_jmp_regs. Now, if a fault occurs later on,
kgdb_notify
detects that this is a fixable fault (kgdb_may_fault) and restores the previous
state
immediately. But because atomic_notifier_call_chain wraps the invocation of the
kgdb_notify handler into rcu_read_lock/unlock, the premature fixup-return
leaves no
chance to restore this lock (i.e. the preemption counter).
Looking at this issue from an outsider perspective, I wondered why the fault
return
context isn't patched instead of jumping back directly. Somehow this looks
cleaner and
more robust to me. So I hacked the proof-of-concept below, and it actually
solved my
problem over CONFIG_PREEMPT without obvious regressions (so far).
I saw that e.g. MIPS jumps back from fixup_exception. While this seems to be
immune
against the issue I found, it is still not conforming with the way fixable
faults are
handled normally in the kernel and may break on future changes as well.
So, if there are no pitfalls hidden, I would suggest to refactor this part for
all
archs. At least for x86 I could offer to work out a patch (as time permits).
Jan
--- linux-2.6.17.13.orig/arch/i386/kernel/kgdb.c
+++ linux-2.6.17.13/arch/i386/kernel/kgdb.c
@@ -311,7 +311,14 @@ static int kgdb_notify(struct notifier_b
/* Bad memory access? */
if (cmd == DIE_PAGE_FAULT_NO_CONTEXT && atomic_read(&debugger_active)
&& kgdb_may_fault) {
- kgdb_fault_longjmp(kgdb_fault_jmp_regs);
+ //kgdb_fault_longjmp(kgdb_fault_jmp_regs);
+ regs->ebx = kgdb_fault_jmp_regs[0];
+ regs->esi = kgdb_fault_jmp_regs[1];
+ regs->edi = kgdb_fault_jmp_regs[2];
+ regs->ebp = kgdb_fault_jmp_regs[3];
+ regs->esp = kgdb_fault_jmp_regs[4];
+ regs->eip = kgdb_fault_jmp_regs[5];
+ regs->eax = 1;
return NOTIFY_STOP;
} else if (cmd == DIE_PAGE_FAULT)
/* A normal page fault, ignore. */
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________ Kgdb-bugreport mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport
