Fault Injection issues: stacktrace x86_64 and failslab NUMA

Scott Porter Fri, 20 Apr 2007 12:25:01 -0700

I'm attempting to use Fault Injection stacktrace filtering on an x86_64
platform (see config details below) and finding problems:


(1) Apparently stacktrace on x86_64 isn't always reliable but the fault
injection code path to save a stack trace looks *completely* unreliable

(2) CONFIG_NUMA and failslab don't mix -- the NUMA code successfully
allocates after the fault injection code thinks it has failed!

Details:

(1) When the fault injection code is saving a stack trace to compare
require-start/require-end addresses, it employs the following code in
arch/x86_64/kernel/traps.c where the task is the currently running
thread:

        if (!stack) {
                unsigned long dummy;
                stack = &dummy;
                if (tsk && tsk != current)
                        stack = (unsigned long *)tsk->thread.rsp;
        }

Can anyone explain how this code could possibly get a valid stack
pointer using "&dummy" on an automatic variable defined in the middle of
a routine that has arguments and other auto variables?  When I look at
the stack buffer that is saved it contains only entries for
"should_fail()" and so fail_stacktrace() NEVER matches on the start/end
addresses.

Prior to this patch:

  2006-09-26 Andi Kleen [PATCH] Merge stacktrace and show_trace

the save_stack_trace() routine would get the current thread stack
pointer this way:

-       if (!task || task == current) {
-               /* Grab rbp right from our regs: */
-               asm ("mov %%rbp, %0" : "=r" (stack));

Any pointers you can give to help me understand x86_64 stack frames
and/or how this code should work is appreciated!

I'd like to take advantage of the Fault Injection capabilities to debug
error paths in a driver but without reliable stacktrace it looks like
I'm left to invent my own kmalloc() error injection methods.  Anybody
know how to interpose (a la LD_PRELOAD) on exported kernel interfaces
like kmalloc() for a loadable kernel module?!  I'm thinking that Fault
Injection for failslab and fail-page-alloc provided as a separate
loadable module would be more appealing than having to add debug code
into the kernel anyways?

(2) The fault injection code in mm/slab.c within ____cache_alloc()
returns NULL when should_failslab() returns true.  HOWEVER, later code
in __cache_alloc() compiled in with NUMA_BUILD (via CONFIG_NUMA)
successfully allocates memory from another node and prevents the error
injection from really failing:

    if (!objp)
            objp = ____cache_alloc(cachep, flags);
    /*
     * We may just have run out of memory on the local node.
     * ____cache_alloc_node() knows how to locate memory on other nodes
     */
    if (NUMA_BUILD && !objp)
            objp = ____cache_alloc_node(cachep, flags, numa_node_id());

Not sure what the best way is to deal with this (other than turning off
CONFIG_NUMA!).

Config details:

2.6.20.1 x86_64
CONFIG_DEBUG_KERNEL=y
CONFIG_STACKTRACE=y
CONFIG_FRAME_POINTER=y
CONFIG_FAULT_INJECTION=y
CONFIG_FAILSLAB=y
CONFIG_FAIL_PAGE_ALLOC=y
# CONFIG_FAIL_MAKE_REQUEST is not set
CONFIG_FAULT_INJECTION_DEBUG_FS=y
CONFIG_NUMA=y



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Fault Injection issues: stacktrace x86_64 and failslab NUMA

Reply via email to