On 7/13/2012 6:07 AM, Akinobu Mita wrote: > 2012/7/13 Andrew Morton <[email protected]>: >> On Thu, 12 Jul 2012 20:04:53 +0900 >> Akinobu Mita <[email protected]> wrote: >> >>> The function dup_task() may fail at the following function calls in >>> the following order. >>> >>> 0) alloc_task_struct_node() >>> 1) alloc_thread_info_node() >>> 2) arch_dup_task_struct() >>> >>> Error by 0) is not a matter, it can just return. But error by 1) >>> requires releasing task_struct allocated by 0) before it returns. >>> Likewise, error by 2) requires releasing task_struct and thread_info >>> allocated by 0) and 1). >>> >>> The existing error handling calls free_task_struct() and >>> free_thread_info() which do not only release task_struct and >>> thread_info, but also call architecture specific >>> arch_release_task_struct() and arch_release_thread_info(). >>> >>> The problem is that task_struct and thread_info are not fully >>> initialized yet at this point, but arch_release_task_struct() and >>> arch_release_thread_info() are called with them. >>> >>> For example, x86 defines its own arch_release_task_struct() that >>> releases a task_xstate. If alloc_thread_info_node() fails in >>> dup_task(), arch_release_task_struct() is called with task_struct >>> which is just allocated and filled with garbage in this error handling. >>> >>> This actually happened with tools/testing/fault-injection/failcmd.sh >>> >>> # env FAILCMD_TYPE=fail_page_alloc \ >>> ./tools/testing/fault-injection/failcmd.sh --times=100 \ >>> --min-order=0 --ignore-gfp-wait=0 \ >>> -- make -C tools/testing/selftests/ run_tests >>> >>> In order to fix this issue, make free_{task_struct,thread_info}() not >>> to call arch_release_{task_struct,thread_info}() and call >>> arch_release_{task_struct,thread_info}() implicitly where needed. >>> >>> Default arch_release_task_struct() and arch_release_thread_info() are >>> defined as empty by default. So this change only affects the >>> architectures which implement their own arch_release_task_struct() or >>> arch_release_thread_info() as listed below. >> This conflicts with Salman's fix (below) which is in linux-next via >> Ingo's tree. >> >> It appears that we should drop Salman's patch altogether and use yours? > Yes. Salman's patch fixes error handling for x86, sh, and mn10300. > But it doesn't fix for tile. If tile's arch_release_thread_info() is > called after setup_thread_stack(tsk, orig), it may release original > task's step_state. (tsk->step_state will be cleared by tile's > copy_thread() after dup_task_struct()).
Yes, I think Akinobu's patch is better. Thanks. -- Chris Metcalf, Tilera Corp. http://www.tilera.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

