On 2017/10/28 16:43, Masami Hiramatsu wrote:
On Fri, 27 Oct 2017 21:30:24 +0800
zhouchengming<zhouchengmi...@huawei.com>  wrote:

On 2017/10/27 20:33, Borislav Petkov wrote:
On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
This is a real bug happened on one of our machines, below is the calltrace.
We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
encounter a deleted (poisoned) list_head.
Looks like some out-of-tree, old kernel thing. We don't have
mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
it could be that register_jprobe() is not ready yet.
Yes, it's an out-of-tree module, loaded when boot kernel. register_kprobe()
maybe not ready yet, but the bug is not caused by it obviously.

Looking at the Code, though:

    20:   74 59                   je     0x7b
    22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
    29:   00 00
    2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
    2f:   72 3a                   jb     0x6b
    31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
    35:   77 34                   ja     0x6b

%rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
that looks more like smp_alt_modules is not initialized yet but I could
could very well be wrong because this is an old kernel. So trigger that
with the upstream kernel without out of tree modules.
The smp_alt_modules is defined by LIST_HEAD, so it's initialized at start.

A deleted list_head->next = LIST_POISON1 = 0xdead000000000000 + 0x100, then
container_of() to get the struct smp_alt_module: -0x30 = 0xdead0000000000d0

Obviously, it's a deleted list_head, and I have explained clearly how it happen 
in
the patch comment.
Ah, I see. It looks alternatives_text_reserved() bug at a glance.
But simply adding smp_alt mutex to alternatives_text_reserved() causes
ABBA deadlock in the kprobe's path.
So your solution is to replace the smp_alt with text_mutex, since
alternatives_text_reserved is x86 specific function.

Hmm, let me see... I agree that will be a simple way to solve, but
it also means we have 2 resources protected by text_mutex.

Yes, the smp_alt mutex must be held outside the text_mutex, this is a simpler 
way
to solve, because we will need another x86 specific interface if we want to hold
the smp_alt mutex.

But like you said, it's not good to use one text_mutex to protect 2 resources...
I hope there is any better way.

Thanks.

Thank you,




Reply via email to