On 2017/10/27 22:15, Peter Zijlstra wrote:
On Fri, Oct 27, 2017 at 02:33:48PM +0200, Borislav Petkov wrote:
On Fri, Oct 27, 2017 at 07:42:45PM +0800, zhouchengming wrote:
This is a real bug happened on one of our machines, below is the calltrace.
We can see the trigger is at alternatives_text_reserved+0x20/0x80, and
encounter a deleted (poisoned) list_head.
Looks like some out-of-tree, old kernel thing. We don't have
mlx4_stats_sysfs_create() upstream and looking at the boot timestamps,
it could be that register_jprobe() is not ready yet.

Looking at the Code, though:

   20:   74 59                   je     0x7b
   22:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
   29:   00 00
   2b:*  48 3b 71 20             cmp    0x20(%rcx),%rsi<-- trapping instruction
   2f:   72 3a                   jb     0x6b
   31:   48 3b 79 28             cmp    0x28(%rcx),%rdi
   35:   77 34                   ja     0x6b

%rcx is 0xdead0000000000d0 and that is POISON_POINTER_DELTA + 0xd0 so
that looks more like smp_alt_modules is not initialized yet but I could
could very well be wrong because this is an old kernel. So trigger that
with the upstream kernel without out of tree modules.
Not to mention that we're about (or just have) yanked jprobes out of the
kernel entirely.

Well... but this is a bug of alternatives_text_reserved(), it traverse the list 
without holding
the smp_alt mutex. So all users of it, like kprobes, will still have this 
problem. Maybe I could
think of a way to get rid of the mutex entirely.

Thanks.

.



Reply via email to