* Ingo Molnar <[email protected]> wrote:

> Subject: [PATCH] atomic: Fix bugs in 'fetch_or()' and rename it to 'xchg_or()'
> 
> Linus noticed a couple of problems with the fetch_or() implementation 
> introduced 
> by 5fd7a09cfb8c ("atomic: Export fetch_or()"):
> 
>  - Sloppy macro implementation: 'mask' and 'ptr' is evaluated multiple times, 
>    which will break if arguments have side effects.

So this shiny new patch manages to crash the x86 kernel with a NULL pointer 
dereference:

  [    0.143027] BUG: unable to handle kernel NULL pointer dereference at       
    (null)
  [    0.144000] IP: [<ffffffff8107c64c>] resched_curr+0x3c/0xc0

GCC manages to turn this:

static bool set_nr_and_not_polling(struct task_struct *p)
{
        struct thread_info *ti = task_thread_info(p);
        return !(xchg_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
}

and this:

>  /**
> + * xchg_or - perform *ptr |= mask atomically and return old value of *ptr
> + * @ptr: pointer to value (cmpxchg() compatible integer pointer type)
>   * @mask: mask to OR on the value
>   *
> + * cmpxchg() based, it's a macro so it works for different integer types.
>   */
> +#ifndef xchg_or
> +# define xchg_or(ptr, mask)                                          \
> +({                                                                   \
> +     typeof(ptr)  __ptr  = (ptr);                                    \
> +     typeof(mask) __mask = (mask);                                   \
> +                                                                     \
> +     typeof(*(__ptr)) __old, __val = *__ptr;                         \
> +                                                                     \
>       for (;;) {                                                      \
> +             __old = cmpxchg(__ptr, __val, __val | __mask);          \
>               if (__old == __val)                                     \
>                       break;                                          \
>               __val = __old;                                          \
>       }                                                               \
> +                                                                     \
>       __old;                                                          \
>  })

into:

    41c1:       89 c2                   mov    %eax,%edx
    41c3:       89 d6                   mov    %edx,%esi
    41c5:       31 c9                   xor    %ecx,%ecx
    41c7:       89 d0                   mov    %edx,%eax
    41c9:       83 ce 08                or     $0x8,%esi
    41cc:       f0 0f b1 31             lock cmpxchg %esi,(%rcx)

note the RCX zeroing via XOR...

The original, working sequence is:

    41c4:       89 c2                   mov    %eax,%edx
    41c6:       89 d6                   mov    %edx,%esi
    41c8:       89 d0                   mov    %edx,%eax
    41ca:       83 ce 08                or     $0x8,%esi
    41cd:       f0 0f b1 31             lock cmpxchg %esi,(%rcx)

The change that makes the difference is the 'ptr' part of:

> +             __old = cmpxchg(__ptr, __val, __val | __mask);          \

This variant works:

> +             __old = cmpxchg((ptr), __val, __val | __mask);          \

After a lot of staring PeterZ realized that __ptr aliases with the x86 
cmpxchg() 
macro-jungle's __ptr name!!

So if I do a s/__ptr/_ptr it all works...

But IMHO this really highlights a fundamental weakness of all this macro magic, 
it's all way too fragile.

Why don't we introduce a boring family of APIs:

        cmpxchg_8()
        cmpxchg_16()
        cmpxchg_32()
        cmpxchg_64()

        xchg_or_32()
        xchg_or_64()
        ...

... with none of this pesky auto-typing property and none of the 
macro-inside-a-macro crap? We could do clean types and would write them all in 
proper C, not fragile CPP.

It's not like we migrate between the types all that frequently - and even if we 
do, it's trivial.

hm?

Thanks,

        Ingo

Reply via email to