On Sun, Mar 24, 2019 at 10:17:49PM +0100, Rasmus Villemoes wrote:
> gcc already knows the semantics of these functions and can optimize
> accordingly. E.g. for strcpy() of a literal to a buffer, gcc readily
> compiles

The example you gave appears to get optimized accordingly, but there are
numerous sane counterexamples that don't get optimized.

On arm64, with GCC 8.3.0, let's look at the simple strcmp usage in
kernel/trace/preemptirq_delay_test.c:

static int preemptirq_delay_run(void *data)
{
        unsigned long flags;

        if (!strcmp(test_mode, "irq")) {
                local_irq_save(flags);
                busy_wait(delay);
                local_irq_restore(flags);
        } else if (!strcmp(test_mode, "preempt")) {
                preempt_disable();
                busy_wait(delay);
                preempt_enable();
        }

        return 0;
}

This is what happens without my patch:

preemptirq_delay_run:
.LFB2517:
        .cfi_startproc
        stp     x29, x30, [sp, -32]!
        .cfi_def_cfa_offset 32
        .cfi_offset 29, -32
        .cfi_offset 30, -24
        adrp    x1, .LC0
        add     x1, x1, :lo12:.LC0
        mov     x29, sp
        stp     x19, x20, [sp, 16]
        .cfi_offset 19, -16
        .cfi_offset 20, -8
        adrp    x19, .LANCHOR0
        add     x19, x19, :lo12:.LANCHOR0
        mov     x0, x19
>       bl      strcmp
        cbz     w0, .L22
        adrp    x1, .LC1
        mov     x0, x19
        add     x1, x1, :lo12:.LC1
>       bl      strcmp
        cbz     w0, .L23

The calls to strcmp are not optimized out. Here is what this snippet looks like
after my patch:

preemptirq_delay_run:
.LFB2517:
        .cfi_startproc
        stp     x29, x30, [sp, -32]!
        .cfi_def_cfa_offset 32
        .cfi_offset 29, -32
        .cfi_offset 30, -24
        adrp    x0, .LANCHOR0
        mov     w1, 29289
        mov     x29, sp
        ldr     w2, [x0, #:lo12:.LANCHOR0]
        movk    w1, 0x71, lsl 16
        add     x3, x0, :lo12:.LANCHOR0
        cmp     w2, w1
        beq     .L23
        ldr     x1, [x0, #:lo12:.LANCHOR0]
        mov     x0, 29296
        movk    x0, 0x6565, lsl 16
        movk    x0, 0x706d, lsl 32
        movk    x0, 0x74, lsl 48
        cmp     x1, x0
        beq     .L24

The calls to strcmp were optimized out completely, not even replaced by a call
to memcmp.

I can produce lots of kernel examples that don't seem to follow basic str*
optimization, which is what prompted me to write this in the first place.

> Does this even compile? It's a well-known (or perhaps
> not-so-well-known?) pitfall that __builtin_constant_p() is not
> guaranteed to be usable in __builtin_choose_expr() - the latter only
> accepts bona fide integer constant expressions, while evaluation of
> __builtin_constant_p can be delayed until various optimization phases.

It not only compiles, but also seems to work correctly from what I've observed.

> This seems to be buggy - you don't know that src is at least as long as
> dest. And arguing "if it's shorter, there's a nul byte, which will
> differ from dest at that point, so memcmp will/should stop" means that
> the whole word-at-a-time argument would be out.

You mean reading the last word of a string could read out of bounds of the
string when using memcmp? There are lots of memcmp instances using a literal
string in the kernel; I don't think it would be hard to find one that violates
what you've pointed out, so I'm not really sure why it's a problem.

After a few minutes of grepping, it seems like the memcmp usage in
drivers/md/dm-integrity.c can read out of bounds on its arguments:
} else if (!memcmp(opt_string, "internal_hash:", strlen("internal_hash:"))) {
        r = get_alg_and_key(opt_string, &ic->internal_hash_alg, &ti->error,
                            "Invalid internal_hash argument");
        if (r)
                goto bad;
} else if (!memcmp(opt_string, "journal_crypt:", strlen("journal_crypt:"))) {
        r = get_alg_and_key(opt_string, &ic->journal_crypt_alg, &ti->error,
                            "Invalid journal_crypt argument");
        if (r)
                goto bad;
} else if (!memcmp(opt_string, "journal_mac:", strlen("journal_mac:"))) {

Where opt_string is a dynamically-set argument with no specified minimum length.

Sultan

Reply via email to