On Thu, May 12, 2016 at 04:12:21PM +0200, Martin Liška wrote: > --- a/gcc/asan.c > +++ b/gcc/asan.c > @@ -243,6 +243,11 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value; > static bool asan_shadow_offset_computed; > static vec<char *> sanitized_sections; > > +/* Set of variable declarations that are going to be guarded by > + use-after-scope sanitizer. */ > + > +static hash_set<tree> asan_handled_variables (13);
Doesn't this introduce yet another global ctor? If yes (and we unfortunately have already way too many), I'd strongly prefer to avoid that, use pointer to hash_set<tree> or something similar. > +/* Depending on POISON flag, emit a call to poison (or unpoison) stack memory > + allocated for local variables, localted in OFFSETS. LENGTH is number > + of OFFSETS, BASE is the register holding the stack base, > + against which OFFSETS array offsets are relative to. BASE_OFFSET > represents > + an offset requested by alignment and similar stuff. */ > + > +static void > +asan_poison_stack_variables (rtx shadow_base, rtx base, > + HOST_WIDE_INT base_offset, > + HOST_WIDE_INT *offsets, int length, > + tree *decls, bool poison) > +{ > + if (asan_sanitize_use_after_scope ()) > + for (int l = length - 2; l > 0; l -= 2) > + { I think this is unfortunate, it leads to: movl $-235802127, 2147450880(%rax) movl $-185335552, 2147450884(%rax) movl $-202116109, 2147450888(%rax) movb $-8, 2147450884(%rax) movb $-8, 2147450885(%rax) (e.g. on use-after-scope-1.c). The asan_emit_stack_protection function already walks all the entries in the offsets array in both of the for (l = length; l; l -= 2) loops, so please handle the initial poisoning and final unpoisoning there as well. The goal is that for variables that you want poison-after-scope at the start of the function (btw, I've noticed that current SVN LLVM doesn't bother with it and thus doesn't track "use before scope" (before the scope is entered for the first time, maybe we shouldn't either, that would catch only compiler bugs rather than user code bugs, right?)) have 0xf8 on all corresponding bytes including the one that would otherwise have 0x01 through 0x07. When unpoisoning at the end of the function, again you should combine that with unpoisoning of the red zone and partial zone bytes plus the last 0x01 through 0x07, etc. Plus, as I've mentioned before, it would be nice to optimize - for ASAN_MARK unpoison appearing strictly before (i.e. dominating) the first (non-shadow) memory read or write in the function (explicit or possible through function calls etc.) you really don't need to unpoison (depending on whether we follow LLVM as mentioned above then it can be removed without anything, or the decl needs to be somehow marked and tell asan_emit_stack_protection it shouldn't poison it at the start), and for ASAN_MARK poisoning appearing after the last load/store in the function (post dominating those, you don't care about noreturn though) you can combine that (remove the ASAN_MARK) with letting asan_emit_stack_protection know it doesn't need to unpoison. > + char c = poison ? ASAN_STACK_MAGIC_USE_AFTER_SCOPE : 0; > + for (unsigned i = 0; i < shadow_size; ++i) > + { > + emit_move_insn (var_mem, gen_int_mode (c, QImode)); > + var_mem = adjust_address (var_mem, QImode, 1); When you combine it with the loop, you can also use the infrastructure to handle it 4 bytes at a time. Another thing I've noticed is that the inline expansion of __asan_unpoison_stack_memory you emit looks buggy. In use-after-scope-1.c I see: _9 = (unsigned long) &my_char; _10 = _9 >> 3; _11 = _10 + 2147450880; _12 = (signed char *) _11; MEM[(short int *)_12] = 0; That would be fine only for 16 byte long my_char, but we have instead 9 byte one. So I believe in that case we need to store 0x00, 0x01 bytes, for little endian thus 0x0100. You could use for it a function similarly to asan_shadow_cst, just build INTEGER_CST rather than CONST_INT out of it. In general, poisioning is storing 0xf8 to all affected shadow bytes, unpoisioning should restore the state what we would emit without use-after-scope sanitization, which is all but the last byte 0, and the last byte 0 only if the var size is a multiple of 8, otherwise number of valid bytes (1-7). As for the option, it seems clang uses now -fsanitize-address-use-after-scope option, while I don't like that much, if they have already released some version with that option or if they are unwilling to change, I'd go with their option. > + if (flag_stack_reuse != SR_NONE > + && flag_openacc > + && oacc_declare_returns != NULL) This actually looks like preexisting OpenACC bug, I doubt the OpenACC behavior should depend on -fstack-reuse= setting. + bool unpoison_var = asan_poisoned_variables.contains (t); + if (asan_sanitize_use_after_scope () + && unpoison_var) + asan_poisoned_variables.remove (t); Similarly to asan_handled_variables, I'd prefer it to be a pointer to hash_set or something similar, so that it costs as few as possible for the general case (no sanitization). Similarly, querying the hash_set even for no use-after-scope sanitization looks wrong. + if ((asan_sanitize_stack_p () || asan_sanitize_use_after_scope ()) I would say if asan_sanitize_stack_p () is false, then we should not be doing use-after-scope sanitization (error if user requested that explicitly). Don't remember if I've mentioned it earlier, but for vars that are TREE_ADDRESSABLE only because of ASAN_MARK calls, we should probably turn them non-addressable and remove those ASAN_MARK calls, those shouldn't leak. You can have a look at the r237814 change for how similarly compare and exchange is special cased for the addressables discovery (though, the ASAN_MARK case would be easier, just drop it rather than turn it into something different). I think I miss some testsuite coverage for what has been discussed, stuff like: switch (x) { int a; case 1: int b; foo (&a); foo (&b); /* FALLTHRU */ case 2: int c; foo (&a); foo (&b); foo (&c); ... } On use-after-scope-goto-1.c it seems LLVM does a better job, there the goto doesn't cross the declaration of any of the a/b/c/d/e/f vars, so doesn't need unpoisioning of them; and another testcase is: int main(int argc, char **argv) { if (argc == 0) { int a; int b; int c; int d; int e; int f; int *ptr; int *ptr2; int *ptr3; int *ptr4; int *ptr5; int *ptr6; label: { a = 123; b = 123; c = 123; d = 123; e = 123; f = 123; ptr = &a; *ptr = 1; ptr2 = &b; *ptr2 = 1; ptr3 = &c; *ptr3 = 1; ptr4 = &d; *ptr4 = 1; ptr5 = &e; *ptr5 = 1; ptr6 = &f; *ptr6 = 1; return 0; } } else goto label; return 0; } I believe the C/C++ FEs have warnings/errors for crossing initialization of a variable with goto, can't that infrastructure be used also for discovery of what variables needs unpoisoning when entering a scope through a goto? Of course for something like setjmp (if we need to do anything about it at all, maybe longjmp just clears it), or non-local or computed goto we probably have to be conservative, but for the common case of local goto can't we figure out what needs to be unpoisoned either in the FEs, or during gimplification, and emit the unpoisoning not at the label, but on the edge between the goto and the label? Jakub