Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/23/13 07:05, Richard Biener wrote: Btw, -ftree-isolate-paths sounds a bit generic for isolating paths leading to undefined behavior, maybe -fisolate-errorneous-paths? (please drop 'tree-' from new options, 'tree' isn't meaningful to GCC users) Seems reasonable, particularly since I think this can and should be easily extended for things like out-of-bounds array accesses and perhaps other stuff. Oh yea, it'll be -fisolate-erroneous-paths. Done. The file should be named gimple-ssa-isolate-paths.c, too. Done. + for (si = gsi_start_nondebug_bb (bb), si2 = gsi_start_nondebug_bb (duplicate); +!gsi_end_p (si) && !gsi_end_p (si2); +gsi_next_nondebug (&si), gsi_next_nondebug (&si2)) + { + while (gimple_code (gsi_stmt (si)) == GIMPLE_LABEL) + gsi_next_nondebug (&si); gsi_after_labels (bb); if (is_gimple_debug (gsi_stmt (gsi))) gsi_next_nondebug (&gsi); warrants a new gsi_nondebug_after_labels () I assume you meant gsi_start_nondebug_after_labels_bb. Once we hit the first real statement in a BB we shouldn't see any more labels in that block, so we just need to use gsi_next_nondebug after that. + /* SI2 is the iterator in the duplicate block and it now points +to our victim statement. */ + gimple_seq seq = NULL; + tree t = build_call_expr_loc (0, + builtin_decl_explicit (BUILT_IN_TRAP), 0); + gimplify_and_add (t, &seq); + gsi_insert_before (&si2, seq, GSI_SAME_STMT); please build a gimple call stmt directly instead of going through the gimplifier. I can't recall where I got that gem (wherever it was it should be fixed too), but yea, going through the gimplifier is dumb. Fixed. + if (operand_equal_p (op, integer_zero_node, 0)) integer_zerop () Fixed. Note this instance goes away, but there was another in the stanza which detected explicit NULL pointer dereferences in the IL. + /* We've got a NULL PHI argument. Now see if the +PHI's result is dereferenced within BB. */ + FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs) + { + unsigned int j; + bool removed_edge = false; + + /* We only care about uses in BB which are assignements +with memory operands. + +We could see if any uses are as function arguments +when the callee has marked the argument as being +non-null. */ + if (gimple_bb (use_stmt) != bb + || !gimple_has_mem_ops (use_stmt) + || gimple_code (use_stmt) != GIMPLE_ASSIGN) !is_gimple_assign (), you can drop the has_mem_ops check, but ... + continue; + + /* Look at each operand for a memory reference using +LHS. */ ... you want to use walk_stmt_load_store_ops () which handles all kinds of stmts fine. Actually for this code Marc Glisse recommended infer_nonnull_range, and I'm in agreement. + for (j = 0; j < gimple_num_ops (use_stmt); j++) + { + tree op = gimple_op (use_stmt, j); + + if (op == NULL) + continue; + + while (handled_component_p (op)) + op = TREE_OPERAND (op, 0); + + if ((TREE_CODE (op) == MEM_REF + || TREE_CODE (op) == INDIRECT_REF) no more INDIRECT_REFs are in the IL. Oh. I learn something every day. I vaguely recall discussing MEM_REF at a summit years ago, but I was a bit pre-occupied with personal matters. I guess it totally subsumed INDIRECT_REF at this stage in the compiler. Can't complain about that :-) I'll need to rework the second stanza which detects NULL dereferences explicitly appearing in the IL a bit, but it shouldn't be too difficult. Thanks for the feedback, Jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/23/13 01:58, Florian Weimer wrote: Could you keep in mind that there is considerable interest in a check_nonnull attribute which marks values (parameters, return values, maybe even struct fields) that can be NULL and need to be checked explictly prior to dereference? GCC would then warn if there is a path on which the check is missing. I don't have time at the moment to work on this, but it's on my ever-growing TODO list. :) There's a lot of overlap between this and the change which originally set me down this path -- namely a pass which would warn about potential null pointer dereference (with suitable knobs which allowed the user to control things like the non-nullness of a load of a pointer from memory). I'd like to get back to that work, but it's definitely not a 4.9 thing. Jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On Mon, Oct 21, 2013 at 7:27 PM, Jeff Law wrote: > On 10/21/13 06:19, Richard Biener wrote: > >> >> I wonder why this isn't part of the regular jump-threading code - after >> all the opportunity is to thead to __builtin_unreachable () ;) Otherwise >> the question is always where you'd place this pass and whether it >> enables jump threading or CSE opportunities (that at least it does). > > In theory one could do this as part of jump threading. Doing it in jump > threading would have the advantage that we could use a NULL generated in a > different block than the dereference. Of course, we'd have a full path to > duplicate at that point. > > The downside is complexity. This pass is amazingly simple and effective. > The jump threading code is considerably more complex and bolting this on the > side would just make it worse. > > From a pipeline location, if kept as a separate pass, it probably needs to > be between DOM1 & phi-only cprop. DOM1 does a reasonable job at propagating > the NULL values into the PHI nodes to expose the optimization. And the pass > tends to expose precisely the kinds of PHI nodes that phi-only cprop can > optimize. VRP2/PRE/DOM2 do a good job at finding the newly exposed CSEs and > jump threading opportunities. > > I haven't looked to see if there's anything left to optimize after DOM2, but > I can certainly speculate that there would be cases exposed by the various > passes that currently exist between DOM1 & DOM2. That's certainly worth > exploring. > > As you know, phase ordering will always be a concern. We already have > acknowledged that we're punting some of those issues once we stopped > iterating DOM. True ... Btw, -ftree-isolate-paths sounds a bit generic for isolating paths leading to undefined behavior, maybe -fisolate-errorneous-paths? (please drop 'tree-' from new options, 'tree' isn't meaningful to GCC users) The file should be named gimple-ssa-isolate-paths.c, too. + for (si = gsi_start_nondebug_bb (bb), si2 = gsi_start_nondebug_bb (duplicate); +!gsi_end_p (si) && !gsi_end_p (si2); +gsi_next_nondebug (&si), gsi_next_nondebug (&si2)) + { + while (gimple_code (gsi_stmt (si)) == GIMPLE_LABEL) + gsi_next_nondebug (&si); gsi_after_labels (bb); if (is_gimple_debug (gsi_stmt (gsi))) gsi_next_nondebug (&gsi); warrants a new gsi_nondebug_after_labels () + /* SI2 is the iterator in the duplicate block and it now points +to our victim statement. */ + gimple_seq seq = NULL; + tree t = build_call_expr_loc (0, + builtin_decl_explicit (BUILT_IN_TRAP), 0); + gimplify_and_add (t, &seq); + gsi_insert_before (&si2, seq, GSI_SAME_STMT); please build a gimple call stmt directly instead of going through the gimplifier. + if (operand_equal_p (op, integer_zero_node, 0)) integer_zerop () + /* We've got a NULL PHI argument. Now see if the +PHI's result is dereferenced within BB. */ + FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs) + { + unsigned int j; + bool removed_edge = false; + + /* We only care about uses in BB which are assignements +with memory operands. + +We could see if any uses are as function arguments +when the callee has marked the argument as being +non-null. */ + if (gimple_bb (use_stmt) != bb + || !gimple_has_mem_ops (use_stmt) + || gimple_code (use_stmt) != GIMPLE_ASSIGN) !is_gimple_assign (), you can drop the has_mem_ops check, but ... + continue; + + /* Look at each operand for a memory reference using +LHS. */ ... you want to use walk_stmt_load_store_ops () which handles all kinds of stmts fine. + for (j = 0; j < gimple_num_ops (use_stmt); j++) + { + tree op = gimple_op (use_stmt, j); + + if (op == NULL) + continue; + + while (handled_component_p (op)) + op = TREE_OPERAND (op, 0); + + if ((TREE_CODE (op) == MEM_REF + || TREE_CODE (op) == INDIRECT_REF) no more INDIRECT_REFs are in the IL. Richard. + && TREE_OPERAND (op, 0) == lhs) + { + edge e = gimple_phi_arg_edge (phi, i); + duplicate = isolate_path (bb, duplicate, + e, use_stmt); + + /* When we remove an incoming edge, we need to +reprocess the Ith element. */ +
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/22/2013 09:00 PM, Jeff Law wrote: So I was poking at this a bit. It's trival to use infer_nonnull_range and to teach infer_nonnull_range to use the returns_nonnull attribute to pick up that return x in an appropriately decorated function implies that x is non-null. We'll need a better place to shove infer_nonnull_range so that it's available to both users. Could you keep in mind that there is considerable interest in a check_nonnull attribute which marks values (parameters, return values, maybe even struct fields) that can be NULL and need to be checked explictly prior to dereference? GCC would then warn if there is a path on which the check is missing. I don't have time at the moment to work on this, but it's on my ever-growing TODO list. :) -- Florian Weimer / Red Hat Product Security Team
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/18/13 14:31, Marc Glisse wrote: On Fri, 18 Oct 2013, Jeff Law wrote: On 10/18/13 12:47, Marc Glisse wrote: * tree-vrp has a function infer_nonnull_range, do you think we could share it? We now store the VRP ranges for integers, but not for pointers. If we did (or maybe just a non-null bit), the pass could just test that bit on the variable found in the PHI. I'm not sure what can really be shared here -- this patch searches for PHIs where one or more of the args is a NULL pointer. The NULL pointer will be explicit in the PHI arg. But once you have that pointer defined by a PHI containing a zero, you look at all its uses, trying to find one that proves the pointer is non-zero (only dereferences for now, but you have a comment about the non-null attribute). And infer_nonnull_range precisely says whether a statement proves that a pointer is non-zero (well, there may be a few subtle differences, and some tests might need to move between infer_value_range and infer_nonnull_range). I am just talking of replacing 20 lines of code with a function call, not a huge sharing I agree... Storing the VRP info might not actually help, since you need to know starting from which statement the pointer is non-zero. So I was poking at this a bit. It's trival to use infer_nonnull_range and to teach infer_nonnull_range to use the returns_nonnull attribute to pick up that return x in an appropriately decorated function implies that x is non-null. We'll need a better place to shove infer_nonnull_range so that it's available to both users. I also looked at having VRP use the returns_nonnull to tighten ranges in the current function (when it's appropriately decorated). However, that's a bit more problematical. By the time we process the return statement, we've likely already taken the return value's range to VARYING. Going back down the lattice (to ~[0, 0]) is generally considered bad. Given how rarely I expect this to help, I'm dropping this part of the floor. The hack I had to avoid processing a block multiple times if a single SSA_NAME has multiple uses in the block was totally unnecessary. All the right things happen if that hack gets removed and isolate_path is slightly adjusted. So if we have X that we've determined as a NULL value on some path, given a block like x->y = blah blah x->z = fu bar If we find the x->z use first in the immediate use iterator, we'll first transform into x->y = blah blah trap Then we see the x->y use and transform into trap Which is exactly what we want. jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/21/13 06:19, Richard Biener wrote: I wonder why this isn't part of the regular jump-threading code - after all the opportunity is to thead to __builtin_unreachable () ;) Otherwise the question is always where you'd place this pass and whether it enables jump threading or CSE opportunities (that at least it does). In theory one could do this as part of jump threading. Doing it in jump threading would have the advantage that we could use a NULL generated in a different block than the dereference. Of course, we'd have a full path to duplicate at that point. The downside is complexity. This pass is amazingly simple and effective. The jump threading code is considerably more complex and bolting this on the side would just make it worse. From a pipeline location, if kept as a separate pass, it probably needs to be between DOM1 & phi-only cprop. DOM1 does a reasonable job at propagating the NULL values into the PHI nodes to expose the optimization. And the pass tends to expose precisely the kinds of PHI nodes that phi-only cprop can optimize. VRP2/PRE/DOM2 do a good job at finding the newly exposed CSEs and jump threading opportunities. I haven't looked to see if there's anything left to optimize after DOM2, but I can certainly speculate that there would be cases exposed by the various passes that currently exist between DOM1 & DOM2. That's certainly worth exploring. As you know, phase ordering will always be a concern. We already have acknowledged that we're punting some of those issues once we stopped iterating DOM. Jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On Fri, Oct 18, 2013 at 7:12 PM, Jeff Law wrote: > > > Back in 2011 I wrote code to detect cases when traversing a particular path > could be proven to trigger undefined behaviour (such as a null pointer > dereference). That original patch would find the control dependent edges > leading to the dereference and eliminate those edges. The result being those > paths with undefined behaviour simply vanished. > > The problem with that implementation is there could have been observable > side effects on those paths prior to triggering the undefined behaviour. > > At that time I speculated we could isolate the path (via block copying) with > undefined behaviour. On the duplicate path we'd replace the undefined > behaviour with a trap and remove only the outgoing edges. That would still > preserve any visible side effects on the undefined path up to the undefined > behaviour and we still often get to simplify the main path, though not as > much as the original version from 2011. > > That's precisely what this patch does. When we have a PHI which feeds a > memory dereference in the same block as the PHI and one (or more) of the PHI > args is NULL we duplicate the block, redirect incoming edges with the NULL > PHI args to the duplicate and change the statement with the memory > dereference to a trap. > > You can certainly ask is this actually helpful. In my experience, yes. I'm > regularly seeing stuff like > > x2 = PHI (x1, x1, 0, 0); > *x2 > > When we isolate the paths with NULL incoming values, we're left with > x2 = PHI (x1, x1) > > And we can propagate x1 into uses of x2. The block now has just one pred > and the current block can sometimes be merged into that single pred. I've > seen the combination of those two then lead to identification additional > common subexpressions. > > You might also ask how often such paths show up. In a bootstrap about a > week ago, this triggered ~3500 times. > > The code tries to be reasonably smart and re-use an existing duplicate when > there's multiple NULL values in a PHI node. That triggers often enough to > be worth the marginal amount of book keeping. > > The code doesn't yet handle the case where there's multiple dereferences in > the block. There's no guarantee the first dereference will be the first one > we find on the immediate use list. It's on my TODO list. > > > There's no reason this same framework couldn't be used to do the same thing > for an out of bounds array access. Similarly, it wouldn't be hard to issue > a warning when these transformations are applied. > > Posting at this time to get some feedback. Planning a formal RFA prior to > close of 4.9 stage1. > > And, of course, this bootstrapps and regression tests. 20030711-3 is > clearly optimized further with this patch, but I'll cobble together some > more testcases next week. > > > I'm not familiar with the options stuff, so if someone could look at that > more closely, I'd appreciate it. Similarly for the bits to create a new > pass structure. > > > Thoughts, comments, flames? I wonder why this isn't part of the regular jump-threading code - after all the opportunity is to thead to __builtin_unreachable () ;) Otherwise the question is always where you'd place this pass and whether it enables jump threading or CSE opportunities (that at least it does). Richard. > > > > > > * Makefile.in (OBJS): Add tree-ssa-isolate-paths. > * common.opt (-ftree-isolate-paths): Add option and documentation. > * opts.c (default_options_table): Add OPT_ftree_isolate_paths. > * passes.def: Add path isolation pass. > * timevar.def (TV_TREE_SSA_ISOLATE_PATHS): New timevar. > * tree-pass.h (make_isolate_paths): Declare. > * tree-ssa-isolate-paths.c: New file. > > * gcc.dg/pr38984.c: Add -fno-tree-isolate-paths. > * gcc.dg/tree-ssa/20030711-3.c: Update expected output. > > * testsuite/libmudflap.c/fail20-frag.c: Add -fno-tree-isolate-paths. > > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index ba39ac9..e7c18bc 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1412,6 +1412,7 @@ OBJS = \ > tree-ssa-dse.o \ > tree-ssa-forwprop.o \ > tree-ssa-ifcombine.o \ > + tree-ssa-isolate-paths.o \ > tree-ssa-live.o \ > tree-ssa-loop-ch.o \ > tree-ssa-loop-im.o \ > diff --git a/gcc/common.opt b/gcc/common.opt > index 1f11fcd..1fd1bcc 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2104,6 +2104,12 @@ foptimize-strlen > Common Report Var(flag_optimize_strlen) Optimization > Enable string length optimizations on trees > > +ftree-isolate-paths > +Common Report Var(flag_tree_isolate_paths) Init(1) Optimization > +Detect paths which trigger undefined behaviour. Isolate those paths from > +the main control flow and turn the statement with undefined behaviour into > +a trap. > + > ftree-loop-distribution > Common Report Var(flag_tree_loop_distribution) Optimization >
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/18/13 15:15, Marc Glisse wrote: On Fri, 18 Oct 2013, Jeff Law wrote: On 10/18/13 14:31, Marc Glisse wrote: But once you have that pointer defined by a PHI containing a zero, you look at all its uses, trying to find one that proves the pointer is non-zero What are you going to do with that information? Uh? I must have been very unclear in my post, this sentence was supposed to describe what *your* patch is doing... Ah, no worries. Sometimes email is a lousy medium for communication. Yes, precisely. All I was talking about was separating the predicate: "this statement results in UB if that SSA_NAME pointer is NULL" into its own function, and we happen to already have a function in VRP which does that. Yes, good point. Sharing that routine would be advisable for various reasons. I don't offhand see anything VRP specific in there, so it ought to be relatively easy. Jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On Fri, 18 Oct 2013, Jeff Law wrote: On 10/18/13 14:31, Marc Glisse wrote: But once you have that pointer defined by a PHI containing a zero, you look at all its uses, trying to find one that proves the pointer is non-zero What are you going to do with that information? Uh? I must have been very unclear in my post, this sentence was supposed to describe what *your* patch is doing... The only use I can see for this pass would be discovering more statements that, if a NULL value flows in, result in undefined behaviour. Yes, precisely. All I was talking about was separating the predicate: "this statement results in UB if that SSA_NAME pointer is NULL" into its own function, and we happen to already have a function in VRP which does that. The comment about the non-null attribute refers to cases where we've decorated an argument as "must be non-null". If we see a NULL flowing into such a call, then we've crossed the line into undefined behaviour. We'd want to treat the call just like *0 -- isolate the path with NULL flowing to that call and turn the call into a trap. Yes, and I think you would get that for free if you called the VRP function instead of manually checking if the statement is an indirection. Similarly if we find a NULL flowing to a return statement in a function decorated as never returning NULL. Good idea. Sorry for the misunderstandings... -- Marc Glisse
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/18/13 14:31, Marc Glisse wrote: But once you have that pointer defined by a PHI containing a zero, you look at all its uses, trying to find one that proves the pointer is non-zero What are you going to do with that information? The only use I can see for this pass would be discovering more statements that, if a NULL value flows in, result in undefined behaviour. (only dereferences for now, but you have a comment about the non-null attribute). The comment about the non-null attribute refers to cases where we've decorated an argument as "must be non-null". If we see a NULL flowing into such a call, then we've crossed the line into undefined behaviour. We'd want to treat the call just like *0 -- isolate the path with NULL flowing to that call and turn the call into a trap. Similarly if we find a NULL flowing to a return statement in a function decorated as never returning NULL. Jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On Fri, 18 Oct 2013, Jeff Law wrote: On 10/18/13 12:47, Marc Glisse wrote: * tree-vrp has a function infer_nonnull_range, do you think we could share it? We now store the VRP ranges for integers, but not for pointers. If we did (or maybe just a non-null bit), the pass could just test that bit on the variable found in the PHI. I'm not sure what can really be shared here -- this patch searches for PHIs where one or more of the args is a NULL pointer. The NULL pointer will be explicit in the PHI arg. But once you have that pointer defined by a PHI containing a zero, you look at all its uses, trying to find one that proves the pointer is non-zero (only dereferences for now, but you have a comment about the non-null attribute). And infer_nonnull_range precisely says whether a statement proves that a pointer is non-zero (well, there may be a few subtle differences, and some tests might need to move between infer_value_range and infer_nonnull_range). I am just talking of replacing 20 lines of code with a function call, not a huge sharing I agree... Storing the VRP info might not actually help, since you need to know starting from which statement the pointer is non-zero. -- Marc Glisse
Re: [RFC] Isolate & simplify paths with undefined behaviour
On 10/18/13 12:47, Marc Glisse wrote: Maybe a new -fretroactive-undefined-behavior? (for later, obviously) Something like that -- haven't thought much about the name. * should cfg_altered be static (or a member of the pass class)? At the minimum it should be static. Into the pass class would be better. I'll take care of that. * tree-vrp has a function infer_nonnull_range, do you think we could share it? We now store the VRP ranges for integers, but not for pointers. If we did (or maybe just a non-null bit), the pass could just test that bit on the variable found in the PHI. I'm not sure what can really be shared here -- this patch searches for PHIs where one or more of the args is a NULL pointer. The NULL pointer will be explicit in the PHI arg. So a real world example (alias.i from gcc-4.6.0): : # iftmp.163_8 = PHI # _241 = PHI <_242(66), _109(65), _246(93)> _112 = iftmp.163_8->u.hwint[0]; _113 = _241 + _112; _114 = _113 * 8; _115 = ref_28(D)->size; if (_114 == _115) goto ; else goto ; So when BB67 is reached via BB65 or BB93 iftmp.163_8 will have the value 0 via the PHI node at the start of BB67 and it'll be derefereced in BB67. We'll isolate the 65->67 path and the 93->67 path. What you're suggesting would be more useful for a warning pass which detected potential null pointer dereferences. If we continue with the same example and assume we isolated the two paths noted above we have this remaining as BB67: : # iftmp.163_8 = PHI # _241 = PHI <_242(66)> _112 = iftmp.163_8->u.hwint[0]; _113 = _241 + _112; _114 = _113 * 8; _115 = ref_28(D)->size; if (_114 == _115) goto ; else goto ; If we've determined via VRP that iftmp.163_111 has a non-null property, then we would not warn about a potential null pointer dereference. That's an example of where preserving the range information for pointers out of VRP would be useful. I actually have another patch which does precisely these kinds of warnings. It needs more TLC before we could even consider integrating it. BTW, if we continue with that example, immediately after isolation we have (I'm including our block's predecessor because it's going to merge). BB63 is the pred, BB64 is the block where we isolated out two paths: ;; basic block 63, loop depth 0, count 0, freq 529, maybe hot ;;prev block 62, next block 64, flags: (NEW, REACHABLE) ;;pred: 62 [27.0%] (TRUE_VALUE,EXECUTABLE) iftmp.161_102 = _99->size; _186 = iftmp.161_102->u.hwint[0]; _69 = mem_25(D)->u.fld[1].rt_mem; iftmp.163_111 = _69->offset; ;;succ: 64 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 64, loop depth 0, count 0, freq 623, maybe hot ;;prev block 63, next block 65, flags: (NEW, REACHABLE) ;;pred: 63 [100.0%] (FALLTHRU,EXECUTABLE) # iftmp.163_8 = PHI # _241 = PHI <_186(63)> _112 = iftmp.163_8->u.hwint[0]; _113 = _241 + _112; _114 = _113 * 8; _115 = ref_28(D)->size; if (_114 == _115) goto ; else goto ; We soon realize that we can cprop away the two PHI nodes at the start of BB64 and merge BB64 and BB63 together resulting in: ;; basic block 63, loop depth 0, count 0, freq 529, maybe hot ;;prev block 62, next block 64, flags: (NEW, REACHABLE) ;;pred: 62 [27.0%] (TRUE_VALUE,EXECUTABLE) iftmp.161_102 = _99->size; _186 = iftmp.161_102->u.hwint[0]; _69 = mem_25(D)->u.fld[1].rt_mem; iftmp.163_111 = _69->offset; _112 = iftmp.163_111->u.hwint[0]; _113 = _112 + _186; _114 = _113 * 8; _115 = ref_28(D)->size; if (_114 == _115) goto ; else goto ; Later we discover some CSE opportunities and remove dead code resulting in: ;; basic block 66, loop depth 0, count 0, freq 529, maybe hot ;;prev block 65, next block 67, flags: (NEW, REACHABLE) ;;pred: 65 [27.0%] (TRUE_VALUE,EXECUTABLE) _186 = _91->u.hwint[0]; _113 = _166 + _186; _114 = _113 * 8; _115 = ref_28(D)->size; if (_114 == _115) goto ; else goto ; Needless to say, this is goodness. Jeff
Re: [RFC] Isolate & simplify paths with undefined behaviour
On Fri, 18 Oct 2013, Jeff Law wrote: Back in 2011 I wrote code to detect cases when traversing a particular path could be proven to trigger undefined behaviour (such as a null pointer dereference). That original patch would find the control dependent edges leading to the dereference and eliminate those edges. The result being those paths with undefined behaviour simply vanished. The problem with that implementation is there could have been observable side effects on those paths prior to triggering the undefined behaviour. Maybe a new -fretroactive-undefined-behavior? (for later, obviously) At that time I speculated we could isolate the path (via block copying) with undefined behaviour. On the duplicate path we'd replace the undefined behaviour with a trap and remove only the outgoing edges. That would still preserve any visible side effects on the undefined path up to the undefined behaviour and we still often get to simplify the main path, though not as much as the original version from 2011. That's precisely what this patch does. When we have a PHI which feeds a memory dereference in the same block as the PHI and one (or more) of the PHI args is NULL we duplicate the block, redirect incoming edges with the NULL PHI args to the duplicate and change the statement with the memory dereference to a trap. Cool. As a new pass doing something quite clear, I am trying to read it to learn, and have a couple minor comments: * should cfg_altered be static (or a member of the pass class)? * tree-vrp has a function infer_nonnull_range, do you think we could share it? We now store the VRP ranges for integers, but not for pointers. If we did (or maybe just a non-null bit), the pass could just test that bit on the variable found in the PHI. -- Marc Glisse
[RFC] Isolate & simplify paths with undefined behaviour
Back in 2011 I wrote code to detect cases when traversing a particular path could be proven to trigger undefined behaviour (such as a null pointer dereference). That original patch would find the control dependent edges leading to the dereference and eliminate those edges. The result being those paths with undefined behaviour simply vanished. The problem with that implementation is there could have been observable side effects on those paths prior to triggering the undefined behaviour. At that time I speculated we could isolate the path (via block copying) with undefined behaviour. On the duplicate path we'd replace the undefined behaviour with a trap and remove only the outgoing edges. That would still preserve any visible side effects on the undefined path up to the undefined behaviour and we still often get to simplify the main path, though not as much as the original version from 2011. That's precisely what this patch does. When we have a PHI which feeds a memory dereference in the same block as the PHI and one (or more) of the PHI args is NULL we duplicate the block, redirect incoming edges with the NULL PHI args to the duplicate and change the statement with the memory dereference to a trap. You can certainly ask is this actually helpful. In my experience, yes. I'm regularly seeing stuff like x2 = PHI (x1, x1, 0, 0); *x2 When we isolate the paths with NULL incoming values, we're left with x2 = PHI (x1, x1) And we can propagate x1 into uses of x2. The block now has just one pred and the current block can sometimes be merged into that single pred. I've seen the combination of those two then lead to identification additional common subexpressions. You might also ask how often such paths show up. In a bootstrap about a week ago, this triggered ~3500 times. The code tries to be reasonably smart and re-use an existing duplicate when there's multiple NULL values in a PHI node. That triggers often enough to be worth the marginal amount of book keeping. The code doesn't yet handle the case where there's multiple dereferences in the block. There's no guarantee the first dereference will be the first one we find on the immediate use list. It's on my TODO list. There's no reason this same framework couldn't be used to do the same thing for an out of bounds array access. Similarly, it wouldn't be hard to issue a warning when these transformations are applied. Posting at this time to get some feedback. Planning a formal RFA prior to close of 4.9 stage1. And, of course, this bootstrapps and regression tests. 20030711-3 is clearly optimized further with this patch, but I'll cobble together some more testcases next week. I'm not familiar with the options stuff, so if someone could look at that more closely, I'd appreciate it. Similarly for the bits to create a new pass structure. Thoughts, comments, flames? * Makefile.in (OBJS): Add tree-ssa-isolate-paths. * common.opt (-ftree-isolate-paths): Add option and documentation. * opts.c (default_options_table): Add OPT_ftree_isolate_paths. * passes.def: Add path isolation pass. * timevar.def (TV_TREE_SSA_ISOLATE_PATHS): New timevar. * tree-pass.h (make_isolate_paths): Declare. * tree-ssa-isolate-paths.c: New file. * gcc.dg/pr38984.c: Add -fno-tree-isolate-paths. * gcc.dg/tree-ssa/20030711-3.c: Update expected output. * testsuite/libmudflap.c/fail20-frag.c: Add -fno-tree-isolate-paths. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index ba39ac9..e7c18bc 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1412,6 +1412,7 @@ OBJS = \ tree-ssa-dse.o \ tree-ssa-forwprop.o \ tree-ssa-ifcombine.o \ + tree-ssa-isolate-paths.o \ tree-ssa-live.o \ tree-ssa-loop-ch.o \ tree-ssa-loop-im.o \ diff --git a/gcc/common.opt b/gcc/common.opt index 1f11fcd..1fd1bcc 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2104,6 +2104,12 @@ foptimize-strlen Common Report Var(flag_optimize_strlen) Optimization Enable string length optimizations on trees +ftree-isolate-paths +Common Report Var(flag_tree_isolate_paths) Init(1) Optimization +Detect paths which trigger undefined behaviour. Isolate those paths from +the main control flow and turn the statement with undefined behaviour into +a trap. + ftree-loop-distribution Common Report Var(flag_tree_loop_distribution) Optimization Enable loop distribution on trees diff --git a/gcc/opts.c b/gcc/opts.c index 728d36d..b6ed0c2 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -453,6 +453,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_ftree_ch, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fcombine_stack_adjustments, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_fcompare_elim, NULL, 1 }, +{ OPT_LEVELS_1_PLUS, OPT_ftree_isolate_paths, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_slsr, NU