On 08/29/2017 03:13 AM, Christophe Lyon wrote: > Hi Jeff, > > > On 29 August 2017 at 07:07, Jeff Law <l...@redhat.com> wrote: >> This is a two part patchkit to improve DOM's ability to derive constant >> equivalences that arise as a result of traversing a particular edge in >> the CFG. >> >> Until now we only allowed a single NAME = NAME|CONST equivalence to be >> associated with an edge in the CFG. Patch #1 generalizes that code so >> that we can record multiple simple equivalences on an edge. Much like >> expression equivalences, we just shove them into a vec and iterate on >> the vec in the appropriate places. >> >> Patch #2 has the interesting bits. >> >> Back in the gcc-7 effort I added the ability to look at the operands of >> a BIT_IOR_EXPR that had a zero result. In that case each operand of the >> BIT_IOR must have a zero value. This was to address a missed >> optimization regression bug during stage4. >> >> The plan was always to add analogous BIT_AND support, but I didn't feel >> like handling BIT_AND was appropriate at the time (no bz entry and no >> regressions related to that capability). >> >> I'd also had the sense that further improvements could be made here. For >> example, it is common for the BIT_IOR or BIT_AND to be fed by a >> comparison and we ought to be able to record the result of the >> comparison. If the comparison happened to be an equality test, then we >> may ultimately derive a constant for on operand of the equality test as >> well. >> >> It also seemed like the NOP/CONVERT_EXPR handling could be incorporated >> into such a change. >> >> So I pulled together some instrumentation. Lots of things generate >> equivalences -- but a much smaller subset of those equivalences are >> ultimately useful. >> >> Probably the most surprising was BIT_XOR, which allows us to generate >> all kinds of equivalences, but none that were useful for ultimate >> simplification in any of the tests I looked at. >> >> >> The most subtle was COND_EXPRs. We might have something like >> >> res = (a != 5) ? x : 1; >> >> >> We can't actually derive anything useful for "a" here, even if we know >> the result is one. That's because "x" could have the value 1. So you >> end up only being able to derive equivalences for COND_EXPRs when both >> arms have a constant value. That restriction dramatically reduces the >> utility of handling COND_EXPR -- to the point where I'm not including it. >> >> So what we end up with is BIT_AND/BIT_IOR, conversions, plus/minus, >> comparisons and neg/not. >> >> So when we determine that a particular SSA_NAME has a constant value, we >> look at the defining statement and essentially try to derive a value for >> the input operand(s) based on knowing the result value. If we can >> derive a constant value for an input operand, we record that value and >> recurse. >> >> In cases where we walk backwards to a condition. We will record the >> condition into the available expression table. >> >> >> The code is written such that if we find cases where the equivalences >> for other nodes are useful, they're easy to add. >> >> >> These equivalences are most useful to the threader, but I've seen them >> help in other cases as well. There's a half-dozen or so new tests >> reduced from GCC itself. >> >> Bootstrapped and regression tested on x86_64, lightly tested on ppc64le >> via bootstrapping and running the new tests to verify they do the right >> thing on a !logical_op_short_circuit target. >> >> Installing on the trunk. >> >> Jeff >> >> >> commit 506ac60cacbc4c4e5e166513ea83c1d2e14eaf3b >> Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Tue Aug 29 05:03:22 2017 +0000 >> >> * tree-ssa-dom.c (class edge_info): Changed from a struct >> to a class. Add ctor/dtor, methods and data members. >> (edge_info::edge_info): Renamed from allocate_edge_info. >> Initialize additional members. >> (edge_info::~edge_info): New. >> (free_dom_edge_info): Delete the edge info. >> (record_edge_info): Use new class & associated member functions. >> Tighten forms for testing for edge equivalences. >> (record_temporary_equivalences): Iterate over the simple >> equivalences rather than assuming there's only one per edge. >> (cprop_into_successor_phis): Iterate over the simple >> equivalences rather than assuming there's only one per edge. >> (optimize_stmt): Use operand_equal_p rather than pointer >> equality for mini-DSE code. >> [ snip ]
>> commit a370df2c52074abbb044d1921a0c7df235758050 >> Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Tue Aug 29 05:03:36 2017 +0000 >> >> * tree-ssa-dom.c (edge_info::record_simple_equiv): Call >> derive_equivalences. >> (derive_equivalences_from_bit_ior, >> record_temporary_equivalences): >> Code moved into.... >> (edge_info::derive_equivalences): New private member function >> >> * gcc.dg/torture/pr57214.c: Fix type of loop counter. >> * gcc.dg/tree-ssa/ssa-sink-16.c: Disable DOM. >> * gcc.dg/tree-ssa/ssa-dom-thread-11.c: New test. >> * gcc.dg/tree-ssa/ssa-dom-thread-12.c: New test. >> * gcc.dg/tree-ssa/ssa-dom-thread-13.c: New test. >> * gcc.dg/tree-ssa/ssa-dom-thread-14.c: New test. >> * gcc.dg/tree-ssa/ssa-dom-thread-15.c: New test. >> * gcc.dg/tree-ssa/ssa-dom-thread-16.c: New test. >> * gcc.dg/tree-ssa/ssa-dom-thread-17.c: New test. >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@251397 >> 138bc75d-0d04-0410-961f-82ee72b054a4 >> > > 3 of the new tests fail on arm-none-linux-gnueabihf > --with-cpu=cortex-a15 --with-fpu=vfpv3-d16-fp16 > > FAIL: gcc.dg/tree-ssa/ssa-dom-thread-11.c scan-tree-dump-times dom2 > "Threaded" 1 > FAIL: gcc.dg/tree-ssa/ssa-dom-thread-14.c scan-tree-dump-times dom2 > "Threaded" 1 > FAIL: gcc.dg/tree-ssa/ssa-dom-thread-16.c scan-tree-dump-times dom2 > "Threaded" 1 > > they do pass when configuring for cpu cortex-a9/a15 and fpu > neon-fp16/neon-vfpv4 > > I do not have the dumps since it's automated testing; let me know if > you need me to > reproduce it manually and extract the dumps. Thanks. -11 and -16 are fairly sensitive to branch costing so I'm not terribly surprised to find out we're going to need to adjust the target selectors a bit more. I'll look into what's going on with -14 as well. Thanks, jeff