On Sat, Nov 12, 2016 at 8:36 AM, Evgeny Kudryashov <kudryas...@ispras.ru> wrote: > On 2016-11-10 13:30, Bin.Cheng wrote: >> >> Hi, >> I see the cost problem with your test now. When computing an address >> type iv_use with a candidate, the computation consists of two parts, >> for computation can be represented by addressing mode, it is done in >> memory reference; for computation cannot be represented by addressing >> mode, it is done outside of memory reference. The final cost is added >> up from the two computation parts. >> For address iv_use: >> MEM[base + biv << scale + offset] >> when it is computed with below candidate on target only supports [base >> + biv << scale] addressing mode: >> biv >> The computations would be like: >> base' = base + offset >> MEM[base' + biv << scale] >> Both computations has its own cost, the first one is normal RTX cost, >> the second one is addressing mode cost. Final cost is added up from >> both parts. >> >> Normally, all these cost should be added up in cost model, but there >> should be one exception found in your test: If iv_uses of a group has >> exactly the same iv ({base, step}), the first part computation (RTX) >> can be shared among all iv_uses, thus the cost should only counted one >> time. That is, we should be able to model such CSE opportunities. >> Apparently, we can't CSE the second part computation, of course there >> won't be CSE opportunities in address expression anyway. > > > Hi Bin, > Yes, that is exactly what happens. And this computation might be cheaper > than initialization and increment of new iv and it would be more preferable. > >> That said, this patch should make difference between cost of RTX >> computation and address expression, and only add up RTX cost once if >> it can be CSEed. Well, it might be not trivial to check CSE >> opportunities of RTX computation, for example, some iv_uses of the >> group are the same, others are not. >> >> Thanks, >> bin > > > Since uses in a given group have the same base and step, they can only > differ by offsets. Among those, equivalent offsets can be CSE'd. Then, > perhaps it's possible to use a hash set of unique offsets in this group cost > estimation loop, and count RTX computation cost only when adding a new entry > to the set. What do you think about this approach? We can start handling groups with exactly the same uses. When constructing groups, record a flag indicating the group only has the same uses; when computing cost, accumulate RTX computation cost once for flagged groups. The rationale is: use/cand cost computation in IVOPT is complicated and inaccurate, it's doesn't make much sense trying to do fine-tuning based on such costs. It often results in Brownian-movement (I am not sure if that's the word). Moreover, we may want to further restrict to single basic block iv_uses when flagging groups. BTW, it maybe non-trivial to compute costs of RTX computation and address expression separately. Such costs are computed and accumulated together in get_address_cost.
> > While working on this issue, I've found another problem: that costs may > become negative. That looks unintended, I have filed a new bug: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78332 It could be improved, but not a functional bug IMHO. I will comment on the PR. Thanks, bin