On Fri, Sep 11, 2015 at 6:29 PM, Ramana Radhakrishnan <ramana.radhakrish...@foss.arm.com> wrote: > >>>> Saying that all reductions have equivalent performance is unlikely to be >>>> true for many platforms. On PowerPC, for example, a PLUS reduction has >>>> very different cost from a MAX reduction. If the model isn't >>>> fine-grained enough, let's please be aggressive about fixing it. I'm >>>> fine if it's a separate patch, but in my mind this shouldn't be allowed >>>> to languish. >>> >>> ...I agree that the general vectoriser cost model could probably be >>> improved, but it seems fairer for that improvement to be done by whoever >>> adds the patterns that need it. >> >> All right. But in response to Ramana's comment, are all relevant >> reductions of similar cost on each ARM platform? Obviously they don't >> have the same cost on different platforms, but the question is whether a >> reduc_plus, reduc_max, etc., has identical cost on each individual >> platform. If not, ARM may have a concern as well. I don't know the >> situation for x86 either. > > From cauldron I have a note that we need to look at the vectorizer cost model > for both the ARM and AArch64 backends and move away from > the set of magic constants that it currently returns.
Indeed. > On AArch32, all the reduc_ patterns are emulated with pair-wise operations > while on AArch64 they aren't. Thus they aren't likely to be the same cost as a > standard vector arithmetic instruction. What difference this makes in practice > remains to be seen, however the first step is moving towards the newer > vectorizer > cost model interface. > > I'll put this on a list of things for us to look at but I'm not sure who/when > will get around to looking at this. Note that the target should be able to "see" the cond reduction via the add_stmt_cost hook calls. It should see 'where' as the epilogue and 'kind' as a hint - recording stmt_info which should have sufficient info for a good guess. At finish_cost () time the target can compute a proper overall cost. Yes, the vectorizer "IL" (the stmt-infos) isn't very powerful and esp. for code not corresponding to existing gimple stmts it doesn't even exist. We need to improve in that area to better represent the desired transform. Richard. > regards > Ramana