Hi! > -----Original Message----- > From: Richard Biener [mailto:[email protected]] > Sent: Tuesday, March 24, 2020 10:14 PM > To: Yangfei (Felix) <[email protected]> > Cc: [email protected] > Subject: Re: [RFC] Should widening_mul should consider block frequency? > > > > As written in the PR I'd follow fma generation and restrict defs to the > > > same > BB. > > > > Thanks for the suggestion. That should be more consistent. > > Attached please find the adapted patch. > > Bootstrap and tested on both x86_64 and aarch64 Linux platform. > > OK with moving the BB check before the is_widening_mult_p call since it's way > cheaper.
That's a good point. I have attached the v2 patch.
Also did a spec2017 test on aarch64, no obvious impact witnessed with this.
Can you sponsor this patch please? My networking does not work well and I am
having some trouble pushing it : - (
git commit msg:
widening_mul: restrict ops to be defined in the same basic-block when
convert plusminus to widen
In the testcase for PR94269, widening_mul moves two multiply instructions
from outside the loop to inside
the loop, merging with two add instructions separately. This increases the
cost of the loop. Like FMA detection
in the same pass, simply restrict ops to be defined in the same basic-block
to avoid possibly moving multiply
to a different block with a higher execution frequency.
2020-03-26 Felix Yang <[email protected]>
gcc/
PR tree-optimization/94269
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Restrict this
operation to single basic block.
gcc/testsuite/
PR tree-optimization/94269
* gcc.dg/pr94269.c: New test.
change log:
gcc:
+2020-03-26 Felix Yang <[email protected]>
+
+ PR tree-optimization/94269
+ * tree-ssa-math-opts.c (convert_plusminus_to_widen): Restrict this
+ operation to single basic block.
gcc/testsuite:
+2020-03-26 Felix Yang <[email protected]>
+
+ PR tree-optimization/94269
+ * gcc.dg/pr94269.c: New test.
+
pr94269-v2.patch
Description: pr94269-v2.patch
