On 11 November 2015 at 16:03, Richard Biener <rguent...@suse.de> wrote:
> On Wed, 11 Nov 2015, Prathamesh Kulkarni wrote:
>
>> On 10 November 2015 at 20:11, Richard Biener <rguent...@suse.de> wrote:
>> > On Mon, 9 Nov 2015, Prathamesh Kulkarni wrote:
>> >
>> >> On 4 November 2015 at 20:35, Richard Biener <rguent...@suse.de> wrote:
>> >> >
>> >> > Btw, did you investigate code gen differences on x86_64/i586?  That
>> >> > target expands all divisions/modulo ops via divmod, relying on CSE
>> >> > solely as the HW always computes both div and mod (IIRC).
>> >> x86_64 has optab_handler for divmod defined, so the transform won't
>> >> take place on x86.
>> >
>> > Ok.
>> >
>> >> > +
>> >> > +        gassign *assign_stmt = gimple_build_assign (gimple_assign_lhs
>> >> > (use_stmt), rhs);
>> >> > +        gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt);
>> >> >
>> >> > Ick.  Please use
>> >> >
>> >> >     gimple_set_rhs_from_tree (use_stmt, res);
>> >> Um there doesn't seem to be gimple_set_rhs_from_tree.
>> >> I used gimple_assign_set_rhs_from_tree which requires gsi for use_stmt.
>> >> Is that OK ?
>> >
>> > Yes.
>> >
>> >> >     update_stmt (use_stmt);
>> >> >     if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
>> >> >       cfg_changed = true;
>> >> >
>> >> > +  free_dominance_info (CDI_DOMINATORS);
>> >> >
>> >> > do not free dominators.
>> >>
>> >> I have done the suggested changes in the attached patch.
>> >> I have a few questions:
>> >>
>> >> a) Does the change to insert DIVMOD call before topmost div or mod
>> >> stmt with matching operands
>> >> look correct ?
>> >
>> > +  /* Insert call-stmt just before the topmost div/mod stmt.
>> > +     top_bb dominates all other basic blocks containing div/mod stms
>> > +     so, the topmost stmt would be the first div/mod stmt with matching
>> > operands
>> > +     in top_bb.  */
>> > +
>> > +  gcc_assert (top_bb != 0);
>> > +  gimple_stmt_iterator gsi;
>> > +  for (gsi = gsi_after_labels (top_bb); !gsi_end_p (gsi); gsi_next
>> > (&gsi))
>> > +    {
>> > +      gimple *g = gsi_stmt (gsi);
>> > +      if (is_gimple_assign (g)
>> > +         && (gimple_assign_rhs_code (g) == TRUNC_DIV_EXPR
>> > +            || gimple_assign_rhs_code (g) == TRUNC_MOD_EXPR)
>> > +         && operand_equal_p (op1, gimple_assign_rhs1 (g), 0)
>> > +         && operand_equal_p (op2, gimple_assign_rhs2 (g), 0))
>> > +       break;
>> >
>> > Looks overly complicated to me.  Just remember "topmost" use_stmt
>> > alongside top_bb (looks like you'll no longer need top_bb if you
>> > retail top_stmt).  And then do
>> >
>> >    gsi = gsi_for_stmt (top_stmt);
>> >
>> > and insert before that.
>> Thanks, done in this patch. Does it look OK ?
>> IIUC gimple_uid (stmt1) < gimple_uid (stmt2) can be used to check if
>> stmt1 occurs before stmt2
>> only if stmt1 and stmt2 are in the same basic block ?
>> >
>> >> b) Handling constants - I dropped handling constants in the attached
>> >> patch. IIUC we don't want
>> >> to enable this transform if there's a specialized expansion for some
>> >> constants for div or mod ?
>> >
>> > See expand_divmod which has lots of special cases for constant operands
>> > not requiring target support for div or mod.
>> Thanks, would it be OK if I do this in follow up patch ?
>
> Well, just not handle them like in your patch is fine.
>
>> >
>> >> I suppose this would also be target dependent and require a target hook ?
>> >> For instance arm defines modsi3 pattern to expand mod when 2nd operand
>> >> is constant and <= 0 or power of 2,
>> >> while for other cases goes the expand_divmod() route to generate call
>> >> to __aeabi_idivmod libcall.
>> >
>> > Ok, so it lacks a signed mod instruction.
>> >
>> >> c) Gating the divmod transform -
>> >> I tried gating it on checks for optab_handlers on div and mod, however
>> >> this doesn't enable transform for arm cortex-a9
>> >> anymore (cortex-a9 doesn't have hardware instructions for integer div and 
>> >> mod).
>> >> IIUC for cortex-a9,
>> >> optab_handler (sdivmod_optab, SImode) returns CODE_FOR_nothing because
>> >> HAVE_divsi3 is 0.
>> >> However optab_handler (smod_optab, SImode) matches since optab_handler
>> >> only checks for existence of pattern
>> >> (and not whether the pattern gets matched).
>> >> I suppose we should enable the transform only if the divmod, div, and
>> >> mod pattern do not match rather than checking
>> >> if the patterns exist via optab_handler ? For a general x % y, modsi3
>> >> would fail to match but optab_handler(smod_optab, SImode ) still
>> >> says it's matched.
>> >
>> > Ah, of course.  Querying for an optab handler is just a cheap
>> > guesstimate...  Not sure how to circumvent this best (sub-target
>> > enablement of patterns).  RTL expansion just goes ahead (of course)
>> > and sees if expansion eventually fails.  Richard?
>> >
>> >> Should we define a new target hook combine_divmod, which returns true
>> >> if transforming to divmod is desirable for that
>> >> target ?
>> >> The default definition could be:
>> >> bool default_combine_divmod (enum machine_mode mode, tree op1, tree op2)
>> >> {
>> >>   // check for optab_handlers for div/mod/divmod and libfunc for divmod
>> >> }
>> >>
>> >> And for arm, it could be over-ridden to return false if op2 is
>> >> constant and <= 0 or power of 2.
>> >> I am not really sure if this is a good idea since I am replicating
>> >> information from modsi3 pattern.
>> >> Any change to the pattern may require corresponding change to the hook :/
>> >
>> > Yeah, I don't think that is desirable.  Ideally we'd have a way
>> > to query HAVE_* for CODE_FOR_* which would mean target-insns.def
>> > support for all div/mod/divmod patterns(?) and queries...
>> >
>> > Not sure if what would be enough though.
>> >
>> > Note that the divmod check is equally flawed.
>> >
>> > I think with the above I'd enable the transform when
>> >
>> > +  if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
>> > +      || (optab_libfunc (divmod_optab, mode) != NULL_RTX
>> >            && optab_handler ([su]div_optab, mode) == CODE_FOR_nothing))
>> > +    return false;
>> Um this fails for the arm backend (for cortex-a9) because
>> optab_handler (divmod_optab, mode) != CODE_FOR_nothing is false
>> optab_libfunc (divmod_optab, mode) != NULL_RTX is true.
>> optab_handler (div_optab, mode) == CODE_FOR_nothing is true.
>> which comes down to false || (true && true) which is true and we hit
>> return false.
>
> Oh, sorry to mess up the test - it was supposed to be inverted.
>
>> AFAIU, we want the transform to be disabled if:
>> a) optab_handler exists for divmod.
>> b) optab_handler exists for div.
>> c) optab_libfunc does not exist for divmod.  */
>>
>> +  if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
>> +      || optab_handler (div_optab, mode) != CODE_FOR_nothing
>> +      || optab_libfunc (divmod_optab, mode) == NULL_RTX)
>> +    return false;
>> Does that look correct ?
>
> No.  That will disable if we have a divmod optab.  Instead try
>
>  if (! (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
>         || (optab_libfunc (divmod_optab, mode) != NULL_RTX
>             && optab_handler ([su]div_optab, mode) == CODE_FOR_nothing)))
>    return false;
>
> which is what I intended.  If we have a divmod optab go ahead.
> If we have a libfunc and not a div optab then as well.
Oops, I assumed that we only wanted this transform if optab_libfunc existed.
Modified the test in the attached patch.
Well this does affect x86_64 and i?86 -;)

I added the following hunk back to expand_DIVMOD, since if optab_handler exists
we want to use it for expansion. Does it look OK ?

+  /* Check if optab handler exists for udivmod/sdivmod.  */
+  if (optab_handler (tab, mode) != CODE_FOR_nothing)
+    {
+      rtx quotient = gen_reg_rtx (mode);
+      rtx remainder = gen_reg_rtx (mode);
+      expand_twoval_binop (tab, op0, op1, quotient, remainder,
TYPE_UNSIGNED (type));
+
+      /* Wrap the return value (quotient, remaineder) within COMPLEX_EXPR */
+      expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs),
+                  make_tree (TREE_TYPE (arg0), quotient),
+                  make_tree (TREE_TYPE (arg1), remainder)),
+                  target, VOIDmode, EXPAND_NORMAL);
+
+      return;
+    }

I verified the code generated for x86_64 and i?86 and it's same for my
test-cases.
However during a clean build of gcc for x86_64, I am getting segfault
in bid64_div.c:
In file included from
/home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/libgcc/config/libbid/bid_internal.h:27:0,
                 from
/home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/libgcc/config/libbid/bid64_div.c:56:
/home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/libgcc/config/libbid/bid64_div.c:
In function ‘__bid64_div’:
/home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/libgcc/config/libbid/bid_conf.h:36:19:
internal compiler error: in expand_DIVMOD, at internal-fn.c:2099
 #define bid64_div __bid64_div
                   ^
/home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/libgcc/config/libbid/bid64_div.c:80:1:
note: in expansion of macro ‘bid64_div’
 bid64_div (UINT64 x,
 ^
0x8e101f expand_DIVMOD
        /home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/gcc/internal-fn.c:2099
0x705927 expand_call_stmt
        /home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/gcc/cfgexpand.c:2549
0x705927 expand_gimple_stmt_1
        /home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/gcc/cfgexpand.c:3509
0x705927 expand_gimple_stmt
        /home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/gcc/cfgexpand.c:3672
0x708d65 expand_gimple_basic_block
        /home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/gcc/cfgexpand.c:5676
0x70e696 execute
        /home/bilbo/gnu-toolchain/src/gcc.git~tcwg-72/gcc/cfgexpand.c:6288

It looks like in the following code in expand_DIVMOD:

+  rtx quotient = simplify_gen_subreg (mode, libval, libval_mode, 0);
+  rtx remainder = simplify_gen_subreg (mode, libval, libval_mode,
+                                      GET_MODE_SIZE (mode));

remainder is (nil) and hence the segfault in make_tree (I added the
asserts for quotient and remainder later).
I am not sure why it's happening  though, investigating it.

Thanks,
Prathamesh
>
>> >
>> > so we either will have a divmod instruction (hopefully not sub-target
>> > disabled for us) or a libfunc for divmod and for sure no HW divide
>> > instruction (HW mod can be emulated by HW divide but not the other
>> > way around).
>> >
>> >> d) Adding effective-target-check for divmod: I just enabled it for
>> >> arm*-*-* for now. I could additionally append more targets,
>> >> not sure if this is the right approach.
>> >
>> > Looks good to me.
>> Is this version OK if bootstrap/testing passes ?
>
> Ok with adjusting the optab check like above.
>
> Thanks,
> Richard.
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index a7da373..bc0b410 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2045,6 +2045,66 @@ expand_GOACC_LOOP (gcall *stmt ATTRIBUTE_UNUSED)
   gcc_unreachable ();
 }
 
+/* Expand DIVMOD() using:
+ a) optab handler for udivmod/sdivmod if it is available.
+ b) If optab_handler doesn't exist, Generate call to
+    optab_libfunc for udivmod/sdivmod.  */
+
+static void
+expand_DIVMOD (gcall *stmt)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree arg0 = gimple_call_arg (stmt, 0);
+  tree arg1 = gimple_call_arg (stmt, 1);
+  
+  gcc_assert (TREE_CODE (TREE_TYPE (lhs)) == COMPLEX_TYPE); 
+  tree type = TREE_TYPE (TREE_TYPE (lhs));
+  machine_mode mode = TYPE_MODE (type);
+  optab tab = (TYPE_UNSIGNED (type)) ? udivmod_optab : sdivmod_optab;
+
+  rtx op0 = expand_normal (arg0);
+  rtx op1 = expand_normal (arg1);
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+  /* Check if optab handler exists for udivmod/sdivmod.  */
+  if (optab_handler (tab, mode) != CODE_FOR_nothing)
+    {
+      rtx quotient = gen_reg_rtx (mode);
+      rtx remainder = gen_reg_rtx (mode);
+      expand_twoval_binop (tab, op0, op1, quotient, remainder, TYPE_UNSIGNED 
(type));
+  
+      /* Wrap the return value (quotient, remaineder) within COMPLEX_EXPR */
+      expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs),
+                  make_tree (TREE_TYPE (arg0), quotient),
+                  make_tree (TREE_TYPE (arg1), remainder)),
+                  target, VOIDmode, EXPAND_NORMAL);
+
+      return;
+    }
+
+  rtx libfunc = optab_libfunc (tab, mode);
+  gcc_assert (libfunc != NULL_RTX);
+  
+  machine_mode libval_mode =
+    smallest_mode_for_size (2 * GET_MODE_BITSIZE (mode), MODE_INT);
+
+  rtx libval = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
+                                       libval_mode, 2, op0, GET_MODE (op0), 
op1, GET_MODE (op1)); 
+
+  rtx quotient = simplify_gen_subreg (mode, libval, libval_mode, 0);
+  rtx remainder = simplify_gen_subreg (mode, libval, libval_mode,
+                                      GET_MODE_SIZE (mode));
+
+  gcc_assert (quotient != NULL_RTX);
+  gcc_assert (remainder != NULL_RTX);
+  
+  /* Wrap the return value (quotient, remaineder) within COMPLEX_EXPR */
+  expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs),
+                      make_tree (TREE_TYPE (arg0), quotient),
+                      make_tree (TREE_TYPE (arg1), remainder)),
+              target, VOIDmode, EXPAND_NORMAL);
+}
+
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 78266d9..f28579b 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -83,3 +83,5 @@ DEF_INTERNAL_FN (GOACC_DIM_POS, ECF_PURE | ECF_NOTHROW | 
ECF_LEAF, ".")
 
 /* OpenACC looping abstraction.  See internal-fn.h for usage.  */
 DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL)
+
+DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL)
diff --git a/gcc/testsuite/gcc.dg/pr43721-1.c b/gcc/testsuite/gcc.dg/pr43721-1.c
new file mode 100644
index 0000000..2ec1118
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr43721-1.c
@@ -0,0 +1,11 @@
+/* { dg-options "-O2 -fdump-tree-widening_mul" } */
+/* { dg-require-effective-target divmod } */
+
+int f(int x, int y)
+{
+  int quotient = x / y;
+  int remainder = x % y;
+  return quotient + remainder;
+}
+
+/* { dg-final { scan-tree-dump-times "DIVMOD" 1 "widening_mul" } } */
diff --git a/gcc/testsuite/gcc.dg/pr43721-2.c b/gcc/testsuite/gcc.dg/pr43721-2.c
new file mode 100644
index 0000000..974099c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr43721-2.c
@@ -0,0 +1,17 @@
+/* { dg-options "-O2 -fdump-tree-widening_mul" } */
+/* { dg-require-effective-target divmod } */
+
+int f(int x, int y)
+{
+  extern int early_exit;
+
+  int quot = x / y;
+
+  if (early_exit)
+    return 0;
+
+  int rem = x % y;
+  return quot + rem;
+}
+
+/* { dg-final { scan-tree-dump-times "DIVMOD" 1 "widening_mul" } } */
diff --git a/gcc/testsuite/gcc.dg/pr43721-3.c b/gcc/testsuite/gcc.dg/pr43721-3.c
new file mode 100644
index 0000000..41cda2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr43721-3.c
@@ -0,0 +1,18 @@
+/* { dg-options "-O2 -fdump-tree-widening_mul" } */
+/* { dg-require-effective-target divmod } */
+
+int f(int x, int y)
+{
+  extern int flag;
+  int quot;
+
+  if (flag)
+    quot = x / y;
+  else
+    quot = 0;
+
+  int rem = x % y;
+  return quot + rem; 
+}
+
+/* { dg-final { scan-tree-dump-times "DIVMOD" 0 "widening_mul" } } */
diff --git a/gcc/testsuite/gcc.dg/pr43721-4.c b/gcc/testsuite/gcc.dg/pr43721-4.c
new file mode 100644
index 0000000..b4f69b0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr43721-4.c
@@ -0,0 +1,19 @@
+/* { dg-options "-O2 -fdump-tree-widening_mul" } */
+/* { dg-require-effective-target divmod } */
+
+int f(int x, int y)
+{
+  int quot = 0;
+  int rem = 0;
+
+  extern int flag;
+
+  if (flag)
+    quot = x / y;
+  else
+    rem = x % y;
+
+  return quot + rem;
+}
+
+/* { dg-final { scan-tree-dump-times "DIVMOD" 0 "widening_mul" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index b543519..ffb7ca5 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6494,3 +6494,14 @@ proc check_effective_target_vect_max_reduc { } {
     }
     return 0
 }
+
+# Return 1 if the target supports divmod
+
+proc check_effective_target_divmod { } {
+    if { [istarget arm*-*-*]
+        || [istarget x86_64-*-*] 
+        || [istarget i?86-*-*] } {
+       return 1
+    }
+    return 0
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 41fcabf..c7ccf3d 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -110,6 +110,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa.h"
 #include "builtins.h"
 #include "params.h"
+#include "optabs-libfuncs.h"
+#include "tree-eh.h"
 
 /* This structure represents one basic block that either computes a
    division, or is a common dominator for basic block that compute a
@@ -182,6 +184,9 @@ static struct
 
   /* Number of fp fused multiply-add ops inserted.  */
   int fmas_inserted;
+
+  /* Number of divmod calls inserted.  */
+  int divmod_calls_inserted;
 } widen_mul_stats;
 
 /* The instance of "struct occurrence" representing the highest
@@ -3493,6 +3498,160 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
op2)
   return true;
 }
 
+/* Set top_stmt to topmost stmt between top_stmt and use_stmt, and add
+   use_stmt to vector use_stmts provided basic block containing top_stmt
+   dominates use_stmt or vice versa.  */
+   
+static void 
+maybe_record_divmod (vec<gimple *>& stmts, gimple *&top_stmt, gimple *use_stmt)
+{
+  basic_block bb = gimple_bb (use_stmt);
+  basic_block top_bb = gimple_bb (top_stmt);
+
+  if (dominated_by_p (CDI_DOMINATORS, top_bb, bb)) 
+    {
+      if (bb != top_bb
+         || gimple_uid (use_stmt) < gimple_uid (top_stmt))
+       top_stmt = use_stmt;
+    }  
+  else if (!dominated_by_p (CDI_DOMINATORS, bb, top_bb)) 
+    return;
+
+  stmts.safe_push (use_stmt);
+}  
+
+/* Check if the stmt is a candidate for divmod transform.  */
+
+static bool
+divmod_candidate_p (gassign *stmt)
+{
+  enum machine_mode mode = TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt)));
+  const_tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  optab divmod_optab, div_optab;
+
+  if (TYPE_UNSIGNED (type))
+    {
+      divmod_optab = udivmod_optab;
+      div_optab = udiv_optab;
+    }
+  else
+    {
+      divmod_optab = sdivmod_optab;
+      div_optab = sdiv_optab;
+    }
+
+  /* Enable the transform if:
+     a) optab_handler exists for divmod or
+     b) optab_libfunc (divmod) exists and optab_handler(div) doesn't.  */
+       
+  if (! (optab_handler (divmod_optab, mode) != CODE_FOR_nothing
+        || (optab_libfunc (divmod_optab, mode) != NULL_RTX
+            && optab_handler (div_optab, mode) == CODE_FOR_nothing)))
+   return false;
+
+  tree op1 = gimple_assign_rhs1 (stmt);
+  tree op2 = gimple_assign_rhs2 (stmt);
+
+  if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2))
+    return false;
+
+  if (TYPE_OVERFLOW_TRAPS (type))
+    return false;
+
+  return true;
+} 
+
+/* This function looks for:
+   t1 = a TRUNC_DIV_EXPR b;
+   t2 = a TRUNC_MOD_EXPR b;
+   and transforms it to the following sequence:
+   complex_tmp = DIVMOD (a, b);
+   t1 = REALPART_EXPR(a);
+   t2 = IMAGPART_EXPR(b);
+   This change is done only if the target has support for divmod.
+
+   The pass works in two phases:
+   1) Walk through all immediate uses of stmt's operand and find a
+      TRUNC_DIV_EXPR with matching operands and if such a stmt is found add
+      it to stmts vector.
+   2) Insert DIVMOD call before first div/mod stmt in top_bb (basic block that
+      dominates other div/mod stmts with same operands) and update entries in
+      stmts vector to use return value of DIMOVD (REALEXPR_PART for div,
+      IMAGPART_EXPR for mod).  */
+
+static bool
+convert_to_divmod (gassign *stmt)
+{
+  if (!divmod_candidate_p (stmt))
+    return false;
+
+  tree op1 = gimple_assign_rhs1 (stmt);
+  tree op2 = gimple_assign_rhs2 (stmt);
+
+  vec<gimple *> stmts = vNULL;
+  stmts.safe_push (stmt);
+
+  imm_use_iterator use_iter;
+  gimple *use_stmt; 
+  gimple *top_stmt = stmt;
+ 
+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1)
+    {
+      if (is_gimple_assign (use_stmt)
+          && gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR
+         && operand_equal_p (op1, gimple_assign_rhs1 (use_stmt), 0)
+         && operand_equal_p (op2, gimple_assign_rhs2 (use_stmt), 0))
+       maybe_record_divmod (stmts, top_stmt, use_stmt);
+    }
+  
+  if (stmts.length () == 1)
+    return false;
+
+  /* Create the library call and insert the call stmt before top_stmt.  */
+  gcall *call_stmt = gimple_build_call_internal (IFN_DIVMOD, 2, op1, op2);
+  tree res = make_temp_ssa_name (
+               build_complex_type (TREE_TYPE (gimple_assign_lhs (stmt))),
+               call_stmt, "divmod_tmp");
+
+  gimple_call_set_lhs (call_stmt, res);
+  gimple_stmt_iterator top_stmt_gsi = gsi_for_stmt (top_stmt);
+  gsi_insert_before (&top_stmt_gsi, call_stmt, GSI_SAME_STMT);
+
+  widen_mul_stats.divmod_calls_inserted++; 
+
+  /* Update stmts. */
+  bool cfg_changed = false;
+  for (unsigned i = 0; i < stmts.length (); ++i) 
+    {
+      tree rhs;
+      use_stmt = stmts[i];
+ 
+      switch (gimple_assign_rhs_code (use_stmt))
+       {
+         case TRUNC_DIV_EXPR:
+           rhs = fold_build1 (REALPART_EXPR, TREE_TYPE (op1), res);
+           break;
+
+         case TRUNC_MOD_EXPR:
+           rhs = fold_build1 (IMAGPART_EXPR, TREE_TYPE (op1), res);
+           break;
+
+         default:
+           gcc_unreachable (); 
+       }
+
+      gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt);
+      gimple_assign_set_rhs_from_tree (&gsi, rhs);
+      update_stmt (use_stmt);
+
+      if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))
+       cfg_changed = true;
+    }
+
+  stmts.release ();
+  return cfg_changed;
+}
+
 /* Find integer multiplications where the operands are extended from
    smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
    where appropriate.  */
@@ -3535,7 +3694,10 @@ pass_optimize_widening_mul::execute (function *fun)
   basic_block bb;
   bool cfg_changed = false;
 
+  calculate_dominance_info (CDI_DOMINATORS);
+
   memset (&widen_mul_stats, 0, sizeof (widen_mul_stats));
+  renumber_gimple_stmt_uids ();
 
   FOR_EACH_BB_FN (bb, fun)
     {
@@ -3563,6 +3725,10 @@ pass_optimize_widening_mul::execute (function *fun)
                    }
                  break;
 
+               case TRUNC_MOD_EXPR:
+                 convert_to_divmod (as_a<gassign *> (stmt));
+                 break;
+
                case PLUS_EXPR:
                case MINUS_EXPR:
                  convert_plusminus_to_widen (&gsi, stmt, code);
@@ -3614,6 +3780,8 @@ pass_optimize_widening_mul::execute (function *fun)
                            widen_mul_stats.maccs_inserted);
   statistics_counter_event (fun, "fused multiply-adds inserted",
                            widen_mul_stats.fmas_inserted);
+  statistics_counter_event (fun, "divmod calls inserted",
+                           widen_mul_stats.divmod_calls_inserted);
 
   return cfg_changed ? TODO_cleanup_cfg : 0;
 }

Reply via email to