Re: [PATCH] tree-optimization/106365 - DSE of LEN_STORE and MASK_STORE

2022-07-30 Thread Jeff Law via Gcc-patches




On 7/21/2022 3:05 AM, Richard Biener via Gcc-patches wrote:

The following enhances DSE to handle LEN_STORE (optimally) and
MASK_STORE (conservatively).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
Kewen is testing on powerpc.  Handling MASK_STORE_LANES in
a similar way to MASK_STORE is probably possible but I couldn't
figure a way to generate one for testing.  STORE_LANES is
probably handled already since it's ECF_CONST.

PR tree-optimization/106365
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Handle
LEN_STORE, add mode to initialize a may-def and handle
MASK_STORE that way.
(dse_optimize_stmt): Query may-defs.  Handle internal
functions LEN_STORE and MASK_STORE similar to how
we handle memory builtins but without byte tracking.
LGTM.  Obviously if you can massage the PR into a testcase, it's even 
better :-)


jeff



[PATCH] tree-optimization/106365 - DSE of LEN_STORE and MASK_STORE

2022-07-21 Thread Richard Biener via Gcc-patches
The following enhances DSE to handle LEN_STORE (optimally) and
MASK_STORE (conservatively).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
Kewen is testing on powerpc.  Handling MASK_STORE_LANES in
a similar way to MASK_STORE is probably possible but I couldn't
figure a way to generate one for testing.  STORE_LANES is
probably handled already since it's ECF_CONST.

PR tree-optimization/106365
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Handle
LEN_STORE, add mode to initialize a may-def and handle
MASK_STORE that way.
(dse_optimize_stmt): Query may-defs.  Handle internal
functions LEN_STORE and MASK_STORE similar to how
we handle memory builtins but without byte tracking.
---
 gcc/tree-ssa-dse.cc | 55 +
 1 file changed, 51 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
index 8d1739a4510..34cfd1a8802 100644
--- a/gcc/tree-ssa-dse.cc
+++ b/gcc/tree-ssa-dse.cc
@@ -93,7 +93,9 @@ static bitmap need_eh_cleanup;
 static bitmap need_ab_cleanup;
 
 /* STMT is a statement that may write into memory.  Analyze it and
-   initialize WRITE to describe how STMT affects memory.
+   initialize WRITE to describe how STMT affects memory.  When
+   MAY_DEF_OK is true then the function initializes WRITE to what
+   the stmt may define.
 
Return TRUE if the statement was analyzed, FALSE otherwise.
 
@@ -101,7 +103,7 @@ static bitmap need_ab_cleanup;
can be achieved by analyzing more statements.  */
 
 static bool
-initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
+initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write, bool may_def_ok = 
false)
 {
   /* It's advantageous to handle certain mem* functions.  */
   if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
@@ -146,6 +148,32 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
  break;
}
 }
+  else if (is_gimple_call (stmt)
+  && gimple_call_internal_p (stmt))
+{
+  switch (gimple_call_internal_fn (stmt))
+   {
+   case IFN_LEN_STORE:
+ ao_ref_init_from_ptr_and_size
+ (write, gimple_call_arg (stmt, 0),
+  int_const_binop (MINUS_EXPR,
+   gimple_call_arg (stmt, 2),
+   gimple_call_arg (stmt, 4)));
+ return true;
+   case IFN_MASK_STORE:
+ /* We cannot initialize a must-def ao_ref (in all cases) but we
+can provide a may-def variant.  */
+ if (may_def_ok)
+   {
+ ao_ref_init_from_ptr_and_size
+ (write, gimple_call_arg (stmt, 0),
+  TYPE_SIZE_UNIT (TREE_TYPE (gimple_call_arg (stmt, 2;
+ return true;
+   }
+ break;
+   default:;
+   }
+}
   else if (tree lhs = gimple_get_lhs (stmt))
 {
   if (TREE_CODE (lhs) != SSA_NAME)
@@ -1328,8 +1356,10 @@ dse_optimize_stmt (function *fun, gimple_stmt_iterator 
*gsi, sbitmap live_bytes)
 
   ao_ref ref;
   /* If this is not a store we can still remove dead call using
- modref summary.  */
-  if (!initialize_ao_ref_for_dse (stmt, ))
+ modref summary.  Note we specifically allow ref to be initialized
+ to a conservative may-def since we are looking for followup stores
+ to kill all of it.  */
+  if (!initialize_ao_ref_for_dse (stmt, , true))
 {
   dse_optimize_call (gsi, live_bytes);
   return;
@@ -1398,6 +1428,23 @@ dse_optimize_stmt (function *fun, gimple_stmt_iterator 
*gsi, sbitmap live_bytes)
  return;
}
 }
+  else if (is_gimple_call (stmt)
+  && gimple_call_internal_p (stmt))
+{
+  switch (gimple_call_internal_fn (stmt))
+   {
+   case IFN_LEN_STORE:
+   case IFN_MASK_STORE:
+ {
+   enum dse_store_status store_status;
+   store_status = dse_classify_store (, stmt, false, live_bytes);
+   if (store_status == DSE_STORE_DEAD)
+ delete_dead_or_redundant_call (gsi, "dead");
+   return;
+ }
+   default:;
+   }
+}
 
   bool by_clobber_p = false;
 
-- 
2.35.3