On 12/16/2016 06:57 AM, Richard Biener wrote:

Apart from what Trevor says about using sbitmaps (try to avoid the initial
zeroing please) and the missed freeing (you can use auto_[s]bitmap?)
some comments below.
In progress. We'll need a few routines for sbitmaps that don't currently exist, but nothing that should be hard to implement correctly.

+
+static void
+trim_complex_store (bitmap orig, bitmap live, gimple *stmt)
+{
+  bitmap dead = BITMAP_ALLOC (NULL);
+  bitmap_and_compl (dead, orig, live);
+
+  /* So we have to verify that either the real or imag part as a whole
+     is dead.  They are always the same size.  Thus for one to be dead
+     the number of live bytes would have to be the same as the number of
+     dead bytes.  */
+  if (bitmap_count_bits (live) == bitmap_count_bits (dead))

popcount is expensive, so is this really a short-cut?
It's less about being a short cut and more about realizing both halves are the same size. Thus to be trimmable the number of live bytes must exactly equal the number of dead bytes. This works regardless of the size of the underlying components of the complex type.



+    {
+      /* Hopefully we have live bits that look like 0xff00 or 0xff
(assuming
+        8 bytes for the underlying real/imag part).  If so we can optimize
+        this case.  */
+      if (bitmap_first_set_bit (live) == 0
+         && bitmap_first_set_bit (dead) == bitmap_count_bits (live))

Hmm, but does that properly handle byte-wise reads from it?  Like
reading the realpart plus the last byte from the imagpart?
It should. We've already verified that exactly half of the object is live and half is dead. So we just need to make sure that the live part is contiguous at the start or end of the object.

Again, written this way we don't have to care about the size of the components of the COMPLEX_EXPR.

You're right in that if we keep the test in this form we should avoid the redundant popcount calls.

+
+/* STMT is a memory write where one or more bytes written are dead
+   stores.  ORIG is the bitmap of bytes stored by STMT.  LIVE is the
+   bitmap of stores that are actually live.
+
+   Attempt to rewrite STMT so that it writes fewer memory locations.  Right
+   now we only support trimming at the start or end of the memory region.
+   It's not clear how much there is to be gained by trimming from the
middle
+   of the region.  */
+
+static void
+trim_partially_dead_store (bitmap orig, bitmap live, gimple *stmt)
+{
+  if (is_gimple_assign (stmt))
+    {
+      switch (gimple_assign_rhs_code (stmt))
+       {
+       case COMPLEX_CST:
+         trim_complex_store (orig, live, stmt);

so patch #1 only handles stores from COMPLEX_CST?  Ok...
Patch #1 handles detecting of any aggregate store that is made fully dead by subsequent partial stores. It also handles trimming of a partially dead complex constant initialization.

+  if (valid_ao_ref_for_dse (ref)
+      && (ref->max_size / BITS_PER_UNIT
+         <= PARAM_VALUE (PARAM_DSE_MAX_OBJECT_SIZE)))
+    {
+      live_bytes = BITMAP_ALLOC (NULL);
+      orig_live_bytes = BITMAP_ALLOC (NULL);
+      bitmap_set_range (live_bytes,
+                       ref->offset / BITS_PER_UNIT,
+                       ref->max_size / BITS_PER_UNIT);
+      bitmap_copy (orig_live_bytes, live_bytes);

So I'd use a once-per-pass allocated sbitmap here.  I don't see why you need
the orig_live_bytes bitmap though (just keep that implicitely by the known
range?)
That would require resizing the bitmap or biasing. While we have a consistent maximum size, ref->offset varies. And that's probably the one case that sparse bitmaps handle better than simple bitmaps.

So I guess we'll have to bias everything. It's not particularly hard, but it can be error prone.


+    }
+
   /* Find the first dominated statement that clobbers (part of) the
      memory stmt stores to with no intermediate statement that may use
      part of the memory stmt stores.  That is, find a store that may
@@ -164,7 +341,15 @@ dse_possible_dead_store_p (ao_ref *ref, gimple *stmt,
gimple **use_stmt)
        }

       if (fail)
-       return false;
+       {
+         /* STMT might be partially dead and we may be able to reduce
+            how many memory locations it stores into.  */
+         if (live_bytes
+             && !bitmap_equal_p (live_bytes, orig_live_bytes)
+             && !gimple_clobber_p (stmt))
+           trim_partially_dead_store (orig_live_bytes, live_bytes, stmt);

The actual transform in dse_possible_dead_store_p looks a bit misplaced.
I see it's somehow convenient but then maybe returning a enum from this
function might be cleaner.  Well, I'm not too torn about this, so maybe
just rename the function a bit (no good suggestion either).
I pondered that as well. I don't have a strong opinion either way. I'll look into returning a tri-state (live, partially dead, dead).


Jeff

Reply via email to