Issue 86966
Summary [DSE] Missed optimization: eliminate `memset` of `alloca` if either `alloca` doesn't escape or `memset` is dead
Labels llvm:optimizations, missed-optimization
Assignees
Reporter XChy
    Alive2 proof: https://alive2.llvm.org/ce/z/Dq-Cih

### Motivating example 

```llvm
define void @src(i1 %c, i1 %c1, i1 %c2, i64 %n) {
entry:
  %p = alloca [2 x i64], align 8
  call void @llvm.memset.p0.i64(ptr %p, i8 0, i64 16, i1 false)
  br i1 %c, label %then, label %else

then: ; preds = %entry
  br i1 %c2, label %common.ret, label %thread

thread:                                          ; preds = %then
  store i64 %n, ptr %p, align 8
  %8 = getelementptr i8, ptr %p, i64 8
  store i64 %n, ptr %8, align 8
  br label %escape

else: ; preds = %entry
  store i64 %n, ptr %p, align 8
  %11 = getelementptr i8, ptr %p, i64 8
  store i64 %n, ptr %11, align 8
  br i1 %c1, label %common.ret, label %escape

common.ret:                                       ; preds = %then, %else, %escape
  ret void

escape: ; preds = %thread, %else
  call void @use(ptr %p)
 br label %common.ret
}
```

`%p` has two final destinations. 
One is `common.ret`, and through all paths to `common.ret`, `%p` doesn't escape from this function. In this case, `memset` is dead. 
Another destination is `escape`, though `%p` will escape by `use` function, but memset is also dead because there are two consecutive i64 stores on all paths to `escape`.
Thus the memset in entry block is dead indeed.

I'm sorry that I cannot generalize it further, but one key point is that DSE eliminate this case when we replace the two consecutive stores with a single store: https://godbolt.org/z/T45aexqM5

### Real-world motivation

This snippet of IR is derived from [linux/drivers/gpu/drm/i915/i915_utils.c@__i915_printk](https://github.com/torvalds/linux/blob/8d025e2092e29bfd13e56c78e22af25fac83c8ec/drivers/gpu/drm/i915/i915_utils.c#L17) (after O3 pipeline). Looks like handling C's va_args.
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, see also:https://godbolt.org/z/ebx1Kv87z

**Let me know if you can confirm that it's an optimization opportunity, thanks.**
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to