https://bugs.llvm.org/show_bug.cgi?id=43828
Bug ID: 43828
Summary: nowrap flags are not always correct after
vectorization
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Loop Optimizer
Assignee: unassignedb...@nondot.org
Reporter: dantrus...@gmail.com
CC: llvm-bugs@lists.llvm.org
Created attachment 22738
--> https://bugs.llvm.org/attachment.cgi?id=22738&action=edit
Test to demonstrate wrong vectorizer behavior
When widening instructions loop vectorize always copies IR flags (including
nowrap) from scalar instruction to new vector instruction.
But this is not always correct. Consider subtract reduction loop which
is vectorized and interleaved.
outer_loop:
%local_4 = phi i32 [ 2, %entry ], [ %4, %outer_tail]
br label %inner_loop
inner_loop:
%local_2 = phi i32 [ 0, %outer_loop ], [ %1, %inner_loop ]
%local_3 = phi i32 [ -104, %outer_loop ], [ %0, %inner_loop ]
%0 = sub nuw nsw i32 %local_3, %local_4
%1 = add nuw nsw i32 %local_2, 1
%2 = icmp ugt i32 %local_2, 126
br i1 %2, label %outer_tail, label %inner_loop
outer_tail:
%3 = phi i32 [ %0, %inner_loop ]
%4 = add i32 %local_4, 1
%5 = icmp slt i32 %4, 6
br i1 %5, label %outer_loop, label %exit
Note nuw/nsw flags on sub instruction - they're correct for scalar code
after vectorization it becomes:
vector.ph: ; preds = %outer_loop
%broadcast.splatinsert3 = insertelement <4 x i32> undef, i32 %local_4, i32 0
%broadcast.splat4 = shufflevector <4 x i32> %broadcast.splatinsert3, <4 x
i32> undef, <4 x i32> zeroinitializer
br label %vector.body
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%vec.phi = phi <4 x i32> [ <i32 -104, i32 0, i32 0, i32 0>, %vector.ph ], [
%2, %vector.body ]
%vec.phi2 = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %3, %vector.body
]
%0 = sub nuw nsw <4 x i32> %vec.phi, %broadcast.splat4
%1 = sub nuw nsw <4 x i32> %vec.phi2, %broadcast.splat4
%index.next = add i32 %index, 8
%2 = icmp eq i32 %index.next, 128
br i1 %2, label %middle.block, label %vector.body, !llvm.loop !0
Note that %1 sub still has nuw flag set, but it is incorrect now.
Due to this flag, later optimizations remove second sub instruction
[ (0 - x)<nuw> -> 0 ] which results in incorrect code
Simple testcase is attached (unrolling vectorized loop makes it clearly
visible)
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs