[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

I'm not sure about that - we could tie instcombine to -O0 or some similar proxy 
for debugging ve performance - but I'm practice it's fairly likely that most 
traps are compiler inserted so it probably works out the same.

Conditional instcombine would let us remove much of the current logic for 
conditionally inserting traps which seems a win for implementation complexity.

Doesn't matter much for this patch, if D106299 
 lands then sure, let's switch it on for 
openmp GPU. If it goes the instcombine route then we don't need to toggle a 
switch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert abandoned this revision.
jdoerfert added a comment.

Replaced by D106308 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

In D106301#2888203 , @JonChesterfield 
wrote:

> In D106301#2888170 , @jdoerfert 
> wrote:
>
>> llvm.trap is preserved, thus branches to an llvm.trap are preserved.
>
> That's interesting. Consistent with IR in general,
>
>   template  int test(int x) {
> if (x < 42) {
>   return x;
> } else {
>   if (Trap)
> __builtin_trap();
>   __builtin_unreachable();
> }
>   }
>   
>   extern "C" {
>   int trap(int x) { return test(x); }
>   int none(int x) { return test(x); }
>   }
>
> `=>`
>
>   define i32 @trap(i32 returned %0) {
> %2 = icmp slt i32 %0, 42
> br i1 %2, label %4, label %3
>   
>   3:; preds = %1
> tail call void @llvm.trap() #3
> unreachable
>   
>   4:; preds = %1
> ret i32 %0
>   }
>   
>   define i32 @none(i32 returned %0)  {
> %2 = icmp slt i32 %0, 42
> tail call void @llvm.assume(i1 %2) #3
> ret i32 %0
>   }
>
> So yes, we'll get faster codegen if we are willing to throw away traps 
> followed by unreachable code.
>
> If that's a legitimate transform to do, it seems like something we should do 
> in instcombine, instead of a separate pass. I.e. fold `trap, unreachable` to 
> `unreachable`.
>
> Can we do that instead?

We could, but we should not. A trap inserted by assert or the user should stay 
(IMHO).
What we do here is to avoid new traps that were arbitrarily inserted with 
unreachables. There is no particular reason why some "reasons" for an 
unreachable insert a trap
and others do not, it's just "to help debugging".


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

In D106301#2888170 , @jdoerfert wrote:

> llvm.trap is preserved, thus branches to an llvm.trap are preserved.

That's interesting. Consistent with IR in general,

  template  int test(int x) {
if (x < 42) {
  return x;
} else {
  if (Trap)
__builtin_trap();
  __builtin_unreachable();
}
  }
  
  extern "C" {
  int trap(int x) { return test(x); }
  int none(int x) { return test(x); }
  }

`=>`

  define i32 @trap(i32 returned %0) {
%2 = icmp slt i32 %0, 42
br i1 %2, label %4, label %3
  
  3:; preds = %1
tail call void @llvm.trap() #3
unreachable
  
  4:; preds = %1
ret i32 %0
  }
  
  define i32 @none(i32 returned %0)  {
%2 = icmp slt i32 %0, 42
tail call void @llvm.assume(i1 %2) #3
ret i32 %0
  }

So yes, we'll get faster codegen if we are willing to throw away traps followed 
by unreachable code.

If that's a legitimate transform to do, it seems like something we should do in 
instcombine, instead of a separate pass. I.e. fold `trap, unreachable` to 
`unreachable`.

Can we do that instead?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

As an example, often end up with code like this right now:

%26 = load i32, i32* addrspacecast (i32 addrspace(3)* @execution_param to 
i32*), align 4, !dbg !39, !tbaa !27
%and.i13.i.i = and i32 %26, 4, !dbg !39
%cmp.i14.not.i.i = icmp eq i32 %and.i13.i.i, 0, !dbg !39
br i1 %cmp.i14.not.i.i, label %if.end.i129.i.i, label 
%__kmpc_parallel_51.exit.i, !dbg !39
  
  if.end.i129.i.i:  ; preds = 
%_Z16DecParallelLevelbj.exit.i.i
tail call void @llvm.trap() #10, !dbg !39
unreachable, !dbg !39

which could be:

  br label %__kmpc_parallel_51.exit.i, !dbg !39


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment.

In D106301#2888162 , @JonChesterfield 
wrote:

> What's the problem with emitting llvm.trap in various unreachable places?

llvm.trap is preserved, thus branches to an llvm.trap are preserved.

> Wondering if it also affects translating assert to an llvm.trap

no.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert updated this revision to Diff 359882.
jdoerfert added a comment.

Fix copy error


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -214,11 +214,13 @@
 // DEBUG_DIRECTIVES-NOT: warning: debug
 // NO_DEBUG-NOT: warning: debug
 // NO_DEBUG: "-fopenmp-is-device"
+// NO_DEBUG-SAME: "-trap-before-unreachable=never"
 // NO_DEBUG-NOT: "-debug-info-kind=
 // NO_DEBUG: ptxas
 // DEBUG_DIRECTIVES: "-triple" "nvptx64-nvidia-cuda"
 // DEBUG_DIRECTIVES-SAME: "-debug-info-kind=line-directives-only"
 // DEBUG_DIRECTIVES-SAME: "-fopenmp-is-device"
+// DEBUG-DIRECTIVES-SAME: "-trap-before-unreachable=never"
 // DEBUG_DIRECTIVES: ptxas
 // DEBUG_DIRECTIVES: "-lineinfo"
 // NO_DEBUG-NOT: "-g"
@@ -251,6 +253,7 @@
 // HAS_DEBUG-SAME: "-debug-info-kind={{limited|line-tables-only}}"
 // HAS_DEBUG-SAME: "-dwarf-version=2"
 // HAS_DEBUG-SAME: "-fopenmp-is-device"
+// HAS_DEBUG-SAME: "-trap-before-unreachable=never"
 // HAS_DEBUG: ptxas
 // HAS_DEBUG-SAME: "-g"
 // HAS_DEBUG-SAME: "--dont-merge-basicblocks"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -6621,6 +6621,10 @@
   CmdArgs.push_back("-fopenmp-host-ir-file-path");
   CmdArgs.push_back(Args.MakeArgString(OpenMPDeviceInput->getFilename()));
 }
+
+// We disable `llvm.trap` generation in OpenMP offload code generation.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-trap-before-unreachable=never");
   }
 
   if (Triple.isAMDGPU()) {


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -214,11 +214,13 @@
 // DEBUG_DIRECTIVES-NOT: warning: debug
 // NO_DEBUG-NOT: warning: debug
 // NO_DEBUG: "-fopenmp-is-device"
+// NO_DEBUG-SAME: "-trap-before-unreachable=never"
 // NO_DEBUG-NOT: "-debug-info-kind=
 // NO_DEBUG: ptxas
 // DEBUG_DIRECTIVES: "-triple" "nvptx64-nvidia-cuda"
 // DEBUG_DIRECTIVES-SAME: "-debug-info-kind=line-directives-only"
 // DEBUG_DIRECTIVES-SAME: "-fopenmp-is-device"
+// DEBUG-DIRECTIVES-SAME: "-trap-before-unreachable=never"
 // DEBUG_DIRECTIVES: ptxas
 // DEBUG_DIRECTIVES: "-lineinfo"
 // NO_DEBUG-NOT: "-g"
@@ -251,6 +253,7 @@
 // HAS_DEBUG-SAME: "-debug-info-kind={{limited|line-tables-only}}"
 // HAS_DEBUG-SAME: "-dwarf-version=2"
 // HAS_DEBUG-SAME: "-fopenmp-is-device"
+// HAS_DEBUG-SAME: "-trap-before-unreachable=never"
 // HAS_DEBUG: ptxas
 // HAS_DEBUG-SAME: "-g"
 // HAS_DEBUG-SAME: "--dont-merge-basicblocks"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -6621,6 +6621,10 @@
   CmdArgs.push_back("-fopenmp-host-ir-file-path");
   CmdArgs.push_back(Args.MakeArgString(OpenMPDeviceInput->getFilename()));
 }
+
+// We disable `llvm.trap` generation in OpenMP offload code generation.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-trap-before-unreachable=never");
   }
 
   if (Triple.isAMDGPU()) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment.

What's the problem with emitting llvm.trap in various unreachable places? 
Wondering if it also affects translating assert to an llvm.trap


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106301/new/

https://reviews.llvm.org/D106301

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert created this revision.
jdoerfert added reviewers: jhuber6, tianshilei1992, JonChesterfield.
Herald added subscribers: guansong, bollu, yaxunl.
jdoerfert requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

We want to fold more aggressively on the GPU and we therefore disable
the generation of `llvm.trap` before an unreachable  by default.

Depends on D106299 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106301

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -214,11 +214,13 @@
 // DEBUG_DIRECTIVES-NOT: warning: debug
 // NO_DEBUG-NOT: warning: debug
 // NO_DEBUG: "-fopenmp-is-device"
+// NO_DEBUG-SAME: "-trap-before-unreachable=false"
 // NO_DEBUG-NOT: "-debug-info-kind=
 // NO_DEBUG: ptxas
 // DEBUG_DIRECTIVES: "-triple" "nvptx64-nvidia-cuda"
 // DEBUG_DIRECTIVES-SAME: "-debug-info-kind=line-directives-only"
 // DEBUG_DIRECTIVES-SAME: "-fopenmp-is-device"
+// DEBUG-DIRECTIVES-SAME: "-trap-before-unreachable=false"
 // DEBUG_DIRECTIVES: ptxas
 // DEBUG_DIRECTIVES: "-lineinfo"
 // NO_DEBUG-NOT: "-g"
@@ -251,6 +253,7 @@
 // HAS_DEBUG-SAME: "-debug-info-kind={{limited|line-tables-only}}"
 // HAS_DEBUG-SAME: "-dwarf-version=2"
 // HAS_DEBUG-SAME: "-fopenmp-is-device"
+// HAS_DEBUG-SAME: "-trap-before-unreachable=false"
 // HAS_DEBUG: ptxas
 // HAS_DEBUG-SAME: "-g"
 // HAS_DEBUG-SAME: "--dont-merge-basicblocks"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -6621,6 +6621,10 @@
   CmdArgs.push_back("-fopenmp-host-ir-file-path");
   CmdArgs.push_back(Args.MakeArgString(OpenMPDeviceInput->getFilename()));
 }
+
+// We disable `llvm.trap` generation in OpenMP offload code generation.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-trap-before-unreachable=never");
   }
 
   if (Triple.isAMDGPU()) {


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -214,11 +214,13 @@
 // DEBUG_DIRECTIVES-NOT: warning: debug
 // NO_DEBUG-NOT: warning: debug
 // NO_DEBUG: "-fopenmp-is-device"
+// NO_DEBUG-SAME: "-trap-before-unreachable=false"
 // NO_DEBUG-NOT: "-debug-info-kind=
 // NO_DEBUG: ptxas
 // DEBUG_DIRECTIVES: "-triple" "nvptx64-nvidia-cuda"
 // DEBUG_DIRECTIVES-SAME: "-debug-info-kind=line-directives-only"
 // DEBUG_DIRECTIVES-SAME: "-fopenmp-is-device"
+// DEBUG-DIRECTIVES-SAME: "-trap-before-unreachable=false"
 // DEBUG_DIRECTIVES: ptxas
 // DEBUG_DIRECTIVES: "-lineinfo"
 // NO_DEBUG-NOT: "-g"
@@ -251,6 +253,7 @@
 // HAS_DEBUG-SAME: "-debug-info-kind={{limited|line-tables-only}}"
 // HAS_DEBUG-SAME: "-dwarf-version=2"
 // HAS_DEBUG-SAME: "-fopenmp-is-device"
+// HAS_DEBUG-SAME: "-trap-before-unreachable=false"
 // HAS_DEBUG: ptxas
 // HAS_DEBUG-SAME: "-g"
 // HAS_DEBUG-SAME: "--dont-merge-basicblocks"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -6621,6 +6621,10 @@
   CmdArgs.push_back("-fopenmp-host-ir-file-path");
   CmdArgs.push_back(Args.MakeArgString(OpenMPDeviceInput->getFilename()));
 }
+
+// We disable `llvm.trap` generation in OpenMP offload code generation.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-trap-before-unreachable=never");
   }
 
   if (Triple.isAMDGPU()) {
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits