[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-07-28 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak commented:

Thank you Kareem for your comments, I got pulled into other work and wasn't 
able to reply properly until now.

https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-07-28 Thread Sergio Afonso via llvm-branch-commits


@@ -3173,19 +3173,14 @@ convertOmpThreadprivate(Operation &opInst, 
llvm::IRBuilderBase &builder,
   LLVM::GlobalOp global =
   addressOfOp.getGlobal(moduleTranslation.symbolTable());
   llvm::GlobalValue *globalValue = moduleTranslation.lookupGlobal(global);
-
-  if (!ompBuilder->Config.isTargetDevice()) {
-llvm::Type *type = globalValue->getValueType();
-llvm::TypeSize typeSize =
-
builder.GetInsertBlock()->getModule()->getDataLayout().getTypeStoreSize(
-type);
-llvm::ConstantInt *size = builder.getInt64(typeSize.getFixedValue());
-llvm::Value *callInst = ompBuilder->createCachedThreadPrivate(
-ompLoc, globalValue, size, global.getSymName() + ".cache");
-moduleTranslation.mapValue(opInst.getResult(0), callInst);
-  } else {
-moduleTranslation.mapValue(opInst.getResult(0), globalValue);

skatrak wrote:

To the best of my knowledge, the reason a case was added for target device here 
initially was that `omp.threadprivate` operations in a function containing a 
target region were being translated, even though they weren't located inside of 
a target region. If we didn't map their result to anything in LLVM here, then 
its uses in e.g. map clauses would crash the compiler. It's close to the same 
reason why you implemented the "scope" translation of some OpenMP ops as well.

Since the previous PR in the stack would remove these host `omp.threadprivate` 
ops and replace them with placeholders, the consequence is that we don't have 
to do this anymore. If the resulting value originally was mapped into the 
target region then, indeed, it would come in as a kernel argument.

I'm not sure if this somehow breaks translation of `threadprivate` ops residing 
inside of target regions and device functions, though. Perhaps @agozillon can 
comment if this change might inadvertently break actual lowering of 
`threadprivate` for the device, as he originally added this logic here. Or 
maybe `threadprivate` didn't work before in the target device and still 
doesn't, which is also possible.

https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-07-28 Thread Sergio Afonso via llvm-branch-commits


@@ -5488,40 +5483,172 @@ convertDeclareTargetAttr(Operation *op, 
mlir::omp::DeclareTargetAttr attribute,
   return success();
 }
 
-// Returns true if the operation is inside a TargetOp or
-// is part of a declare target function.
-static bool isTargetDeviceOp(Operation *op) {
+namespace {
+
+/// Implementation of the dialect interface that converts operations belonging
+/// to the OpenMP dialect to LLVM IR.
+class OpenMPDialectLLVMIRTranslationInterface
+: public LLVMTranslationDialectInterface {
+public:
+  using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface;
+
+  /// Translates the given operation to LLVM IR using the provided IR builder
+  /// and saving the state in `moduleTranslation`.
+  LogicalResult
+  convertOperation(Operation *op, llvm::IRBuilderBase &builder,
+   LLVM::ModuleTranslation &moduleTranslation) const final;
+
+  /// Given an OpenMP MLIR attribute, create the corresponding LLVM-IR,
+  /// runtime calls, or operation amendments
+  LogicalResult
+  amendOperation(Operation *op, ArrayRef instructions,
+ NamedAttribute attribute,
+ LLVM::ModuleTranslation &moduleTranslation) const final;
+};
+
+} // namespace
+
+LogicalResult OpenMPDialectLLVMIRTranslationInterface::amendOperation(
+Operation *op, ArrayRef instructions,
+NamedAttribute attribute,
+LLVM::ModuleTranslation &moduleTranslation) const {
+  return llvm::StringSwitch>(
+ attribute.getName())
+  .Case("omp.is_target_device",
+[&](Attribute attr) {
+  if (auto deviceAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilderConfig &config =
+moduleTranslation.getOpenMPBuilder()->Config;
+config.setIsTargetDevice(deviceAttr.getValue());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.is_gpu",
+[&](Attribute attr) {
+  if (auto gpuAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilderConfig &config =
+moduleTranslation.getOpenMPBuilder()->Config;
+config.setIsGPU(gpuAttr.getValue());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.host_ir_filepath",
+[&](Attribute attr) {
+  if (auto filepathAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilder *ompBuilder =
+moduleTranslation.getOpenMPBuilder();
+ompBuilder->loadOffloadInfoMetadata(filepathAttr.getValue());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.flags",
+[&](Attribute attr) {
+  if (auto rtlAttr = dyn_cast(attr))
+return convertFlagsAttr(op, rtlAttr, moduleTranslation);
+  return failure();
+})
+  .Case("omp.version",
+[&](Attribute attr) {
+  if (auto versionAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilder *ompBuilder =
+moduleTranslation.getOpenMPBuilder();
+ompBuilder->M.addModuleFlag(llvm::Module::Max, "openmp",
+versionAttr.getVersion());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.declare_target",
+[&](Attribute attr) {
+  if (auto declareTargetAttr =
+  dyn_cast(attr))
+return convertDeclareTargetAttr(op, declareTargetAttr,
+moduleTranslation);
+  return failure();
+})
+  .Case("omp.requires",
+[&](Attribute attr) {
+  if (auto requiresAttr = dyn_cast(attr)) 
{
+using Requires = omp::ClauseRequires;
+Requires flags = requiresAttr.getValue();
+llvm::OpenMPIRBuilderConfig &config =
+moduleTranslation.getOpenMPBuilder()->Config;
+config.setHasRequiresReverseOffload(
+bitEnumContainsAll(flags, Requires::reverse_offload));
+config.setHasRequiresUnifiedAddress(
+bitEnumContainsAll(flags, Requires::unified_address));
+config.setHasRequiresUnifiedSharedMemory(
+bitEnumContainsAll(flags, 
Requires::unified_shared_memory));
+config.setHasRequiresDynamicAllocators(
+bitEnumContainsAll(flags, Requires::dynamic_allocators));
+return success();
+  }
+  return failure();
+})
+  .Case("omp.target_triples",
+[&](Attribute attr) {
+  if (auto triplesAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilderConfig &config =
+moduleTra

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-07-28 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak edited 
https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-06-16 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy approved this pull request.

LGTM! Thanks Sergio and sorry for the very late review. Just 2 questions.

https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-06-16 Thread Kareem Ergawy via llvm-branch-commits


@@ -3173,19 +3173,14 @@ convertOmpThreadprivate(Operation &opInst, 
llvm::IRBuilderBase &builder,
   LLVM::GlobalOp global =
   addressOfOp.getGlobal(moduleTranslation.symbolTable());
   llvm::GlobalValue *globalValue = moduleTranslation.lookupGlobal(global);
-
-  if (!ompBuilder->Config.isTargetDevice()) {
-llvm::Type *type = globalValue->getValueType();
-llvm::TypeSize typeSize =
-
builder.GetInsertBlock()->getModule()->getDataLayout().getTypeStoreSize(
-type);
-llvm::ConstantInt *size = builder.getInt64(typeSize.getFixedValue());
-llvm::Value *callInst = ompBuilder->createCachedThreadPrivate(
-ompLoc, globalValue, size, global.getSymName() + ".cache");
-moduleTranslation.mapValue(opInst.getResult(0), callInst);
-  } else {
-moduleTranslation.mapValue(opInst.getResult(0), globalValue);

ergawy wrote:

Just to verify my understanding, we do not need the `else` here since for 
`target` regions, we pass `threadprivate` values as kernel arguments, right?

https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-06-16 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy edited 
https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-06-16 Thread Kareem Ergawy via llvm-branch-commits


@@ -5488,40 +5483,172 @@ convertDeclareTargetAttr(Operation *op, 
mlir::omp::DeclareTargetAttr attribute,
   return success();
 }
 
-// Returns true if the operation is inside a TargetOp or
-// is part of a declare target function.
-static bool isTargetDeviceOp(Operation *op) {
+namespace {
+
+/// Implementation of the dialect interface that converts operations belonging
+/// to the OpenMP dialect to LLVM IR.
+class OpenMPDialectLLVMIRTranslationInterface
+: public LLVMTranslationDialectInterface {
+public:
+  using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface;
+
+  /// Translates the given operation to LLVM IR using the provided IR builder
+  /// and saving the state in `moduleTranslation`.
+  LogicalResult
+  convertOperation(Operation *op, llvm::IRBuilderBase &builder,
+   LLVM::ModuleTranslation &moduleTranslation) const final;
+
+  /// Given an OpenMP MLIR attribute, create the corresponding LLVM-IR,
+  /// runtime calls, or operation amendments
+  LogicalResult
+  amendOperation(Operation *op, ArrayRef instructions,
+ NamedAttribute attribute,
+ LLVM::ModuleTranslation &moduleTranslation) const final;
+};
+
+} // namespace
+
+LogicalResult OpenMPDialectLLVMIRTranslationInterface::amendOperation(
+Operation *op, ArrayRef instructions,
+NamedAttribute attribute,
+LLVM::ModuleTranslation &moduleTranslation) const {
+  return llvm::StringSwitch>(
+ attribute.getName())
+  .Case("omp.is_target_device",
+[&](Attribute attr) {
+  if (auto deviceAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilderConfig &config =
+moduleTranslation.getOpenMPBuilder()->Config;
+config.setIsTargetDevice(deviceAttr.getValue());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.is_gpu",
+[&](Attribute attr) {
+  if (auto gpuAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilderConfig &config =
+moduleTranslation.getOpenMPBuilder()->Config;
+config.setIsGPU(gpuAttr.getValue());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.host_ir_filepath",
+[&](Attribute attr) {
+  if (auto filepathAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilder *ompBuilder =
+moduleTranslation.getOpenMPBuilder();
+ompBuilder->loadOffloadInfoMetadata(filepathAttr.getValue());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.flags",
+[&](Attribute attr) {
+  if (auto rtlAttr = dyn_cast(attr))
+return convertFlagsAttr(op, rtlAttr, moduleTranslation);
+  return failure();
+})
+  .Case("omp.version",
+[&](Attribute attr) {
+  if (auto versionAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilder *ompBuilder =
+moduleTranslation.getOpenMPBuilder();
+ompBuilder->M.addModuleFlag(llvm::Module::Max, "openmp",
+versionAttr.getVersion());
+return success();
+  }
+  return failure();
+})
+  .Case("omp.declare_target",
+[&](Attribute attr) {
+  if (auto declareTargetAttr =
+  dyn_cast(attr))
+return convertDeclareTargetAttr(op, declareTargetAttr,
+moduleTranslation);
+  return failure();
+})
+  .Case("omp.requires",
+[&](Attribute attr) {
+  if (auto requiresAttr = dyn_cast(attr)) 
{
+using Requires = omp::ClauseRequires;
+Requires flags = requiresAttr.getValue();
+llvm::OpenMPIRBuilderConfig &config =
+moduleTranslation.getOpenMPBuilder()->Config;
+config.setHasRequiresReverseOffload(
+bitEnumContainsAll(flags, Requires::reverse_offload));
+config.setHasRequiresUnifiedAddress(
+bitEnumContainsAll(flags, Requires::unified_address));
+config.setHasRequiresUnifiedSharedMemory(
+bitEnumContainsAll(flags, 
Requires::unified_shared_memory));
+config.setHasRequiresDynamicAllocators(
+bitEnumContainsAll(flags, Requires::dynamic_allocators));
+return success();
+  }
+  return failure();
+})
+  .Case("omp.target_triples",
+[&](Attribute attr) {
+  if (auto triplesAttr = dyn_cast(attr)) {
+llvm::OpenMPIRBuilderConfig &config =
+moduleTra

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-05-15 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/137201



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-05-05 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/137201

>From 22f22aa0ca2c98dfcc48a70f2f7e0a5b68d7b1d9 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Tue, 22 Apr 2025 12:04:45 +0100
Subject: [PATCH] [MLIR][OpenMP] Simplify OpenMP device codegen

After removing host operations from the device MLIR module, it is no longer
necessary to provide special codegen logic to prevent these operations from
causing compiler crashes or miscompilations.

This patch removes these now unnecessary code paths to simplify codegen logic.
Some MLIR tests are now replaced with Flang tests, since the responsibility of
dealing with host operations has been moved earlier in the compilation flow.

MLIR tests holding target device modules are updated to no longer include now
unsupported host operations.
---
 .../OpenMP/target-nesting-in-host-ops.f90 |  87 
 .../Integration/OpenMP/task-target-device.f90 |  37 ++
 .../OpenMP/threadprivate-target-device.f90|  40 ++
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 423 +++---
 ...arget-constant-indexing-device-region.mlir |  25 +-
 .../Target/LLVMIR/omptarget-debug-var-1.mlir  |  19 +-
 .../omptarget-memcpy-align-metadata.mlir  |  61 +--
 .../LLVMIR/omptarget-target-inside-task.mlir  |  43 --
 ...ptarget-threadprivate-device-lowering.mlir |  31 --
 .../Target/LLVMIR/openmp-llvm-invalid.mlir|  45 ++
 .../openmp-target-nesting-in-host-ops.mlir| 160 ---
 .../LLVMIR/openmp-task-target-device.mlir |  26 --
 12 files changed, 409 insertions(+), 588 deletions(-)
 create mode 100644 flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
 create mode 100644 flang/test/Integration/OpenMP/task-target-device.f90
 create mode 100644 
flang/test/Integration/OpenMP/threadprivate-target-device.f90
 delete mode 100644 mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir
 delete mode 100644 
mlir/test/Target/LLVMIR/omptarget-threadprivate-device-lowering.mlir
 delete mode 100644 
mlir/test/Target/LLVMIR/openmp-target-nesting-in-host-ops.mlir
 delete mode 100644 mlir/test/Target/LLVMIR/openmp-task-target-device.mlir

diff --git a/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90 
b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
new file mode 100644
index 0..8c85a3c1784ed
--- /dev/null
+++ b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
@@ -0,0 +1,87 @@
+!===--===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===--===!
+
+!REQUIRES: amdgpu-registered-target
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-version=50 -fopenmp-is-target-device %s -o - | FileCheck %s
+
+! CHECK-NOT: define void @nested_target_in_parallel
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_parallel(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+
+  !$omp parallel
+!$omp target map(tofrom: v)
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_wsloop
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_wsloop_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_wsloop(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: i
+
+  !$omp do
+  do i=1, 10
+!$omp target map(tofrom: v)
+!$omp end target
+  end do
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_parallel_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_with_private_{{.*}}(ptr 
%{{.*}}, ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_parallel_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp parallel firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_task_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_task_with_private_{{.*}}(ptr %{{.*}}, 
ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_task_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp task firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end task
+end subroutine
+
+! CHECK-NOT: define void @target_and_atomic_update
+! CHECK: define weak_odr protec

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-04-29 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/137201

>From c86caed6b80bdd4ed4edeb77f65116584e61b83b Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Tue, 22 Apr 2025 12:04:45 +0100
Subject: [PATCH] [MLIR][OpenMP] Simplify OpenMP device codegen

After removing host operations from the device MLIR module, it is no longer
necessary to provide special codegen logic to prevent these operations from
causing compiler crashes or miscompilations.

This patch removes these now unnecessary code paths to simplify codegen logic.
Some MLIR tests are now replaced with Flang tests, since the responsibility of
dealing with host operations has been moved earlier in the compilation flow.

MLIR tests holding target device modules are updated to no longer include now
unsupported host operations.
---
 .../OpenMP/target-nesting-in-host-ops.f90 |  87 
 .../Integration/OpenMP/task-target-device.f90 |  37 ++
 .../OpenMP/threadprivate-target-device.f90|  40 ++
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 423 +++---
 ...arget-constant-indexing-device-region.mlir |  25 +-
 .../Target/LLVMIR/omptarget-debug-var-1.mlir  |  19 +-
 .../omptarget-memcpy-align-metadata.mlir  |  61 +--
 .../LLVMIR/omptarget-target-inside-task.mlir  |  43 --
 ...ptarget-threadprivate-device-lowering.mlir |  31 --
 .../Target/LLVMIR/openmp-llvm-invalid.mlir|  45 ++
 .../openmp-target-nesting-in-host-ops.mlir| 160 ---
 .../LLVMIR/openmp-task-target-device.mlir |  26 --
 12 files changed, 409 insertions(+), 588 deletions(-)
 create mode 100644 flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
 create mode 100644 flang/test/Integration/OpenMP/task-target-device.f90
 create mode 100644 
flang/test/Integration/OpenMP/threadprivate-target-device.f90
 delete mode 100644 mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir
 delete mode 100644 
mlir/test/Target/LLVMIR/omptarget-threadprivate-device-lowering.mlir
 delete mode 100644 
mlir/test/Target/LLVMIR/openmp-target-nesting-in-host-ops.mlir
 delete mode 100644 mlir/test/Target/LLVMIR/openmp-task-target-device.mlir

diff --git a/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90 
b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
new file mode 100644
index 0..8c85a3c1784ed
--- /dev/null
+++ b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
@@ -0,0 +1,87 @@
+!===--===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===--===!
+
+!REQUIRES: amdgpu-registered-target
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-version=50 -fopenmp-is-target-device %s -o - | FileCheck %s
+
+! CHECK-NOT: define void @nested_target_in_parallel
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_parallel(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+
+  !$omp parallel
+!$omp target map(tofrom: v)
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_wsloop
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_wsloop_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_wsloop(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: i
+
+  !$omp do
+  do i=1, 10
+!$omp target map(tofrom: v)
+!$omp end target
+  end do
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_parallel_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_with_private_{{.*}}(ptr 
%{{.*}}, ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_parallel_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp parallel firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_task_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_task_with_private_{{.*}}(ptr %{{.*}}, 
ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_task_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp task firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end task
+end subroutine
+
+! CHECK-NOT: define void @target_and_atomic_update
+! CHECK: define weak_odr protec

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-04-24 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

PR stack:
- #137198
- #137199
- #137200
- #137201

https://github.com/llvm/llvm-project/pull/137201
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-04-24 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)


Changes

After removing host operations from the device MLIR module, it is no longer 
necessary to provide special codegen logic to prevent these operations from 
causing compiler crashes or miscompilations.

This patch removes these now unnecessary code paths to simplify codegen logic. 
Some MLIR tests are now replaced with Flang tests, since the responsibility of 
dealing with host operations has been moved earlier in the compilation flow.

MLIR tests holding target device modules are updated to no longer include now 
unsupported host operations.

---

Patch is 52.28 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/137201.diff


12 Files Affected:

- (added) flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90 (+87) 
- (added) flang/test/Integration/OpenMP/task-target-device.f90 (+37) 
- (added) flang/test/Integration/OpenMP/threadprivate-target-device.f90 (+40) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+159-264) 
- (modified) 
mlir/test/Target/LLVMIR/omptarget-constant-indexing-device-region.mlir (+10-15) 
- (modified) mlir/test/Target/LLVMIR/omptarget-debug-var-1.mlir (+6-11) 
- (modified) mlir/test/Target/LLVMIR/omptarget-memcpy-align-metadata.mlir 
(+24-37) 
- (removed) mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir (-40) 
- (removed) 
mlir/test/Target/LLVMIR/omptarget-threadprivate-device-lowering.mlir (-30) 
- (modified) mlir/test/Target/LLVMIR/openmp-llvm-invalid.mlir (+45) 
- (removed) mlir/test/Target/LLVMIR/openmp-target-nesting-in-host-ops.mlir 
(-156) 
- (removed) mlir/test/Target/LLVMIR/openmp-task-target-device.mlir (-26) 


``diff
diff --git a/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90 
b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
new file mode 100644
index 0..8c85a3c1784ed
--- /dev/null
+++ b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
@@ -0,0 +1,87 @@
+!===--===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===--===!
+
+!REQUIRES: amdgpu-registered-target
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-version=50 -fopenmp-is-target-device %s -o - | FileCheck %s
+
+! CHECK-NOT: define void @nested_target_in_parallel
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_parallel(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+
+  !$omp parallel
+!$omp target map(tofrom: v)
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_wsloop
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_wsloop_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_wsloop(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: i
+
+  !$omp do
+  do i=1, 10
+!$omp target map(tofrom: v)
+!$omp end target
+  end do
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_parallel_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_with_private_{{.*}}(ptr 
%{{.*}}, ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_parallel_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp parallel firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_task_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_task_with_private_{{.*}}(ptr %{{.*}}, 
ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_task_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp task firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end task
+end subroutine
+
+! CHECK-NOT: define void @target_and_atomic_update
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_target_and_atomic_update_{{.*}}(ptr %{{.*}})
+subroutine target_and_atomic_update(x, expr)
+  implicit none
+  integer, intent(inout) :: x, expr
+
+  !$omp target
+  !$omp end target
+
+  !$omp atomic update
+  x = x + expr
+end subroutine
+
+! CHECK-NOT: define

[llvm-branch-commits] [flang] [mlir] [MLIR][OpenMP] Simplify OpenMP device codegen (PR #137201)

2025-04-24 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/137201

After removing host operations from the device MLIR module, it is no longer 
necessary to provide special codegen logic to prevent these operations from 
causing compiler crashes or miscompilations.

This patch removes these now unnecessary code paths to simplify codegen logic. 
Some MLIR tests are now replaced with Flang tests, since the responsibility of 
dealing with host operations has been moved earlier in the compilation flow.

MLIR tests holding target device modules are updated to no longer include now 
unsupported host operations.

>From 174401a11e313c6b1ebe80552c59ea59a0c6a4de Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Tue, 22 Apr 2025 12:04:45 +0100
Subject: [PATCH] [MLIR][OpenMP] Simplify OpenMP device codegen

After removing host operations from the device MLIR module, it is no longer
necessary to provide special codegen logic to prevent these operations from
causing compiler crashes or miscompilations.

This patch removes these now unnecessary code paths to simplify codegen logic.
Some MLIR tests are now replaced with Flang tests, since the responsibility of
dealing with host operations has been moved earlier in the compilation flow.

MLIR tests holding target device modules are updated to no longer include now
unsupported host operations.
---
 .../OpenMP/target-nesting-in-host-ops.f90 |  87 
 .../Integration/OpenMP/task-target-device.f90 |  37 ++
 .../OpenMP/threadprivate-target-device.f90|  40 ++
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 423 +++---
 ...arget-constant-indexing-device-region.mlir |  25 +-
 .../Target/LLVMIR/omptarget-debug-var-1.mlir  |  17 +-
 .../omptarget-memcpy-align-metadata.mlir  |  61 +--
 .../LLVMIR/omptarget-target-inside-task.mlir  |  40 --
 ...ptarget-threadprivate-device-lowering.mlir |  30 --
 .../Target/LLVMIR/openmp-llvm-invalid.mlir|  45 ++
 .../openmp-target-nesting-in-host-ops.mlir| 156 ---
 .../LLVMIR/openmp-task-target-device.mlir |  26 --
 12 files changed, 408 insertions(+), 579 deletions(-)
 create mode 100644 flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
 create mode 100644 flang/test/Integration/OpenMP/task-target-device.f90
 create mode 100644 
flang/test/Integration/OpenMP/threadprivate-target-device.f90
 delete mode 100644 mlir/test/Target/LLVMIR/omptarget-target-inside-task.mlir
 delete mode 100644 
mlir/test/Target/LLVMIR/omptarget-threadprivate-device-lowering.mlir
 delete mode 100644 
mlir/test/Target/LLVMIR/openmp-target-nesting-in-host-ops.mlir
 delete mode 100644 mlir/test/Target/LLVMIR/openmp-task-target-device.mlir

diff --git a/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90 
b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
new file mode 100644
index 0..8c85a3c1784ed
--- /dev/null
+++ b/flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
@@ -0,0 +1,87 @@
+!===--===!
+! This directory can be used to add Integration tests involving multiple
+! stages of the compiler (for eg. from Fortran to LLVM IR). It should not
+! contain executable tests. We should only add tests here sparingly and only
+! if there is no other way to test. Repeat this message in each test that is
+! added to this directory and sub-directories.
+!===--===!
+
+!REQUIRES: amdgpu-registered-target
+!RUN: %flang_fc1 -triple amdgcn-amd-amdhsa -emit-llvm -fopenmp 
-fopenmp-version=50 -fopenmp-is-target-device %s -o - | FileCheck %s
+
+! CHECK-NOT: define void @nested_target_in_parallel
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_parallel(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+
+  !$omp parallel
+!$omp target map(tofrom: v)
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_wsloop
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_wsloop_{{.*}}(ptr %{{.*}}, ptr 
%{{.*}})
+subroutine nested_target_in_wsloop(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: i
+
+  !$omp do
+  do i=1, 10
+!$omp target map(tofrom: v)
+!$omp end target
+  end do
+end subroutine
+
+! CHECK-NOT: define void @nested_target_in_parallel_with_private
+! CHECK: define weak_odr protected amdgpu_kernel void 
@__omp_offloading_{{.*}}_nested_target_in_parallel_with_private_{{.*}}(ptr 
%{{.*}}, ptr %{{.*}}, ptr %{{.*}})
+subroutine nested_target_in_parallel_with_private(v)
+  implicit none
+  integer, intent(inout) :: v(10)
+  integer :: x
+  x = 10
+
+  !$omp parallel firstprivate(x)
+!$omp target map(tofrom: v(1:x))
+!$omp end target
+  !$omp end parallel
+end subroutine
+
+! CHECK-NOT: defi