[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/Leporacanthicus closed https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/ergawy approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
ergawy wrote: > @ergawy Could you take a look at this, given that you did something similar > [even if it was much smaller] recently? Sorry, this totally slipped my mind. I will take a look today. https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
Leporacanthicus wrote: > progress? I'm afraid I'm still waiting for review from some of the key people here. Apparently there was some OpenMP conference last week, which kept some people very busy, so I'm hoping by the time they've read emails and got over jet-lag, etc, they'll get round to this. https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
ye-luo wrote: progress? https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
Leporacanthicus wrote: @ergawy Could you take a look at this, given that you did something similar [even if it was much smaller] recently? https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/tblah approved this pull request. https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/nikic dismissed https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/Leporacanthicus updated https://github.com/llvm/llvm-project/pull/92430 >From 629f5785738fdc52d4dc8d193aa43b3d011b1039 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 19 Apr 2024 18:00:58 +0100 Subject: [PATCH 1/6] Fix for changed code at the end of AllocaIP. Some of the OpenMP code can change the instruction pointed at by the insertion point. This leads to an assert in the compiler about BB->getParent() and IP->getParent() not matching or something like that. The fix is to rebuild the insertionpoint from the block, rather than use builder.restoreIP. Also, move some of the alloca generation, rather than skipping back and forth between insert points (and ensure all the allocas are done before their users are created). A simple test, mainly to ensure the minimal reproducer doesn't fail to compile in the future is also added. --- .../OpenMP/parallel-reduction-allocate.f90| 23 +++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 11 ++--- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 22 +++--- 3 files changed, 45 insertions(+), 11 deletions(-) create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 diff --git a/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 new file mode 100644 index 0..fddce25ae22cc --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 @@ -0,0 +1,23 @@ +!! The main point of this test is to check that the code compiles at all, so the +!! checking is not very detailed. Not hitting an assert, crashing or otherwise failing +!! to compile is the key point. Also, emitting llvm is required for this to happen. +! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s 2>&1 | FileCheck %s +subroutine proc + implicit none + real(8),allocatable :: F(:) + real(8),allocatable :: A(:) + +!$omp parallel private(A) reduction(+:F) + allocate(A(10)) +!$omp end parallel +end subroutine proc + +!CHECK-LABEL: define void @proc_() +!CHECK: call void +!CHECK-SAME: @__kmpc_fork_call(ptr {{.*}}, i32 1, ptr @[[OMP_PAR:.*]], {{.*}}) + +!CHECK: define internal void @[[OMP_PAR]] +!CHECK: omp.par.region8: +!CHECK-NEXT: call ptr @malloc +!CHECK-SAME: i64 10 + diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 2c4b45255d059..d423b545a1faf 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -1391,7 +1391,8 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel( // Change the location to the outer alloca insertion point to create and // initialize the allocas we pass into the parallel region. - Builder.restoreIP(OuterAllocaIP); + InsertPointTy NewOuter(OuterAllocaBlock, OuterAllocaBlock->begin()); + Builder.restoreIP(NewOuter); AllocaInst *TIDAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "tid.addr"); AllocaInst *ZeroAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "zero.addr"); @@ -2155,7 +2156,8 @@ OpenMPIRBuilder::createReductions(const LocationDescription , // values. unsigned NumReductions = ReductionInfos.size(); Type *RedArrayTy = ArrayType::get(Builder.getPtrTy(), NumReductions); - Builder.restoreIP(AllocaIP); + Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator()); + // Builder.restoreIP(AllocaIP); Value *RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array"); Builder.SetInsertPoint(InsertBlock, InsertBlock->end()); @@ -2556,7 +2558,10 @@ OpenMPIRBuilder::applyStaticWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_for_static_fini); // Allocate space for computed loop bounds as expected by the "init" function. - Builder.restoreIP(AllocaIP); + + // Builder.restoreIP(AllocaIP); + Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator()); + Type *I32Type = Type::getInt32Ty(M.getContext()); Value *PLastIter = Builder.CreateAlloca(I32Type, nullptr, "p.lastiter"); Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 6ec4c120c11ea..47b07248ba84d 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -1154,6 +1154,17 @@ convertOmpParallel(omp::ParallelOp opInst, llvm::IRBuilderBase , MutableArrayRef reductionArgs = opInst.getRegion().getArguments().take_back( opInst.getNumReductionVars()); + +SmallVector byRefVars; +if (isByRef) { + for (unsigned i = 0; i < opInst.getNumReductionVars(); ++i) { +// Allocate reduction variable (which is a pointer to the real reduciton +// variable allocated in the inlined region) +
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
nikic wrote: > > Allocas should be placed at the start of the entry block. > > Would it be acceptable to make this change such that the alloca's are at the > very start of the block? I have tried that, and it seems to work. It will > still mean changes to the tests, and I can't guarantee ALL alloca's for all > Fortran (via MLIR) generated code will be at the start of the entry block. Yes, that should be fine. You could also use something like getFirstNonPHIOrDbgOrAlloca to compute the insert point after the current allocas. (Calling this many times could have quadratic runtime though -- I don't have enough context here to say whether that's a problem or not.) https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
Leporacanthicus wrote: > Allocas should be placed at the start of the entry block. Would it be acceptable to make this change such that the alloca's are at the very start of the block? I have tried that, and it seems to work. It will still mean changes to the tests, and I can't guarantee ALL alloca's for all Fortran (via MLIR) generated code will be at the start of the entry block. We have a further ticket to clean this up, but it's a bigger task, which we will do, but it will take a few weeks at least. https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/tblah approved this pull request. LGTM if @Meinersbur is happy. It's a shame about the fallout to the clang tests but I can see that would be very difficult to avoid. Thank you for persisting with this difficult patch. https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/Leporacanthicus updated https://github.com/llvm/llvm-project/pull/92430 >From 629f5785738fdc52d4dc8d193aa43b3d011b1039 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 19 Apr 2024 18:00:58 +0100 Subject: [PATCH 1/6] Fix for changed code at the end of AllocaIP. Some of the OpenMP code can change the instruction pointed at by the insertion point. This leads to an assert in the compiler about BB->getParent() and IP->getParent() not matching or something like that. The fix is to rebuild the insertionpoint from the block, rather than use builder.restoreIP. Also, move some of the alloca generation, rather than skipping back and forth between insert points (and ensure all the allocas are done before their users are created). A simple test, mainly to ensure the minimal reproducer doesn't fail to compile in the future is also added. --- .../OpenMP/parallel-reduction-allocate.f90| 23 +++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 11 ++--- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 22 +++--- 3 files changed, 45 insertions(+), 11 deletions(-) create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 diff --git a/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 new file mode 100644 index 00..fddce25ae22ccf --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 @@ -0,0 +1,23 @@ +!! The main point of this test is to check that the code compiles at all, so the +!! checking is not very detailed. Not hitting an assert, crashing or otherwise failing +!! to compile is the key point. Also, emitting llvm is required for this to happen. +! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s 2>&1 | FileCheck %s +subroutine proc + implicit none + real(8),allocatable :: F(:) + real(8),allocatable :: A(:) + +!$omp parallel private(A) reduction(+:F) + allocate(A(10)) +!$omp end parallel +end subroutine proc + +!CHECK-LABEL: define void @proc_() +!CHECK: call void +!CHECK-SAME: @__kmpc_fork_call(ptr {{.*}}, i32 1, ptr @[[OMP_PAR:.*]], {{.*}}) + +!CHECK: define internal void @[[OMP_PAR]] +!CHECK: omp.par.region8: +!CHECK-NEXT: call ptr @malloc +!CHECK-SAME: i64 10 + diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 2c4b45255d059c..d423b545a1faf3 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -1391,7 +1391,8 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel( // Change the location to the outer alloca insertion point to create and // initialize the allocas we pass into the parallel region. - Builder.restoreIP(OuterAllocaIP); + InsertPointTy NewOuter(OuterAllocaBlock, OuterAllocaBlock->begin()); + Builder.restoreIP(NewOuter); AllocaInst *TIDAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "tid.addr"); AllocaInst *ZeroAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "zero.addr"); @@ -2155,7 +2156,8 @@ OpenMPIRBuilder::createReductions(const LocationDescription , // values. unsigned NumReductions = ReductionInfos.size(); Type *RedArrayTy = ArrayType::get(Builder.getPtrTy(), NumReductions); - Builder.restoreIP(AllocaIP); + Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator()); + // Builder.restoreIP(AllocaIP); Value *RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array"); Builder.SetInsertPoint(InsertBlock, InsertBlock->end()); @@ -2556,7 +2558,10 @@ OpenMPIRBuilder::applyStaticWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_for_static_fini); // Allocate space for computed loop bounds as expected by the "init" function. - Builder.restoreIP(AllocaIP); + + // Builder.restoreIP(AllocaIP); + Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator()); + Type *I32Type = Type::getInt32Ty(M.getContext()); Value *PLastIter = Builder.CreateAlloca(I32Type, nullptr, "p.lastiter"); Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 6ec4c120c11ea5..47b07248ba84d6 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -1154,6 +1154,17 @@ convertOmpParallel(omp::ParallelOp opInst, llvm::IRBuilderBase , MutableArrayRef reductionArgs = opInst.getRegion().getArguments().take_back( opInst.getNumReductionVars()); + +SmallVector byRefVars; +if (isByRef) { + for (unsigned i = 0; i < opInst.getNumReductionVars(); ++i) { +// Allocate reduction variable (which is a pointer to the real reduciton +// variable allocated in the inlined region) +
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
@@ -0,0 +1,23 @@ +!! The main point of this test is to check that the code compiles at all, so the Leporacanthicus wrote: I have created an internal ticket to improve the code generation. It's just a much bigger task than this one started out as. I then found more problems, so this took much longer than I expected. https://github.com/llvm/llvm-project/pull/92430 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)
https://github.com/Leporacanthicus updated https://github.com/llvm/llvm-project/pull/92430 >From e08c19ced1b2bcdd4a83ffcd3c5f5656c2021d52 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 19 Apr 2024 18:00:58 +0100 Subject: [PATCH 1/5] Fix for changed code at the end of AllocaIP. Some of the OpenMP code can change the instruction pointed at by the insertion point. This leads to an assert in the compiler about BB->getParent() and IP->getParent() not matching or something like that. The fix is to rebuild the insertionpoint from the block, rather than use builder.restoreIP. Also, move some of the alloca generation, rather than skipping back and forth between insert points (and ensure all the allocas are done before their users are created). A simple test, mainly to ensure the minimal reproducer doesn't fail to compile in the future is also added. --- .../OpenMP/parallel-reduction-allocate.f90| 23 +++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 11 ++--- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 22 +++--- 3 files changed, 45 insertions(+), 11 deletions(-) create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 diff --git a/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 new file mode 100644 index 0..fddce25ae22cc --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 @@ -0,0 +1,23 @@ +!! The main point of this test is to check that the code compiles at all, so the +!! checking is not very detailed. Not hitting an assert, crashing or otherwise failing +!! to compile is the key point. Also, emitting llvm is required for this to happen. +! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s 2>&1 | FileCheck %s +subroutine proc + implicit none + real(8),allocatable :: F(:) + real(8),allocatable :: A(:) + +!$omp parallel private(A) reduction(+:F) + allocate(A(10)) +!$omp end parallel +end subroutine proc + +!CHECK-LABEL: define void @proc_() +!CHECK: call void +!CHECK-SAME: @__kmpc_fork_call(ptr {{.*}}, i32 1, ptr @[[OMP_PAR:.*]], {{.*}}) + +!CHECK: define internal void @[[OMP_PAR]] +!CHECK: omp.par.region8: +!CHECK-NEXT: call ptr @malloc +!CHECK-SAME: i64 10 + diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index cb4de9c8876dc..e76197ea6c9c7 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -1391,7 +1391,8 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel( // Change the location to the outer alloca insertion point to create and // initialize the allocas we pass into the parallel region. - Builder.restoreIP(OuterAllocaIP); + InsertPointTy NewOuter(OuterAllocaBlock, OuterAllocaBlock->begin()); + Builder.restoreIP(NewOuter); AllocaInst *TIDAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "tid.addr"); AllocaInst *ZeroAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "zero.addr"); @@ -2151,7 +2152,8 @@ OpenMPIRBuilder::createReductions(const LocationDescription , // values. unsigned NumReductions = ReductionInfos.size(); Type *RedArrayTy = ArrayType::get(Builder.getPtrTy(), NumReductions); - Builder.restoreIP(AllocaIP); + Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator()); + // Builder.restoreIP(AllocaIP); Value *RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array"); Builder.SetInsertPoint(InsertBlock, InsertBlock->end()); @@ -2552,7 +2554,10 @@ OpenMPIRBuilder::applyStaticWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_for_static_fini); // Allocate space for computed loop bounds as expected by the "init" function. - Builder.restoreIP(AllocaIP); + + // Builder.restoreIP(AllocaIP); + Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator()); + Type *I32Type = Type::getInt32Ty(M.getContext()); Value *PLastIter = Builder.CreateAlloca(I32Type, nullptr, "p.lastiter"); Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 6ec4c120c11ea..47b07248ba84d 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -1154,6 +1154,17 @@ convertOmpParallel(omp::ParallelOp opInst, llvm::IRBuilderBase , MutableArrayRef reductionArgs = opInst.getRegion().getArguments().take_back( opInst.getNumReductionVars()); + +SmallVector byRefVars; +if (isByRef) { + for (unsigned i = 0; i < opInst.getNumReductionVars(); ++i) { +// Allocate reduction variable (which is a pointer to the real reduciton +// variable allocated in the inlined region) +