[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-18 Thread Mats Petersson via cfe-commits

https://github.com/Leporacanthicus closed 
https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-18 Thread Kareem Ergawy via cfe-commits

https://github.com/ergawy approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-17 Thread Kareem Ergawy via cfe-commits

ergawy wrote:

> @ergawy Could you take a look at this, given that you did something similar 
> [even if it was much smaller] recently?

Sorry, this totally slipped my mind. I will take a look today.

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-17 Thread Mats Petersson via cfe-commits

Leporacanthicus wrote:

> progress?

I'm afraid I'm still waiting for review from some of the key people here. 
Apparently there was some OpenMP conference last week, which kept some people 
very busy, so I'm hoping by the time they've read emails and got over jet-lag, 
etc, they'll get round to this.

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-17 Thread Ye Luo via cfe-commits

ye-luo wrote:

progress?

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-07 Thread Mats Petersson via cfe-commits

Leporacanthicus wrote:

@ergawy Could you take a look at this, given that you did something similar 
[even if it was much smaller] recently?

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-07 Thread Tom Eccles via cfe-commits

https://github.com/tblah approved this pull request.


https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-07 Thread Nikita Popov via cfe-commits

https://github.com/nikic dismissed 
https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-06 Thread Mats Petersson via cfe-commits

https://github.com/Leporacanthicus updated 
https://github.com/llvm/llvm-project/pull/92430

>From 629f5785738fdc52d4dc8d193aa43b3d011b1039 Mon Sep 17 00:00:00 2001
From: Mats Petersson 
Date: Fri, 19 Apr 2024 18:00:58 +0100
Subject: [PATCH 1/6] Fix for changed code at the end of AllocaIP.

Some of the OpenMP code can change the instruction pointed at by
the insertion point. This leads to an assert in the compiler about
BB->getParent() and IP->getParent() not matching or something like
that.

The fix is to rebuild the insertionpoint from the block, rather
than use builder.restoreIP.

Also, move some of the alloca generation, rather than skipping back
and forth between insert points (and ensure all the allocas are done
before their users are created).

A simple test, mainly to ensure the minimal reproducer doesn't fail
to compile in the future is also added.
---
 .../OpenMP/parallel-reduction-allocate.f90| 23 +++
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 11 ++---
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 22 +++---
 3 files changed, 45 insertions(+), 11 deletions(-)
 create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-allocate.f90

diff --git a/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 
b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90
new file mode 100644
index 0..fddce25ae22cc
--- /dev/null
+++ b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90
@@ -0,0 +1,23 @@
+!! The main point of this test is to check that the code compiles at all, so 
the
+!! checking is not very detailed. Not hitting an assert, crashing or otherwise 
failing
+!! to compile is the key point. Also, emitting llvm is required for this to 
happen.
+! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine proc
+  implicit none
+  real(8),allocatable :: F(:)
+  real(8),allocatable :: A(:)
+
+!$omp parallel private(A) reduction(+:F)
+  allocate(A(10))
+!$omp end parallel
+end subroutine proc
+
+!CHECK-LABEL: define void @proc_()
+!CHECK: call void
+!CHECK-SAME: @__kmpc_fork_call(ptr {{.*}}, i32 1, ptr @[[OMP_PAR:.*]], {{.*}})
+
+!CHECK: define internal void @[[OMP_PAR]]
+!CHECK: omp.par.region8:
+!CHECK-NEXT: call ptr @malloc
+!CHECK-SAME: i64 10
+
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 2c4b45255d059..d423b545a1faf 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -1391,7 +1391,8 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
 
   // Change the location to the outer alloca insertion point to create and
   // initialize the allocas we pass into the parallel region.
-  Builder.restoreIP(OuterAllocaIP);
+  InsertPointTy NewOuter(OuterAllocaBlock, OuterAllocaBlock->begin());
+  Builder.restoreIP(NewOuter);
   AllocaInst *TIDAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "tid.addr");
   AllocaInst *ZeroAddrAlloca =
   Builder.CreateAlloca(Int32, nullptr, "zero.addr");
@@ -2155,7 +2156,8 @@ OpenMPIRBuilder::createReductions(const 
LocationDescription ,
   // values.
   unsigned NumReductions = ReductionInfos.size();
   Type *RedArrayTy = ArrayType::get(Builder.getPtrTy(), NumReductions);
-  Builder.restoreIP(AllocaIP);
+  Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator());
+  //  Builder.restoreIP(AllocaIP);
   Value *RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array");
 
   Builder.SetInsertPoint(InsertBlock, InsertBlock->end());
@@ -2556,7 +2558,10 @@ OpenMPIRBuilder::applyStaticWorkshareLoop(DebugLoc DL, 
CanonicalLoopInfo *CLI,
   getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_for_static_fini);
 
   // Allocate space for computed loop bounds as expected by the "init" 
function.
-  Builder.restoreIP(AllocaIP);
+
+  //  Builder.restoreIP(AllocaIP);
+  Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator());
+
   Type *I32Type = Type::getInt32Ty(M.getContext());
   Value *PLastIter = Builder.CreateAlloca(I32Type, nullptr, "p.lastiter");
   Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound");
diff --git 
a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp 
b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 6ec4c120c11ea..47b07248ba84d 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1154,6 +1154,17 @@ convertOmpParallel(omp::ParallelOp opInst, 
llvm::IRBuilderBase ,
 MutableArrayRef reductionArgs =
 opInst.getRegion().getArguments().take_back(
 opInst.getNumReductionVars());
+
+SmallVector byRefVars;
+if (isByRef) {
+  for (unsigned i = 0; i < opInst.getNumReductionVars(); ++i) {
+// Allocate reduction variable (which is a pointer to the real 
reduciton
+// variable allocated in the inlined region)
+

[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-06 Thread Nikita Popov via cfe-commits

nikic wrote:

> > Allocas should be placed at the start of the entry block.
> 
> Would it be acceptable to make this change such that the alloca's are at the 
> very start of the block? I have tried that, and it seems to work. It will 
> still mean changes to the tests, and I can't guarantee ALL alloca's for all 
> Fortran (via MLIR) generated code will be at the start of the entry block.

Yes, that should be fine.

You could also use something like getFirstNonPHIOrDbgOrAlloca to compute the 
insert point after the current allocas. (Calling this many times could have 
quadratic runtime though -- I don't have enough context here to say whether 
that's a problem or not.)

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-06 Thread Mats Petersson via cfe-commits

Leporacanthicus wrote:

> Allocas should be placed at the start of the entry block.

Would it be acceptable to make this change such that the alloca's are at the 
very start of the block? I have tried that, and it seems to work. It will still 
mean changes to the tests, and I can't guarantee ALL alloca's for all Fortran 
(via MLIR) generated code will be at the start of the entry block.

We have a further ticket to clean this up, but it's a bigger task, which we 
will do, but it will take a few weeks at least.

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-06 Thread Tom Eccles via cfe-commits

https://github.com/tblah approved this pull request.

LGTM if @Meinersbur is happy. It's a shame about the fallout to the clang tests 
but I can see that would be very difficult to avoid.

Thank you for persisting with this difficult patch.

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-05 Thread Mats Petersson via cfe-commits

https://github.com/Leporacanthicus updated 
https://github.com/llvm/llvm-project/pull/92430

>From 629f5785738fdc52d4dc8d193aa43b3d011b1039 Mon Sep 17 00:00:00 2001
From: Mats Petersson 
Date: Fri, 19 Apr 2024 18:00:58 +0100
Subject: [PATCH 1/6] Fix for changed code at the end of AllocaIP.

Some of the OpenMP code can change the instruction pointed at by
the insertion point. This leads to an assert in the compiler about
BB->getParent() and IP->getParent() not matching or something like
that.

The fix is to rebuild the insertionpoint from the block, rather
than use builder.restoreIP.

Also, move some of the alloca generation, rather than skipping back
and forth between insert points (and ensure all the allocas are done
before their users are created).

A simple test, mainly to ensure the minimal reproducer doesn't fail
to compile in the future is also added.
---
 .../OpenMP/parallel-reduction-allocate.f90| 23 +++
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 11 ++---
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 22 +++---
 3 files changed, 45 insertions(+), 11 deletions(-)
 create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-allocate.f90

diff --git a/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 
b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90
new file mode 100644
index 00..fddce25ae22ccf
--- /dev/null
+++ b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90
@@ -0,0 +1,23 @@
+!! The main point of this test is to check that the code compiles at all, so 
the
+!! checking is not very detailed. Not hitting an assert, crashing or otherwise 
failing
+!! to compile is the key point. Also, emitting llvm is required for this to 
happen.
+! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine proc
+  implicit none
+  real(8),allocatable :: F(:)
+  real(8),allocatable :: A(:)
+
+!$omp parallel private(A) reduction(+:F)
+  allocate(A(10))
+!$omp end parallel
+end subroutine proc
+
+!CHECK-LABEL: define void @proc_()
+!CHECK: call void
+!CHECK-SAME: @__kmpc_fork_call(ptr {{.*}}, i32 1, ptr @[[OMP_PAR:.*]], {{.*}})
+
+!CHECK: define internal void @[[OMP_PAR]]
+!CHECK: omp.par.region8:
+!CHECK-NEXT: call ptr @malloc
+!CHECK-SAME: i64 10
+
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 2c4b45255d059c..d423b545a1faf3 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -1391,7 +1391,8 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
 
   // Change the location to the outer alloca insertion point to create and
   // initialize the allocas we pass into the parallel region.
-  Builder.restoreIP(OuterAllocaIP);
+  InsertPointTy NewOuter(OuterAllocaBlock, OuterAllocaBlock->begin());
+  Builder.restoreIP(NewOuter);
   AllocaInst *TIDAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "tid.addr");
   AllocaInst *ZeroAddrAlloca =
   Builder.CreateAlloca(Int32, nullptr, "zero.addr");
@@ -2155,7 +2156,8 @@ OpenMPIRBuilder::createReductions(const 
LocationDescription ,
   // values.
   unsigned NumReductions = ReductionInfos.size();
   Type *RedArrayTy = ArrayType::get(Builder.getPtrTy(), NumReductions);
-  Builder.restoreIP(AllocaIP);
+  Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator());
+  //  Builder.restoreIP(AllocaIP);
   Value *RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array");
 
   Builder.SetInsertPoint(InsertBlock, InsertBlock->end());
@@ -2556,7 +2558,10 @@ OpenMPIRBuilder::applyStaticWorkshareLoop(DebugLoc DL, 
CanonicalLoopInfo *CLI,
   getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_for_static_fini);
 
   // Allocate space for computed loop bounds as expected by the "init" 
function.
-  Builder.restoreIP(AllocaIP);
+
+  //  Builder.restoreIP(AllocaIP);
+  Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator());
+
   Type *I32Type = Type::getInt32Ty(M.getContext());
   Value *PLastIter = Builder.CreateAlloca(I32Type, nullptr, "p.lastiter");
   Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound");
diff --git 
a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp 
b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 6ec4c120c11ea5..47b07248ba84d6 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1154,6 +1154,17 @@ convertOmpParallel(omp::ParallelOp opInst, 
llvm::IRBuilderBase ,
 MutableArrayRef reductionArgs =
 opInst.getRegion().getArguments().take_back(
 opInst.getNumReductionVars());
+
+SmallVector byRefVars;
+if (isByRef) {
+  for (unsigned i = 0; i < opInst.getNumReductionVars(); ++i) {
+// Allocate reduction variable (which is a pointer to the real 
reduciton
+// variable allocated in the inlined region)
+

[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-05 Thread Mats Petersson via cfe-commits


@@ -0,0 +1,23 @@
+!! The main point of this test is to check that the code compiles at all, so 
the

Leporacanthicus wrote:

I have created an internal ticket to improve the code generation. It's just a 
much bigger task than this one started out as. I then found more problems, so 
this took much longer than I expected.

https://github.com/llvm/llvm-project/pull/92430
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] [mlir] [Flang]Fix for changed code at the end of AllocaIP. (PR #92430)

2024-06-04 Thread Mats Petersson via cfe-commits

https://github.com/Leporacanthicus updated 
https://github.com/llvm/llvm-project/pull/92430

>From e08c19ced1b2bcdd4a83ffcd3c5f5656c2021d52 Mon Sep 17 00:00:00 2001
From: Mats Petersson 
Date: Fri, 19 Apr 2024 18:00:58 +0100
Subject: [PATCH 1/5] Fix for changed code at the end of AllocaIP.

Some of the OpenMP code can change the instruction pointed at by
the insertion point. This leads to an assert in the compiler about
BB->getParent() and IP->getParent() not matching or something like
that.

The fix is to rebuild the insertionpoint from the block, rather
than use builder.restoreIP.

Also, move some of the alloca generation, rather than skipping back
and forth between insert points (and ensure all the allocas are done
before their users are created).

A simple test, mainly to ensure the minimal reproducer doesn't fail
to compile in the future is also added.
---
 .../OpenMP/parallel-reduction-allocate.f90| 23 +++
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 11 ++---
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 22 +++---
 3 files changed, 45 insertions(+), 11 deletions(-)
 create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-allocate.f90

diff --git a/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90 
b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90
new file mode 100644
index 0..fddce25ae22cc
--- /dev/null
+++ b/flang/test/Lower/OpenMP/parallel-reduction-allocate.f90
@@ -0,0 +1,23 @@
+!! The main point of this test is to check that the code compiles at all, so 
the
+!! checking is not very detailed. Not hitting an assert, crashing or otherwise 
failing
+!! to compile is the key point. Also, emitting llvm is required for this to 
happen.
+! RUN: %flang_fc1 -emit-llvm -fopenmp -o - %s 2>&1 | FileCheck %s
+subroutine proc
+  implicit none
+  real(8),allocatable :: F(:)
+  real(8),allocatable :: A(:)
+
+!$omp parallel private(A) reduction(+:F)
+  allocate(A(10))
+!$omp end parallel
+end subroutine proc
+
+!CHECK-LABEL: define void @proc_()
+!CHECK: call void
+!CHECK-SAME: @__kmpc_fork_call(ptr {{.*}}, i32 1, ptr @[[OMP_PAR:.*]], {{.*}})
+
+!CHECK: define internal void @[[OMP_PAR]]
+!CHECK: omp.par.region8:
+!CHECK-NEXT: call ptr @malloc
+!CHECK-SAME: i64 10
+
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index cb4de9c8876dc..e76197ea6c9c7 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -1391,7 +1391,8 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
 
   // Change the location to the outer alloca insertion point to create and
   // initialize the allocas we pass into the parallel region.
-  Builder.restoreIP(OuterAllocaIP);
+  InsertPointTy NewOuter(OuterAllocaBlock, OuterAllocaBlock->begin());
+  Builder.restoreIP(NewOuter);
   AllocaInst *TIDAddrAlloca = Builder.CreateAlloca(Int32, nullptr, "tid.addr");
   AllocaInst *ZeroAddrAlloca =
   Builder.CreateAlloca(Int32, nullptr, "zero.addr");
@@ -2151,7 +2152,8 @@ OpenMPIRBuilder::createReductions(const 
LocationDescription ,
   // values.
   unsigned NumReductions = ReductionInfos.size();
   Type *RedArrayTy = ArrayType::get(Builder.getPtrTy(), NumReductions);
-  Builder.restoreIP(AllocaIP);
+  Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator());
+  //  Builder.restoreIP(AllocaIP);
   Value *RedArray = Builder.CreateAlloca(RedArrayTy, nullptr, "red.array");
 
   Builder.SetInsertPoint(InsertBlock, InsertBlock->end());
@@ -2552,7 +2554,10 @@ OpenMPIRBuilder::applyStaticWorkshareLoop(DebugLoc DL, 
CanonicalLoopInfo *CLI,
   getOrCreateRuntimeFunction(M, omp::OMPRTL___kmpc_for_static_fini);
 
   // Allocate space for computed loop bounds as expected by the "init" 
function.
-  Builder.restoreIP(AllocaIP);
+
+  //  Builder.restoreIP(AllocaIP);
+  Builder.SetInsertPoint(AllocaIP.getBlock()->getTerminator());
+
   Type *I32Type = Type::getInt32Ty(M.getContext());
   Value *PLastIter = Builder.CreateAlloca(I32Type, nullptr, "p.lastiter");
   Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound");
diff --git 
a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp 
b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 6ec4c120c11ea..47b07248ba84d 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1154,6 +1154,17 @@ convertOmpParallel(omp::ParallelOp opInst, 
llvm::IRBuilderBase ,
 MutableArrayRef reductionArgs =
 opInst.getRegion().getArguments().take_back(
 opInst.getNumReductionVars());
+
+SmallVector byRefVars;
+if (isByRef) {
+  for (unsigned i = 0; i < opInst.getNumReductionVars(); ++i) {
+// Allocate reduction variable (which is a pointer to the real 
reduciton
+// variable allocated in the inlined region)
+