[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `openmp-s390x-linux` running on `systemz-1` while building `clang,openmp` at step 6 "test-openmp". Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/12678 Here is the relevant piece of the build log for the reference ``` Step 6 (test-openmp) failure: test (failure) TEST 'libomp :: worksharing/for/omp_for_private_reduction.cpp' FAILED Exit Code: 1 Command Output (stdout): -- # RUN: at line 1 /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang++ -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt -std=c++17 /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/worksharing/for/Output/omp_for_private_reduction.cpp.tmp -lm -latomic -fopenmp-version=60 && /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/worksharing/for/Output/omp_for_private_reduction.cpp.tmp # executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang++ -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt -std=c++17 /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/worksharing/for/Output/omp_for_private_reduction.cpp.tmp -lm -latomic -fopenmp-version=60 # .---command stderr # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:78:42: error: use of undeclared identifier 'I' # |78 | double _Complex expected = 0.0 + 0.0 * I; # | | ^ # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:79:40: error: use of undeclared identifier 'I' # |79 | double _Complex result = 0.0 + 0.0 * I; # | |^ # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:84:22: error: use of undeclared identifier 'I' # |84 | arr[i] = i - i * I; # | | ^ # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:92:19: error: use of undeclared identifier 'creal' # |92 | real_sum += creal(arr[i]); # | | ^ # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:93:19: error: use of undeclared identifier 'cimag' # |93 | imag_sum += cimag(arr[i]); # | | ^ # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:96:36: error: use of undeclared identifier 'I' # |96 | result = real_sum + imag_sum * I; # | |^ # | /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:97:9: error: use of undeclared identifier 'cabs' # |97 | if (cabs(result - expected) > 1e-6) { # | | ^~~~ # | 7 errors generated. # `- # error: command failed with exit status: 1 -- ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `openmp-offload-amdgpu-runtime-2` running on `rocm-worker-hw-02` while building `clang,openmp` at step 6 "test-openmp". Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/7099 Here is the relevant piece of the build log for the reference ``` Step 6 (test-openmp) failure: test (failure) TEST 'libomp :: worksharing/for/omp_for_private_reduction.cpp' FAILED Exit Code: 1 Command Output (stdout): -- # RUN: at line 1 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang++ -fopenmp -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/ompt -std=c++17 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/worksharing/for/Output/omp_for_private_reduction.cpp.tmp -lm -latomic -fopenmp-version=60 && /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/worksharing/for/Output/omp_for_private_reduction.cpp.tmp # executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang++ -fopenmp -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/ompt -std=c++17 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/worksharing/for/Output/omp_for_private_reduction.cpp.tmp -lm -latomic -fopenmp-version=60 # .---command stderr # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:78:42: error: use of undeclared identifier 'I' # |78 | double _Complex expected = 0.0 + 0.0 * I; # | | ^ # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:79:40: error: use of undeclared identifier 'I' # |79 | double _Complex result = 0.0 + 0.0 * I; # | |^ # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:84:22: error: use of undeclared identifier 'I' # |84 | arr[i] = i - i * I; # | | ^ # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:92:19: error: use of undeclared identifier 'creal' # |92 | real_sum += creal(arr[i]); # | | ^ # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:93:19: error: use of undeclared identifier 'cimag' # |93 | imag_sum += cimag(arr[i]); # | | ^ # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:96:36: error: use of undeclared identifier 'I' # |96 | result = real_sum + imag_sum * I; # | |^ # | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp:97:9: error: use of undeclared identifier 'cabs' # |97 | if (cabs(result - expected) > 1e-6) { # | | ^~~~ # | 7 errors generated. # `- # error: command failed with exit status: 1 -- ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale closed https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/alexey-bataev approved this pull request. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
chandraghale wrote: @alexey-bataev Added few more changes on top, to be on accordance with spec. Shared copy is updated with values from private copies until all updates are complete, before combining into the original list item. Hope the changes are fine with you. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -3947,7 +3947,7 @@ static void emitScanBasedDirective( CGF.CGM.getOpenMPRuntime().emitReduction( CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, {/*WithNowait=*/true, /*SimpleReduction=*/true, - /*IsPrivateVarReduction*/{}, OMPD_unknown}); + /*IsPrivateVarReduction*/ {}, OMPD_unknown}); alexey-bataev wrote: ```suggestion /*IsPrivateVarReduction=*/{}, OMPD_unknown}); ``` Recommended way https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/alexey-bataev edited https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/alexey-bataev approved this pull request. LG with nits https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5753,7 +5753,7 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) { CGM.getOpenMPRuntime().emitReduction( *this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, {/*WithNowait=*/true, /*SimpleReduction=*/true, - /*IsPrivateVarReduction*/{}, OMPD_simd}); + /*IsPrivateVarReduction*/ {}, OMPD_simd}); alexey-bataev wrote: ```suggestion /*IsPrivateVarReduction=*/{}, OMPD_simd}); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5748,7 +5752,8 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) { } CGM.getOpenMPRuntime().emitReduction( *this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_simd}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, + /*IsPrivateVarReduction*/ {false}, OMPD_simd}); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -3943,7 +3946,8 @@ static void emitScanBasedDirective( PrivScope.Privatize(); CGF.CGM.getOpenMPRuntime().emitReduction( CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_unknown}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, + /*IsPrivateVarReduction*/ {false}, OMPD_unknown}); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -3943,7 +3946,8 @@ static void emitScanBasedDirective( PrivScope.Privatize(); CGF.CGM.getOpenMPRuntime().emitReduction( CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_unknown}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, + /*IsPrivateVarReduction*/ {false}, OMPD_unknown}); alexey-bataev wrote: ```suggestion /*IsPrivateVarReduction=*/{}, OMPD_unknown}); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5748,7 +5752,8 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) { } CGM.getOpenMPRuntime().emitReduction( *this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_simd}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, + /*IsPrivateVarReduction*/ {false}, OMPD_simd}); alexey-bataev wrote: ```suggestion /*IsPrivateVarReduction=*/{}, OMPD_simd}); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); chandraghale wrote: Done https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); +else if (InitValue.isFloat()) + InitVal = llvm::ConstantFP::get(LLVMType, InitValue.getFloat()); +else if (InitValue.isComplexInt()) { + // For complex int: create struct { real, imag } + llvm::Constant *Real = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexIntReal()); + llvm::Constant *Imag = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexIntImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} else if (InitValue.isComplexFloat()) { + llvm::Constant *Real = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexFloatReal()); + llvm::Constant *Imag = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexFloatImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cas
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +// EmitNullInitialization handles default construction for C++ classes +// and zeroing for scalars, which is a reasonable default. +CGF.EmitNullInitialization(SharedResult, PrivateType); + } + return; // UDR initialization handled +} +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +if (const Expr *InitExpr = VD->getInit()) { + CGF.EmitAnyExprToMem(InitExpr, SharedResult, + PrivateType.getQualifiers(), true); + return; +} + } +} +CGF.EmitNullInitialization(SharedResult, PrivateType); + }; + EmitSharedInit(); + CGF.Bui
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); chandraghale wrote: Done !!! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); alexey-bataev wrote: Use OMPBuilder.getOrCreateInternalVariable https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); alexey-bataev wrote: Use CGF.MakeNaturalAlignRawAddrLValue https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +// EmitNullInitialization handles default construction for C++ classes +// and zeroing for scalars, which is a reasonable default. +CGF.EmitNullInitialization(SharedResult, PrivateType); + } + return; // UDR initialization handled +} +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +if (const Expr *InitExpr = VD->getInit()) { + CGF.EmitAnyExprToMem(InitExpr, SharedResult, + PrivateType.getQualifiers(), true); + return; +} + } +} +CGF.EmitNullInitialization(SharedResult, PrivateType); + }; + EmitSharedInit(); + CGF.Bui
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); alexey-bataev wrote: ```suggestion PrivateType.getQualifiers(), /*IsInitializer=*/true); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); alexey-bataev wrote: ```suggestion PrivateType.getQualifiers(), /*IsInitializer=*/true); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,234 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + llvm::Constant::getNullValue(LLVMType), ".omp.reduction." + SharedName, + nullptr, llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); alexey-bataev wrote: ```suggestion PrivateType.getQualifiers(), /*IsInitializer=*/true); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -1481,6 +1482,8 @@ void CodeGenFunction::EmitOMPReductionClauseFinal( Privates.append(C->privates().begin(), C->privates().end()); LHSExprs.append(C->lhs_exprs().begin(), C->lhs_exprs().end()); RHSExprs.append(C->rhs_exprs().begin(), C->rhs_exprs().end()); +IsPrivate.append(C->private_var_reduction_flags().begin(), + C->private_var_reduction_flags().end()); chandraghale wrote: Thanks for pointing, I have fixed it now. Mix of private of non-private reduction is now correctly populating. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5748,7 +5754,7 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) { } CGM.getOpenMPRuntime().emitReduction( *this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_simd}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, false, OMPD_simd}); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -3943,7 +3948,8 @@ static void emitScanBasedDirective( PrivScope.Privatize(); CGF.CGM.getOpenMPRuntime().emitReduction( CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_unknown}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, + /*IsPrivateVarReduction */ false, OMPD_unknown}); chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5748,7 +5754,7 @@ void CodeGenFunction::EmitOMPScanDirective(const OMPScanDirective &S) { } CGM.getOpenMPRuntime().emitReduction( *this, ParentDir.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_simd}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, false, OMPD_simd}); alexey-bataev wrote: ```suggestion {/*WithNowait=*/true, /*SimpleReduction=*/true, /*IsPrivateVarReduction=*/false, OMPD_simd}); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -1481,6 +1482,8 @@ void CodeGenFunction::EmitOMPReductionClauseFinal( Privates.append(C->privates().begin(), C->privates().end()); LHSExprs.append(C->lhs_exprs().begin(), C->lhs_exprs().end()); RHSExprs.append(C->rhs_exprs().begin(), C->rhs_exprs().end()); +IsPrivate.append(C->private_var_reduction_flags().begin(), + C->private_var_reduction_flags().end()); alexey-bataev wrote: What if there is a mix of private and non-private reductions in a single construct? IsPrivateVarReduction flag is set for all reduced value, not matter if they are private or not https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -3943,7 +3948,8 @@ static void emitScanBasedDirective( PrivScope.Privatize(); CGF.CGM.getOpenMPRuntime().emitReduction( CGF, S.getEndLoc(), Privates, LHSs, RHSs, ReductionOps, - {/*WithNowait=*/true, /*SimpleReduction=*/true, OMPD_unknown}); + {/*WithNowait=*/true, /*SimpleReduction=*/true, + /*IsPrivateVarReduction */ false, OMPD_unknown}); alexey-bataev wrote: ```suggestion /*IsPrivateVarReduction=*/false, OMPD_unknown}); ``` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5200,6 +5428,26 @@ void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc, CGF.EmitBranch(DefaultBB); CGF.EmitBlock(DefaultBB, /*IsFinished=*/true); + if (Options.IsPrivateVarReduction) { +if (LHSExprs.empty() || Privates.empty() || ReductionOps.empty()) + return; +if (LHSExprs.size() != Privates.size() || +LHSExprs.size() != ReductionOps.size()) + return; alexey-bataev wrote: This code musy be removed and transformed into asserts https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
chandraghale wrote: ping @alexey-bataev https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,273 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); +else if (InitValue.isFloat()) + InitVal = llvm::ConstantFP::get(LLVMType, InitValue.getFloat()); +else if (InitValue.isComplexInt()) { + // For complex int: create struct { real, imag } + llvm::Constant *Real = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexIntReal()); + llvm::Constant *Imag = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexIntImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} else if (InitValue.isComplexFloat()) { + llvm::Constant *Real = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexFloatReal()); + llvm::Constant *Imag = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexFloatImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cas
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5200,6 +5460,18 @@ void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc, CGF.EmitBranch(DefaultBB); CGF.EmitBlock(DefaultBB, /*IsFinished=*/true); + if (Options.IsPrivateVarReduction) { +if (LHSExprs.empty() || Privates.empty() || ReductionOps.empty()) + return; +if (LHSExprs.size() != Privates.size() || +LHSExprs.size() != ReductionOps.size()) + return; alexey-bataev wrote: I mean better to have asserts here instead of early exits. These early exits must be deleted https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { alexey-bataev wrote: Why do you need such a complex analysis, why you cannot rely on a EmitSharedInit to handle the initialization for const values? https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,273 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); +else if (InitValue.isFloat()) + InitVal = llvm::ConstantFP::get(LLVMType, InitValue.getFloat()); +else if (InitValue.isComplexInt()) { + // For complex int: create struct { real, imag } + llvm::Constant *Real = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexIntReal()); + llvm::Constant *Imag = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexIntImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} else if (InitValue.isComplexFloat()) { + llvm::Constant *Real = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexFloatReal()); + llvm::Constant *Imag = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexFloatImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cas
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale updated https://github.com/llvm/llvm-project/pull/134709 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); +else if (InitValue.isFloat()) + InitVal = llvm::ConstantFP::get(LLVMType, InitValue.getFloat()); +else if (InitValue.isComplexInt()) { + // For complex int: create struct { real, imag } + llvm::Constant *Real = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexIntReal()); + llvm::Constant *Imag = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexIntImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} else if (InitValue.isComplexFloat()) { + llvm::Constant *Real = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexFloatReal()); + llvm::Constant *Imag = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexFloatImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cas
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); +else if (InitValue.isFloat()) + InitVal = llvm::ConstantFP::get(LLVMType, InitValue.getFloat()); +else if (InitValue.isComplexInt()) { + // For complex int: create struct { real, imag } + llvm::Constant *Real = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexIntReal()); + llvm::Constant *Imag = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexIntImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} else if (InitValue.isComplexFloat()) { + llvm::Constant *Real = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexFloatReal()); + llvm::Constant *Imag = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexFloatImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cas
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { chandraghale wrote: For non-const initalizer ,It is handled in the subsequent intialization phase within EmitSharedInit block. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); +else if (InitValue.isFloat()) + InitVal = llvm::ConstantFP::get(LLVMType, InitValue.getFloat()); +else if (InitValue.isComplexInt()) { + // For complex int: create struct { real, imag } + llvm::Constant *Real = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexIntReal()); + llvm::Constant *Imag = llvm::ConstantInt::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexIntImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} else if (InitValue.isComplexFloat()) { + llvm::Constant *Real = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(0), + InitValue.getComplexFloatReal()); + llvm::Constant *Imag = llvm::ConstantFP::get( + cast(LLVMType)->getElementType(1), + InitValue.getComplexFloatImag()); + InitVal = llvm::ConstantStruct::get( + cast(LLVMType), {Real, Imag}); +} + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + else +ReductionVarNameStr = "unnamed_priv_var"; + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cas
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,274 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { alexey-bataev wrote: Not sure this is correct. What if the init value is not a constant? You need to emit the initial value and then assign it. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
chandraghale wrote: Addressed release-note rebase problem. @alexey-bataev https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -937,15 +937,7 @@ OpenMP Support - Added support 'no_openmp_constructs' assumption clause. - Added support for 'self_maps' in map and requirement clause. - Added support for 'omp stripe' directive. -- Fixed a crashing bug with ``omp unroll partial`` if the argument to - ``partial`` was an invalid expression. (#GH139267) -- Fixed a crashing bug with ``omp tile sizes`` if the argument to ``sizes`` was - an invalid expression. (#GH139073) -- Fixed a crashing bug with ``omp simd collapse`` if the argument to - ``collapse`` was an invalid expression. (#GH138493) -- Fixed a crashing bug with a malformed ``cancel`` directive. (#GH139360) -- Fixed a crashing bug with ``omp distribute dist_schedule`` if the argument to - ``dist_schedule`` was not strictly positive. (#GH139266) +- Added support for private variable reduction. chandraghale wrote: Fixed !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -937,15 +937,7 @@ OpenMP Support - Added support 'no_openmp_constructs' assumption clause. - Added support for 'self_maps' in map and requirement clause. - Added support for 'omp stripe' directive. -- Fixed a crashing bug with ``omp unroll partial`` if the argument to - ``partial`` was an invalid expression. (#GH139267) -- Fixed a crashing bug with ``omp tile sizes`` if the argument to ``sizes`` was - an invalid expression. (#GH139073) -- Fixed a crashing bug with ``omp simd collapse`` if the argument to - ``collapse`` was an invalid expression. (#GH138493) -- Fixed a crashing bug with a malformed ``cancel`` directive. (#GH139360) -- Fixed a crashing bug with ``omp distribute dist_schedule`` if the argument to - ``dist_schedule`` was not strictly positive. (#GH139266) +- Added support for private variable reduction. alexey-bataev wrote: Again, the problem with the rebase. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
chandraghale wrote: @alexey-bataev updated few more test with complex. Any more feedback. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr && !PrivateType->isAggregateType() && +!PrivateType->isAnyComplexType()) { alexey-bataev wrote: Add at least a runtime test with complex types, if possible https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr && !PrivateType->isAggregateType() && +!PrivateType->isAnyComplexType()) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) { +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + } else { +ReductionVarNameStr = "unnamed_priv_var"; + } + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResu
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr && !PrivateType->isAggregateType() && +!PrivateType->isAnyComplexType()) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) { +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + } else { +ReductionVarNameStr = "unnamed_priv_var"; + } + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResu
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr && !PrivateType->isAggregateType() && +!PrivateType->isAnyComplexType()) { alexey-bataev wrote: Complex types should be supported, the compiler should not drop it silently https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr && !PrivateType->isAggregateType() && +!PrivateType->isAnyComplexType()) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) { +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + } else { +ReductionVarNameStr = "unnamed_priv_var"; + } + + // Create an internal shared variable + std::string SharedName = + CGM.getOpenMPRuntime().getName({"internal_pivate_", ReductionVarNameStr}); + llvm::GlobalVariable *SharedVar = new llvm::GlobalVariable( + CGM.getModule(), LLVMType, false, llvm::GlobalValue::InternalLinkage, + InitVal, ".omp.reduction." + SharedName, nullptr, + llvm::GlobalVariable::NotThreadLocal); + + SharedVar->setAlignment( + llvm::MaybeAlign(CGF.getContext().getTypeAlign(PrivateType) / 8)); + + Address SharedResult(SharedVar, SharedVar->getValueType(), + CGF.getContext().getTypeAlignInChars(PrivateType)); + + llvm::Value *ThreadId = getThreadID(CGF, Loc); + llvm::Value *BarrierLoc = emitUpdateLocation(CGF, Loc, OMP_ATOMIC_REDUCE); + llvm::Value *BarrierArgs[] = {BarrierLoc, ThreadId}; + + llvm::BasicBlock *InitBB = CGF.createBasicBlock("init"); + llvm::BasicBlock *InitEndBB = CGF.createBasicBlock("init.end"); + + llvm::Value *IsWorker = CGF.Builder.CreateICmpEQ( + ThreadId, llvm::ConstantInt::get(ThreadId->getType(), 0)); + CGF.Builder.CreateCondBr(IsWorker, InitBB, InitEndBB); + + CGF.EmitBlock(InitBB); + + auto EmitSharedInit = [&]() { +if (UDR) { // Check if it's a User-Defined Reduction + if (const Expr *UDRInitExpr = UDR->getInitializer()) { +std::pair FnPair = +getUserDefinedReduction(UDR); +llvm::Function *InitializerFn = FnPair.second; +if (InitializerFn) { + if (const auto *CE = + dyn_cast(UDRInitExpr->IgnoreParenImpCasts())) { +const auto *OutDRE = cast( +cast(CE->getArg(0)->IgnoreParenImpCasts()) +->getSubExpr()); +const VarDecl *OutVD = cast(OutDRE->getDecl()); + +CodeGenFunction::OMPPrivateScope LocalScope(CGF); +LocalScope.addPrivate(OutVD, SharedResult); + +(void)LocalScope.Privatize(); +if (const auto *OVE = dyn_cast( +CE->getCallee()->IgnoreParenImpCasts())) { + CodeGenFunction::OpaqueValueMapping OpaqueMap( + CGF, OVE, RValue::get(InitializerFn)); + CGF.EmitIgnoredExpr(CE); +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); +} + } else { +CGF.EmitAnyExprToMem(UDRInitExpr, SharedResult, + PrivateType.getQualifiers(), true); + } +} else { + CGF.EmitAnyExprToMem(UDRInitExpr, SharedResu
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -5200,6 +5460,18 @@ void CGOpenMPRuntime::emitReduction(CodeGenFunction &CGF, SourceLocation Loc, CGF.EmitBranch(DefaultBB); CGF.EmitBlock(DefaultBB, /*IsFinished=*/true); + if (Options.IsPrivateVarReduction) { +if (LHSExprs.empty() || Privates.empty() || ReductionOps.empty()) + return; +if (LHSExprs.size() != Privates.size() || +LHSExprs.size() != ReductionOps.size()) + return; alexey-bataev wrote: It is better to have these as asserts, not checks https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -4898,6 +4898,266 @@ void CGOpenMPRuntime::emitSingleReductionCombiner(CodeGenFunction &CGF, } } +void CGOpenMPRuntime::emitPrivateReduction( +CodeGenFunction &CGF, SourceLocation Loc, const Expr *Privates, +const Expr *LHSExprs, const Expr *RHSExprs, const Expr *ReductionOps) { + + // Create a shared global variable (__shared_reduction_var) to accumulate the + // final result. + // + // Call __kmpc_barrier to synchronize threads before initialization. + // + // The master thread (thread_id == 0) initializes __shared_reduction_var + //with the identity value or initializer. + // + // Call __kmpc_barrier to synchronize before combining. + // For each i: + //- Thread enters critical section. + //- Reads its private value from LHSExprs[i]. + //- Updates __shared_reduction_var[i] = RedOp_i(__shared_reduction_var[i], + //LHSExprs[i]). + //- Exits critical section. + // + // Call __kmpc_barrier after combining. + // + // Each thread copies __shared_reduction_var[i] back to LHSExprs[i]. + // + // Final __kmpc_barrier to synchronize after broadcasting + QualType PrivateType = Privates->getType(); + llvm::Type *LLVMType = CGF.ConvertTypeForMem(PrivateType); + + llvm::Constant *InitVal = nullptr; + const OMPDeclareReductionDecl *UDR = getReductionInit(ReductionOps); + // Determine the initial value for the shared reduction variable + if (!UDR) { +InitVal = llvm::Constant::getNullValue(LLVMType); +if (const auto *DRE = dyn_cast(Privates)) { + if (const auto *VD = dyn_cast(DRE->getDecl())) { +const Expr *InitExpr = VD->getInit(); +if (InitExpr && !PrivateType->isAggregateType() && +!PrivateType->isAnyComplexType()) { + Expr::EvalResult Result; + if (InitExpr->EvaluateAsRValue(Result, CGF.getContext())) { +APValue &InitValue = Result.Val; +if (InitValue.isInt()) + InitVal = llvm::ConstantInt::get(LLVMType, InitValue.getInt()); + } +} + } +} + } else { +InitVal = llvm::Constant::getNullValue(LLVMType); + } + std::string ReductionVarNameStr; + if (const auto *DRE = dyn_cast(Privates->IgnoreParenCasts())) { +ReductionVarNameStr = DRE->getDecl()->getNameAsString(); + } else { +ReductionVarNameStr = "unnamed_priv_var"; + } alexey-bataev wrote: Drop extra braces arounf substatements https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale updated https://github.com/llvm/llvm-project/pull/134709 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale updated https://github.com/llvm/llvm-project/pull/134709 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -530,6 +530,12 @@ OpenMP Support - Added support 'no_openmp_constructs' assumption clause. - Added support for 'self_maps' in map and requirement clause. - Added support for 'omp stripe' directive. +- Fixed a crashing bug with ``omp unroll partial`` if the argument to + ``partial`` was an invalid expression. (#GH139267) +- Fixed a crashing bug with ``omp tile sizes`` if the argument to ``sizes`` was + an invalid expression. (#GH139073) +- Fixed a crashing bug with ``omp distribute dist_schedule`` if the argument to + ``dist_schedule`` was not strictly positive. (#GH139266) chandraghale wrote: Yeah .. I was trying to resolve merge conflict with this file. Will resolve this. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -530,6 +530,12 @@ OpenMP Support - Added support 'no_openmp_constructs' assumption clause. - Added support for 'self_maps' in map and requirement clause. - Added support for 'omp stripe' directive. +- Fixed a crashing bug with ``omp unroll partial`` if the argument to + ``partial`` was an invalid expression. (#GH139267) +- Fixed a crashing bug with ``omp tile sizes`` if the argument to ``sizes`` was + an invalid expression. (#GH139073) +- Fixed a crashing bug with ``omp distribute dist_schedule`` if the argument to + ``dist_schedule`` was not strictly positive. (#GH139266) alexey-bataev wrote: These changes should not be part of your patch, looks like rebase issue? https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale updated https://github.com/llvm/llvm-project/pull/134709 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -0,0 +1,93 @@ +//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +#include +#include +#include "omp_testsuite.h" + +#define N 10 +class Sum { + int val; + +public: + Sum(int v = 0) : val(v) {} + Sum operator+(const Sum &rhs) const { return Sum(val + rhs.val); } + Sum &operator+=(const Sum &rhs) { +val += rhs.val; +return *this; + } + int getValue() const { return val; } +}; + +// Declare OpenMP reduction +#pragma omp declare reduction(sum_reduction:Sum : omp_out += omp_in) \ +initializer(omp_priv = Sum(0)) + +int checkUserDefinedReduction() { + Sum final_result_udr(0); + Sum array_sum[N]; + int error_flag = 0; + int expected_value = 0; + for (int i = 0; i < N; ++i) { +array_sum[i] = Sum(i); +expected_value += i; // Calculate expected sum: 0 + 1 + ... + (N-1) + } +#pragma omp parallel num_threads(4) + { +#pragma omp for reduction(sum_reduction : final_result_udr) chandraghale wrote: Added one reduction var. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -0,0 +1,93 @@ +//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +#include +#include +#include "omp_testsuite.h" + +#define N 10 +class Sum { + int val; + +public: + Sum(int v = 0) : val(v) {} + Sum operator+(const Sum &rhs) const { return Sum(val + rhs.val); } + Sum &operator+=(const Sum &rhs) { +val += rhs.val; +return *this; + } + int getValue() const { return val; } +}; + +// Declare OpenMP reduction +#pragma omp declare reduction(sum_reduction:Sum : omp_out += omp_in) \ +initializer(omp_priv = Sum(0)) + +int checkUserDefinedReduction() { + Sum final_result_udr(0); + Sum array_sum[N]; + int error_flag = 0; + int expected_value = 0; + for (int i = 0; i < N; ++i) { +array_sum[i] = Sum(i); +expected_value += i; // Calculate expected sum: 0 + 1 + ... + (N-1) + } +#pragma omp parallel num_threads(4) + { +#pragma omp for reduction(sum_reduction : final_result_udr) +for (int i = 0; i < N; ++i) { + final_result_udr += array_sum[i]; +} + +if (final_result_udr.getValue() != expected_value) + error_flag += 1; + } + return error_flag; +} + +void performReductions(int n_elements, const int *input_values, + int &sum_val_out, int &prod_val_out, + float &float_sum_val_out) { + // private variables for this thread's reduction. + sum_val_out = 0; + prod_val_out = 1; + float_sum_val_out = 0.0f; + + const float kPiValue = 3.14f; +#pragma omp for reduction(original(private), + : sum_val_out) \ chandraghale wrote: Is nt this already reducing over 2 more variable *:prod_val_out and +:float_sum_val_out. https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -0,0 +1,93 @@ +//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +#include +#include +#include "omp_testsuite.h" + +#define N 10 +class Sum { + int val; + +public: + Sum(int v = 0) : val(v) {} + Sum operator+(const Sum &rhs) const { return Sum(val + rhs.val); } + Sum &operator+=(const Sum &rhs) { +val += rhs.val; +return *this; + } + int getValue() const { return val; } +}; + +// Declare OpenMP reduction +#pragma omp declare reduction(sum_reduction:Sum : omp_out += omp_in) \ +initializer(omp_priv = Sum(0)) chandraghale wrote: Done !! https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale updated https://github.com/llvm/llvm-project/pull/134709 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -0,0 +1,93 @@ +//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +#include +#include +#include "omp_testsuite.h" + +#define N 10 +class Sum { + int val; + +public: + Sum(int v = 0) : val(v) {} + Sum operator+(const Sum &rhs) const { return Sum(val + rhs.val); } + Sum &operator+=(const Sum &rhs) { +val += rhs.val; +return *this; + } + int getValue() const { return val; } +}; + +// Declare OpenMP reduction +#pragma omp declare reduction(sum_reduction:Sum : omp_out += omp_in) \ +initializer(omp_priv = Sum(0)) + +int checkUserDefinedReduction() { + Sum final_result_udr(0); + Sum array_sum[N]; + int error_flag = 0; + int expected_value = 0; + for (int i = 0; i < N; ++i) { +array_sum[i] = Sum(i); +expected_value += i; // Calculate expected sum: 0 + 1 + ... + (N-1) + } +#pragma omp parallel num_threads(4) + { +#pragma omp for reduction(sum_reduction : final_result_udr) +for (int i = 0; i < N; ++i) { + final_result_udr += array_sum[i]; +} + +if (final_result_udr.getValue() != expected_value) + error_flag += 1; + } + return error_flag; +} + +void performReductions(int n_elements, const int *input_values, + int &sum_val_out, int &prod_val_out, + float &float_sum_val_out) { + // private variables for this thread's reduction. + sum_val_out = 0; + prod_val_out = 1; + float_sum_val_out = 0.0f; + + const float kPiValue = 3.14f; +#pragma omp for reduction(original(private), + : sum_val_out) \ alexey-bataev wrote: Same, use 2 variables https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -0,0 +1,93 @@ +//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +#include +#include +#include "omp_testsuite.h" + +#define N 10 +class Sum { + int val; + +public: + Sum(int v = 0) : val(v) {} + Sum operator+(const Sum &rhs) const { return Sum(val + rhs.val); } + Sum &operator+=(const Sum &rhs) { +val += rhs.val; +return *this; + } + int getValue() const { return val; } +}; + +// Declare OpenMP reduction +#pragma omp declare reduction(sum_reduction:Sum : omp_out += omp_in) \ +initializer(omp_priv = Sum(0)) alexey-bataev wrote: Better to have Sum(1) or something else to check that not a defulat constructor is called https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
@@ -0,0 +1,93 @@ +//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +#include +#include +#include "omp_testsuite.h" + +#define N 10 +class Sum { + int val; + +public: + Sum(int v = 0) : val(v) {} + Sum operator+(const Sum &rhs) const { return Sum(val + rhs.val); } + Sum &operator+=(const Sum &rhs) { +val += rhs.val; +return *this; + } + int getValue() const { return val; } +}; + +// Declare OpenMP reduction +#pragma omp declare reduction(sum_reduction:Sum : omp_out += omp_in) \ +initializer(omp_priv = Sum(0)) + +int checkUserDefinedReduction() { + Sum final_result_udr(0); + Sum array_sum[N]; + int error_flag = 0; + int expected_value = 0; + for (int i = 0; i < N; ++i) { +array_sum[i] = Sum(i); +expected_value += i; // Calculate expected sum: 0 + 1 + ... + (N-1) + } +#pragma omp parallel num_threads(4) + { +#pragma omp for reduction(sum_reduction : final_result_udr) alexey-bataev wrote: Make reductions over 2 variables https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- clang/test/OpenMP/for_private_reduction_codegen.cpp openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp clang/lib/CodeGen/CGOpenMPRuntime.cpp clang/lib/CodeGen/CGOpenMPRuntime.h clang/lib/CodeGen/CGStmtOpenMP.cpp `` View the diff from clang-format here. ``diff diff --git a/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp b/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp index 0a3bbafd9..7227b974c 100644 --- a/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp +++ b/openmp/runtime/test/worksharing/for/omp_for_private_reduction.cpp @@ -1,4 +1,4 @@ -//RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run +// RUN: %libomp-cxx-compile -fopenmp-version=60 && %libomp-run #include #include #include "omp_testsuite.h" @@ -65,8 +65,8 @@ int main(void) { int input_array[N]; int total_errors = 0; const float kPiVal = 3.14f; - const int kExpectedSum = 45;// Sum of 0..9 - const int kExpectedProd = 3628800; // 10! + const int kExpectedSum = 45; // Sum of 0..9 + const int kExpectedProd = 3628800; // 10! const float kExpectedFsum = kPiVal * N; // 3.14f * 10 for (int i = 0; i < N; i++) `` https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
chandraghale wrote: > Can you add a few runtime tests, some with UDR with classes, several > reduction items with the different types, to check that functional part works > correctly? You can add it to the libomp right in this patch. Just want to be > sure we do not miss anything here Sure !!! Added few runtime tests. @alexey-bataev https://github.com/llvm/llvm-project/pull/134709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [openmp] [OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (PR #134709)
https://github.com/chandraghale updated https://github.com/llvm/llvm-project/pull/134709 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits