[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-19 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

@kaz7, it seems that the thread_limit is being set properly, but the 
`omp_get_thread_limit()` is giving a wrong output when you enable anything more 
than `-O1`. I will fix it as soon as I can. Meanwhile, if you absolutely want 
the test case to work right now, remove the printf causing the issue or do not 
run that test case with a higher optimization level than `-O1`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-07 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:28
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel

mstorsjo wrote:
> mstorsjo wrote:
> > This test fails when running (on Windows) on GitHub Actions runners - see 
> > https://github.com/mstorsjo/llvm-mingw/actions/runs/6019088705/job/16342540379.
> > 
> > I believe that this bit of the test has got a hidden assumption that it is 
> > running in an environment with 4 or more cores. By setting `#pragma omp 
> > target thread_limit(tl)` (with `tl=4`) and running a line in parallel with 
> > `#pragma omp parallel`, it expects that we'll get 4 printouts - while in 
> > practice, we'll get anywhere between 1 and 4 printouts depending on the 
> > number of cores.
> > 
> > Is there something that can be done to make this test work in such an 
> > environment too?
> Can someone involved in this patch take on fixing it so that it works on 
> machines with fewer than 4 cores? I'm not sure what's the most appropriate 
> path forward here, as it breaks clearly in such configs (even if it might not 
> be hit by one of the official llvm buildbots, but it shows up as breakage in 
> my nightly builds every day now) - reverting seems a bit harsh. I guess I 
> could just rip out this part of the test?
@mstorsjo , I noticed that you have committed this 
https://github.com/llvm/llvm-project/commit/c2019c416c8d7ec50aec6ac6b82c9aa4e99b0f6f

Does this solve your problem ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553487.
sandeepkosuri added a comment.

made new LIT test cases target specific to linux


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target directives which 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553405.
sandeepkosuri added a comment.

Made LIT test cases more robust to check lines ordering problem


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553358.
sandeepkosuri added a comment.

Used `CHECK-DAG` s to avoid LIT test failures on Windows system


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553139.
sandeepkosuri added a comment.

Edited the LIT test cases to use more script generated check lines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri marked an inline comment as done.
sandeepkosuri added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

ABataev wrote:
> sandeepkosuri wrote:
> > ABataev wrote:
> > > Why removed these checks?
> > I did not remove any check lines in this function.
> > But I removed checks in `omp_task_entry` function that were not related to 
> > my changes, to avoid failures. I only wanted to check whether 
> > `__kmpc_set_thread_limit()` is called.
> > 
> > Same for all the other test cases.
> Better to restore it to be able to use the script in future without many 
> changes
But a few check lines are failing on windows, while passing on debian.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-23 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri marked an inline comment as done.
sandeepkosuri added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

ABataev wrote:
> Why removed these checks?
I did not remove any check lines in this function.
But I removed checks in `omp_task_entry` function that were not related to my 
changes, to avoid failures. I only wanted to check whether 
`__kmpc_set_thread_limit()` is called.

Same for all the other test cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-23 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 552725.
sandeepkosuri added a comment.

Added PCH options to the RUN lines in LIT tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target directives which 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-09 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 548518.
sandeepkosuri added a comment.

Used the python script `update_cc_test_checks.py` to generate the checks for 
the newly added tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-08 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 548082.
sandeepkosuri added a comment.

- Updated `SemaOpenMP.cpp` to support `thread_limit` clause on the newly 
allowed directives.

- This update is to fix the newly added LIT tests' failures (which were 
occurring only on debug build)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-06 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

In D152054#4560353 , @tianshilei1992 
wrote:

> Is this patch to support `thread_limit` on `target` directive on the host?

Yes @tianshilei1992 , It is for host only


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-04 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9866
+  (CGM.getLangOpts().OpenMP >= 51 &&
+   needsTaskBasedThreadLimit(D.getDirectiveKind()) &&
+   D.hasClausesOfKind());

ABataev wrote:
> I think you don't need needsTaskBasedThreadLimit call here, the 
> emitTargetCall function itself can be called only for target-based directives
`emitTargetCall()` is called for all the target based directives, even target - 
team based directives, which already have a thread limit implementation in 
place. So, I need `needsTaskBasedThreadLimit` to select applicable directives 
only.



Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143
+if (CGF.CGM.getLangOpts().OpenMP >= 51 &&
+needsTaskBasedThreadLimit(S.getDirectiveKind()) && TL) {
+  // Emit __kmpc_set_thread_limit() to set the thread_limit for the task

ABataev wrote:
> Same regarding needsTaskBasedThreadLimit(S.getDirectiveKind()) , the function 
> EmitOMPTargetTaskBasedDirective is called only for target-based directives
Similarly here as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-07-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 544201.
sandeepkosuri added a comment.

Explicitly mentioned `-fopenmp-version=51` in LIT test cases


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target directives which support thread_limit
+// 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-07-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 543889.
sandeepkosuri added a comment.

- Added support for `thread_limit` clause on relevant combined directives which 
begin with `target` as per @ABataev 's comments.
- Added additional LIT test cases to check codegen of the `thread_limit` on the 
newly supported directives.
- Updated the runtime LIT as per @jdoerfert 's comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-30 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9818
+  D.hasClausesOfKind() ||
+  (CGM.getLangOpts().OpenMP >= 51 && D.getDirectiveKind() == OMPD_target &&
+   D.hasClausesOfKind());

ABataev wrote:
> What if D is combined target directive, i.e. D.getDirectiveKind() is 
> something like OMPD_target_teams, etc.?
I will fix that, thanks for noticing.



Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143-5148
+S.getSingleClause()) {
+  // Emit __kmpc_set_thread_limit() to set the thread_limit for the task
+  // enclosing this target region. This will indirectly set the 
thread_limit
+  // for every applicable construct within target region.
+  CGF.CGM.getOpenMPRuntime().emitThreadLimitClause(
+  CGF, S.getSingleClause()->getThreadLimit(),

ABataev wrote:
> Avoid double call of S.getSingleClause(), store in 
> local variable call result.
sure.



Comment at: clang/test/OpenMP/target_codegen.cpp:849
 // OMP51: [[CE:%.*]] = load {{.*}} [[CEA]]
-// OMP51: call i32 @__tgt_target_kernel({{.*}}, i64 -1, i32 -1, i32 [[CE]],
+// OMP51: call ptr @__kmpc_omp_task_alloc({{.*@.omp_task_entry.*}})
+// OMP51: call i32 [[OMP_TASK_ENTRY]]

ABataev wrote:
> It requires extra resource consumption, can you try to avoid creating outer 
> task, if possible?
I tried different ideas for making `thread_limit` work on `target`.

I tried to reuse the existing implementation by replacing the directive to 
`target teams(1) thread_limit(x)` at  parsing , sema and IR stages. I couldn't 
successfully implement any of them. So, I tried adding `num_threads` for all 
the parallel directives within `target`, and there were corner cases like 
parallel directives in a function which is called in target region, which were 
becoming tedious to handle.

This method seem to encompass the idea of thread limit on `target` pretty well 
and also works... So I proceeded with this idea.



Comment at: openmp/runtime/src/kmp_ftn_entry.h:809
+return thread_limit;
+  else
+return thread->th.th_current_task->td_icvs.thread_limit;

ABataev wrote:
> No need for else here
oops, I will fix that


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-12 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 530591.
sandeepkosuri added a comment.

Updated `target_codegen.cpp` test case to incorporate my changes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,81 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+  }
+
+// confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+// OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  return 0;
+}
Index: openmp/runtime/src/kmp_runtime.cpp
===
--- openmp/runtime/src/kmp_runtime.cpp
+++ openmp/runtime/src/kmp_runtime.cpp
@@ -1867,6 +1867,7 @@
   int nthreads;
   int master_active;
   int master_set_numthreads;
+  int task_thread_limit = 0;
   int level;
   int active_level;
   int teams_level;
@@ -1905,6 +1906,8 @@
 root = master_th->th.th_root;
 master_active = root->r.r_active;
 master_set_numthreads = master_th->th.th_set_nproc;
+task_thread_limit =
+master_th->th.th_current_task->td_icvs.task_thread_limit;
 
 #if OMPT_SUPPORT
 ompt_data_t ompt_parallel_data = ompt_data_none;
@@ -1995,6 +1998,11 @@
  ? master_set_numthreads
  // TODO: get nproc directly from current task
  : get__nproc_2(parent_team, master_tid);
+  // Use the thread_limit set for the current target task if exists, else go
+  // with the deduced nthreads
+  nthreads = task_thread_limit > 0 && task_thread_limit < nthreads
+ ? task_thread_limit
+ : nthreads;
   // Check if we need to take forkjoin lock? (no need for serialized
   // parallel out of teams 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-03 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri created this revision.
sandeepkosuri added reviewers: ABataev, soumitra, koops, RitanyaB, dreachem.
Herald added subscribers: sunshaoce, guansong, yaxunl.
Herald added a project: All.
sandeepkosuri requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: llvm-commits, openmp-commits, cfe-commits, jplehr, 
sstefan1.
Herald added projects: clang, OpenMP, LLVM.

- This patch adds support for thread_limit clause on target directive according 
to OpenMP 51 [2.14.5]
- The idea is to create an outer task for target region, when there is a 
thread_limit clause, and manipulate the thread_limit of task instead. This way, 
thread_limit will be applied to all the relevant constructs enclosed by the 
target region.




Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D152054

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,81 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+  }
+
+// confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+// OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  return 0;
+}
Index: openmp/runtime/src/kmp_runtime.cpp
===
--- openmp/runtime/src/kmp_runtime.cpp
+++ openmp/runtime/src/kmp_runtime.cpp
@@ -1867,6 +1867,7 @@
   int nthreads;
   int master_active;
   int master_set_numthreads;
+  int task_thread_limit = 0;
   int level;
   int active_level;
   int teams_level;
@@ -1905,6 +1906,8 @@
 root = master_th->th.th_root;
 master_active = root->r.r_active;
 master_set_numthreads = master_th->th.th_set_nproc;
+task_thread_limit =
+master_th->th.th_current_task->td_icvs.task_thread_limit;
 
 #if OMPT_SUPPORT
 ompt_data_t ompt_parallel_data = ompt_data_none;
@@ -1995,6 +1998,11 @@
  

[PATCH] D127855: [OpenMP] Basic parse and sema support for modifiers in order clause

2023-02-08 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

In D127855#4048642 , @jyu2 wrote:

> In D127855#3956014 , @sandeepkosuri 
> wrote:
>
>> As I do not have commit access, can someone commit this patch, now that it 
>> passes the pre-merge tests ?
>
> I see some tests failed after this patch.  Failed only with -fopenmp-vesion=51
>
> https://www.godbolt.org/z/3oxWTcxn7

Hey @jyu2 , this error is not reproducible anymore, I think the issue is 
solved, by someone else's patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127855/new/

https://reviews.llvm.org/D127855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127855: [OpenMP] Basic parse and sema support for modifiers in order clause

2023-01-17 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

In D127855#4059776 , @jyu2 wrote:

> Hi @sandeepkosuri, do you plan to fix this?  Thanks.  Jennifer

Hi jyu2, sorry for a late reply, and yes I will fix it. Thanks for pointing 
this out.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127855/new/

https://reviews.llvm.org/D127855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127855: [OpenMP] Basic parse and sema support for modifiers in order clause

2022-11-28 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

As I do not have commit access, can someone commit this patch, now that it 
passes the pre-merge tests ?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127855/new/

https://reviews.llvm.org/D127855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137765: [NFC] Fixing a comment and some indentations

2022-11-09 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri created this revision.
sandeepkosuri added reviewers: cchen, ABataev, RitanyaB, soumitra.
Herald added a project: All.
sandeepkosuri requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D137765

Files:
  clang/include/clang/AST/OpenMPClause.h
  clang/include/clang/Sema/Scope.h


Index: clang/include/clang/Sema/Scope.h
===
--- clang/include/clang/Sema/Scope.h
+++ clang/include/clang/Sema/Scope.h
@@ -44,11 +44,11 @@
   enum ScopeFlags {
 /// This indicates that the scope corresponds to a function, which
 /// means that labels are set here.
-FnScope   = 0x01,
+FnScope = 0x01,
 
 /// This is a while, do, switch, for, etc that can have break
 /// statements embedded into it.
-BreakScope= 0x02,
+BreakScope = 0x02,
 
 /// This is a while, do, for, which can have continue statements
 /// embedded into it.
Index: clang/include/clang/AST/OpenMPClause.h
===
--- clang/include/clang/AST/OpenMPClause.h
+++ clang/include/clang/AST/OpenMPClause.h
@@ -7640,7 +7640,7 @@
   /// Location of '('.
   SourceLocation LParenLoc;
 
-  /// A kind of the 'default' clause.
+  /// A kind of the 'order' clause.
   OpenMPOrderClauseKind Kind = OMPC_ORDER_unknown;
 
   /// Start location of the kind in source code.


Index: clang/include/clang/Sema/Scope.h
===
--- clang/include/clang/Sema/Scope.h
+++ clang/include/clang/Sema/Scope.h
@@ -44,11 +44,11 @@
   enum ScopeFlags {
 /// This indicates that the scope corresponds to a function, which
 /// means that labels are set here.
-FnScope   = 0x01,
+FnScope = 0x01,
 
 /// This is a while, do, switch, for, etc that can have break
 /// statements embedded into it.
-BreakScope= 0x02,
+BreakScope = 0x02,
 
 /// This is a while, do, for, which can have continue statements
 /// embedded into it.
Index: clang/include/clang/AST/OpenMPClause.h
===
--- clang/include/clang/AST/OpenMPClause.h
+++ clang/include/clang/AST/OpenMPClause.h
@@ -7640,7 +7640,7 @@
   /// Location of '('.
   SourceLocation LParenLoc;
 
-  /// A kind of the 'default' clause.
+  /// A kind of the 'order' clause.
   OpenMPOrderClauseKind Kind = OMPC_ORDER_unknown;
 
   /// Start location of the kind in source code.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127855: [OpenMP] Basic parse and sema support for modifiers in order clause

2022-11-09 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:10634
   "OpenMP constructs may not be nested inside an atomic region">;
+def err_omp_prohibited_region_order
+: Error<"construct %0 not allowed in a region associated with a directive "

ABataev wrote:
> Do you have the test for this error message?
I forgot to add a test for this, I will do it. Thanks for noticing.



Comment at: clang/lib/Sema/SemaOpenMP.cpp:875
+  /// false - otherwise.
+  bool HasOrderConcurrent() const {
+if (const SharingMapTy *Top = getTopOfStackOrNull())

ABataev wrote:
> `isOrderConcurrent`
This function is never used, So I will remove this altogether. Thanks for 
pointing this out.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127855/new/

https://reviews.llvm.org/D127855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D127855: [OpenMP] Basic parse and sema support for modifiers in order clause

2022-10-13 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: clang/include/clang/Sema/Scope.h:483-488
+  /// Determine whether this scope is some OpenMP directive with
+  /// order clause which specifies concurrent scope.
+  bool isOpenMPOrderClauseScope() const{
+return getFlags() & Scope::OpenMPOrderClauseScope;
+  }
+

ABataev wrote:
> Why do wee need new scope?
I needed a new scope flag to keep track of all the new scopes created inside a 
region which has an associated order clause. Then I proceeded to mark all those 
nested scopes within 'order clause' region with this flag. I needed to do this 
to implement this restriction (OpenMP 5.1 - 2.11.3):
```
A region that corresponds to a construct with an order clause that specifies 
concurrent may not contain calls to the OpenMP Runtime API.
```
Changes in this file, in SemaOpenMP.cpp ( in Sema::ActOnOpenMPCall() ) and in 
Scope.cpp together form the implementation of the above restriction.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127855/new/

https://reviews.llvm.org/D127855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits