================
@@ -0,0 +1,188 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fopenmp -fclangir
-emit-llvm %s -o %t-cir.ll
+// RUN: FileCheck %s --input-file %t-cir.ll
+
+void before(int);
+void during(int);
+void after(int);
+
+// Test simple for loop with constant bounds: for (int i = 0; i < 10; i++)
+void emit_simple_for() {
+ int j = 5;
+ before(j);
+#pragma omp parallel
+ {
+#pragma omp for
+ for (int i = 0; i < 10; i++) {
+ during(j);
+ }
+ }
+ after(j);
+}
+
+// CHECK-LABEL: define dso_local void @emit_simple_for()
+// CHECK: call void @before(i32 %{{.*}})
+// CHECK: call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr @{{.*}}, i32 1,
ptr @emit_simple_for..omp_par, ptr %{{.*}})
+// CHECK: call void @after(i32 %{{.*}})
+
+// CHECK-LABEL: define internal void @emit_simple_for..omp_par(
+// CHECK: store i32 0, ptr %p.lowerbound
+// CHECK: store i32 9, ptr %p.upperbound
+// CHECK: store i32 1, ptr %p.stride
+// CHECK: call void @__kmpc_for_static_init_4u(
+// CHECK: omp_loop.body:
+// CHECK: omp.loop_nest.region:
+// CHECK: store i32 %{{.*}}, ptr %{{.*}}, align 4
+// CHECK: call void @during(i32 %{{.*}})
+// CHECK: call void @__kmpc_for_static_fini(
+// CHECK: call void @__kmpc_barrier(
----------------
Parigi wrote:
Thank you very much for the review and the feedback.
I think there may have been a misunderstanding: I am testing the lowering pass,
following the same approach used in the other test files under
llvm-project/clang/test/CIR/Lowering/ (e.g. global-var-simple.cpp). In the
lowering I use the -fclangir flag, which produces a different output compared
to the standard pipeline. I can also emit the same result you linked to me
removing that flag.
I also wanted to point out that the compiler you shared is different from the
one I am using: the version you linked does not support CIR — if you use
-fclangir it does not break, but the output is unchanged. You can verify this
clearly by trying the -emit-cir flag.
If you want to test a version closer to what I am using, here is a Godbolt
link: [https://godbolt.org/z/xfnYM4YT9](https://godbolt.org/z/xfnYM4YT9)
The linked example tests the parallel construct because, if you uncomment the
#pragma omp for, it breaks as is not implemented. You will already notice that
the parallel output differs from the one you linked.
As an additional sanity check, I also built executables using a driver.c and
tested both the parallel and for cases. Both .ll files produce identical
runtime behavior — only the IR syntax differs. I attach here a zip with the
tests I have done.
[PR-testing.zip](https://github.com/user-attachments/files/25576796/PR-testing.zip)
Regarding the PR size: I agree with your concern and had already thought about
it. However, I am not sure how to split it cleanly. Emitting only wsloop or
only loop_nest in isolation would not make sense, since the construct requires
both together — otherwise the emission would break. This covers emitOMPFor
(which emits wsloop) and emitForStmt (which emits loop_nest).
An option would be to remove support for variable loop bounds for now,
restricting the implementation to constant bounds only. This would reduce the
code size.
I would appreciate your thoughts on whether the split I proposed makes sense,
or if you have other suggestions — I am happy to follow your guidance.
https://github.com/llvm/llvm-project/pull/181841
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits