On Mon, 20 Apr 2026 at 09:57, Thomas Schwinge wrote:
> Hi!
>
> Christopher et al., first: Thanks for your work on this (see below) and
> other bug fixes!
>
> On 2026-04-13T12:38:51-0700, Jerry D <[email protected]> wrote:
>> On 4/13/26 12:20 PM, Harald Anlauf wrote:
>>> Am 11.04.26 um 6:54 PM schrieb Jerry D:
>>>> The attached patch look fairly simple.
>>>>
>>>> Regression tested on x86_64.
>>>>
>>>> I plan to commit this one in a little while.
>>>
>>> Does this need RM approval? It's a change below gcc/ .
>>
>> We were given a week at the start of last week.
>
> Not sure what that last comment actually relates to, but: this patch
> doesn't change GCC/Fortran code, but generic GCC middle end
> infrastructure, 'gcc/omp-expand.cc:expand_oacc_for' (OpenACC 'loop'
> construct handling).
>
> I note that this commit also resolves the ICEs reported in
> <https://gcc.gnu.org/PR95550>
> "[OpenACC] ICE in expand_oacc_for, at omp-expand.c:6075", and
> <https://gcc.gnu.org/PR107227>
> "Compiler bug in private allocatable array in OpenACC compute statement",
> so these PRs need to be verified if they're indeed completely resolved
> now (duplicates), or if other work remains, and/or whether their test
> cases should also be installed.
>
>> This one is pushed already as of April 11, 2026
>
> <https://gcc.gnu.org/g:010618b8dcb73220790f8f82cf76e8a2aacc2122>
> "fortran: Fix ICE in expand_oacc_for with private derived type [PR93554]".
>
>> No worries here.
>
> It's easy to make assertion failures go away by just defusing the
> 'assert's. ;-)
>
> (And also: removing 'assert's won't make any conforming code misbehave,
> so this commit doesn't introduce any regressions for conforming code.)
>
> However, are we really convinced that any follow-on middle end, back end
> compiler handling is correct for the "more loose" basic block layout that
> we now accept? From the discussion in the PR initially by Tobias (CC
> added), and later Christopher, I can see that some thought has been spent
> on this, but the test case that we currently have (per the commit)
> certainly isn't able to verify this: in other words, this isn't
> sufficient, and we need a thorough review of the code (changes) and their
> effect on later compiler passes, and proper execution test case(s) added,
> exercising this now-enabled functionality in context of OpenACC's
> (different levels of) parallelism, relevant different data types (see the
> other PRs I've mentioned, for example), etc.
>
> I shall try to help with review of the OpenACC side of things, but I'll
> appreciate help with the Fortran side of things (as producer of this
> previously-rejected code).
>
> Christopher, can you please elaborate to which extent you've verified (by
> inspection and/or testing) the code changes?
>
>
> Grüße
> Thomas
Hi Thomas,
Thanks for the review.
Scope note: this follow-up adds only tests; no further source changes
beyond r16-8571. And since LLM use on gcc-patches has sparked some
controversy, disclosure: I use Claude and GPT models while preparing
patches and flag that with an "Assisted-by: Claude (Anthropic)"
trailer, which this patch carries alongside my Signed-off-by.
On your three asks:
(1) PR95550 / PR107227 duplicate status.
Both reduce to the same assertion (omp-expand.cc:7722) as PR93554 and
are resolved by r16-8571-g010618b8dcb. On the parent both reproducers
ICE; on trunk both compile cleanly and the post-fix ompexp CFG matches
PR93554's. Close as DUPLICATE. I've installed their test cases under
libgomp (attached) rather than gcc/testsuite because the originals are
runtime-behaviour reports, not compile-time ICEs. Duplicate-of
comments posted on the two bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95550
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107227
and the follow-up is attached as patch 64265 on the parent bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93554
(2) Execution tests.
Attached patch adds five tests under
libgomp/testsuite/libgomp.oacc-fortran/:
pr93554-1.f90 Canonical derived-type-with-allocatable-component
shape; five parallelism variants.
pr93554-2.f90 Allocation inside the loop body, gang-only; forces
__nvptx_free at region exit. Sentinel check
detects leaked private storage.
pr93554-3.f90 Whole allocatable private, num_gangs(4); double-
phase write+sum checks per-gang isolation.
pr95550-1.f90 Burnus's create(A) + loop private(A); five variants.
pr107227-1.f90 Bryngelson's whole-allocatable private; five
variants.
dg-prune-output suppresses the pre-existing
"using vector_length(32), ignoring 1" info diagnostic.
(3) Extent of verification.
- Host fallback (trunk gfortran -foffload=disable): 5 scenarios x 6
opt levels (-O0..-O3 -Os -Og), all PASS.
- NVPTX offload (trunk, sm_89 / RTX 5060 Ti, CUDA 13.2): all five
PASS at -O2 with ACC_DEVICE_TYPE=nvidia.
- check-target-libgomp-fortran: +60 PASS, 0 FAIL/XPASS/UNRESOLVED
over the 6374-pass baseline.
- GOMP_DEBUG=1 PTX trace shows __nvptx_malloc/__nvptx_free call
sites inside the .entry body for pr93554-2/3, not only in runtime
helpers. pr93554-2's sentinel and pr93554-3's sum identity hold
across all iterations.
Bounds worth noting:
- AMD GCN untested (NVPTX-only hardware).
- Fully-partitioned gang+worker+vector with an aggregate private
exposes a separate codegen bug on NVPTX that r16-8571 only
uncovers (wrong results for whole-allocatable private; static-
array variant is fine). Filed as PR124964, cross-linked with
PR95397 which may share the same root cause:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124964
pr93554-2/3 therefore pin to gang-only.
Thanks,
ChrisFrom e4272e6f37ddf022152f90212dd89764761c9d3f Mon Sep 17 00:00:00 2001
From: Christopher Albert <[email protected]>
Date: Tue, 21 Apr 2026 09:00:45 +0200
Subject: [PATCH] libgomp, testsuite: OpenACC Fortran private-allocatable
execution tests [PR93554, PR95550, PR107227]
The compile-only regression for PR93554 in
gcc/testsuite/gfortran.dg/goacc/pr93554.f90 covers the ICE in
expand_oacc_for when an OpenACC loop's private clause carries a
derived type with an allocatable component. It does not exercise
the finalisation/free CFG edge that r16-8571-g010618b8dcb relaxed,
nor the closely related whole-allocatable shapes reported as PR95550
and PR107227.
Add five libgomp execution tests so that a runtime check across
gang/worker/vector/seq/kernels parallelism variants is available for
each of the three reported forms, plus two targeted stress cases:
pr93554-1.f90 Derived type with allocatable component; five
parallelism variants (gang/worker/vector/seq/kernels).
Canonical regression shape; mirrors the merged
gfortran.dg/goacc/pr93554.f90 compile test.
pr93554-2.f90 Derived type with allocatable component allocated
inside the loop body. Pinned to gang-only; exits the
region with priv.b non-null, so the per-thread
finaliser __nvptx_free must run at region exit. A
sentinel check at iteration start detects leaked
private storage.
pr93554-3.f90 Whole allocatable array marked private under
num_gangs(4) gang partitioning. Double-phase
write+sum exercises per-gang isolation; the OpenACC
runtime materialises the per-gang copy on entry and
the finaliser frees each copy on exit.
pr95550-1.f90 Burnus's acc parallel create(A) + acc loop private(A)
shape; five parallelism variants. Verifies that the
host copy of A is left untouched.
pr107227-1.f90 Bryngelson's whole-allocatable acc parallel loop
private(arr) shape; five parallelism variants.
All five compile cleanly on trunk, PASS across the DejaGnu matrix
for host fallback and on NVPTX sm_89.
Per-PR outcome:
- PR93554: fixed by r16-8571-g010618b8dcb; covered by pr93554-1.f90
(canonical), pr93554-2.f90 (runtime free edge), and pr93554-3.f90
(per-gang independence).
- PR95550: duplicate of PR93554 -- same ICE site, resolved by the
same commit. pr95550-1.f90 installs Burnus's shape as the
execution regression.
- PR107227: duplicate of PR93554 -- same ICE site, resolved by the
same commit. pr107227-1.f90 installs Bryngelson's shape.
Scope note: worker- and vector-level private for whole allocatables
under NVPTX offload exposes a pre-existing codegen issue distinct
from r16-8571 (see PR95397 and the Bugzilla thread for the new
allocatable-worker/vector report filed separately). pr93554-2.f90
and pr93554-3.f90 are therefore pinned to gang-only partitioning.
Assisted-by: Claude (Anthropic)
PR fortran/93554
PR middle-end/95550
PR libgomp/107227
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/pr93554-1.f90: New test.
* testsuite/libgomp.oacc-fortran/pr93554-2.f90: New test.
* testsuite/libgomp.oacc-fortran/pr93554-3.f90: New test.
* testsuite/libgomp.oacc-fortran/pr95550-1.f90: New test.
* testsuite/libgomp.oacc-fortran/pr107227-1.f90: New test.
Signed-off-by: Christopher Albert <[email protected]>
---
.../libgomp.oacc-fortran/pr107227-1.f90 | 119 ++++++++++++++++
.../libgomp.oacc-fortran/pr93554-1.f90 | 134 ++++++++++++++++++
.../libgomp.oacc-fortran/pr93554-2.f90 | 55 +++++++
.../libgomp.oacc-fortran/pr93554-3.f90 | 55 +++++++
.../libgomp.oacc-fortran/pr95550-1.f90 | 84 +++++++++++
5 files changed, 447 insertions(+)
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90
create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90
new file mode 100644
index 00000000000..42cf20d0f78
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90
@@ -0,0 +1,119 @@
+! PR libgomp/107227 â execution test.
+!
+! Whole allocatable array used with private clause on an OpenACC loop.
+! Same ICE site as PR93554; expected to be resolved by
+! r16-8571-g010618b8dcb. Covers gang/worker/vector/seq and
+! 'parallel loop' vs 'kernels loop'.
+
+! { dg-do run }
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+program pr107227
+ implicit none
+ integer, parameter :: n = 32
+ integer :: res(n), i, expect(n)
+
+ do i = 1, n
+ expect(i) = 2 * i
+ end do
+
+ call gang_parallel()
+ call worker_parallel()
+ call vector_parallel()
+ call seq_parallel()
+ call gang_kernels()
+
+contains
+
+ subroutine check(tag, res_)
+ integer, intent(in) :: tag
+ integer, intent(in) :: res_(n)
+ integer :: j
+ do j = 1, n
+ if (res_(j) /= expect(j)) then
+ write(0,*) "tag=", tag, " j=", j, " got=", res_(j), " want=", expect(j)
+ stop tag
+ end if
+ end do
+ end subroutine check
+
+ subroutine gang_parallel()
+ integer :: j
+ real, allocatable :: arr(:)
+ allocate(arr(n))
+ res = -1
+ !$acc parallel loop gang private(arr) copy(res)
+ do j = 1, n
+ arr(j) = 2.0 * real(j)
+ res(j) = int(arr(j))
+ end do
+ !$acc end parallel loop
+ call check(11, res)
+ deallocate(arr)
+ end subroutine gang_parallel
+
+ subroutine worker_parallel()
+ integer :: j
+ real, allocatable :: arr(:)
+ allocate(arr(n))
+ res = -1
+ !$acc parallel num_gangs(4) num_workers(4) vector_length(1) copy(res)
+ !$acc loop worker private(arr)
+ do j = 1, n
+ arr(j) = 2.0 * real(j)
+ res(j) = int(arr(j))
+ end do
+ !$acc end parallel
+ call check(12, res)
+ deallocate(arr)
+ end subroutine worker_parallel
+
+ subroutine vector_parallel()
+ integer :: j
+ real, allocatable :: arr(:)
+ allocate(arr(n))
+ res = -1
+ !$acc parallel num_gangs(1) num_workers(1) vector_length(32) copy(res)
+ !$acc loop vector private(arr)
+ do j = 1, n
+ arr(j) = 2.0 * real(j)
+ res(j) = int(arr(j))
+ end do
+ !$acc end parallel
+ call check(13, res)
+ deallocate(arr)
+ end subroutine vector_parallel
+
+ subroutine seq_parallel()
+ integer :: j
+ real, allocatable :: arr(:)
+ allocate(arr(n))
+ res = -1
+ !$acc parallel num_gangs(1) num_workers(1) vector_length(1) copy(res)
+ !$acc loop seq private(arr)
+ do j = 1, n
+ arr(j) = 2.0 * real(j)
+ res(j) = int(arr(j))
+ end do
+ !$acc end parallel
+ call check(14, res)
+ deallocate(arr)
+ end subroutine seq_parallel
+
+ subroutine gang_kernels()
+ integer :: j
+ real, allocatable :: arr(:)
+ allocate(arr(n))
+ res = -1
+ !$acc kernels copy(res)
+ !$acc loop private(arr)
+ do j = 1, n
+ arr(j) = 2.0 * real(j)
+ res(j) = int(arr(j))
+ end do
+ !$acc end kernels
+ call check(15, res)
+ deallocate(arr)
+ end subroutine gang_kernels
+
+end program pr107227
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90
new file mode 100644
index 00000000000..9f741a9e96d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90
@@ -0,0 +1,134 @@
+! PR fortran/93554 â execution test.
+!
+! Exercises "private(x)" on an OpenACC loop where x is a derived type
+! with an allocatable component. Four subroutines cover the four
+! partitioning levels (gang, worker, vector, seq) that 'acc loop' may
+! carry, in both the 'parallel loop' and 'kernels loop' forms.
+!
+! Before r16-8571-g010618b8dcb this ICE'd in expand_oacc_for at
+! omp-expand.cc; afterwards it compiles and is expected to execute
+! correctly: the private copy of 'x' must not leak results into the
+! loop-body arithmetic performed through the shared output array.
+
+! { dg-do run }
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+program pr93554
+ implicit none
+ integer, parameter :: n = 32
+ integer :: res(n), i, expect(n)
+
+ do i = 1, n
+ expect(i) = 10 + i
+ end do
+
+ call gang_parallel()
+ call worker_parallel()
+ call vector_parallel()
+ call seq_parallel()
+ call gang_kernels()
+
+contains
+
+ subroutine check(tag, res_)
+ integer, intent(in) :: tag
+ integer, intent(in) :: res_(n)
+ integer :: j
+ do j = 1, n
+ if (res_(j) /= expect(j)) then
+ write(0,*) "tag=", tag, " j=", j, " got=", res_(j), " want=", expect(j)
+ stop tag
+ end if
+ end do
+ end subroutine check
+
+ subroutine gang_parallel()
+ type :: t
+ integer :: a
+ integer, allocatable :: b(:)
+ end type
+ type(t) :: x
+ integer :: j
+ res = -1
+ !$acc parallel loop gang private(x) copy(res)
+ do j = 1, n
+ x%a = 10 + j
+ res(j) = x%a
+ end do
+ !$acc end parallel loop
+ call check(1, res)
+ end subroutine gang_parallel
+
+ subroutine worker_parallel()
+ type :: t
+ integer :: a
+ integer, allocatable :: b(:)
+ end type
+ type(t) :: x
+ integer :: j
+ res = -1
+ !$acc parallel num_gangs(4) num_workers(4) vector_length(1) copy(res)
+ !$acc loop worker private(x)
+ do j = 1, n
+ x%a = 10 + j
+ res(j) = x%a
+ end do
+ !$acc end parallel
+ call check(2, res)
+ end subroutine worker_parallel
+
+ subroutine vector_parallel()
+ type :: t
+ integer :: a
+ integer, allocatable :: b(:)
+ end type
+ type(t) :: x
+ integer :: j
+ res = -1
+ !$acc parallel num_gangs(1) num_workers(1) vector_length(32) copy(res)
+ !$acc loop vector private(x)
+ do j = 1, n
+ x%a = 10 + j
+ res(j) = x%a
+ end do
+ !$acc end parallel
+ call check(3, res)
+ end subroutine vector_parallel
+
+ subroutine seq_parallel()
+ type :: t
+ integer :: a
+ integer, allocatable :: b(:)
+ end type
+ type(t) :: x
+ integer :: j
+ res = -1
+ !$acc parallel num_gangs(1) num_workers(1) vector_length(1) copy(res)
+ !$acc loop seq private(x)
+ do j = 1, n
+ x%a = 10 + j
+ res(j) = x%a
+ end do
+ !$acc end parallel
+ call check(4, res)
+ end subroutine seq_parallel
+
+ subroutine gang_kernels()
+ type :: t
+ integer :: a
+ integer, allocatable :: b(:)
+ end type
+ type(t) :: x
+ integer :: j
+ res = -1
+ !$acc kernels copy(res)
+ !$acc loop private(x)
+ do j = 1, n
+ x%a = 10 + j
+ res(j) = x%a
+ end do
+ !$acc end kernels
+ call check(5, res)
+ end subroutine gang_kernels
+
+end program pr93554
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90
new file mode 100644
index 00000000000..d45edffdbbe
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90
@@ -0,0 +1,55 @@
+! PR fortran/93554 -- runtime free-edge stress.
+!
+! The companion compile-time regression test (pr93554.f90 under
+! gcc/testsuite/gfortran.dg/goacc/) only exercises the CFG edge on
+! the pass dump. This test exits the OpenACC region with the
+! allocatable component still allocated, which forces the per-thread
+! finalisation inserted by the Fortran front end to run at region
+! exit. Before r16-8571-g010618b8dcb this ICE'd in expand_oacc_for;
+! afterwards the finalisation-free edge must execute for every
+! thread and produce correct results. On NVPTX the call surfaces in
+! GOMP_DEBUG=1 as __nvptx_free.
+!
+! Partitioning is pinned to gang-only: worker/vector-level private
+! for aggregate types on NVPTX is a separate, pre-existing question
+! (see PR95397) and is not what this test is trying to exercise.
+
+! { dg-do run }
+
+program pr93554_alloc_in_body
+ implicit none
+ integer, parameter :: n = 128
+ integer, parameter :: sentinel = -999
+ integer :: res(n), expect(n), j
+ type :: t
+ integer :: a
+ integer, allocatable :: b(:)
+ end type
+ type(t) :: x
+
+ do j = 1, n
+ expect(j) = 3*j + 7
+ end do
+
+ res = -1
+
+ !$acc parallel loop gang private(x) copy(res) num_gangs(4)
+ do j = 1, n
+ if (.not. allocated(x%b)) then
+ allocate(x%b(8))
+ x%b = sentinel
+ end if
+ ! If a prior iteration's write leaked through, we will see
+ ! non-sentinel, non-zero values in slots we don't write below.
+ if (x%b(2) /= sentinel .and. x%b(2) /= 0) stop 3
+ x%b = 0
+ x%b(1) = j
+ x%b(5) = 2*j + 7
+ res(j) = x%b(1) + x%b(5)
+ end do
+ !$acc end parallel loop
+
+ do j = 1, n
+ if (res(j) /= expect(j)) stop 1
+ end do
+end program pr93554_alloc_in_body
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90
new file mode 100644
index 00000000000..768d84a252d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90
@@ -0,0 +1,55 @@
+! PR fortran/93554 -- per-gang independence of whole-allocatable privates.
+!
+! Whole allocatable array marked private on a gang-partitioned
+! !$acc parallel loop with num_gangs(4). Each iteration populates
+! its per-gang private buf, overwrites it, and reads the final
+! contents back. A miscompile that lets the relaxed CFG leak
+! private storage between iterations on the same gang -- or across
+! gangs -- scrambles the checksum. The test also forces the
+! per-gang finaliser to free buf at region exit (same code path as
+! PR107227 / PR95550 at runtime).
+!
+! Kept gang-only deliberately: worker- and vector-level private for
+! whole allocatables is a separate GCC/NVPTX implementation question
+! unrelated to r16-8571.
+
+! { dg-do run }
+
+program pr93554_private_independence
+ implicit none
+ integer, parameter :: n = 128, m = 16
+ integer :: res(n), expect(n), j, k
+ integer, allocatable :: buf(:)
+
+ allocate(buf(m))
+
+ ! Expected: sum_{k=1..m} 2*(j*m + k)
+ ! = 2*m*j*m + 2*m*(m+1)/2
+ ! = 2*(m*m*j + m*(m+1)/2).
+ do j = 1, n
+ expect(j) = 2*(m*m*j + m*(m + 1)/2)
+ end do
+
+ res = -1
+
+ !$acc parallel loop gang private(buf) copy(res) num_gangs(4)
+ do j = 1, n
+ do k = 1, m
+ buf(k) = j*m + k
+ end do
+ do k = 1, m
+ buf(k) = 2*buf(k)
+ end do
+ res(j) = 0
+ do k = 1, m
+ res(j) = res(j) + buf(k)
+ end do
+ end do
+ !$acc end parallel loop
+
+ do j = 1, n
+ if (res(j) /= expect(j)) stop 2
+ end do
+
+ deallocate(buf)
+end program pr93554_private_independence
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90
new file mode 100644
index 00000000000..1ee4efd0872
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90
@@ -0,0 +1,84 @@
+! PR middle-end/95550 â execution test.
+!
+! Shared allocatable A created on the device with 'acc parallel create(A)',
+! then used privately inside an 'acc loop private(A)'. Same ICE site as
+! PR93554; expected to be resolved by r16-8571-g010618b8dcb.
+!
+! Post-region the host-side A must be unchanged (create does not copy
+! back). We also exercise an analogous 'acc kernels' variant.
+
+! { dg-do run }
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+program pr95550
+ implicit none
+ integer, parameter :: n = 32
+ integer, allocatable :: a(:)
+ integer :: i, j
+
+ allocate(a(n))
+ do j = 1, n
+ a(j) = 3 * j
+ end do
+
+ !$acc parallel create(a) num_gangs(1) num_workers(1) vector_length(1)
+ !$acc loop seq private(a)
+ do i = 1, n
+ a(i) = 9 * i
+ end do
+ !$acc end parallel
+
+ ! Host copy of 'a' must be untouched: create(...) does not copy back.
+ do j = 1, n
+ if (a(j) /= 3 * j) then
+ write(0,*) "host a corrupted after parallel-create-private: j=", j, " got=", a(j)
+ stop 21
+ end if
+ end do
+
+ !$acc parallel create(a) num_gangs(4) num_workers(4) vector_length(1)
+ !$acc loop worker private(a)
+ do i = 1, n
+ a(i) = 9 * i
+ end do
+ !$acc end parallel
+
+ do j = 1, n
+ if (a(j) /= 3 * j) stop 22
+ end do
+
+ !$acc parallel create(a) num_gangs(1) num_workers(1) vector_length(32)
+ !$acc loop vector private(a)
+ do i = 1, n
+ a(i) = 9 * i
+ end do
+ !$acc end parallel
+
+ do j = 1, n
+ if (a(j) /= 3 * j) stop 23
+ end do
+
+ !$acc parallel create(a)
+ !$acc loop gang private(a)
+ do i = 1, n
+ a(i) = 9 * i
+ end do
+ !$acc end parallel
+
+ do j = 1, n
+ if (a(j) /= 3 * j) stop 24
+ end do
+
+ !$acc kernels create(a)
+ !$acc loop private(a)
+ do i = 1, n
+ a(i) = 9 * i
+ end do
+ !$acc end kernels
+
+ do j = 1, n
+ if (a(j) /= 3 * j) stop 25
+ end do
+
+ deallocate(a)
+end program pr95550
--
2.53.0