16 Regression] ICE in expand_oacc_for ...

Albert, Christopher Tue, 21 Apr 2026 01:54:12 -0700

On Mon, 20 Apr 2026 at 09:57, Thomas Schwinge wrote:
> Hi!
>
> Christopher et al., first: Thanks for your work on this (see below) and
> other bug fixes!
>
> On 2026-04-13T12:38:51-0700, Jerry D <[email protected]> wrote:
>> On 4/13/26 12:20 PM, Harald Anlauf wrote:
>>> Am 11.04.26 um 6:54 PM schrieb Jerry D:
>>>> The attached patch look fairly simple.
>>>>
>>>> Regression tested on x86_64.
>>>>
>>>> I plan to commit this one in a little while.
>>> 
>>> Does this need RM approval?  It's a change below gcc/ .
>>
>> We were given a week at the start of last week.
>
> Not sure what that last comment actually relates to, but: this patch
> doesn't change GCC/Fortran code, but generic GCC middle end
> infrastructure, 'gcc/omp-expand.cc:expand_oacc_for' (OpenACC 'loop'
> construct handling).
>
> I note that this commit also resolves the ICEs reported in
> <https://gcc.gnu.org/PR95550>
> "[OpenACC] ICE in expand_oacc_for, at omp-expand.c:6075", and
> <https://gcc.gnu.org/PR107227>
> "Compiler bug in private allocatable array in OpenACC compute statement",
> so these PRs need to be verified if they're indeed completely resolved
> now (duplicates), or if other work remains, and/or whether their test
> cases should also be installed.
>
>> This one is pushed already as of April 11, 2026
>
> <https://gcc.gnu.org/g:010618b8dcb73220790f8f82cf76e8a2aacc2122>
> "fortran: Fix ICE in expand_oacc_for with private derived type [PR93554]".
>
>> No worries here.
>
> It's easy to make assertion failures go away by just defusing the
> 'assert's.  ;-)
>
> (And also: removing 'assert's won't make any conforming code misbehave,
> so this commit doesn't introduce any regressions for conforming code.)
>
> However, are we really convinced that any follow-on middle end, back end
> compiler handling is correct for the "more loose" basic block layout that
> we now accept?  From the discussion in the PR initially by Tobias (CC
> added), and later Christopher, I can see that some thought has been spent
> on this, but the test case that we currently have (per the commit)
> certainly isn't able to verify this: in other words, this isn't
> sufficient, and we need a thorough review of the code (changes) and their
> effect on later compiler passes, and proper execution test case(s) added,
> exercising this now-enabled functionality in context of OpenACC's
> (different levels of) parallelism, relevant different data types (see the
> other PRs I've mentioned, for example), etc.
>
> I shall try to help with review of the OpenACC side of things, but I'll
> appreciate help with the Fortran side of things (as producer of this
> previously-rejected code).
>
> Christopher, can you please elaborate to which extent you've verified (by
> inspection and/or testing) the code changes?
>
>
> Grüße
>  Thomas


Hi Thomas,

Thanks for the review.

Scope note: this follow-up adds only tests; no further source changes
beyond r16-8571.  And since LLM use on gcc-patches has sparked some
controversy, disclosure: I use Claude and GPT models while preparing
patches and flag that with an "Assisted-by: Claude (Anthropic)"
trailer, which this patch carries alongside my Signed-off-by.

On your three asks:

(1) PR95550 / PR107227 duplicate status.

Both reduce to the same assertion (omp-expand.cc:7722) as PR93554 and
are resolved by r16-8571-g010618b8dcb.  On the parent both reproducers
ICE; on trunk both compile cleanly and the post-fix ompexp CFG matches
PR93554's.  Close as DUPLICATE.  I've installed their test cases under
libgomp (attached) rather than gcc/testsuite because the originals are
runtime-behaviour reports, not compile-time ICEs.  Duplicate-of
comments posted on the two bugs:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95550
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107227
and the follow-up is attached as patch 64265 on the parent bug:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93554

(2) Execution tests.

Attached patch adds five tests under
libgomp/testsuite/libgomp.oacc-fortran/:

  pr93554-1.f90   Canonical derived-type-with-allocatable-component
                  shape; five parallelism variants.
  pr93554-2.f90   Allocation inside the loop body, gang-only; forces
                  __nvptx_free at region exit.  Sentinel check
                  detects leaked private storage.
  pr93554-3.f90   Whole allocatable private, num_gangs(4); double-
                  phase write+sum checks per-gang isolation.
  pr95550-1.f90   Burnus's create(A) + loop private(A); five variants.
  pr107227-1.f90  Bryngelson's whole-allocatable private; five
                  variants.

dg-prune-output suppresses the pre-existing
"using vector_length(32), ignoring 1" info diagnostic.

(3) Extent of verification.

- Host fallback (trunk gfortran -foffload=disable): 5 scenarios x 6
  opt levels (-O0..-O3 -Os -Og), all PASS.
- NVPTX offload (trunk, sm_89 / RTX 5060 Ti, CUDA 13.2): all five
  PASS at -O2 with ACC_DEVICE_TYPE=nvidia.
- check-target-libgomp-fortran: +60 PASS, 0 FAIL/XPASS/UNRESOLVED
  over the 6374-pass baseline.
- GOMP_DEBUG=1 PTX trace shows __nvptx_malloc/__nvptx_free call
  sites inside the .entry body for pr93554-2/3, not only in runtime
  helpers.  pr93554-2's sentinel and pr93554-3's sum identity hold
  across all iterations.

Bounds worth noting:

- AMD GCN untested (NVPTX-only hardware).
- Fully-partitioned gang+worker+vector with an aggregate private
  exposes a separate codegen bug on NVPTX that r16-8571 only
  uncovers (wrong results for whole-allocatable private; static-
  array variant is fine).  Filed as PR124964, cross-linked with
  PR95397 which may share the same root cause:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124964
  pr93554-2/3 therefore pin to gang-only.


Thanks,

Chris

From e4272e6f37ddf022152f90212dd89764761c9d3f Mon Sep 17 00:00:00 2001
From: Christopher Albert <[email protected]>
Date: Tue, 21 Apr 2026 09:00:45 +0200
Subject: [PATCH] libgomp, testsuite: OpenACC Fortran private-allocatable
 execution tests [PR93554, PR95550, PR107227]

The compile-only regression for PR93554 in
gcc/testsuite/gfortran.dg/goacc/pr93554.f90 covers the ICE in
expand_oacc_for when an OpenACC loop's private clause carries a
derived type with an allocatable component.  It does not exercise
the finalisation/free CFG edge that r16-8571-g010618b8dcb relaxed,
nor the closely related whole-allocatable shapes reported as PR95550
and PR107227.

Add five libgomp execution tests so that a runtime check across
gang/worker/vector/seq/kernels parallelism variants is available for
each of the three reported forms, plus two targeted stress cases:

pr93554-1.f90  Derived type with allocatable component; five
               parallelism variants (gang/worker/vector/seq/kernels).
               Canonical regression shape; mirrors the merged
               gfortran.dg/goacc/pr93554.f90 compile test.
pr93554-2.f90  Derived type with allocatable component allocated
               inside the loop body.  Pinned to gang-only; exits the
               region with priv.b non-null, so the per-thread
               finaliser __nvptx_free must run at region exit.  A
               sentinel check at iteration start detects leaked
               private storage.
pr93554-3.f90  Whole allocatable array marked private under
               num_gangs(4) gang partitioning.  Double-phase
               write+sum exercises per-gang isolation; the OpenACC
               runtime materialises the per-gang copy on entry and
               the finaliser frees each copy on exit.
pr95550-1.f90  Burnus's acc parallel create(A) + acc loop private(A)
               shape; five parallelism variants.  Verifies that the
               host copy of A is left untouched.
pr107227-1.f90 Bryngelson's whole-allocatable acc parallel loop
               private(arr) shape; five parallelism variants.

All five compile cleanly on trunk, PASS across the DejaGnu matrix
for host fallback and on NVPTX sm_89.

Per-PR outcome:

 - PR93554: fixed by r16-8571-g010618b8dcb; covered by pr93554-1.f90
   (canonical), pr93554-2.f90 (runtime free edge), and pr93554-3.f90
   (per-gang independence).
 - PR95550: duplicate of PR93554 -- same ICE site, resolved by the
   same commit.  pr95550-1.f90 installs Burnus's shape as the
   execution regression.
 - PR107227: duplicate of PR93554 -- same ICE site, resolved by the
   same commit.  pr107227-1.f90 installs Bryngelson's shape.

Scope note: worker- and vector-level private for whole allocatables
under NVPTX offload exposes a pre-existing codegen issue distinct
from r16-8571 (see PR95397 and the Bugzilla thread for the new
allocatable-worker/vector report filed separately).  pr93554-2.f90
and pr93554-3.f90 are therefore pinned to gang-only partitioning.

Assisted-by: Claude (Anthropic)

	PR fortran/93554
	PR middle-end/95550
	PR libgomp/107227

libgomp/ChangeLog:

	* testsuite/libgomp.oacc-fortran/pr93554-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/pr93554-2.f90: New test.
	* testsuite/libgomp.oacc-fortran/pr93554-3.f90: New test.
	* testsuite/libgomp.oacc-fortran/pr95550-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/pr107227-1.f90: New test.

Signed-off-by: Christopher Albert <[email protected]>
---
 .../libgomp.oacc-fortran/pr107227-1.f90       | 119 ++++++++++++++++
 .../libgomp.oacc-fortran/pr93554-1.f90        | 134 ++++++++++++++++++
 .../libgomp.oacc-fortran/pr93554-2.f90        |  55 +++++++
 .../libgomp.oacc-fortran/pr93554-3.f90        |  55 +++++++
 .../libgomp.oacc-fortran/pr95550-1.f90        |  84 +++++++++++
 5 files changed, 447 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90
new file mode 100644
index 00000000000..42cf20d0f78
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr107227-1.f90
@@ -0,0 +1,119 @@
+! PR libgomp/107227 â execution test.
+!
+! Whole allocatable array used with private clause on an OpenACC loop.
+! Same ICE site as PR93554; expected to be resolved by
+! r16-8571-g010618b8dcb.  Covers gang/worker/vector/seq and
+! 'parallel loop' vs 'kernels loop'.
+
+! { dg-do run }
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+program pr107227
+  implicit none
+  integer, parameter :: n = 32
+  integer :: res(n), i, expect(n)
+
+  do i = 1, n
+     expect(i) = 2 * i
+  end do
+
+  call gang_parallel()
+  call worker_parallel()
+  call vector_parallel()
+  call seq_parallel()
+  call gang_kernels()
+
+contains
+
+  subroutine check(tag, res_)
+    integer, intent(in) :: tag
+    integer, intent(in) :: res_(n)
+    integer :: j
+    do j = 1, n
+       if (res_(j) /= expect(j)) then
+          write(0,*) "tag=", tag, " j=", j, " got=", res_(j), " want=", expect(j)
+          stop tag
+       end if
+    end do
+  end subroutine check
+
+  subroutine gang_parallel()
+    integer :: j
+    real, allocatable :: arr(:)
+    allocate(arr(n))
+    res = -1
+    !$acc parallel loop gang private(arr) copy(res)
+    do j = 1, n
+       arr(j) = 2.0 * real(j)
+       res(j) = int(arr(j))
+    end do
+    !$acc end parallel loop
+    call check(11, res)
+    deallocate(arr)
+  end subroutine gang_parallel
+
+  subroutine worker_parallel()
+    integer :: j
+    real, allocatable :: arr(:)
+    allocate(arr(n))
+    res = -1
+    !$acc parallel num_gangs(4) num_workers(4) vector_length(1) copy(res)
+    !$acc loop worker private(arr)
+    do j = 1, n
+       arr(j) = 2.0 * real(j)
+       res(j) = int(arr(j))
+    end do
+    !$acc end parallel
+    call check(12, res)
+    deallocate(arr)
+  end subroutine worker_parallel
+
+  subroutine vector_parallel()
+    integer :: j
+    real, allocatable :: arr(:)
+    allocate(arr(n))
+    res = -1
+    !$acc parallel num_gangs(1) num_workers(1) vector_length(32) copy(res)
+    !$acc loop vector private(arr)
+    do j = 1, n
+       arr(j) = 2.0 * real(j)
+       res(j) = int(arr(j))
+    end do
+    !$acc end parallel
+    call check(13, res)
+    deallocate(arr)
+  end subroutine vector_parallel
+
+  subroutine seq_parallel()
+    integer :: j
+    real, allocatable :: arr(:)
+    allocate(arr(n))
+    res = -1
+    !$acc parallel num_gangs(1) num_workers(1) vector_length(1) copy(res)
+    !$acc loop seq private(arr)
+    do j = 1, n
+       arr(j) = 2.0 * real(j)
+       res(j) = int(arr(j))
+    end do
+    !$acc end parallel
+    call check(14, res)
+    deallocate(arr)
+  end subroutine seq_parallel
+
+  subroutine gang_kernels()
+    integer :: j
+    real, allocatable :: arr(:)
+    allocate(arr(n))
+    res = -1
+    !$acc kernels copy(res)
+    !$acc loop private(arr)
+    do j = 1, n
+       arr(j) = 2.0 * real(j)
+       res(j) = int(arr(j))
+    end do
+    !$acc end kernels
+    call check(15, res)
+    deallocate(arr)
+  end subroutine gang_kernels
+
+end program pr107227
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90
new file mode 100644
index 00000000000..9f741a9e96d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-1.f90
@@ -0,0 +1,134 @@
+! PR fortran/93554 â execution test.
+!
+! Exercises "private(x)" on an OpenACC loop where x is a derived type
+! with an allocatable component.  Four subroutines cover the four
+! partitioning levels (gang, worker, vector, seq) that 'acc loop' may
+! carry, in both the 'parallel loop' and 'kernels loop' forms.
+!
+! Before r16-8571-g010618b8dcb this ICE'd in expand_oacc_for at
+! omp-expand.cc; afterwards it compiles and is expected to execute
+! correctly: the private copy of 'x' must not leak results into the
+! loop-body arithmetic performed through the shared output array.
+
+! { dg-do run }
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+program pr93554
+  implicit none
+  integer, parameter :: n = 32
+  integer :: res(n), i, expect(n)
+
+  do i = 1, n
+     expect(i) = 10 + i
+  end do
+
+  call gang_parallel()
+  call worker_parallel()
+  call vector_parallel()
+  call seq_parallel()
+  call gang_kernels()
+
+contains
+
+  subroutine check(tag, res_)
+    integer, intent(in) :: tag
+    integer, intent(in) :: res_(n)
+    integer :: j
+    do j = 1, n
+       if (res_(j) /= expect(j)) then
+          write(0,*) "tag=", tag, " j=", j, " got=", res_(j), " want=", expect(j)
+          stop tag
+       end if
+    end do
+  end subroutine check
+
+  subroutine gang_parallel()
+    type :: t
+       integer :: a
+       integer, allocatable :: b(:)
+    end type
+    type(t) :: x
+    integer :: j
+    res = -1
+    !$acc parallel loop gang private(x) copy(res)
+    do j = 1, n
+       x%a = 10 + j
+       res(j) = x%a
+    end do
+    !$acc end parallel loop
+    call check(1, res)
+  end subroutine gang_parallel
+
+  subroutine worker_parallel()
+    type :: t
+       integer :: a
+       integer, allocatable :: b(:)
+    end type
+    type(t) :: x
+    integer :: j
+    res = -1
+    !$acc parallel num_gangs(4) num_workers(4) vector_length(1) copy(res)
+    !$acc loop worker private(x)
+    do j = 1, n
+       x%a = 10 + j
+       res(j) = x%a
+    end do
+    !$acc end parallel
+    call check(2, res)
+  end subroutine worker_parallel
+
+  subroutine vector_parallel()
+    type :: t
+       integer :: a
+       integer, allocatable :: b(:)
+    end type
+    type(t) :: x
+    integer :: j
+    res = -1
+    !$acc parallel num_gangs(1) num_workers(1) vector_length(32) copy(res)
+    !$acc loop vector private(x)
+    do j = 1, n
+       x%a = 10 + j
+       res(j) = x%a
+    end do
+    !$acc end parallel
+    call check(3, res)
+  end subroutine vector_parallel
+
+  subroutine seq_parallel()
+    type :: t
+       integer :: a
+       integer, allocatable :: b(:)
+    end type
+    type(t) :: x
+    integer :: j
+    res = -1
+    !$acc parallel num_gangs(1) num_workers(1) vector_length(1) copy(res)
+    !$acc loop seq private(x)
+    do j = 1, n
+       x%a = 10 + j
+       res(j) = x%a
+    end do
+    !$acc end parallel
+    call check(4, res)
+  end subroutine seq_parallel
+
+  subroutine gang_kernels()
+    type :: t
+       integer :: a
+       integer, allocatable :: b(:)
+    end type
+    type(t) :: x
+    integer :: j
+    res = -1
+    !$acc kernels copy(res)
+    !$acc loop private(x)
+    do j = 1, n
+       x%a = 10 + j
+       res(j) = x%a
+    end do
+    !$acc end kernels
+    call check(5, res)
+  end subroutine gang_kernels
+
+end program pr93554
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90
new file mode 100644
index 00000000000..d45edffdbbe
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-2.f90
@@ -0,0 +1,55 @@
+! PR fortran/93554 -- runtime free-edge stress.
+!
+! The companion compile-time regression test (pr93554.f90 under
+! gcc/testsuite/gfortran.dg/goacc/) only exercises the CFG edge on
+! the pass dump.  This test exits the OpenACC region with the
+! allocatable component still allocated, which forces the per-thread
+! finalisation inserted by the Fortran front end to run at region
+! exit.  Before r16-8571-g010618b8dcb this ICE'd in expand_oacc_for;
+! afterwards the finalisation-free edge must execute for every
+! thread and produce correct results.  On NVPTX the call surfaces in
+! GOMP_DEBUG=1 as __nvptx_free.
+!
+! Partitioning is pinned to gang-only: worker/vector-level private
+! for aggregate types on NVPTX is a separate, pre-existing question
+! (see PR95397) and is not what this test is trying to exercise.
+
+! { dg-do run }
+
+program pr93554_alloc_in_body
+  implicit none
+  integer, parameter :: n = 128
+  integer, parameter :: sentinel = -999
+  integer :: res(n), expect(n), j
+  type :: t
+     integer :: a
+     integer, allocatable :: b(:)
+  end type
+  type(t) :: x
+
+  do j = 1, n
+     expect(j) = 3*j + 7
+  end do
+
+  res = -1
+
+  !$acc parallel loop gang private(x) copy(res) num_gangs(4)
+  do j = 1, n
+     if (.not. allocated(x%b)) then
+        allocate(x%b(8))
+        x%b = sentinel
+     end if
+     ! If a prior iteration's write leaked through, we will see
+     ! non-sentinel, non-zero values in slots we don't write below.
+     if (x%b(2) /= sentinel .and. x%b(2) /= 0) stop 3
+     x%b    = 0
+     x%b(1) = j
+     x%b(5) = 2*j + 7
+     res(j) = x%b(1) + x%b(5)
+  end do
+  !$acc end parallel loop
+
+  do j = 1, n
+     if (res(j) /= expect(j)) stop 1
+  end do
+end program pr93554_alloc_in_body
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90
new file mode 100644
index 00000000000..768d84a252d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr93554-3.f90
@@ -0,0 +1,55 @@
+! PR fortran/93554 -- per-gang independence of whole-allocatable privates.
+!
+! Whole allocatable array marked private on a gang-partitioned
+! !$acc parallel loop with num_gangs(4).  Each iteration populates
+! its per-gang private buf, overwrites it, and reads the final
+! contents back.  A miscompile that lets the relaxed CFG leak
+! private storage between iterations on the same gang -- or across
+! gangs -- scrambles the checksum.  The test also forces the
+! per-gang finaliser to free buf at region exit (same code path as
+! PR107227 / PR95550 at runtime).
+!
+! Kept gang-only deliberately: worker- and vector-level private for
+! whole allocatables is a separate GCC/NVPTX implementation question
+! unrelated to r16-8571.
+
+! { dg-do run }
+
+program pr93554_private_independence
+  implicit none
+  integer, parameter :: n = 128, m = 16
+  integer :: res(n), expect(n), j, k
+  integer, allocatable :: buf(:)
+
+  allocate(buf(m))
+
+  ! Expected: sum_{k=1..m} 2*(j*m + k)
+  !         = 2*m*j*m + 2*m*(m+1)/2
+  !         = 2*(m*m*j + m*(m+1)/2).
+  do j = 1, n
+     expect(j) = 2*(m*m*j + m*(m + 1)/2)
+  end do
+
+  res = -1
+
+  !$acc parallel loop gang private(buf) copy(res) num_gangs(4)
+  do j = 1, n
+     do k = 1, m
+        buf(k) = j*m + k
+     end do
+     do k = 1, m
+        buf(k) = 2*buf(k)
+     end do
+     res(j) = 0
+     do k = 1, m
+        res(j) = res(j) + buf(k)
+     end do
+  end do
+  !$acc end parallel loop
+
+  do j = 1, n
+     if (res(j) /= expect(j)) stop 2
+  end do
+
+  deallocate(buf)
+end program pr93554_private_independence
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90
new file mode 100644
index 00000000000..1ee4efd0872
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/pr95550-1.f90
@@ -0,0 +1,84 @@
+! PR middle-end/95550 â execution test.
+!
+! Shared allocatable A created on the device with 'acc parallel create(A)',
+! then used privately inside an 'acc loop private(A)'.  Same ICE site as
+! PR93554; expected to be resolved by r16-8571-g010618b8dcb.
+!
+! Post-region the host-side A must be unchanged (create does not copy
+! back).  We also exercise an analogous 'acc kernels' variant.
+
+! { dg-do run }
+!TODO { dg-prune-output {using 'vector_length \(32\)', ignoring 1} }
+
+program pr95550
+  implicit none
+  integer, parameter :: n = 32
+  integer, allocatable :: a(:)
+  integer :: i, j
+
+  allocate(a(n))
+  do j = 1, n
+     a(j) = 3 * j
+  end do
+
+  !$acc parallel create(a) num_gangs(1) num_workers(1) vector_length(1)
+  !$acc loop seq private(a)
+  do i = 1, n
+     a(i) = 9 * i
+  end do
+  !$acc end parallel
+
+  ! Host copy of 'a' must be untouched: create(...) does not copy back.
+  do j = 1, n
+     if (a(j) /= 3 * j) then
+        write(0,*) "host a corrupted after parallel-create-private: j=", j, " got=", a(j)
+        stop 21
+     end if
+  end do
+
+  !$acc parallel create(a) num_gangs(4) num_workers(4) vector_length(1)
+  !$acc loop worker private(a)
+  do i = 1, n
+     a(i) = 9 * i
+  end do
+  !$acc end parallel
+
+  do j = 1, n
+     if (a(j) /= 3 * j) stop 22
+  end do
+
+  !$acc parallel create(a) num_gangs(1) num_workers(1) vector_length(32)
+  !$acc loop vector private(a)
+  do i = 1, n
+     a(i) = 9 * i
+  end do
+  !$acc end parallel
+
+  do j = 1, n
+     if (a(j) /= 3 * j) stop 23
+  end do
+
+  !$acc parallel create(a)
+  !$acc loop gang private(a)
+  do i = 1, n
+     a(i) = 9 * i
+  end do
+  !$acc end parallel
+
+  do j = 1, n
+     if (a(j) /= 3 * j) stop 24
+  end do
+
+  !$acc kernels create(a)
+  !$acc loop private(a)
+  do i = 1, n
+     a(i) = 9 * i
+  end do
+  !$acc end kernels
+
+  do j = 1, n
+     if (a(j) /= 3 * j) stop 25
+  end do
+
+  deallocate(a)
+end program pr95550
-- 
2.53.0

Re: [PATCH,fortran] Fix Bug 93554 - [13/14/15/16 Regression] ICE in expand_oacc_for ...

Reply via email to