https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848
Bug ID: 94848
Summary: [Offloading][LTO] partial var elimination errors /
-ftree-pre causes link errors |
libgomp.fortran/use_device_ptr-optional-3.f90 failures
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: lto
Assignee: unassigned at gcc dot gnu.org
Reporter: burnus at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
Target Milestone: ---
Compiling
gfortran -fopenmp libgomp.fortran/use_device_ptr-optional-3.f90 \
-O1 -foffload=-lgfortran -ftree-pre
with actual offloading (amdgcn, nvidia) fails with:
/tmp/ccUEo7uX.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.5'
It works with -fno-tree-pre (or when compiling without actual offloading).
The optimization happens on the host side as -foffload="-O0 -lgfortran"
does not solve the issue.
In the Fortran code, this array (A.12) appears in a device function ("omp
declare target") as:
if (any (c_z /= [1,2,3])) stop 37
As mentioned below, the other array (A.9) appears in:
if (any (x /= [3,4,6,2])) stop 44
And in the dump as:
static integer(kind=4) A.12[3] = {1, 2, 3};
static integer(kind=4) A.9[4] = {3, 4, 6, 2};
…
_20 = A.9[S.10];
…
_26 = A.12[S.13_67];
In the optimized dump (-fno-tree-pre):
ivtmp.333_78 = (unsigned long) &A.9;
…
ivtmp.325_89 = (unsigned long) &A.12;
But with -ftree-pre, the last assignment is gone – but
<bb 43> [local count: 428295]:
_gfortran_stop_numeric (37, 0);
still exists. Here, the array has been "unrolled", i.e.:
if (_61 != 1)
goto <bb 43>; [5.50%]
(Followed by the conditions for "2" and "3".)
That's perfectly fine and optimizes "A.12" away.
* * *
If I look at the dumps (-fdump-tree-all) on the device side, those (still)
contain:
pretmp_157 = A.12[_15];
…
if (_134 != pretmp_157)
goto <bb 45>; [5.50%]
My impression is that the local static variable "A.12" is removed before
writing the LTO data – based on the -ftree-pre analysis.
But the LTO expression usage is written before that removal. – At least that
would explain why it fails on the device side.