https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118348
Bug ID: 118348
Summary: [SVE] HACCKernels seems to miscompile with VLS SVE
after 0c5c0c959c2e592b84739f19ca771fa69eb8dfee
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: prathamesh3492 at gcc dot gnu.org
Target Milestone: ---
Hi,
HACCKernels (https://git.cels.anl.gov/hacc/HACCKernels) seems to miscompile and
result in "bus error" after the following commit:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=0c5c0c959c2e592b84739f19ca771fa69eb8dfee
with following options: -O3 -ffast-math -fopenmp -mcpu=neoverse-v2
-msve-vector-bits=128 and OMP_NUM_THREADS=1
Running under gdb shows:
#0 0x003974873e78c382 in ?? ()
#1 0x0000000000401398 in _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void)
() at main.cpp:159
#2 0x3ebecda63dc7e8f3 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
which likely indicates that there is stack corruption happening, and compiling
with -fstack-protector-strong shows:
Maximum OpenMP Threads: 1
Iterations: 2000
*** stack smashing detected ***: terminated
Aborted
The omp clone of run function has following instruction at beginning:
Dump of assembler code for function
_Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void):
=> 0x0000000000401020 <+0>: stp x29, x30, [sp, #-224]!
0x0000000000401024 <+4>: mov x29, sp
After stp instruction, sp: 0xffffffffed00 stores x29 and *(sp + 8)
stores x30.
Setting watchpoint on 0xffffffffed00 shows that the value of x29 and x30 gets
overwritten in
_Z19GravityForceKernel4iPfS_S_S_fffffRfS0_S0_ at following st1w instruction:
0x00000000004018dc <+316>: st1w {z31.s}, p5, [x9, #-1, mul vl]
=> 0x00000000004018e0 <+320>: whilelo p6.s, w8, w0
z31: {0x3e7da4b63f191827, 0x3dbd38643de2b7de}
with 0xffffffffed00 overwritten by lower half of z31 (0x3e7da4b63f191827) and
0xffffffffed08 being overwritten by upper half (0x3dbd38643de2b7de).
Backtrace after st1w thus shows:
#0 0x00000000004018e0 in GravityForceKernel<4, PolyCoefficients4> (n=619,
x=0x433580, y=0x433f40, z=0x434900,
mass=0x4352c0, x0=<optimized out>, y0=<optimized out>, z0=<optimized out>,
MaxSepSqrd=<optimized out>,
SofteningLenSqrd=<optimized out>, ax=@0xffffffffedcc: 0,
ay=@0xffffffffedc8: 0, az=@0xffffffffedc4: 0)
at GravityForceKernel.cpp:118
#1 GravityForceKernel4 (n=n@entry=619, x=x@entry=0x433580, y=y@entry=0x433f40,
z=z@entry=0x434900,
mass=mass@entry=0x4352c0, x0=<optimized out>, y0=<optimized out>,
z0=<optimized out>, MaxSepSqrd=<optimized out>,
SofteningLenSqrd=<optimized out>, ax=@0xffffffffedcc: 0,
ay=@0xffffffffedc8: 0, az=@0xffffffffedc4: 0)
at GravityForceKernel.cpp:132
#2 0x0000000000401398 in _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void)
() at main.cpp:159
#3 0x3dbd38643de2b7de in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thanks,
Prathamesh