https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162
Bug ID: 102162
Summary: Byte-wise access optimized away at -O1 and above
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: danglin at gcc dot gnu.org
CC: helge.deller at sap dot com
Target Milestone: ---
Host: hppa*-*-linux*
Target: hppa*-*-linux*
Build: hppa*-*-linux*
Created attachment 51394
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51394&action=edit
Test case
The packed attribute is used in Linux v5.14 to request byte-wise access
to unaligned data. This is important on hppa as loads and stores require
strict alignment.
The attached test program is miscompiled at -O1 and above. The byte-wise
accesses are optimized to a single ldw instruction during RTL expansion:
.LEVEL 2.0w
.text
.align 8
.globl test
.type test, @function
test:
.PROC
.CALLINFO FRAME=0,NO_CALLS
.ENTRY
addil LT'output_len,%r27
ldd RT'output_len(%r1),%r28
ldw 0(%r28),%r28
bve (%r2)
extrd,s %r28,63,32,%r28
.EXIT
.PROCEND
.size test, .-test
.globl output_len
.section .bss
.type output_len, @object
.size output_len, 4
.align 1
output_len:
.block 4
.ident "GCC: (GNU) 10.3.0"
This faults when output_len is not aligned on a word boundary.
Not sure, but problem may be the test-unaligned.c.027t.einline pass:
;; Function get_unaligned_le32 (get_unaligned_le32, funcdef_no=0,
decl_uid=1506, cgraph_uid=1, symbol_order=1)
Iterations: 0
get_unaligned_le32 (const void * p)
{
const struct
{
u32 x;
} * __pptr;
u32 _4;
<bb 2> :
__pptr_2 = p_1(D);
_4 = __pptr_2->x;
return _4;
}
;; Function test (test, funcdef_no=1, decl_uid=1512, cgraph_uid=2,
symbol_order=2)
Iterations: 1
Symbols to be put in SSA form
{ D.1520 D.1524 }
Incremental SSA update started at block: 0
Number of blocks in CFG: 5
Number of blocks to update: 4 ( 80%)
Merging blocks 2 and 4
Merging blocks 2 and 3
test ()
{
u32 D.1524;
unsigned int _1;
unsigned int _3;
int _4;
<bb 2> :
_3 = MEM[(const struct *)&output_len].x;
_5 = _3;
_1 = _5;
_4 = (int) _1;
return _4;
}
Ultimately, the MEM gets expanded to the ldw.