On August 15, 2019 4:52:24 PM GMT+02:00, Bernd Edlinger <bernd.edlin...@hotmail.de> wrote: >On 8/15/19 2:54 PM, Richard Biener wrote: >> On Thu, 15 Aug 2019, Bernd Edlinger wrote: >> >>>>>> >>>>>> Hmm. So your patch overrides user-alignment here. Woudln't it >>>>>> be better to do that more conciously by >>>>>> >>>>>> if (! DECL_USER_ALIGN (decl) >>>>>> || (align < GET_MODE_ALIGNMENT (DECL_MODE (decl)) >>>>>> && targetm.slow_unaligned_access (DECL_MODE (decl), >align))) >>>>>> >>> >>> ? I don't know why that would be better? >>> If the value is underaligned no matter why, pretend it was declared >as >>> naturally aligned if that causes wrong code otherwise. >>> That was the idea here. >> >> It would be better because then we ignore it and use what we'd use >> by default rather than inventing sth new. And your patch suggests >> it might be needed to up align even w/o DECL_USER_ALIGN. >> > >Hmmm, you mean the constant 1.0i should not have DECL_USER_ALIGN set? >But it inherits the alignment from the destination variable, >apparently.
Yes. I think it shouldn't inherit the alignment unless we are assembling a static initializer. > >did you mean >if (! DECL_USER_ALIGN (decl) > && align < GET_MODE_ALIGNMENT (DECL_MODE (decl)) > && ... >? > >I can give it a try. No, I meant || thus ignore DECL_USER_ALIGN if it is sth we have to satisfy with unaligned loads. > >>>>>> IMHO whatever code later fails to properly use unaligned loads >>>>>> should be fixed instead rather than ignoring user requested >alignment. >>>>>> >>>>>> Can you quote a short testcase that explains what exactly goes >wrong? >>>>>> The struct-layout ones are awkward to look at... >>>>>> >>>>> >>>>> Sure, >>>>> >>>>> $ cat test.c >>>>> _Complex float __attribute__((aligned(1))) cf; >>>>> >>>>> void foo (void) >>>>> { >>>>> cf = 1.0i; >>>>> } >>>>> >>>>> $ arm-linux-gnueabihf-gcc -S test.c >>>>> during RTL pass: expand >>>>> test.c: In function 'foo': >>>>> test.c:5:6: internal compiler error: in gen_movsf, at >config/arm/arm.md:7003 >>>>> 5 | cf = 1.0i; >>>>> | ~~~^~~~~~ >>>>> 0x7ba475 gen_movsf(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/config/arm/arm.md:7003 >>>>> 0xa49587 insn_gen_fn::operator()(rtx_def*, rtx_def*) const >>>>> ../../gcc-trunk/gcc/recog.h:318 >>>>> 0xa49587 emit_move_insn_1(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3695 >>>>> 0xa49914 emit_move_insn(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3791 >>>>> 0xa494f7 emit_move_complex_parts(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3490 >>>>> 0xa49914 emit_move_insn(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3791 >>>>> 0xa5106f store_expr(tree_node*, rtx_def*, int, bool, bool) >>>>> ../../gcc-trunk/gcc/expr.c:5855 >>>>> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool) >>>>> ../../gcc-trunk/gcc/expr.c:5441 >>>> >>>> Huh, so why didn't it trigger >>>> >>>> /* Handle misaligned stores. */ >>>> mode = TYPE_MODE (TREE_TYPE (to)); >>>> if ((TREE_CODE (to) == MEM_REF >>>> || TREE_CODE (to) == TARGET_MEM_REF) >>>> && mode != BLKmode >>>> && !mem_ref_refers_to_non_mem_p (to) >>>> && ((align = get_object_alignment (to)) >>>> < GET_MODE_ALIGNMENT (mode)) >>>> && (((icode = optab_handler (movmisalign_optab, mode)) >>>> != CODE_FOR_nothing) >>>> || targetm.slow_unaligned_access (mode, align))) >>>> { >>>> >>>> ? (_Complex float is 32bit aligned it seems, the DECL_RTL for the >>>> var is (mem/c:SC (symbol_ref:SI ("cf") [flags 0x2] <var_decl >>>> 0x2aaaaaad1240 cf>) [1 cf+0 S8 A8]), SCmode is 32bit aligned. >>>> >>>> Ah, 'to' is a plain DECL here so the above handling is incomplete. >>>> IIRC component refs like __real cf = 0.f should be handled fine >>>> again(?). So, does adding || DECL_P (to) fix the case as well? >>>> >>> >>> So I tried this instead of the varasm.c change: >>> >>> Index: expr.c >>> =================================================================== >>> --- expr.c (revision 274487) >>> +++ expr.c (working copy) >>> @@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool >nontem >>> /* Handle misaligned stores. */ >>> mode = TYPE_MODE (TREE_TYPE (to)); >>> if ((TREE_CODE (to) == MEM_REF >>> - || TREE_CODE (to) == TARGET_MEM_REF) >>> + || TREE_CODE (to) == TARGET_MEM_REF >>> + || DECL_P (to)) >>> && mode != BLKmode >>> - && !mem_ref_refers_to_non_mem_p (to) >>> + && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to)) >>> && ((align = get_object_alignment (to)) >>> < GET_MODE_ALIGNMENT (mode)) >>> && (((icode = optab_handler (movmisalign_optab, mode)) >>> >>> Result, yes, it fixes this test case >>> but then I run all struct-layout-1.exp there are sill cases. where >we have problems: >>> >>> In file included from >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_x.c:8:^M >>> >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h: >In function 'test2112':^M >>> >/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:23:10: >internal compiler error: in gen_movdf, at config/arm/arm.md:7107^M >>> >/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:62:3: >note: in definition of macro 'TX'^M >>> >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h:113:1: >note: in expansion of macro 'TCI'^M >>> >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h:113:294: >note: in expansion of macro 'F'^M >>> 0x7ba377 gen_movdf(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/config/arm/arm.md:7107^M >>> 0xa494c7 insn_gen_fn::operator()(rtx_def*, rtx_def*) const^M >>> ../../gcc-trunk/gcc/recog.h:318^M >>> 0xa494c7 emit_move_insn_1(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3695^M >>> 0xa49854 emit_move_insn(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3791^M >>> 0xa49437 emit_move_complex_parts(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3490^M >>> 0xa49854 emit_move_insn(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3791^M >>> 0xa50faf store_expr(tree_node*, rtx_def*, int, bool, bool)^M >>> ../../gcc-trunk/gcc/expr.c:5856^M >>> 0xa51f34 expand_assignment(tree_node*, tree_node*, bool)^M >>> ../../gcc-trunk/gcc/expr.c:5302^M >>> 0xa51f34 expand_assignment(tree_node*, tree_node*, bool)^M >>> ../../gcc-trunk/gcc/expr.c:4983^M >>> 0x9338af expand_gimple_stmt_1^M >>> ../../gcc-trunk/gcc/cfgexpand.c:3777^M >>> 0x9338af expand_gimple_stmt^M >>> ../../gcc-trunk/gcc/cfgexpand.c:3875^M >>> 0x939221 expand_gimple_basic_block^M >>> ../../gcc-trunk/gcc/cfgexpand.c:5915^M >>> 0x93af86 execute^M >>> ../../gcc-trunk/gcc/cfgexpand.c:6538^M >>> Please submit a full bug report,^M >>> >>> My personal gut feeling this will be more fragile than over-aligning >the >>> constants. >> >> As said the constant shouldn't end up under-aligned, the user cannot >> specify alignment of literal constants. Not sure what you mean >> with "over"-aligning. >> > > >Hmm wait a moment, I actually wanted _only_ to change the >DECL_ARTIFICIAL >that is built by build_constant_desc. It uses align_variable of >course, >but I totally missed that this also controls the alignment of normal >variables, sorry about the confusion here. > >I mean we should align the constant for the unaligned complex with >the natural alignment of the type-mode. Agreed. That wrong fix made >the variables ignore the alignment, which was of course not intended, >and instead I would need: > >Index: expr.c >=================================================================== >--- expr.c (revision 274531) >+++ expr.c (working copy) >@@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool >nontem > /* Handle misaligned stores. */ > mode = TYPE_MODE (TREE_TYPE (to)); > if ((TREE_CODE (to) == MEM_REF >- || TREE_CODE (to) == TARGET_MEM_REF) >+ || TREE_CODE (to) == TARGET_MEM_REF >+ || DECL_P (to)) > && mode != BLKmode >- && !mem_ref_refers_to_non_mem_p (to) >+ && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to)) > && ((align = get_object_alignment (to)) > < GET_MODE_ALIGNMENT (mode)) > && (((icode = optab_handler (movmisalign_optab, mode)) > >Index: varasm.c >=================================================================== >--- varasm.c (revision 274531) >+++ varasm.c (working copy) >@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3. If not see > #include "stmt.h" > #include "expr.h" > #include "expmed.h" >+#include "optabs.h" > #include "output.h" > #include "langhooks.h" > #include "debug.h" >@@ -3386,7 +3387,15 @@ build_constant_desc (tree exp) > if (TREE_CODE (exp) == STRING_CST) >SET_DECL_ALIGN (decl, targetm.constant_alignment (exp, DECL_ALIGN >(decl))); > else >- align_variable (decl, 0); >+ { >+ align_variable (decl, 0); >+ if (DECL_ALIGN (decl) < GET_MODE_ALIGNMENT (DECL_MODE (decl)) >+ && ((optab_handler (movmisalign_optab, DECL_MODE (decl)) >+ != CODE_FOR_nothing) >+ || targetm.slow_unaligned_access (DECL_MODE (decl), >+ DECL_ALIGN (decl)))) >+ SET_DECL_ALIGN (decl, GET_MODE_ALIGNMENT (DECL_MODE (decl))); >+ } > > /* Now construct the SYMBOL_REF and the MEM. */ > if (use_object_blocks_p ()) > >>> >>> >>>>> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool) >>>>> ../../gcc-trunk/gcc/expr.c:4983 >>>>> 0x93396f expand_gimple_stmt_1 >>>>> ../../gcc-trunk/gcc/cfgexpand.c:3777 >>>>> 0x93396f expand_gimple_stmt >>>>> ../../gcc-trunk/gcc/cfgexpand.c:3875 >>>>> 0x9392e1 expand_gimple_basic_block >>>>> ../../gcc-trunk/gcc/cfgexpand.c:5915 >>>>> 0x93b046 execute >>>>> ../../gcc-trunk/gcc/cfgexpand.c:6538 >>>>> Please submit a full bug report, >>>>> with preprocessed source if appropriate. >>>>> Please include the complete backtrace with any bug report. >>>>> See <https://gcc.gnu.org/bugs/> for instructions. >>>>> >>>>> Without the hunk in varasm.c of course. >>>>> >>>>> What happens is that expand_expr_real_2 returns a unaligned >mem_ref here: >>>>> >>>>> case COMPLEX_CST: >>>>> /* Handle evaluating a complex constant in a CONCAT target. >*/ >>>>> if (original_target && GET_CODE (original_target) == CONCAT) >>>>> { >>>>> [... this path not taken ...] >>> >>> BTW: this code block executes when the other ICE happens. >>> >>>>> } >>>>> >>>>> /* fall through */ >>>>> >>>>> case STRING_CST: >>>>> temp = expand_expr_constant (exp, 1, modifier); >>>>> >>>>> /* temp contains a constant address. >>>>> On RISC machines where a constant address isn't valid, >>>>> make some insns to get that address into a register. */ >>>>> if (modifier != EXPAND_CONST_ADDRESS >>>>> && modifier != EXPAND_INITIALIZER >>>>> && modifier != EXPAND_SUM >>>>> && ! memory_address_addr_space_p (mode, XEXP (temp, 0), >>>>> MEM_ADDR_SPACE >(temp))) >>>>> return replace_equiv_address (temp, >>>>> copy_rtx (XEXP (temp, 0))); >>>>> return temp; >>>>> >>>>> The result of expand_expr_real(..., EXPAND_NORMAL) ought to be >usable >>>>> by emit_move_insn, that is expected just *everywhere* and can't be >changed. >>>>> >>>>> This could probably be fixed in an ugly way in the COMPLEX_CST, >handler >>>>> but OTOH, I don't see any reason why this constant has to be >misaligned >>>>> when it can be easily aligned, which avoids the need for a >misaligned access. >>>> >>>> If the COMPLEX_CST happends to end up in unaligned memory then >that's >>>> of course a bug (unless the target requests that for all >COMPLEX_CSTs). >>>> That is, if the unalignment is triggered because the store is to an >>>> unaligned decl. >>>> >>>> But I think the issue is the above one? >>>> >>> >>> yes initially the constant seems to be unaligned. then it is >expanded, >>> and there is no special handling for unaligned constants in >expand_expr_real, >>> and then probably expand_assignment or store_expr seem not fully >prepared for >>> this either. >> >> With a cross I see the constant has regular aligned _Complex type >> so not sure how it can end up unaligned. >> > >Maybe a target configuration issue. >Not sure, I have configured mine this way: > >../gcc-trunk/configure >--prefix=/home/ed/gnu/arm-linux-gnueabihf-linux64 >--target=arm-linux-gnueabihf --enable-languages=all --with-arch=armv7-a >--with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard > >However it appears now there are two different errors, one is in >expand_assignment >which you found (I start to wonder if I should add you to the authors >section >of this patch), and a different one, which I have not yet simplified, >but you can easily try that for yourself: > >make check-gcc-c RUNTESTFLAGS="struct-layout-1.exp=*" > >it is okay when the test fails to execute but there should no internal >compiler errors. > > >>>>> >>>>> The problem is that the code that handles this misaligned access >>>>> is skipped because the mem_rtx has initially no MEM_ATTRS and >therefore >>>>> MEM_ALIGN == 32, and therefore the code that handles the unaligned >>>>> access is not taken. BUT before the mem_rtx is returned it is >>>>> set to MEM_ALIGN = 8 by set_mem_attributes, and we have an >assertion, >>>>> because the result from expand_expr_real(..., EXPAND_NORMAL) ought >to be >>>>> usable with emit_move_insn. >>>> >>>> yes, as said the _access_ determines the address should be aligned >>>> so we shouldn't end up setting MEM_ALIGN to 8 but to 32 according >>>> to the access type/mode. But we can't trust DECL_ALIGN of >>>> FUNCTION_DECLs but we _can_ trust users writing *(int *)fn >>>> (maybe for actual accesses we _can_ trust DECL_ALIGN, it's just >>>> we may not compute nonzero bits for the actual address because >>>> of function pointer mangling) >>>> (for accessing function code I'd say this would be premature >>>> optimization, but ...) >>>> >>> >>> Not a very nice solution, but it is not worth to spend much effort >>> in optimizing undefined behavior, I just want to avoid the ICE >>> at this time and would not trust the DECL_ALIGN either. >> >> So I meant >> >> Index: gcc/builtins.c >> =================================================================== >> --- gcc/builtins.c (revision 274534) >> +++ gcc/builtins.c (working copy) >> @@ -255,7 +255,8 @@ get_object_alignment_2 (tree exp, unsign >> >> /* Extract alignment information from the innermost object and >> possibly adjust bitpos and offset. */ >> - if (TREE_CODE (exp) == FUNCTION_DECL) >> + if (TREE_CODE (exp) == FUNCTION_DECL >> + && addr_p) >> { >> /* Function addresses can encode extra information besides >their >> alignment. However, if TARGET_PTRMEMFUNC_VBIT_LOCATION >> >> so we get at DECL_ALIGN of the FUNCTION_DECL (not sure if we >> can trust it). >> >>>> >>>> Still I think you can't simply override STACK_SLOT_ALIGNMENT just >because >>>> of the mode of an entry param, can you? If you can assume a bigger >>>> alignment then STACK_SLOT_ALIGNMENT should return it. >>>> >>> >>> I don't see a real problem here. All target except i386 and gcn >(whatever that is) >>> use the default for STACK_SLOT_ALIGNMENT which simply allows any >(large) align value >>> to rule the effective STACK_SLOT_ALIGNMENT. The user could have >simply declared >>> the local variable with the alignment that results in better code >FWIW. >>> >>> If the stack alignment is too high that is capped in >assign_stack_local: >>> >>> /* Ignore alignment if it exceeds MAX_SUPPORTED_STACK_ALIGNMENT. >*/ >>> if (alignment_in_bits > MAX_SUPPORTED_STACK_ALIGNMENT) >>> { >>> alignment_in_bits = MAX_SUPPORTED_STACK_ALIGNMENT; >>> alignment = MAX_SUPPORTED_STACK_ALIGNMENT / BITS_PER_UNIT; >>> } >>> >>> I for one, would just assume that MAX_SUPPORTED_STACK_ALIGNMENT >should >>> be sufficient for all modes that need movmisalign_optab and friends. >>> If it is not, an ICE would be just fine. >> >> Hmm. In some way we could better communicate with the user then >> and do not allow under-aligning automatic vars? But the you >> still have packed structs with BLKmode where the actual field >> accesses will carry SImode even when not aligned(?) >> > >Yes, that works also when unaligned. > >> >> Please split it into the parts for the PR and parts making the >> asserts not trigger. >> > >Yes, will do. > >> The PR is already fixed, right? The assign_parm_find_stack_rtl hunk >> is merely an optimization? >> > >Hmmmm... You are right, I should have added that to the commit >message... > >Of course the test cases try to verify the optimization. > > >Thanks >Bernd.