Re: [PATCH] Re-work get_object_alignment (again)
On Thu, 19 Jul 2012, Markus Trippelsdorf wrote: On 2012.07.17 at 15:10 +0200, Richard Guenther wrote: Comments welcome, of course. This patch apparently miscompiles the Linux kernel, which just hangs during early boot: ... SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 Hierarchical RCU implementation. NR_IRQS:4352 nr_irqs:712 16 Extended CMOS year: 2000 Console: colour VGA+ 80x25 console [tty0] enabled (hang) See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54031 The following fixes one issue (but reportedly does not fix the above issue). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-07-20 Richard Guenther rguent...@suse.de * builtins.c (get_object_alignment_2): Correct offset handling when using type alignment of a MEM_REF kind base. Index: gcc/builtins.c === --- gcc/builtins.c (revision 189656) +++ gcc/builtins.c (working copy) @@ -346,12 +346,10 @@ get_object_alignment_2 (tree exp, unsign known_alignment = get_pointer_alignment_1 (addr, ptr_align, ptr_bitpos); - bitpos += ptr_bitpos; align = MAX (ptr_align, align); - if (TREE_CODE (exp) == MEM_REF - || TREE_CODE (exp) == TARGET_MEM_REF) - bitpos += mem_ref_offset (exp).low * BITS_PER_UNIT; + /* The alignment of the pointer operand in a TARGET_MEM_REF +has to take the variable offset parts into account. */ if (TREE_CODE (exp) == TARGET_MEM_REF) { if (TMR_INDEX (exp)) @@ -369,9 +367,19 @@ get_object_alignment_2 (tree exp, unsign /* When EXP is an actual memory reference then we can use TYPE_ALIGN of a pointer indirection to derive alignment. Do so only if get_pointer_alignment_1 did not reveal absolute -alignment knowledge. */ - if (!addr_p !known_alignment) - align = MAX (TYPE_ALIGN (TREE_TYPE (exp)), align); +alignment knowledge and if using that alignment would +improve the situation. */ + if (!addr_p !known_alignment + TYPE_ALIGN (TREE_TYPE (exp)) align) + align = TYPE_ALIGN (TREE_TYPE (exp)); + else + { + /* Else adjust bitpos accordingly. */ + bitpos += ptr_bitpos; + if (TREE_CODE (exp) == MEM_REF + || TREE_CODE (exp) == TARGET_MEM_REF) + bitpos += mem_ref_offset (exp).low * BITS_PER_UNIT; + } } else if (TREE_CODE (exp) == STRING_CST) {
Re: [PATCH] Re-work get_object_alignment (again)
On 2012.07.17 at 15:10 +0200, Richard Guenther wrote: Comments welcome, of course. This patch apparently miscompiles the Linux kernel, which just hangs during early boot: ... SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 Hierarchical RCU implementation. NR_IRQS:4352 nr_irqs:712 16 Extended CMOS year: 2000 Console: colour VGA+ 80x25 console [tty0] enabled (hang) See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54031 -- Markus
Re: [PATCH] Re-work get_object_alignment (again)
Now, back to PR53970, where #pragma pack() is used to pack a struct. With #pragma pack() no part of the type or field-decls have a hint that packing took place (well, their align information tell you), which means the vectorizers use of contains_packed_reference is not conservative enough, nor is expand_exprs use: case BIT_FIELD_REF: case ARRAY_RANGE_REF: normal_inner_ref: { ... if (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (exp, 0))) || (TREE_CODE (TREE_OPERAND (exp, 1)) == FIELD_DECL DECL_PACKED (TREE_OPERAND (exp, 1 packedp = true; I'm not sure if this flag is required for correctness - it's only passed to extract_bit_field - but if it is the above code does not work for #pragma packed structs. I suppose what should be checked is (a few lines below the above test, after we expanded tem) if (MEM_P (op0) GET_MODE (op0) != BLKmode MEM_ALIGN (op0) GET_MODE_ALIGNMENT (GET_MODE (op0))) packedp = true; ? I suppose packedp was computed for STRICT_ALIGNMENT targets only. I'm not changing the above, but Eric, you should be able to produce a #pragma packed testcase that fails on a STRICT_ALIGNMENT target? This is the -fstrict-volatile-bitfields business though, and its documentation explicitly refers to the packed attribute: If the target requires strict alignment, and honoring the field type would require violating this alignment, a warning is issued. If the field has `packed' attribute, the access is done without honoring the field type. If the field doesn't have `packed' attribute, the access is done honoring the field type. In both cases, GCC assumes that the user knows something about the target hardware that it is unaware of. so I'm a little reluctant to touch that. But, yes, generally speaking, testing TYPE_PACKED or DECL_PACKED to drive code generation is wrong. Oh, and this does not yet fix PR53970 - but I hope that I can remove contains_packed_reference ;) Right, it should definitely go away. -- Eric Botcazou
[PATCH] Re-work get_object_alignment (again)
I've arrived at get_object_{or_type,}alignment again looking at PR53970. And I finally concluded we should unconditionally relying on type-alignment on INDIRECT/MEM/TARGET_MEM_REF when we ask for the alignment of an access (as opposed to when asking for the alignment of an address). So the following patch removes get_object_or_type_alignment again and folds its doings into get_object_alignment_2 which now takes an argument to indicate whether we compute address or object alignment. I changed the meaning of the return value of the functions to indicate whether the information returned is correct. That's used to disable the use of type-alignment knowledge when we know for example the underlying decl and see it is not appropriately aligned. This should mask some of the bugs in user code and in the C frontend (at least) with regarding to packed structs. I have bootstrapped and tested this patch on x86_64-unknown-linux-gnu and also a variant that asserts that the alignment returned by get_object_alignment is at least as large as that computed by get_object_or_type_alignment (so we should not regress). Now, back to PR53970, where #pragma pack() is used to pack a struct. With #pragma pack() no part of the type or field-decls have a hint that packing took place (well, their align information tell you), which means the vectorizers use of contains_packed_reference is not conservative enough, nor is expand_exprs use: case BIT_FIELD_REF: case ARRAY_RANGE_REF: normal_inner_ref: { ... if (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (exp, 0))) || (TREE_CODE (TREE_OPERAND (exp, 1)) == FIELD_DECL DECL_PACKED (TREE_OPERAND (exp, 1 packedp = true; I'm not sure if this flag is required for correctness - it's only passed to extract_bit_field - but if it is the above code does not work for #pragma packed structs. I suppose what should be checked is (a few lines below the above test, after we expanded tem) if (MEM_P (op0) GET_MODE (op0) != BLKmode MEM_ALIGN (op0) GET_MODE_ALIGNMENT (GET_MODE (op0))) packedp = true; ? I suppose packedp was computed for STRICT_ALIGNMENT targets only. I'm not changing the above, but Eric, you should be able to produce a #pragma packed testcase that fails on a STRICT_ALIGNMENT target? Comments welcome, of course. Oh, and this does not yet fix PR53970 - but I hope that I can remove contains_packed_reference ;) Thanks, Richard. 2012-07-17 Richard Guenther rguent...@suse.de * tree.h (get_object_or_type_alignment): Remove. * builtins.c (get_object_alignment_2): New function copied from get_object_alignment_1. Take extra argument to indicate whether we take the address of EXP. Rework to use type alignment information if not, and return whether the result is an approximation or not. (get_object_alignment_1): Wrap around get_object_alignment_2. (get_pointer_alignment_1): Call get_object_alignment_2 indicating we take the address. (get_object_or_type_alignment): Remove. * expr.c (expand_assignment): Call get_object_alignment. (expand_expr_real_1): Likewise. Index: gcc/builtins.c === *** gcc/builtins.c.orig 2012-07-17 12:26:37.0 +0200 --- gcc/builtins.c 2012-07-17 13:15:19.018757760 +0200 *** called_as_built_in (tree node) *** 273,283 on the address at which an object is actually located. These two addresses are not always the same. For example, on ARM targets, the address foo of a Thumb function foo() has the lowest bit set, !whereas foo() itself starts on an even address. */ ! bool ! get_object_alignment_1 (tree exp, unsigned int *alignp, ! unsigned HOST_WIDE_INT *bitposp) { HOST_WIDE_INT bitsize, bitpos; tree offset; --- 273,286 on the address at which an object is actually located. These two addresses are not always the same. For example, on ARM targets, the address foo of a Thumb function foo() has the lowest bit set, !whereas foo() itself starts on an even address. !If ADDR_P is true we are taking the address of the memory reference EXP !and thus cannot rely on the access taking place. */ ! ! static bool ! get_object_alignment_2 (tree exp, unsigned int *alignp, ! unsigned HOST_WIDE_INT *bitposp, bool addr_p) { HOST_WIDE_INT bitsize, bitpos; tree offset; *** get_object_alignment_1 (tree exp, unsign *** 293,340 /* Extract alignment information from the innermost object and possibly adjust bitpos and offset. */ ! if (TREE_CODE (exp) == CONST_DECL) ! exp = DECL_INITIAL (exp); ! if (DECL_P (exp) !TREE_CODE (exp) != LABEL_DECL) { ! if (TREE_CODE (exp) == FUNCTION_DECL) ! { ! /* Function addresses can