Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
Julian, This patch works for register ld/st, but doesn't work for immediate, as showed in example. Would you further improve it for imm? Thanks - Joey $ arm-none-eabi-gcc -v 2>&1 | grep version gcc version 4.7.0 20110922 (experimental) [trunk revision 179074] (GCC) $ cat u.c struct packed_str { char f0; int f1; }__attribute__((packed)) u; void foo(int a) { u.f1 = a; // works fine } void bar() { u.f1 = 1; // doesn't work } $ arm-none-eabi-gcc -mthumb -march=armv7 -S -O2 u.c $ cat u.s .syntax unified .arch armv7 .fpu softvfp .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 1 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .thumb .file "u.c" .text .align 2 .global foo .thumb .thumb_func .type foo, %function foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. movwr3, #:lower16:u movtr3, #:upper16:u str r0, [r3, #1]@ unaligned bx lr .size foo, .-foo .align 2 .global bar .thumb .thumb_func .type bar, %function bar: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. movwr3, #:lower16:u movsr1, #0 movtr3, #:upper16:u @ Still uses aligned str from here ldrbr2, [r3, #0]@ zero_extendqisi2 strbr1, [r3, #4] orr r2, r2, #256 str r2, [r3, #0] bx lr .size bar, .-bar .comm u,5,4 .ident "GCC: (GNU) 4.7.0 20110922 (experimental) [trunk revision 179074]"
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On 26/08/11 17:39, Julian Brown wrote: > On Thu, 25 Aug 2011 18:31:21 +0100 > Julian Brown wrote: > >> On Thu, 25 Aug 2011 16:46:50 +0100 >> Julian Brown wrote: >> >>> So, OK to apply this version, assuming testing comes out OK? (And >>> the followup patch [2/2], which remains unchanged?) >> >> FWIW, all tests pass, apart from >> gcc.target/arm/volatile-bitfields-3.c, which regresses. The output >> contains: >> >> ldrhr0, [r3, #2]@ unaligned >> >> I believe that, to conform to the ARM EABI, that GCC must use an >> (aligned) ldr in this case. Is that correct? If so, it looks like the >> middle-end bitfield code does not take the setting of >> -fstrict-volatile-bitfields into account. > > This version fixes the last issue, by adding additional checks for > volatile accesses/-fstrict-volatile-bitfields. Tests now show no > regressions. > > OK to apply? > + /* On big-endian machines, we count bits from the most significant. +If the bit field insn does not, we must invert. */ It sounds to me like this comment is somewhat out of date now that we have BITS_BIG_ENDIAN; it would be better re-worded to reflect the code as it stands now. Other than that, OK. R. > Thanks, > > Julian > > ChangeLog > > gcc/ > * config/arm/arm.c (arm_override_options): Add unaligned_access > support. > (arm_file_start): Emit attribute for unaligned access as > appropriate. > * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) > (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. > (insv, extzv): Add unaligned-access support. > (extv): Change to expander. Likewise. > (extzv_t1, extv_regsi): Add helpers. > (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) > (unaligned_storesi, unaligned_storehi): New. > (*extv_reg): New (previous extv implementation). > * config/arm/arm.opt (munaligned_access): Add option. > * config/arm/constraints.md (Uw): New constraint. > * expmed.c (store_bit_field_1): Adjust bitfield numbering according > to size of access, not size of unit, when BITS_BIG_ENDIAN != > BYTES_BIG_ENDIAN. Don't use bitfield accesses for > volatile accesses when -fstrict-volatile-bitfields is in effect. > (extract_bit_field_1): Likewise. >
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
Julian Brown wrote: > The problem is, if we're using little-endian bit numbering for memory > locations in big-endian-bytes mode, we need to define an origin from > which to count "backwards" from. For the current implementation, this > will now be (I believe) one word beyond the base address of the access > in question, which is IMO slightly bizarre, to say the least. It doesn't seem all that bizarre to me. Conceptually, the extraction (etc.) operation works on an operand of integer type (usually, word sized) in two steps: - read the operand from its storage location, resulting in a plain (conceptual) integer value - extract certain numbered bits from that integer value The first step, reading the operand from the storage location, depends on BYTES_BIG_ENDIAN (as all memory reads do). It does not depend on BITS_BIG_ENDIAN at all. The second step, extracting numbered bits, depends on BITS_BIG_ENDIAN, which provides the link between bit number and its value. This step however does not depend on BYTES_BIG_ENDIAN at all. [However, if BITS_BIG_ENDIAN is true, you need to consider the total size of the operand in order to perform the conversion, since bit 0 then refers to the most-significant bit, and which bit that is depends on the size ... But this still does not depend on *BYTES_BIG_ENDIAN*.] Thus, the two flags can be considered really independent, and any combination is quite well-defined. When actually implementing the extraction, you will of course deviate from that conceptual sequence, e.g. by avoiding to read certain bytes if you know none of the bits in them will end up in the final result. But while this might result in some computations that may not immediately look obvious, in the end this is just an optimization step and doesn't change that the endian flags remain well-defined ... Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Thu, 25 Aug 2011 18:31:21 +0100 Julian Brown wrote: > On Thu, 25 Aug 2011 16:46:50 +0100 > Julian Brown wrote: > > > So, OK to apply this version, assuming testing comes out OK? (And > > the followup patch [2/2], which remains unchanged?) > > FWIW, all tests pass, apart from > gcc.target/arm/volatile-bitfields-3.c, which regresses. The output > contains: > > ldrhr0, [r3, #2]@ unaligned > > I believe that, to conform to the ARM EABI, that GCC must use an > (aligned) ldr in this case. Is that correct? If so, it looks like the > middle-end bitfield code does not take the setting of > -fstrict-volatile-bitfields into account. This version fixes the last issue, by adding additional checks for volatile accesses/-fstrict-volatile-bitfields. Tests now show no regressions. OK to apply? Thanks, Julian ChangeLog gcc/ * config/arm/arm.c (arm_override_options): Add unaligned_access support. (arm_file_start): Emit attribute for unaligned access as appropriate. * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. (insv, extzv): Add unaligned-access support. (extv): Change to expander. Likewise. (extzv_t1, extv_regsi): Add helpers. (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) (unaligned_storesi, unaligned_storehi): New. (*extv_reg): New (previous extv implementation). * config/arm/arm.opt (munaligned_access): Add option. * config/arm/constraints.md (Uw): New constraint. * expmed.c (store_bit_field_1): Adjust bitfield numbering according to size of access, not size of unit, when BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. Don't use bitfield accesses for volatile accesses when -fstrict-volatile-bitfields is in effect. (extract_bit_field_1): Likewise. commit 645a7c99ff91ea2841c8101fb3c76e3b1fddb2c7 Author: Julian Brown Date: Tue Aug 23 05:46:22 2011 -0700 Unaligned support for packed structs diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 3162b30..cc1eb80 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1905,6 +1905,28 @@ arm_option_override (void) fix_cm3_ldrd = 0; } + /* Enable -munaligned-access by default for + - all ARMv6 architecture-based processors + - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. + + Disable -munaligned-access by default for + - all pre-ARMv6 architecture-based processors + - ARMv6-M architecture-based processors. */ + + if (unaligned_access == 2) +{ + if (arm_arch6 && (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + else if (unaligned_access == 1 + && !(arm_arch6 && (arm_arch_notm || arm_arch7))) +{ + warning (0, "target CPU does not support unaligned accesses"); + unaligned_access = 0; +} + if (TARGET_THUMB1 && flag_schedule_insns) { /* Don't warn since it's on by default in -O2. */ @@ -22145,6 +22167,10 @@ arm_file_start (void) val = 6; asm_fprintf (asm_out_file, "\t.eabi_attribute 30, %d\n", val); + /* Tag_CPU_unaligned_access. */ + asm_fprintf (asm_out_file, "\t.eabi_attribute 34, %d\n", + unaligned_access); + /* Tag_ABI_FP_16bit_format. */ if (arm_fp16_format) asm_fprintf (asm_out_file, "\t.eabi_attribute 38, %d\n", diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 0f23400..0ea0f7f 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -103,6 +103,10 @@ UNSPEC_SYMBOL_OFFSET ; The offset of the start of the symbol from ; another symbolic address. UNSPEC_MEMORY_BARRIER ; Represent a memory barrier. + UNSPEC_UNALIGNED_LOAD ; Used to represent ldr/ldrh instructions that access + ; unaligned locations, on architectures which support + ; that. + UNSPEC_UNALIGNED_STORE ; Same for str/strh. ]) ;; UNSPEC_VOLATILE Usage: @@ -2468,10 +2472,10 @@ ;;; this insv pattern, so this pattern needs to be reevalutated. (define_expand "insv" - [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "") - (match_operand:SI 1 "general_operand" "") - (match_operand:SI 2 "general_operand" "")) -(match_operand:SI 3 "reg_or_int_operand" ""))] + [(set (zero_extract (match_operand 0 "nonimmediate_operand" "") + (match_operand 1 "general_operand" "") + (match_operand 2 "general_operand" "")) +(match_operand 3 "reg_or_int_operand" ""))] "TARGET_ARM || arm_arch_thumb2" " { @@ -2482,35 +2486,70 @@ if (arm_arch_thumb2) { - bool use_bfi = TRUE; - - if (GET_CODE (operands[3]) == CONST_INT) +if (unaligned_access && MEM_P (operands[0]) + && s_register_operand (operands[3], GET_MODE (operands[3])) + && (width == 16 || width == 32) && (start_bit % BITS_PER_UNIT) == 0) { - HOST_WIDE_INT val = INTVAL (operands[3]) & mask; +
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Thu, 25 Aug 2011, Ramana Radhakrishnan wrote: > On 25 August 2011 18:31, Julian Brown wrote: > > On Thu, 25 Aug 2011 16:46:50 +0100 > > Julian Brown wrote: > > > >> So, OK to apply this version, assuming testing comes out OK? (And the > >> followup patch [2/2], which remains unchanged?) > > > > FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c, > > which regresses. The output contains: > > > > ldrh r0, [r3, #2] @ unaligned > > > > I believe that, to conform to the ARM EABI, that GCC must use an > > (aligned) ldr in this case. Is that correct? > > That is correct by my reading of the ABI Spec. > > The relevant section is 7.1.7.5 where it states that : > > "When a volatile bitfield is read it's container must be read exactly > once using the access width appropriate to the type of the container. > " > > Here the type of the container is a long and hence the access should > be with an ldr instruction followed by a shift as it is today > irrespective whether we support unaligned accesses in this case. Except for packed structures, anyway (and there aren't any packed structures involved in this particular testcase). The ABI doesn't cover packed structures and I maintain that there the target-independent GNU C semantics take precedence - meaning that if the compiler doesn't know that a single read with the relevant access width is going to be safe, it must use an instruction sequence it does know to be safe. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On 25 August 2011 18:31, Julian Brown wrote: > On Thu, 25 Aug 2011 16:46:50 +0100 > Julian Brown wrote: > >> So, OK to apply this version, assuming testing comes out OK? (And the >> followup patch [2/2], which remains unchanged?) > > FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c, > which regresses. The output contains: > > ldrh r0, [r3, #2] @ unaligned > > I believe that, to conform to the ARM EABI, that GCC must use an > (aligned) ldr in this case. Is that correct? That is correct by my reading of the ABI Spec. The relevant section is 7.1.7.5 where it states that : "When a volatile bitfield is read it's container must be read exactly once using the access width appropriate to the type of the container. " Here the type of the container is a long and hence the access should be with an ldr instruction followed by a shift as it is today irrespective whether we support unaligned accesses in this case. cheers Ramana
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Thu, 25 Aug 2011 16:46:50 +0100 Julian Brown wrote: > So, OK to apply this version, assuming testing comes out OK? (And the > followup patch [2/2], which remains unchanged?) FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c, which regresses. The output contains: ldrhr0, [r3, #2]@ unaligned I believe that, to conform to the ARM EABI, that GCC must use an (aligned) ldr in this case. Is that correct? If so, it looks like the middle-end bitfield code does not take the setting of -fstrict-volatile-bitfields into account. Julian
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Mon, 4 Jul 2011 13:43:31 +0200 (CEST) "Ulrich Weigand" wrote: > Julian Brown wrote: > > > The most awkward change in the patch is to generic code (expmed.c, > > {store,extract}_bit_field_1): in big-endian mode, the existing > > behaviour (when inserting/extracting a bitfield to a memory > > location) is definitely bogus: "unit" is set to BITS_PER_UNIT for > > memory locations, and if bitsize (the size of the field to > > insert/extract) is greater than BITS_PER_UNIT (which isn't unusual > > at all), xbitpos becomes negative. That can't possibly be > > intentional; I can only assume that this code path is not exercised > > for machines which have memory alternatives for bitfield > > insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN mode. > I agree that the current code cannot possibly be correct. However, > just disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering > *completely* seems wrong to me as well. > > According to the docs, the meaning bit position passed to the > extv/insv expanders is determined by BITS_BIG_ENDIAN, both in the > cases of register and memory operands. Therefore, if BITS_BIG_ENDIAN > differs from BYTES_BIG_ENDIAN, we should need a correction for memory > operands as well. However, this correction needs to be relative to > the size of the access (i.e. the operand to the extv/insn), not just > BITS_PER_UNIT. > Note that with that change, the new code your patch introduces to the > ARM back-end will also need to change. You currently handle bitpos > like this: > > base_addr = adjust_address (operands[1], HImode, > bitpos / BITS_PER_UNIT); > > This implicitly assumes that bitpos counts according to > BYTES_BIG_ENDIAN, not BITS_BIG_ENDIAN -- which exactly cancels out > the common code behaviour introduced by your patch ... I've updated the patch to work with current mainline, and implemented your suggestion along with the change of the interpretation of bitpos in the insv/extv/extzv expanders in arm.md. It seems to work fine (testing still in progress), but I'm a bit concerned that the semantics of bit-positioning for memory operands when BYTES_BIG_ENDIAN && !BITS_BIG_ENDIAN are now strange to the point of perversity. The problem is, if we're using little-endian bit numbering for memory locations in big-endian-bytes mode, we need to define an origin from which to count "backwards" from. For the current implementation, this will now be (I believe) one word beyond the base address of the access in question, which is IMO slightly bizarre, to say the least. But, I can't think of any other easy ways forward than either this patch, or the previous one which disabled bit-endian switching entirely for memory operands in this case. So, OK to apply this version, assuming testing comes out OK? (And the followup patch [2/2], which remains unchanged?) Thanks, Julian ChangeLog gcc/ * config/arm/arm.c (arm_override_options): Add unaligned_access support. (arm_file_start): Emit attribute for unaligned access as appropriate. * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. (insv, extzv): Add unaligned-access support. (extv): Change to expander. Likewise. (extzv_t1, extv_regsi): Add helpers. (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) (unaligned_storesi, unaligned_storehi): New. (*extv_reg): New (previous extv implementation). * config/arm/arm.opt (munaligned_access): Add option. * config/arm/constraints.md (Uw): New constraint. * expmed.c (store_bit_field_1): Adjust bitfield numbering according to size of access, not size of unit, when BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. (extract_bit_field_1): Likewise. commit 7bf9c1c0806ad1ae75e96635cda55fff4c40e7ae Author: Julian Brown Date: Tue Aug 23 05:46:22 2011 -0700 Unaligned support for packed structs diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 2353704..dda2718 100644 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 3162b30..cc1eb80 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1905,6 +1905,28 @@ arm_option_override (void) fix_cm3_ldrd = 0; } + /* Enable -munaligned-access by default for + - all ARMv6 architecture-based processors + - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. + + Disable -munaligned-access by default for + - all pre-ARMv6 architecture-based processors + - ARMv6-M architecture-based processors. */ + + if (unaligned_access == 2) +{ + if (arm_arch6 && (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + else if (unaligned_access == 1 + && !(arm_arch6 && (arm_arch_notm || arm_arch7))) +{ + warning (0, "target CPU does not support unaligned accesses"); + unaligned_access = 0; +} + if (TARGET_THUMB1 && flag_schedule_insns)
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
Julian Brown wrote: > The most awkward change in the patch is to generic code (expmed.c, > {store,extract}_bit_field_1): in big-endian mode, the existing behaviour > (when inserting/extracting a bitfield to a memory location) is > definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations, > and if bitsize (the size of the field to insert/extract) is greater than > BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative. > That can't possibly be intentional; I can only assume that this code > path is not exercised for machines which have memory alternatives for > bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN > mode. [snip] > @@ -648,7 +648,7 @@ store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT > bitsize, >/* On big-endian machines, we count bits from the most significant. >If the bit field insn does not, we must invert. */ > > - if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) > + if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN && !MEM_P (xop0)) > xbitpos = unit - bitsize - xbitpos; I agree that the current code cannot possibly be correct. However, just disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering *completely* seems wrong to me as well. According to the docs, the meaning bit position passed to the extv/insv expanders is determined by BITS_BIG_ENDIAN, both in the cases of register and memory operands. Therefore, if BITS_BIG_ENDIAN differs from BYTES_BIG_ENDIAN, we should need a correction for memory operands as well. However, this correction needs to be relative to the size of the access (i.e. the operand to the extv/insn), not just BITS_PER_UNIT. >From looking at the sources, the simplest way to implement that might be to swap the order of the two corrections, that is, change this: /* On big-endian machines, we count bits from the most significant. If the bit field insn does not, we must invert. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) xbitpos = unit - bitsize - xbitpos; /* We have been counting XBITPOS within UNIT. Count instead within the size of the register. */ if (BITS_BIG_ENDIAN && !MEM_P (xop0)) xbitpos += GET_MODE_BITSIZE (op_mode) - unit; unit = GET_MODE_BITSIZE (op_mode); to look instead like: /* We have been counting XBITPOS within UNIT. Count instead within the size of the register. */ if (BYTES_BIG_ENDIAN && !MEM_P (xop0)) xbitpos += GET_MODE_BITSIZE (op_mode) - unit; unit = GET_MODE_BITSIZE (op_mode); /* On big-endian machines, we count bits from the most significant. If the bit field insn does not, we must invert. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) xbitpos = unit - bitsize - xbitpos; (Note that the condition in the first if must then check BYTES_BIG_ENDIAN instead of BITS_BIG_ENDIAN.) This change results in unchanged behaviour for register operands in all cases, and memory operands if BITS_BIG_ENDIAN == BYTES_BIG_ENDIAN. For the problematic case of memory operands with BITS_BIG_ENDIAN != BYTES_BIG ENDIAN it should result in the appropriate correction. Note that with that change, the new code your patch introduces to the ARM back-end will also need to change. You currently handle bitpos like this: base_addr = adjust_address (operands[1], HImode, bitpos / BITS_PER_UNIT); This implicitly assumes that bitpos counts according to BYTES_BIG_ENDIAN, not BITS_BIG_ENDIAN -- which exactly cancels out the common code behaviour introduced by your patch ... Thoughts? Am I overlooking something here? Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Mon, 9 May 2011 18:01:12 +0100 Julian Brown wrote: > How does this look now? (Re-testing in progress.) The previously-posted version contained a bug in the "extv" expander, which became apparent only when testing together with my fixed-point support patch. This is a corrected version. Re-tested (together with the followup unaligned-memcpy patch and fixed-point patch series), with no regressions in both big & little-endian mode, cross to ARM EABI. OK to apply this version? Thanks, Julian ChangeLog gcc/ * config/arm/arm.c (arm_override_options): Add unaligned_access support. (arm_file_start): Emit attribute for unaligned access as appropriate. * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. (insv, extzv): Add unaligned-access support. (extv): Change to expander. Likewise. (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) (unaligned_storesi, unaligned_storehi): New. (*extv_reg): New (previous extv implementation). * config/arm/arm.opt (munaligned_access): Add option. * config/arm/constraints.md (Uw): New constraint. * expmed.c (store_bit_field_1): Don't tweak bitfield numbering for memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. (extract_bit_field_1): Likewise. commit d66c777aa23e6df0d68265767d4ae9f370ffb761 Author: Julian Brown Date: Tue May 24 10:01:48 2011 -0700 Unaligned support for packed structs diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 47c7a3a..177c9bc 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1674,6 +1674,28 @@ arm_option_override (void) fix_cm3_ldrd = 0; } + /* Enable -munaligned-access by default for + - all ARMv6 architecture-based processors + - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. + + Disable -munaligned-access by default for + - all pre-ARMv6 architecture-based processors + - ARMv6-M architecture-based processors. */ + + if (unaligned_access == 2) +{ + if (arm_arch6 && (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + else if (unaligned_access == 1 + && !(arm_arch6 && (arm_arch_notm || arm_arch7))) +{ + warning (0, "target CPU does not support unaligned accesses"); + unaligned_access = 0; +} + if (TARGET_THUMB1 && flag_schedule_insns) { /* Don't warn since it's on by default in -O2. */ @@ -21555,6 +21577,10 @@ arm_file_start (void) val = 6; asm_fprintf (asm_out_file, "\t.eabi_attribute 30, %d\n", val); + /* Tag_CPU_unaligned_access. */ + asm_fprintf (asm_out_file, "\t.eabi_attribute 34, %d\n", + unaligned_access); + /* Tag_ABI_FP_16bit_format. */ if (arm_fp16_format) asm_fprintf (asm_out_file, "\t.eabi_attribute 38, %d\n", diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index fad82f0..12fd7ca 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -104,6 +104,10 @@ UNSPEC_SYMBOL_OFFSET ; The offset of the start of the symbol from ; another symbolic address. UNSPEC_MEMORY_BARRIER ; Represent a memory barrier. + UNSPEC_UNALIGNED_LOAD ; Used to represent ldr/ldrh instructions that access + ; unaligned locations, on architectures which support + ; that. + UNSPEC_UNALIGNED_STORE ; Same for str/strh. ]) ;; UNSPEC_VOLATILE Usage: @@ -2393,7 +2397,7 @@ ;;; this insv pattern, so this pattern needs to be reevalutated. (define_expand "insv" - [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "") + [(set (zero_extract:SI (match_operand:SI 0 "nonimmediate_operand" "") (match_operand:SI 1 "general_operand" "") (match_operand:SI 2 "general_operand" "")) (match_operand:SI 3 "reg_or_int_operand" ""))] @@ -2407,35 +2411,66 @@ if (arm_arch_thumb2) { - bool use_bfi = TRUE; - - if (GET_CODE (operands[3]) == CONST_INT) +if (unaligned_access && MEM_P (operands[0]) + && s_register_operand (operands[3], GET_MODE (operands[3])) + && (width == 16 || width == 32) && (start_bit % BITS_PER_UNIT) == 0) { - HOST_WIDE_INT val = INTVAL (operands[3]) & mask; - - if (val == 0) + rtx base_addr; + + if (width == 32) { - emit_insn (gen_insv_zero (operands[0], operands[1], - operands[2])); - DONE; + base_addr = adjust_address (operands[0], SImode, + start_bit / BITS_PER_UNIT); + emit_insn (gen_unaligned_storesi (base_addr, operands[3])); } + else + { + rtx tmp = gen_reg_rtx (HImode); - /* See if the set can be done with a single orr instruction. */ - if (val == mask && const_ok_for_arm (val << start_bit)) - use_bfi = FALSE; + base_addr = adjust_address (operands[0], HImode, + start_bit / BITS_PER_UNIT); + emit_move_insn (tmp, gen_lowpart (HImode, operands[3])); +
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Mon, 9 May 2011 18:01:12 +0100 Julian Brown wrote: > How does this look now? (Re-testing in progress.) Results from re-testing are fine, btw. Cheers, Julian
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Fri, 06 May 2011 14:04:16 +0100 Richard Earnshaw wrote: > On Fri, 2011-05-06 at 13:34 +0100, Julian Brown wrote: > > This is the first of two patches to add unaligned-access support to > > the ARM backend. [...] > The compiler should fault -munaligned-access on cores that don't > support it. I've implemented this as a warning (which automatically disables the option). > +(define_insn "unaligned_loadsi" > + [(set (match_operand:SI 0 "s_register_operand" "=r") > + (unspec:SI [(match_operand:SI 1 "memory_operand" "m")] > + UNSPEC_UNALIGNED_LOAD))] > + "unaligned_access" > + "ldr%?\t%0, %1\t@ unaligned" > + [(set_attr "predicable" "yes") > + (set_attr "type" "load1")]) > > I think the final condition should also include TARGET_32BIT, as these > patterns aren't expected to work with Thumb-1. Added. > Secondly, they should be structured to get the length information > right when a 16-bit encoding can be used in Thumb-2 (you'll keep > Carrot happy that way :-) : just add an alternative that's only > enabled for Thumb mode and which matches the requirements for a > 16-bit instruction (beware however of the 16-bit write-back case). I've done this, assuming that by "16-bit write-back case" you meant singleton-register-list ldmia/stmia instructions masquerading as post-increment ldr/str? I've disallowed that by adding a new constraint. It's not entirely clear to me whether Thumb-2 (vs. Thumb-1) will actually ever use 16-bit ldmia/stmia instead of 32-bit writeback ldr/str though. > Finally, I don't see anything to put out the correct build attribute > information for unaligned access enabled (Tag_CPU_unaligned_access). > Have I just missed it? No you didn't miss it, it wasn't there :-). Added. How does this look now? (Re-testing in progress.) Thanks, Julian ChangeLog gcc/ * config/arm/arm.c (arm_override_options): Add unaligned_access support. (arm_file_start): Emit attribute for unaligned access as appropriate. * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. (insv, extzv): Add unaligned-access support. (extv): Change to expander. Likewise. (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) (unaligned_storesi, unaligned_storehi): New. (*extv_reg): New (previous extv implementation). * config/arm/arm.opt (munaligned_access): Add option. * config/arm/constraints.md (Uw): New constraint. * expmed.c (store_bit_field_1): Don't tweak bitfield numbering for memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. (extract_bit_field_1): Likewise. commit bd55df2538dd90e576dd0f69bc9c9d570c8eee08 Author: Julian Brown Date: Wed May 4 10:06:25 2011 -0700 Permit regular ldr/str/ldrh/strh for packed-structure accesses etc. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4f9c2aa..f0f1a73 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1833,6 +1833,28 @@ arm_option_override (void) fix_cm3_ldrd = 0; } + /* Enable -munaligned-access by default for + - all ARMv6 architecture-based processors + - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. + + Disable -munaligned-access by default for + - all pre-ARMv6 architecture-based processors + - ARMv6-M architecture-based processors. */ + + if (unaligned_access == 2) +{ + if (arm_arch6 && (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + else if (unaligned_access == 1 + && !(arm_arch6 && (arm_arch_notm || arm_arch7))) +{ + warning (0, "target CPU does not support unaligned accesses"); + unaligned_access = 0; +} + if (TARGET_THUMB1 && flag_schedule_insns) { /* Don't warn since it's on by default in -O2. */ @@ -21714,6 +21736,10 @@ arm_file_start (void) val = 6; asm_fprintf (asm_out_file, "\t.eabi_attribute 30, %d\n", val); + /* Tag_CPU_unaligned_access. */ + asm_fprintf (asm_out_file, "\t.eabi_attribute 34, %d\n", + unaligned_access); + /* Tag_ABI_FP_16bit_format. */ if (arm_fp16_format) asm_fprintf (asm_out_file, "\t.eabi_attribute 38, %d\n", diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 40ebf35..59b9ffb 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -104,6 +104,10 @@ UNSPEC_SYMBOL_OFFSET ; The offset of the start of the symbol from ; another symbolic address. UNSPEC_MEMORY_BARRIER ; Represent a memory barrier. + UNSPEC_UNALIGNED_LOAD ; Used to represent ldr/ldrh instructions that access + ; unaligned locations, on architectures which support + ; that. + UNSPEC_UNALIGNED_STORE ; Same for str/strh. ]) ;; UNSPEC_VOLATILE Usage: @@ -2393,7 +2397,7 @@ ;;; this insv pattern, so this pattern needs to be reevalutated. (define_expand "insv" - [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "") + [(set (zero_extrac
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
On Fri, 2011-05-06 at 13:34 +0100, Julian Brown wrote: > Hi, > > This is the first of two patches to add unaligned-access support to the > ARM backend. This is done somewhat differently to Jie Zhang's earlier > patch: > > http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01890.html > > In that with Jie's patch, *any* pointer dereference would be allowed to > access unaligned data. This has the undesirable side-effect of > disallowing instructions which don't support unaligned accesses (LDRD, > LDM etc.) when unaligned accesses are enabled. > > Instead, this patch enables only packed-structure accesses to use > ldr/str/ldrh/strh, by taking a hint from the MIPS ldl/ldr > implementation. I figured the unaligned-access ARM case is kind of > similar to those, except that normal loads/stores are used, and the > shifting/merging happens in hardware. > > The standard names extv/extzv/insv can take a memory > operand for the source/destination of the extract/insert operation, so > we just expand to unspec'ed versions of the load and store operations > when unaligned-access support is enabled: the benefit of doing that > rather than, say, expanding using the regular movsi pattern is that we > bypass any smartness in the compiler which might replace operations > which work for unaligned accesses (ldr/str/ldrh/strh) with operations > which don't work (ldrd/strd/ldm/stm/vldr/...). The downside is we might > potentially miss out on optimization opportunities (since these things > no longer look like plain memory accesses). > > Doing things this way allows us to leave the settings for > STRICT_ALIGNMENT/SLOW_BYTE_ACCESS alone, avoiding the disruption that > changing them might cause. > > The most awkward change in the patch is to generic code (expmed.c, > {store,extract}_bit_field_1): in big-endian mode, the existing behaviour > (when inserting/extracting a bitfield to a memory location) is > definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations, > and if bitsize (the size of the field to insert/extract) is greater than > BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative. > That can't possibly be intentional; I can only assume that this code > path is not exercised for machines which have memory alternatives for > bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN > mode. > > The logic for choosing when to enable the unaligned-access support (and > the name of the option to override the default behaviour) is lifted from > Jie's patch. > > Tested with cross to ARM Linux, and (on a branch) in both little & > big-endian mode cross to ARM EABI, with no regressions. OK to apply? > > Thanks, > > Julian > > ChangeLog > > gcc/ > * config/arm/arm.c (arm_override_options): Add unaligned_access > support. > * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) > (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. > (insv, extzv): Add unaligned-access support. > (extv): Change to expander. Likewise. > (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) > (unaligned_storesi, unaligned_storehi): New. > (*extv_reg): New (previous extv implementation). > * config/arm/arm.opt (munaligned_access): Add option. > * expmed.c (store_bit_field_1): Don't tweak bitfield numbering for > memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. > (extract_bit_field_1): Likewise. + if (unaligned_access == 2) +{ + if (arm_arch6 && (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + The compiler should fault -munaligned-access on cores that don't support it. +(define_insn "unaligned_loadsi" + [(set (match_operand:SI 0 "s_register_operand" "=r") + (unspec:SI [(match_operand:SI 1 "memory_operand" "m")] + UNSPEC_UNALIGNED_LOAD))] + "unaligned_access" + "ldr%?\t%0, %1\t@ unaligned" + [(set_attr "predicable" "yes") + (set_attr "type" "load1")]) I think the final condition should also include TARGET_32BIT, as these patterns aren't expected to work with Thumb-1. Secondly, they should be structured to get the length information right when a 16-bit encoding can be used in Thumb-2 (you'll keep Carrot happy that way :-) : just add an alternative that's only enabled for Thumb mode and which matches the requirements for a 16-bit instruction (beware however of the 16-bit write-back case). Finally, I don't see anything to put out the correct build attribute information for unaligned access enabled (Tag_CPU_unaligned_access). Have I just missed it? R.
[PATCH, ARM] Unaligned accesses for packed structures [1/2]
Hi, This is the first of two patches to add unaligned-access support to the ARM backend. This is done somewhat differently to Jie Zhang's earlier patch: http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01890.html In that with Jie's patch, *any* pointer dereference would be allowed to access unaligned data. This has the undesirable side-effect of disallowing instructions which don't support unaligned accesses (LDRD, LDM etc.) when unaligned accesses are enabled. Instead, this patch enables only packed-structure accesses to use ldr/str/ldrh/strh, by taking a hint from the MIPS ldl/ldr implementation. I figured the unaligned-access ARM case is kind of similar to those, except that normal loads/stores are used, and the shifting/merging happens in hardware. The standard names extv/extzv/insv can take a memory operand for the source/destination of the extract/insert operation, so we just expand to unspec'ed versions of the load and store operations when unaligned-access support is enabled: the benefit of doing that rather than, say, expanding using the regular movsi pattern is that we bypass any smartness in the compiler which might replace operations which work for unaligned accesses (ldr/str/ldrh/strh) with operations which don't work (ldrd/strd/ldm/stm/vldr/...). The downside is we might potentially miss out on optimization opportunities (since these things no longer look like plain memory accesses). Doing things this way allows us to leave the settings for STRICT_ALIGNMENT/SLOW_BYTE_ACCESS alone, avoiding the disruption that changing them might cause. The most awkward change in the patch is to generic code (expmed.c, {store,extract}_bit_field_1): in big-endian mode, the existing behaviour (when inserting/extracting a bitfield to a memory location) is definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations, and if bitsize (the size of the field to insert/extract) is greater than BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative. That can't possibly be intentional; I can only assume that this code path is not exercised for machines which have memory alternatives for bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN mode. The logic for choosing when to enable the unaligned-access support (and the name of the option to override the default behaviour) is lifted from Jie's patch. Tested with cross to ARM Linux, and (on a branch) in both little & big-endian mode cross to ARM EABI, with no regressions. OK to apply? Thanks, Julian ChangeLog gcc/ * config/arm/arm.c (arm_override_options): Add unaligned_access support. * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD) (UNSPEC_UNALIGNED_STORE): Add constants for unspecs. (insv, extzv): Add unaligned-access support. (extv): Change to expander. Likewise. (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu) (unaligned_storesi, unaligned_storehi): New. (*extv_reg): New (previous extv implementation). * config/arm/arm.opt (munaligned_access): Add option. * expmed.c (store_bit_field_1): Don't tweak bitfield numbering for memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. (extract_bit_field_1): Likewise. commit e76508ff702406fd63bc59465d9c7ab70dcb3266 Author: Julian Brown Date: Wed May 4 10:06:25 2011 -0700 Permit regular ldr/str/ldrh/strh for packed-structure accesses etc. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4f9c2aa..a18aea6 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1833,6 +1833,22 @@ arm_option_override (void) fix_cm3_ldrd = 0; } + /* Enable -munaligned-access by default for + - all ARMv6 architecture-based processors + - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors. + + Disable -munaligned-access by default for + - all pre-ARMv6 architecture-based processors + - ARMv6-M architecture-based processors. */ + + if (unaligned_access == 2) +{ + if (arm_arch6 && (arm_arch_notm || arm_arch7)) + unaligned_access = 1; + else + unaligned_access = 0; +} + if (TARGET_THUMB1 && flag_schedule_insns) { /* Don't warn since it's on by default in -O2. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 40ebf35..7d37445 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -104,6 +104,10 @@ UNSPEC_SYMBOL_OFFSET ; The offset of the start of the symbol from ; another symbolic address. UNSPEC_MEMORY_BARRIER ; Represent a memory barrier. + UNSPEC_UNALIGNED_LOAD ; Used to represent ldr/ldrh instructions that access + ; unaligned locations, on architectures which support + ; that. + UNSPEC_UNALIGNED_STORE ; Same for str/strh. ]) ;; UNSPEC_VOLATILE Usage: @@ -2393,7 +2397,7 @@ ;;; this insv pattern, so this pattern needs to be reevalutated. (define_expand "insv" - [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "") + [(set (zero_extract:SI (match_oper