Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-09-21 Thread Ye Joey
Julian,

This patch works for register ld/st, but doesn't work for immediate,
as showed in example.

Would you further improve it for imm?

Thanks - Joey

$ arm-none-eabi-gcc -v 2>&1 | grep version
gcc version 4.7.0 20110922 (experimental) [trunk revision 179074] (GCC)
$ cat u.c
struct packed_str
{
char f0;
int  f1;
}__attribute__((packed)) u;

void foo(int a)
{
  u.f1 = a; // works fine
}

void bar()
{
  u.f1 = 1; // doesn't work
}

$ arm-none-eabi-gcc -mthumb -march=armv7 -S -O2 u.c
$ cat u.s
.syntax unified
.arch armv7
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 1
.eabi_attribute 18, 4
.thumb
.file   "u.c"
.text
.align  2
.global foo
.thumb
.thumb_func
.type   foo, %function
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movwr3, #:lower16:u
movtr3, #:upper16:u
str r0, [r3, #1]@ unaligned
bx  lr
.size   foo, .-foo
.align  2
.global bar
.thumb
.thumb_func
.type   bar, %function
bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movwr3, #:lower16:u
movsr1, #0
movtr3, #:upper16:u

@ Still uses aligned str from here
ldrbr2, [r3, #0]@ zero_extendqisi2
strbr1, [r3, #4]
orr r2, r2, #256
str r2, [r3, #0]
bx  lr
.size   bar, .-bar
.comm   u,5,4
.ident  "GCC: (GNU) 4.7.0 20110922 (experimental) [trunk revision 
179074]"


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-09-08 Thread Richard Earnshaw
On 26/08/11 17:39, Julian Brown wrote:
> On Thu, 25 Aug 2011 18:31:21 +0100
> Julian Brown  wrote:
> 
>> On Thu, 25 Aug 2011 16:46:50 +0100
>> Julian Brown  wrote:
>>
>>> So, OK to apply this version, assuming testing comes out OK? (And
>>> the followup patch [2/2], which remains unchanged?)
>>
>> FWIW, all tests pass, apart from
>> gcc.target/arm/volatile-bitfields-3.c, which regresses. The output
>> contains:
>>
>> ldrhr0, [r3, #2]@ unaligned
>>
>> I believe that, to conform to the ARM EABI, that GCC must use an
>> (aligned) ldr in this case. Is that correct? If so, it looks like the
>> middle-end bitfield code does not take the setting of
>> -fstrict-volatile-bitfields into account.
> 
> This version fixes the last issue, by adding additional checks for
> volatile accesses/-fstrict-volatile-bitfields. Tests now show no
> regressions.
> 
> OK to apply?
> 

+  /* On big-endian machines, we count bits from the most significant.
+If the bit field insn does not, we must invert.  */


It sounds to me like this comment is somewhat out of date now that we
have BITS_BIG_ENDIAN; it would be better re-worded to reflect the code
as it stands now.

Other than that, OK.

R.

> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * config/arm/arm.c (arm_override_options): Add unaligned_access
> support.
> (arm_file_start): Emit attribute for unaligned access as
> appropriate.
> * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
> (UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
> (insv, extzv): Add unaligned-access support.
> (extv): Change to expander. Likewise.
> (extzv_t1, extv_regsi): Add helpers.
> (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
> (unaligned_storesi, unaligned_storehi): New.
> (*extv_reg): New (previous extv implementation).
> * config/arm/arm.opt (munaligned_access): Add option.
> * config/arm/constraints.md (Uw): New constraint.
> * expmed.c (store_bit_field_1): Adjust bitfield numbering according
> to size of access, not size of unit, when BITS_BIG_ENDIAN !=
> BYTES_BIG_ENDIAN. Don't use bitfield accesses for
> volatile accesses when -fstrict-volatile-bitfields is in effect.
> (extract_bit_field_1): Likewise.
> 


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-09-01 Thread Ulrich Weigand
Julian Brown wrote:

> The problem is, if we're using little-endian bit numbering for memory
> locations in big-endian-bytes mode, we need to define an origin from
> which to count "backwards" from. For the current implementation, this
> will now be (I believe) one word beyond the base address of the access
> in question, which is IMO slightly bizarre, to say the least.

It doesn't seem all that bizarre to me.  Conceptually, the extraction
(etc.) operation works on an operand of integer type (usually, word sized)
in two steps:
- read the operand from its storage location, resulting in a plain
  (conceptual) integer value
- extract certain numbered bits from that integer value

The first step, reading the operand from the storage location,
depends on BYTES_BIG_ENDIAN (as all memory reads do).  It does not
depend on BITS_BIG_ENDIAN at all.

The second step, extracting numbered bits, depends on BITS_BIG_ENDIAN,
which provides the link between bit number and its value.  This step
however does not depend on BYTES_BIG_ENDIAN at all.  [However, if
BITS_BIG_ENDIAN is true, you need to consider the total size of the
operand in order to perform the conversion, since bit 0 then refers
to the most-significant bit, and which bit that is depends on the
size ...  But this still does not depend on *BYTES_BIG_ENDIAN*.]

Thus, the two flags can be considered really independent, and any
combination is quite well-defined.


When actually implementing the extraction, you will of course deviate
from that conceptual sequence, e.g. by avoiding to read certain bytes
if you know none of the bits in them will end up in the final result.
But while this might result in some computations that may not 
immediately look obvious, in the end this is just an optimization
step and doesn't change that the endian flags remain well-defined ...


Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-08-26 Thread Julian Brown
On Thu, 25 Aug 2011 18:31:21 +0100
Julian Brown  wrote:

> On Thu, 25 Aug 2011 16:46:50 +0100
> Julian Brown  wrote:
> 
> > So, OK to apply this version, assuming testing comes out OK? (And
> > the followup patch [2/2], which remains unchanged?)
> 
> FWIW, all tests pass, apart from
> gcc.target/arm/volatile-bitfields-3.c, which regresses. The output
> contains:
> 
> ldrhr0, [r3, #2]@ unaligned
> 
> I believe that, to conform to the ARM EABI, that GCC must use an
> (aligned) ldr in this case. Is that correct? If so, it looks like the
> middle-end bitfield code does not take the setting of
> -fstrict-volatile-bitfields into account.

This version fixes the last issue, by adding additional checks for
volatile accesses/-fstrict-volatile-bitfields. Tests now show no
regressions.

OK to apply?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm.c (arm_override_options): Add unaligned_access
support.
(arm_file_start): Emit attribute for unaligned access as
appropriate.
* config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
(UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
(insv, extzv): Add unaligned-access support.
(extv): Change to expander. Likewise.
(extzv_t1, extv_regsi): Add helpers.
(unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
(unaligned_storesi, unaligned_storehi): New.
(*extv_reg): New (previous extv implementation).
* config/arm/arm.opt (munaligned_access): Add option.
* config/arm/constraints.md (Uw): New constraint.
* expmed.c (store_bit_field_1): Adjust bitfield numbering according
to size of access, not size of unit, when BITS_BIG_ENDIAN !=
BYTES_BIG_ENDIAN. Don't use bitfield accesses for
volatile accesses when -fstrict-volatile-bitfields is in effect.
(extract_bit_field_1): Likewise.
commit 645a7c99ff91ea2841c8101fb3c76e3b1fddb2c7
Author: Julian Brown 
Date:   Tue Aug 23 05:46:22 2011 -0700

Unaligned support for packed structs

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3162b30..cc1eb80 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1905,6 +1905,28 @@ arm_option_override (void)
 	fix_cm3_ldrd = 0;
 }
 
+  /* Enable -munaligned-access by default for
+ - all ARMv6 architecture-based processors
+ - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
+
+ Disable -munaligned-access by default for
+ - all pre-ARMv6 architecture-based processors
+ - ARMv6-M architecture-based processors.  */
+
+  if (unaligned_access == 2)
+{
+  if (arm_arch6 && (arm_arch_notm || arm_arch7))
+	unaligned_access = 1;
+  else
+	unaligned_access = 0;
+}
+  else if (unaligned_access == 1
+	   && !(arm_arch6 && (arm_arch_notm || arm_arch7)))
+{
+  warning (0, "target CPU does not support unaligned accesses");
+  unaligned_access = 0;
+}
+
   if (TARGET_THUMB1 && flag_schedule_insns)
 {
   /* Don't warn since it's on by default in -O2.  */
@@ -22145,6 +22167,10 @@ arm_file_start (void)
 	val = 6;
   asm_fprintf (asm_out_file, "\t.eabi_attribute 30, %d\n", val);
 
+  /* Tag_CPU_unaligned_access.  */
+  asm_fprintf (asm_out_file, "\t.eabi_attribute 34, %d\n",
+		   unaligned_access);
+
   /* Tag_ABI_FP_16bit_format.  */
   if (arm_fp16_format)
 	asm_fprintf (asm_out_file, "\t.eabi_attribute 38, %d\n",
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 0f23400..0ea0f7f 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -103,6 +103,10 @@
   UNSPEC_SYMBOL_OFFSET  ; The offset of the start of the symbol from
 ; another symbolic address.
   UNSPEC_MEMORY_BARRIER ; Represent a memory barrier.
+  UNSPEC_UNALIGNED_LOAD	; Used to represent ldr/ldrh instructions that access
+			; unaligned locations, on architectures which support
+			; that.
+  UNSPEC_UNALIGNED_STORE ; Same for str/strh.
 ])
 
 ;; UNSPEC_VOLATILE Usage:
@@ -2468,10 +2472,10 @@
 ;;; this insv pattern, so this pattern needs to be reevalutated.
 
 (define_expand "insv"
-  [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "")
- (match_operand:SI 1 "general_operand" "")
- (match_operand:SI 2 "general_operand" ""))
-(match_operand:SI 3 "reg_or_int_operand" ""))]
+  [(set (zero_extract (match_operand 0 "nonimmediate_operand" "")
+  (match_operand 1 "general_operand" "")
+  (match_operand 2 "general_operand" ""))
+(match_operand 3 "reg_or_int_operand" ""))]
   "TARGET_ARM || arm_arch_thumb2"
   "
   {
@@ -2482,35 +2486,70 @@
 
 if (arm_arch_thumb2)
   {
-	bool use_bfi = TRUE;
-
-	if (GET_CODE (operands[3]) == CONST_INT)
+if (unaligned_access && MEM_P (operands[0])
+	&& s_register_operand (operands[3], GET_MODE (operands[3]))
+	&& (width == 16 || width == 32) && (start_bit % BITS_PER_UNIT) == 0)
 	  {
-	HOST_WIDE_INT val = INTVAL (operands[3]) & mask;
+	

Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-08-25 Thread Joseph S. Myers
On Thu, 25 Aug 2011, Ramana Radhakrishnan wrote:

> On 25 August 2011 18:31, Julian Brown  wrote:
> > On Thu, 25 Aug 2011 16:46:50 +0100
> > Julian Brown  wrote:
> >
> >> So, OK to apply this version, assuming testing comes out OK? (And the
> >> followup patch [2/2], which remains unchanged?)
> >
> > FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c,
> > which regresses. The output contains:
> >
> >        ldrh    r0, [r3, #2]    @ unaligned
> >
> > I believe that, to conform to the ARM EABI, that GCC must use an
> > (aligned) ldr in this case. Is that correct?
> 
> That is correct by my reading of the ABI Spec.
> 
> The relevant section is 7.1.7.5 where it states that :
> 
> "When a volatile bitfield is read it's container must be read exactly
> once using the access width appropriate to the type of the container.
> "
> 
> Here the type of the container is a long and hence the access should
> be with an ldr instruction followed by a shift as it is today
> irrespective whether we support unaligned accesses in this case.

Except for packed structures, anyway (and there aren't any packed 
structures involved in this particular testcase).  The ABI doesn't cover 
packed structures and I maintain that there the target-independent GNU C 
semantics take precedence - meaning that if the compiler doesn't know that 
a single read with the relevant access width is going to be safe, it must 
use an instruction sequence it does know to be safe.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-08-25 Thread Ramana Radhakrishnan
On 25 August 2011 18:31, Julian Brown  wrote:
> On Thu, 25 Aug 2011 16:46:50 +0100
> Julian Brown  wrote:
>
>> So, OK to apply this version, assuming testing comes out OK? (And the
>> followup patch [2/2], which remains unchanged?)
>
> FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c,
> which regresses. The output contains:
>
>        ldrh    r0, [r3, #2]    @ unaligned
>
> I believe that, to conform to the ARM EABI, that GCC must use an
> (aligned) ldr in this case. Is that correct?

That is correct by my reading of the ABI Spec.

The relevant section is 7.1.7.5 where it states that :

"When a volatile bitfield is read it's container must be read exactly
once using the access width appropriate to the type of the container.
"

Here the type of the container is a long and hence the access should
be with an ldr instruction followed by a shift as it is today
irrespective whether we support unaligned accesses in this case.

cheers
Ramana


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-08-25 Thread Julian Brown
On Thu, 25 Aug 2011 16:46:50 +0100
Julian Brown  wrote:

> So, OK to apply this version, assuming testing comes out OK? (And the
> followup patch [2/2], which remains unchanged?)

FWIW, all tests pass, apart from gcc.target/arm/volatile-bitfields-3.c,
which regresses. The output contains:

ldrhr0, [r3, #2]@ unaligned

I believe that, to conform to the ARM EABI, that GCC must use an
(aligned) ldr in this case. Is that correct? If so, it looks like the
middle-end bitfield code does not take the setting of
-fstrict-volatile-bitfields into account.

Julian


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-08-25 Thread Julian Brown
On Mon, 4 Jul 2011 13:43:31 +0200 (CEST)
"Ulrich Weigand"  wrote:

> Julian Brown wrote:
> 
> > The most awkward change in the patch is to generic code (expmed.c,
> > {store,extract}_bit_field_1): in big-endian mode, the existing
> > behaviour (when inserting/extracting a bitfield to a memory
> > location) is definitely bogus: "unit" is set to BITS_PER_UNIT for
> > memory locations, and if bitsize (the size of the field to
> > insert/extract) is greater than BITS_PER_UNIT (which isn't unusual
> > at all), xbitpos becomes negative. That can't possibly be
> > intentional; I can only assume that this code path is not exercised
> > for machines which have memory alternatives for bitfield
> > insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN mode.

> I agree that the current code cannot possibly be correct.  However,
> just disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering
> *completely* seems wrong to me as well.  
> 
> According to the docs, the meaning bit position passed to the
> extv/insv expanders is determined by BITS_BIG_ENDIAN, both in the
> cases of register and memory operands.  Therefore, if BITS_BIG_ENDIAN
> differs from BYTES_BIG_ENDIAN, we should need a correction for memory
> operands as well.  However, this correction needs to be relative to
> the size of the access (i.e. the operand to the extv/insn), not just
> BITS_PER_UNIT.

> Note that with that change, the new code your patch introduces to the
> ARM back-end will also need to change.  You currently handle bitpos
> like this:
> 
>   base_addr = adjust_address (operands[1], HImode,
>   bitpos / BITS_PER_UNIT);
> 
> This implicitly assumes that bitpos counts according to
> BYTES_BIG_ENDIAN, not BITS_BIG_ENDIAN -- which exactly cancels out
> the common code behaviour introduced by your patch ...

I've updated the patch to work with current mainline, and implemented
your suggestion along with the change of the interpretation of bitpos in
the insv/extv/extzv expanders in arm.md. It seems to work fine
(testing still in progress), but I'm a bit concerned that the semantics
of bit-positioning for memory operands when BYTES_BIG_ENDIAN
&& !BITS_BIG_ENDIAN are now strange to the point of perversity.

The problem is, if we're using little-endian bit numbering for memory
locations in big-endian-bytes mode, we need to define an origin from
which to count "backwards" from. For the current implementation, this
will now be (I believe) one word beyond the base address of the access
in question, which is IMO slightly bizarre, to say the least. But, I
can't think of any other easy ways forward than either this patch, or
the previous one which disabled bit-endian switching entirely for
memory operands in this case.

So, OK to apply this version, assuming testing comes out OK? (And the
followup patch [2/2], which remains unchanged?)

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm.c (arm_override_options): Add unaligned_access
support.
(arm_file_start): Emit attribute for unaligned access as
appropriate.
* config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
(UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
(insv, extzv): Add unaligned-access support.
(extv): Change to expander. Likewise.
(extzv_t1, extv_regsi): Add helpers.
(unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
(unaligned_storesi, unaligned_storehi): New.
(*extv_reg): New (previous extv implementation).
* config/arm/arm.opt (munaligned_access): Add option.
* config/arm/constraints.md (Uw): New constraint.
* expmed.c (store_bit_field_1): Adjust bitfield numbering according
to size of access, not size of unit, when BITS_BIG_ENDIAN !=
BYTES_BIG_ENDIAN.
(extract_bit_field_1): Likewise.
commit 7bf9c1c0806ad1ae75e96635cda55fff4c40e7ae
Author: Julian Brown 
Date:   Tue Aug 23 05:46:22 2011 -0700

Unaligned support for packed structs

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 2353704..dda2718 100644
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3162b30..cc1eb80 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1905,6 +1905,28 @@ arm_option_override (void)
 	fix_cm3_ldrd = 0;
 }
 
+  /* Enable -munaligned-access by default for
+ - all ARMv6 architecture-based processors
+ - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
+
+ Disable -munaligned-access by default for
+ - all pre-ARMv6 architecture-based processors
+ - ARMv6-M architecture-based processors.  */
+
+  if (unaligned_access == 2)
+{
+  if (arm_arch6 && (arm_arch_notm || arm_arch7))
+	unaligned_access = 1;
+  else
+	unaligned_access = 0;
+}
+  else if (unaligned_access == 1
+	   && !(arm_arch6 && (arm_arch_notm || arm_arch7)))
+{
+  warning (0, "target CPU does not support unaligned accesses");
+  unaligned_access = 0;
+}
+
   if (TARGET_THUMB1 && flag_schedule_insns)

Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-07-04 Thread Ulrich Weigand
Julian Brown wrote:

> The most awkward change in the patch is to generic code (expmed.c,
> {store,extract}_bit_field_1): in big-endian mode, the existing behaviour
> (when inserting/extracting a bitfield to a memory location) is
> definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations,
> and if bitsize (the size of the field to insert/extract) is greater than
> BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative.
> That can't possibly be intentional; I can only assume that this code
> path is not exercised for machines which have memory alternatives for
> bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN
> mode.
[snip]
> @@ -648,7 +648,7 @@ store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT 
> bitsize,
>/* On big-endian machines, we count bits from the most significant.
>If the bit field insn does not, we must invert.  */
>  
> -  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
> +  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN && !MEM_P (xop0))
>   xbitpos = unit - bitsize - xbitpos;

I agree that the current code cannot possibly be correct.  However, just
disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering *completely*
seems wrong to me as well.  

According to the docs, the meaning bit position passed to the extv/insv
expanders is determined by BITS_BIG_ENDIAN, both in the cases of register
and memory operands.  Therefore, if BITS_BIG_ENDIAN differs from
BYTES_BIG_ENDIAN, we should need a correction for memory operands as
well.  However, this correction needs to be relative to the size of
the access (i.e. the operand to the extv/insn), not just BITS_PER_UNIT.

>From looking at the sources, the simplest way to implement that might
be to swap the order of the two corrections, that is, change this:


  /* On big-endian machines, we count bits from the most significant.
 If the bit field insn does not, we must invert.  */

  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
xbitpos = unit - bitsize - xbitpos;

  /* We have been counting XBITPOS within UNIT.
 Count instead within the size of the register.  */
  if (BITS_BIG_ENDIAN && !MEM_P (xop0))
xbitpos += GET_MODE_BITSIZE (op_mode) - unit;

  unit = GET_MODE_BITSIZE (op_mode);


to look instead like:

  /* We have been counting XBITPOS within UNIT.
 Count instead within the size of the register.  */
  if (BYTES_BIG_ENDIAN && !MEM_P (xop0))
xbitpos += GET_MODE_BITSIZE (op_mode) - unit;

  unit = GET_MODE_BITSIZE (op_mode);

  /* On big-endian machines, we count bits from the most significant.
 If the bit field insn does not, we must invert.  */

  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
xbitpos = unit - bitsize - xbitpos;


(Note that the condition in the first if must then check BYTES_BIG_ENDIAN
instead of BITS_BIG_ENDIAN.)   This change results in unchanged behaviour
for register operands in all cases, and memory operands if BITS_BIG_ENDIAN
== BYTES_BIG_ENDIAN.  For the problematic case of memory operands with
BITS_BIG_ENDIAN != BYTES_BIG ENDIAN it should result in the appropriate
correction.

Note that with that change, the new code your patch introduces to the
ARM back-end will also need to change.  You currently handle bitpos
like this:

  base_addr = adjust_address (operands[1], HImode,
  bitpos / BITS_PER_UNIT);

This implicitly assumes that bitpos counts according to BYTES_BIG_ENDIAN,
not BITS_BIG_ENDIAN -- which exactly cancels out the common code behaviour
introduced by your patch ...

Thoughts?  Am I overlooking something here?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-05-26 Thread Julian Brown
On Mon, 9 May 2011 18:01:12 +0100
Julian Brown  wrote:

> How does this look now? (Re-testing in progress.)

The previously-posted version contained a bug in the "extv" expander,
which became apparent only when testing together with my fixed-point
support patch. This is a corrected version.

Re-tested (together with the followup unaligned-memcpy patch and
fixed-point patch series), with no regressions in both big &
little-endian mode, cross to ARM EABI.

OK to apply this version?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm.c (arm_override_options): Add unaligned_access
support.
(arm_file_start): Emit attribute for unaligned access as
appropriate.
* config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
(UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
(insv, extzv): Add unaligned-access support.
(extv): Change to expander. Likewise.
(unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
(unaligned_storesi, unaligned_storehi): New.
(*extv_reg): New (previous extv implementation).
* config/arm/arm.opt (munaligned_access): Add option.
* config/arm/constraints.md (Uw): New constraint.
* expmed.c (store_bit_field_1): Don't tweak bitfield numbering for
memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
(extract_bit_field_1): Likewise.
commit d66c777aa23e6df0d68265767d4ae9f370ffb761
Author: Julian Brown 
Date:   Tue May 24 10:01:48 2011 -0700

Unaligned support for packed structs

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 47c7a3a..177c9bc 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1674,6 +1674,28 @@ arm_option_override (void)
 	fix_cm3_ldrd = 0;
 }
 
+  /* Enable -munaligned-access by default for
+ - all ARMv6 architecture-based processors
+ - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
+
+ Disable -munaligned-access by default for
+ - all pre-ARMv6 architecture-based processors
+ - ARMv6-M architecture-based processors.  */
+
+  if (unaligned_access == 2)
+{
+  if (arm_arch6 && (arm_arch_notm || arm_arch7))
+	unaligned_access = 1;
+  else
+	unaligned_access = 0;
+}
+  else if (unaligned_access == 1
+	   && !(arm_arch6 && (arm_arch_notm || arm_arch7)))
+{
+  warning (0, "target CPU does not support unaligned accesses");
+  unaligned_access = 0;
+}
+
   if (TARGET_THUMB1 && flag_schedule_insns)
 {
   /* Don't warn since it's on by default in -O2.  */
@@ -21555,6 +21577,10 @@ arm_file_start (void)
 	val = 6;
   asm_fprintf (asm_out_file, "\t.eabi_attribute 30, %d\n", val);
 
+  /* Tag_CPU_unaligned_access.  */
+  asm_fprintf (asm_out_file, "\t.eabi_attribute 34, %d\n",
+		   unaligned_access);
+
   /* Tag_ABI_FP_16bit_format.  */
   if (arm_fp16_format)
 	asm_fprintf (asm_out_file, "\t.eabi_attribute 38, %d\n",
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index fad82f0..12fd7ca 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -104,6 +104,10 @@
   UNSPEC_SYMBOL_OFFSET  ; The offset of the start of the symbol from
 ; another symbolic address.
   UNSPEC_MEMORY_BARRIER ; Represent a memory barrier.
+  UNSPEC_UNALIGNED_LOAD	; Used to represent ldr/ldrh instructions that access
+			; unaligned locations, on architectures which support
+			; that.
+  UNSPEC_UNALIGNED_STORE ; Same for str/strh.
 ])
 
 ;; UNSPEC_VOLATILE Usage:
@@ -2393,7 +2397,7 @@
 ;;; this insv pattern, so this pattern needs to be reevalutated.
 
 (define_expand "insv"
-  [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "")
+  [(set (zero_extract:SI (match_operand:SI 0 "nonimmediate_operand" "")
  (match_operand:SI 1 "general_operand" "")
  (match_operand:SI 2 "general_operand" ""))
 (match_operand:SI 3 "reg_or_int_operand" ""))]
@@ -2407,35 +2411,66 @@
 
 if (arm_arch_thumb2)
   {
-	bool use_bfi = TRUE;
-
-	if (GET_CODE (operands[3]) == CONST_INT)
+if (unaligned_access && MEM_P (operands[0])
+	&& s_register_operand (operands[3], GET_MODE (operands[3]))
+	&& (width == 16 || width == 32) && (start_bit % BITS_PER_UNIT) == 0)
 	  {
-	HOST_WIDE_INT val = INTVAL (operands[3]) & mask;
-
-	if (val == 0)
+	rtx base_addr;
+	
+	if (width == 32)
 	  {
-		emit_insn (gen_insv_zero (operands[0], operands[1],
-	  operands[2]));
-		DONE;
+	base_addr = adjust_address (operands[0], SImode,
+	start_bit / BITS_PER_UNIT);
+		emit_insn (gen_unaligned_storesi (base_addr, operands[3]));
 	  }
+	else
+	  {
+	rtx tmp = gen_reg_rtx (HImode);
 
-	/* See if the set can be done with a single orr instruction.  */
-	if (val == mask && const_ok_for_arm (val << start_bit))
-	  use_bfi = FALSE;
+	base_addr = adjust_address (operands[0], HImode,
+	start_bit / BITS_PER_UNIT);
+		emit_move_insn (tmp, gen_lowpart (HImode, operands[3]));
+

Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-05-10 Thread Julian Brown
On Mon, 9 May 2011 18:01:12 +0100
Julian Brown  wrote:

> How does this look now? (Re-testing in progress.)

Results from re-testing are fine, btw.

Cheers,

Julian


Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-05-09 Thread Julian Brown
On Fri, 06 May 2011 14:04:16 +0100
Richard Earnshaw  wrote:

> On Fri, 2011-05-06 at 13:34 +0100, Julian Brown wrote:
> > This is the first of two patches to add unaligned-access support to
> > the ARM backend. [...]

> The compiler should fault -munaligned-access on cores that don't
> support it.

I've implemented this as a warning (which automatically disables the
option).

> +(define_insn "unaligned_loadsi"
> +  [(set (match_operand:SI 0 "s_register_operand" "=r")
> +   (unspec:SI [(match_operand:SI 1 "memory_operand" "m")]
> +  UNSPEC_UNALIGNED_LOAD))]
> +  "unaligned_access"
> +  "ldr%?\t%0, %1\t@ unaligned"
> +  [(set_attr "predicable" "yes")
> +   (set_attr "type" "load1")])
> 
> I think the final condition should also include TARGET_32BIT, as these
> patterns aren't expected to work with Thumb-1.

Added.

> Secondly, they should be structured to get the length information
> right when a 16-bit encoding can be used in Thumb-2 (you'll keep
> Carrot happy that way :-) : just add an alternative that's only
> enabled for Thumb mode and which matches the requirements for a
> 16-bit instruction (beware however of the 16-bit write-back case).

I've done this, assuming that by "16-bit write-back case" you meant
singleton-register-list ldmia/stmia instructions masquerading as
post-increment ldr/str? I've disallowed that by adding a new
constraint. It's not entirely clear to me whether Thumb-2 (vs. Thumb-1)
will actually ever use 16-bit ldmia/stmia instead of 32-bit writeback
ldr/str though.

> Finally, I don't see anything to put out the correct build attribute
> information for unaligned access enabled (Tag_CPU_unaligned_access).
> Have I just missed it?

No you didn't miss it, it wasn't there :-). Added.

How does this look now? (Re-testing in progress.)

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm.c (arm_override_options): Add unaligned_access
support.
(arm_file_start): Emit attribute for unaligned access as
appropriate.
* config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
(UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
(insv, extzv): Add unaligned-access support.
(extv): Change to expander. Likewise.
(unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
(unaligned_storesi, unaligned_storehi): New.
(*extv_reg): New (previous extv implementation).
* config/arm/arm.opt (munaligned_access): Add option.
* config/arm/constraints.md (Uw): New constraint.
* expmed.c (store_bit_field_1): Don't tweak bitfield numbering for
memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
(extract_bit_field_1): Likewise.
commit bd55df2538dd90e576dd0f69bc9c9d570c8eee08
Author: Julian Brown 
Date:   Wed May 4 10:06:25 2011 -0700

Permit regular ldr/str/ldrh/strh for packed-structure accesses etc.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4f9c2aa..f0f1a73 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1833,6 +1833,28 @@ arm_option_override (void)
 	fix_cm3_ldrd = 0;
 }
 
+  /* Enable -munaligned-access by default for
+ - all ARMv6 architecture-based processors
+ - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
+
+ Disable -munaligned-access by default for
+ - all pre-ARMv6 architecture-based processors
+ - ARMv6-M architecture-based processors.  */
+
+  if (unaligned_access == 2)
+{
+  if (arm_arch6 && (arm_arch_notm || arm_arch7))
+	unaligned_access = 1;
+  else
+	unaligned_access = 0;
+}
+  else if (unaligned_access == 1
+	   && !(arm_arch6 && (arm_arch_notm || arm_arch7)))
+{
+  warning (0, "target CPU does not support unaligned accesses");
+  unaligned_access = 0;
+}
+
   if (TARGET_THUMB1 && flag_schedule_insns)
 {
   /* Don't warn since it's on by default in -O2.  */
@@ -21714,6 +21736,10 @@ arm_file_start (void)
 	val = 6;
   asm_fprintf (asm_out_file, "\t.eabi_attribute 30, %d\n", val);
 
+  /* Tag_CPU_unaligned_access.  */
+  asm_fprintf (asm_out_file, "\t.eabi_attribute 34, %d\n",
+		   unaligned_access);
+
   /* Tag_ABI_FP_16bit_format.  */
   if (arm_fp16_format)
 	asm_fprintf (asm_out_file, "\t.eabi_attribute 38, %d\n",
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 40ebf35..59b9ffb 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -104,6 +104,10 @@
   UNSPEC_SYMBOL_OFFSET  ; The offset of the start of the symbol from
 ; another symbolic address.
   UNSPEC_MEMORY_BARRIER ; Represent a memory barrier.
+  UNSPEC_UNALIGNED_LOAD	; Used to represent ldr/ldrh instructions that access
+			; unaligned locations, on architectures which support
+			; that.
+  UNSPEC_UNALIGNED_STORE ; Same for str/strh.
 ])
 
 ;; UNSPEC_VOLATILE Usage:
@@ -2393,7 +2397,7 @@
 ;;; this insv pattern, so this pattern needs to be reevalutated.
 
 (define_expand "insv"
-  [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "")
+  [(set (zero_extrac

Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-05-06 Thread Richard Earnshaw

On Fri, 2011-05-06 at 13:34 +0100, Julian Brown wrote:
> Hi,
> 
> This is the first of two patches to add unaligned-access support to the
> ARM backend. This is done somewhat differently to Jie Zhang's earlier
> patch:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01890.html
> 
> In that with Jie's patch, *any* pointer dereference would be allowed to
> access unaligned data. This has the undesirable side-effect of
> disallowing instructions which don't support unaligned accesses (LDRD,
> LDM etc.) when unaligned accesses are enabled.
> 
> Instead, this patch enables only packed-structure accesses to use
> ldr/str/ldrh/strh, by taking a hint from the MIPS ldl/ldr
> implementation. I figured the unaligned-access ARM case is kind of
> similar to those, except that normal loads/stores are used, and the
> shifting/merging happens in hardware.
> 
> The standard names extv/extzv/insv can take a memory
> operand for the source/destination of the extract/insert operation, so
> we just expand to unspec'ed versions of the load and store operations
> when unaligned-access support is enabled: the benefit of doing that
> rather than, say, expanding using the regular movsi pattern is that we
> bypass any smartness in the compiler which might replace operations
> which work for unaligned accesses (ldr/str/ldrh/strh) with operations
> which don't work (ldrd/strd/ldm/stm/vldr/...). The downside is we might
> potentially miss out on optimization opportunities (since these things
> no longer look like plain memory accesses).
> 
> Doing things this way allows us to leave the settings for
> STRICT_ALIGNMENT/SLOW_BYTE_ACCESS alone, avoiding the disruption that
> changing them might cause.
> 
> The most awkward change in the patch is to generic code (expmed.c,
> {store,extract}_bit_field_1): in big-endian mode, the existing behaviour
> (when inserting/extracting a bitfield to a memory location) is
> definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations,
> and if bitsize (the size of the field to insert/extract) is greater than
> BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative.
> That can't possibly be intentional; I can only assume that this code
> path is not exercised for machines which have memory alternatives for
> bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN
> mode.
> 
> The logic for choosing when to enable the unaligned-access support (and
> the name of the option to override the default behaviour) is lifted from
> Jie's patch.
> 
> Tested with cross to ARM Linux, and (on a branch) in both little &
> big-endian mode cross to ARM EABI, with no regressions. OK to apply?
> 
> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * config/arm/arm.c (arm_override_options): Add unaligned_access
> support.
> * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
> (UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
> (insv, extzv): Add unaligned-access support.
> (extv): Change to expander. Likewise.
> (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
> (unaligned_storesi, unaligned_storehi): New.
> (*extv_reg): New (previous extv implementation).
> * config/arm/arm.opt (munaligned_access): Add option.
> * expmed.c (store_bit_field_1): Don't tweak bitfield numbering for
> memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
> (extract_bit_field_1): Likewise.

+  if (unaligned_access == 2)
+{
+  if (arm_arch6 && (arm_arch_notm || arm_arch7))
+   unaligned_access = 1;
+  else
+   unaligned_access = 0;
+}
+

The compiler should fault -munaligned-access on cores that don't support
it.
+(define_insn "unaligned_loadsi"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (unspec:SI [(match_operand:SI 1 "memory_operand" "m")]
+  UNSPEC_UNALIGNED_LOAD))]
+  "unaligned_access"
+  "ldr%?\t%0, %1\t@ unaligned"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "load1")])

I think the final condition should also include TARGET_32BIT, as these
patterns aren't expected to work with Thumb-1.

Secondly, they should be structured to get the length information right
when a 16-bit encoding can be used in Thumb-2 (you'll keep Carrot happy
that way :-) : just add an alternative that's only enabled for Thumb
mode and which matches the requirements for a 16-bit instruction (beware
however of the 16-bit write-back case).

Finally, I don't see anything to put out the correct build attribute
information for unaligned access enabled (Tag_CPU_unaligned_access).
Have I just missed it?

R.





[PATCH, ARM] Unaligned accesses for packed structures [1/2]

2011-05-06 Thread Julian Brown
Hi,

This is the first of two patches to add unaligned-access support to the
ARM backend. This is done somewhat differently to Jie Zhang's earlier
patch:

  http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01890.html

In that with Jie's patch, *any* pointer dereference would be allowed to
access unaligned data. This has the undesirable side-effect of
disallowing instructions which don't support unaligned accesses (LDRD,
LDM etc.) when unaligned accesses are enabled.

Instead, this patch enables only packed-structure accesses to use
ldr/str/ldrh/strh, by taking a hint from the MIPS ldl/ldr
implementation. I figured the unaligned-access ARM case is kind of
similar to those, except that normal loads/stores are used, and the
shifting/merging happens in hardware.

The standard names extv/extzv/insv can take a memory
operand for the source/destination of the extract/insert operation, so
we just expand to unspec'ed versions of the load and store operations
when unaligned-access support is enabled: the benefit of doing that
rather than, say, expanding using the regular movsi pattern is that we
bypass any smartness in the compiler which might replace operations
which work for unaligned accesses (ldr/str/ldrh/strh) with operations
which don't work (ldrd/strd/ldm/stm/vldr/...). The downside is we might
potentially miss out on optimization opportunities (since these things
no longer look like plain memory accesses).

Doing things this way allows us to leave the settings for
STRICT_ALIGNMENT/SLOW_BYTE_ACCESS alone, avoiding the disruption that
changing them might cause.

The most awkward change in the patch is to generic code (expmed.c,
{store,extract}_bit_field_1): in big-endian mode, the existing behaviour
(when inserting/extracting a bitfield to a memory location) is
definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations,
and if bitsize (the size of the field to insert/extract) is greater than
BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative.
That can't possibly be intentional; I can only assume that this code
path is not exercised for machines which have memory alternatives for
bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN
mode.

The logic for choosing when to enable the unaligned-access support (and
the name of the option to override the default behaviour) is lifted from
Jie's patch.

Tested with cross to ARM Linux, and (on a branch) in both little &
big-endian mode cross to ARM EABI, with no regressions. OK to apply?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm.c (arm_override_options): Add unaligned_access
support.
* config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
(UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
(insv, extzv): Add unaligned-access support.
(extv): Change to expander. Likewise.
(unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
(unaligned_storesi, unaligned_storehi): New.
(*extv_reg): New (previous extv implementation).
* config/arm/arm.opt (munaligned_access): Add option.
* expmed.c (store_bit_field_1): Don't tweak bitfield numbering for
memory locations if BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
(extract_bit_field_1): Likewise.
commit e76508ff702406fd63bc59465d9c7ab70dcb3266
Author: Julian Brown 
Date:   Wed May 4 10:06:25 2011 -0700

Permit regular ldr/str/ldrh/strh for packed-structure accesses etc.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4f9c2aa..a18aea6 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1833,6 +1833,22 @@ arm_option_override (void)
 	fix_cm3_ldrd = 0;
 }
 
+  /* Enable -munaligned-access by default for
+ - all ARMv6 architecture-based processors
+ - ARMv7-A, ARMv7-R, and ARMv7-M architecture-based processors.
+
+ Disable -munaligned-access by default for
+ - all pre-ARMv6 architecture-based processors
+ - ARMv6-M architecture-based processors.  */
+
+  if (unaligned_access == 2)
+{
+  if (arm_arch6 && (arm_arch_notm || arm_arch7))
+	unaligned_access = 1;
+  else
+	unaligned_access = 0;
+}
+
   if (TARGET_THUMB1 && flag_schedule_insns)
 {
   /* Don't warn since it's on by default in -O2.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 40ebf35..7d37445 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -104,6 +104,10 @@
   UNSPEC_SYMBOL_OFFSET  ; The offset of the start of the symbol from
 ; another symbolic address.
   UNSPEC_MEMORY_BARRIER ; Represent a memory barrier.
+  UNSPEC_UNALIGNED_LOAD	; Used to represent ldr/ldrh instructions that access
+			; unaligned locations, on architectures which support
+			; that.
+  UNSPEC_UNALIGNED_STORE ; Same for str/strh.
 ])
 
 ;; UNSPEC_VOLATILE Usage:
@@ -2393,7 +2397,7 @@
 ;;; this insv pattern, so this pattern needs to be reevalutated.
 
 (define_expand "insv"
-  [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "")
+  [(set (zero_extract:SI (match_oper