from:"Thomas Preud'homme"

Re: bswap PRs 69714, 67781

2016-02-14 Thread Thomas Preud'homme

Hi Bernd,

First of all, my apologize for the late reply. I was in holidays the past week 
to celebrate Chinese new year.

On Friday, February 12, 2016 05:28:43 PM Bernd Schmidt wrote:
> PR69714 is an issue where the bswap pass makes an incorrect
> transformation on big-endian targets. The source has a 32-bit bswap, but
> PA doesn't have a pattern for that. Still, we recognize that there is a
> 16-bit bswap involved, and generate code for that - loading the halfword
> at offset 2 from the original memory, as per the proper big-endian
> correction.
> 
> The problem is that we recognized the rotation of the _high_ part, which
> is at offset 0 on big-endian. The symbolic number is 0x0304, rather than
> 0x0102 as it should be. Only the latter form should ever be matched.

Which is exactly what the patch for PR67781 was set out to do (see the if 
(BYTES_BIG_ENDIAN) block in find_bswap_or_nop.

The reason why the offset is wrong is due to another if (BYTES_BIG_ENDIAN) 
block in bswap_replace. I will check the testcase added with that latter 
block, my guess is that the change was trying to fix a similar issue to 
PR67781 and PR69714. When removing it the load in avcrc is done without an 
offset. I should have run the full testsuite also on a big endian system 
instead of a few selected testcases and a bootstrap in addition to the little 
endian bootstrap+testsuite. Lesson learned.

> The
> problem is caused by the patch for PR67781, which was intended to solve
> a different big-endian problem. Unfortunately, I think it is based on an
> incorrect analysis.
> 
> The real issue with the PR67781 testcase is in fact the masking loop,
> identified by Thomas in comment #7 for 67781.
> 
> for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++)
>;
> n->range = rsize;
> 
> If we have a value of 0x01020304, but a range of 5, it means that
> there's an "invisible" high-order byte that we don't care about. On
> little-endian, we can just ignore it. On big-endian, this implies that
> the data we're interested in is located at an offset. The code that does
> the replacements does not use the offset or bytepos fields, it assumes
> that the bytepos always matches that of the load instruction.

Yes, but the change in find_bswap_or_nop aims at checking that we have
0x05040302 or 0x02030405 for big endian targets and 0x04030201 or 0x01020304 
for little endian targets.

Before the "if (rsize < n->range)" block, cmpnop and cmpxchg are respectively 
0x0504030201 and 0x0102030405. Then for big endian it will only keep the 4 
least significant symbolic bytes of cmpxchg (if performs a bitwise and) and 
the 4 most significant symbolic bytes of cmpnop (it performs a right shift) so 
you'd get 0x05040302 for cmpnop and 0x02030405 for cmpxchg. Both would 
translate to a load at offset 0, and then a byteswap for the latter.

As said earlier, the problem is in bswap_replace which tries to adjust the 
address of the load for big endian targets by adding a load offset. With the 
change in find_bswap_or_nop, an offset is never needed because only pattern 
that correspond to a load at offset 0 are recognized. I kept for GCC 7 to 
change that to allow offset and recognize all sub-load and sub-bswap.

> The only
> offset we can introduce is the big-endian correction, but that assumes
> we're always dealing with lowparts.
> 
> So, I think the correct/conservative fix for both bugs is to revert the
> earlier change for PR67781, and then apply the following on top:
> 
> --- revert.tree-ssa-math-opts.c   2016-02-12 15:22:57.098895058 +0100
> +++ tree-ssa-math-opts.c  2016-02-12 15:23:08.482228474 +0100
> @@ -2473,10 +2473,14 @@ find_bswap_or_nop (gimple *stmt, struct
> /* Find real size of result (highest non-zero byte).  */
> if (n->base_addr)
>   {
> -  int rsize;
> +  unsigned HOST_WIDE_INT rsize;
> uint64_t tmpn;
> 
> for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
> rsize++);
> +  if (BYTES_BIG_ENDIAN && n->range != rsize)
> + /* This implies an offset, which is currently not handled by
> +bswap_replace.  */
> + return NULL;
> n->range = rsize;
>   }

This works too yes with less optimizations for big endian. I'm fine with 
either solutions. This one is indeed a bit more conservative so I see the 
appeal to use it for GCC 5 and 6.

Best regards,

Thomas

Re: [PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-26 Thread Thomas Preud'homme

Ping?

On Monday, January 18, 2016 11:33:47 AM Thomas Preud'homme wrote:
> On Wednesday, January 13, 2016 06:39:20 PM Bernd Schmidt wrote:
> > On 01/12/2016 08:55 AM, Thomas Preud'homme wrote:
> > > On Monday, January 11, 2016 04:57:18 PM Bernd Schmidt wrote:
> > >> On 01/08/2016 10:33 AM, Thomas Preud'homme wrote:
> > >>> 2016-01-08  Thomas Preud'homme  
> > >>> 
> > >>>   * g++.dg/pr67989.C: Remove ARM-specific option.
> > >>>   * gcc.target/arm/pr67989.C: New file.
> > >> 
> > >> I checked some other arm tests and they have things like
> > >> 
> > >> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> > >> "-march=*" } { "-march=armv4t" } } */
> > >> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> > >> "-mthumb" } { "" } } */
> > >> 
> > >> Do you need the same in your testcase?
> > > 
> > > That was the first approach I took but Kyrill suggested me to use
> > > arm_arch_v4t and arm_arch_v4t_ok machinery instead. It should take care
> > > about whether the architecture can be selected.
> > 
> > Hmm, the ones I looked at did use dg-add-options, but not the
> > corresponding _ok requirement. So I think this is OK.
> 
> Just to make sure: ok as in OK to commit as is?
> 
> Best regards,
> 
> Thomas

Re: [PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets

2016-01-21 Thread Thomas Preud'homme

On Thursday, January 21, 2016 09:21:52 AM Richard Biener wrote:
> On Thu, 21 Jan 2016, Thomas Preud'homme wrote:
> > On Friday, January 08, 2016 10:05:25 AM Richard Biener wrote:
> > > On Tue, 5 Jan 2016, Thomas Preud'homme wrote:
> > > > Hi,
> > > > 
> > > > bswap optimization pass generate wrong code on big endian targets when
> > > > the
> > > > result of a bit operation it analyzed is a partial load of the range
> > > > of
> > > > memory accessed by the original expression (when one or more bytes at
> > > > lowest address were lost in the computation). This is due to the way
> > > > cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being
> > > > compared to the result of the symbolic expression. Part of the
> > > > adjustment
> > > > is endian independent: it's to ignore the bytes that were not accessed
> > > > by
> > > > the original gimple expression. However, when the result has less byte
> > > > than that original expression, some more byte need to be ignored and
> > > > this
> > > > is endian dependent.
> > > > 
> > > > The current code only support loss of bytes at the highest addresses
> > > > because there is no code to adjust the address of the load. However,
> > > > for
> > > > little and big endian targets the bytes at highest address translate
> > > > into
> > > > different byte significance in the result. This patch first separate
> > > > cmpxchg and cmpnop adjustement into 2 steps and then deal with
> > > > endianness
> > > > correctly for the second step.
> > > > 
> > > > ChangeLog entries are as follow:
> > > > 
> > > > 
> > > > *** gcc/ChangeLog ***
> > > > 
> > > > 2015-12-16  Thomas Preud'homme  
> > > > 
> > > > PR tree-optimization/67781
> > > > * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in
> > > > cmpxchg
> > > > and cmpnop in two steps: first the ones not accessed in
> > > > original
> > > > gimple expression in a endian independent way and then the
> > > > ones
> > > > not
> > > > accessed in the final result in an endian-specific way.
> > > > 
> > > > *** gcc/testsuite/ChangeLog ***
> > > > 
> > > > 2015-12-16  Thomas Preud'homme  
> > > > 
> > > > PR tree-optimization/67781
> > > > * gcc.c-torture/execute/pr67781.c: New file.
> > > > 
> > > > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > > > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > > > new file mode 100644
> > > > index 000..bf50aa2
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > > > @@ -0,0 +1,34 @@
> > > > +#ifdef __UINT32_TYPE__
> > > > +typedef __UINT32_TYPE__ uint32_t;
> > > > +#else
> > > > +typedef unsigned uint32_t;
> > > > +#endif
> > > > +
> > > > +#ifdef __UINT8_TYPE__
> > > > +typedef __UINT8_TYPE__ uint8_t;
> > > > +#else
> > > > +typedef unsigned char uint8_t;
> > > > +#endif
> > > > +
> > > > +struct
> > > > +{
> > > > +  uint32_t a;
> > > > +  uint8_t b;
> > > > +} s = { 0x123456, 0x78 };
> > > > +
> > > > +int pr67781()
> > > > +{
> > > > +  uint32_t c = (s.a << 8) | s.b;
> > > > +  return c;
> > > > +}
> > > > +
> > > > +int
> > > > +main ()
> > > > +{
> > > > +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> > > > +return 0;
> > > > +
> > > > +  if (pr67781 () != 0x12345678)
> > > > +__builtin_abort ();
> > > > +  return 0;
> > > > +}
> > > > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> > > > index b00f046..e5a185f 100644
> > > > --- a/gcc/tree-ssa-math-opts.c
> > > > +++ b/gcc/tree-ssa-math-opts.c
> > > > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct
> > > > symbolic_number *n, int limit)
> > > > 
> > > >  static gimple *
> &

Re: [PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets

2016-01-20 Thread Thomas Preud'homme

On Friday, January 08, 2016 10:05:25 AM Richard Biener wrote:
> On Tue, 5 Jan 2016, Thomas Preud'homme wrote:
> > Hi,
> > 
> > bswap optimization pass generate wrong code on big endian targets when the
> > result of a bit operation it analyzed is a partial load of the range of
> > memory accessed by the original expression (when one or more bytes at
> > lowest address were lost in the computation). This is due to the way
> > cmpxchg and cmpnop are adjusted in find_bswap_or_nop before being
> > compared to the result of the symbolic expression. Part of the adjustment
> > is endian independent: it's to ignore the bytes that were not accessed by
> > the original gimple expression. However, when the result has less byte
> > than that original expression, some more byte need to be ignored and this
> > is endian dependent.
> > 
> > The current code only support loss of bytes at the highest addresses
> > because there is no code to adjust the address of the load. However, for
> > little and big endian targets the bytes at highest address translate into
> > different byte significance in the result. This patch first separate
> > cmpxchg and cmpnop adjustement into 2 steps and then deal with endianness
> > correctly for the second step.
> > 
> > ChangeLog entries are as follow:
> > 
> > 
> > *** gcc/ChangeLog ***
> > 
> > 2015-12-16  Thomas Preud'homme  
> > 
> > PR tree-optimization/67781
> > * tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in
> > cmpxchg
> > and cmpnop in two steps: first the ones not accessed in original
> > gimple expression in a endian independent way and then the ones
> > not
> > accessed in the final result in an endian-specific way.
> > 
> > *** gcc/testsuite/ChangeLog ***
> > 
> > 2015-12-16  Thomas Preud'homme  
> > 
> > PR tree-optimization/67781
> > * gcc.c-torture/execute/pr67781.c: New file.
> > 
> > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > new file mode 100644
> > index 000..bf50aa2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
> > @@ -0,0 +1,34 @@
> > +#ifdef __UINT32_TYPE__
> > +typedef __UINT32_TYPE__ uint32_t;
> > +#else
> > +typedef unsigned uint32_t;
> > +#endif
> > +
> > +#ifdef __UINT8_TYPE__
> > +typedef __UINT8_TYPE__ uint8_t;
> > +#else
> > +typedef unsigned char uint8_t;
> > +#endif
> > +
> > +struct
> > +{
> > +  uint32_t a;
> > +  uint8_t b;
> > +} s = { 0x123456, 0x78 };
> > +
> > +int pr67781()
> > +{
> > +  uint32_t c = (s.a << 8) | s.b;
> > +  return c;
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> > +return 0;
> > +
> > +  if (pr67781 () != 0x12345678)
> > +__builtin_abort ();
> > +  return 0;
> > +}
> > diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> > index b00f046..e5a185f 100644
> > --- a/gcc/tree-ssa-math-opts.c
> > +++ b/gcc/tree-ssa-math-opts.c
> > @@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct
> > symbolic_number *n, int limit)
> > 
> >  static gimple *
> >  find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
> >  {
> > 
> > +  unsigned rsize;
> > +  uint64_t tmpn, mask;
> > 
> >  /* The number which the find_bswap_or_nop_1 result should match in order
> >  
> > to have a full byte swap.  The number is shifted to the right
> > according to the size of the symbolic number before using it.  */
> > 
> > @@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct
> > symbolic_number *n, bool *bswap)
> > 
> >/* Find real size of result (highest non-zero byte).  */
> >if (n->base_addr)
> > 
> > -{
> > -  int rsize;
> > -  uint64_t tmpn;
> > -
> > -  for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
> > rsize++); -  n->range = rsize;
> > -}
> > +for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER,
> > rsize++);
> > +  else
> > +rsize = n->range;
> > 
> > -  /* Zero out the extra bits of N and CMP*.  */
> > +  /* Zero out the bits corresponding to untouched bytes in original
> > gimple
> > + expression.  */

RE: [PATCH, testsuite] Fix PR68632: gcc.target/arm/lto/pr65837 failure on M profile ARM targets

2016-01-19 Thread Thomas Preud'homme

It's indeed fixed now. Thanks.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Christian Bruel
> Sent: Wednesday, December 09, 2015 6:13 PM
> To: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, testsuite] Fix PR68632: gcc.target/arm/lto/pr65837
> failure on M profile ARM targets
> 
> Hi Thomas,
> 
> 
> On 12/09/2015 10:57 AM, Thomas Preud'homme wrote:
> > gcc.target/arm/lto/pr65837 fails on M profile ARM targets because of
> lack of neon instructions. This patch adds the necessary arm_neon_ok
> effective target requirement to avoid running this test for such targets.
> >
> 
> This case also fails for all configs that don't have neon by default.
> This is being fixed with
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00865.html
> 
> 
> > ChangeLog entry is as follows:
> >
> >
> > * gcc/testsuite/ChangeLog ***
> >
> > 2015-12-08  Thomas Preud'homme  
> >
> >  PR testsuite/68632
> >  * gcc.target/arm/lto/pr65837_0.c: Require arm_neon_ok effective
> >  target.
> >
> >
> > diff --git a/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
> b/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
> > index 000fc2a..fcc26a1 100644
> > --- a/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
> > +++ b/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
> > @@ -1,4 +1,5 @@
> >   /* { dg-lto-do run } */
> > +/* { dg-require-effective-target arm_neon_ok } */
> >   /* { dg-lto-options {{-flto -mfpu=neon}} } */
> >   /* { dg-suppress-ld-options {-mfpu=neon} } */
> >
> >
> >
> > Testcase fails without the patch and succeeds with.
> >
> > Is this ok for trunk?
> >
> > Best regards,
> >
> > Thomas
> >
> >

Re: [PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-17 Thread Thomas Preud'homme

On Wednesday, January 13, 2016 06:39:20 PM Bernd Schmidt wrote:
> On 01/12/2016 08:55 AM, Thomas Preud'homme wrote:
> > On Monday, January 11, 2016 04:57:18 PM Bernd Schmidt wrote:
> >> On 01/08/2016 10:33 AM, Thomas Preud'homme wrote:
> >>> 2016-01-08  Thomas Preud'homme  
> >>> 
> >>>   * g++.dg/pr67989.C: Remove ARM-specific option.
> >>>   * gcc.target/arm/pr67989.C: New file.
> >> 
> >> I checked some other arm tests and they have things like
> >> 
> >> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> >> "-march=*" } { "-march=armv4t" } } */
> >> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> >> "-mthumb" } { "" } } */
> >> 
> >> Do you need the same in your testcase?
> > 
> > That was the first approach I took but Kyrill suggested me to use
> > arm_arch_v4t and arm_arch_v4t_ok machinery instead. It should take care
> > about whether the architecture can be selected.
> 
> Hmm, the ones I looked at did use dg-add-options, but not the
> corresponding _ok requirement. So I think this is OK.

Just to make sure: ok as in OK to commit as is?

Best regards,

Thomas

RE: [PATCH, libgcc/ARM 1/6] Fix Thumb-1 only == ARMv6-M & Thumb-2 only == ARMv7-M assumptions

2016-01-12 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 1:58 PM
> 
> Hi,
> 
> We decided to apply the following patch to the ARM embedded 5 branch.
> This is *not* intended for trunk for now. We will send a separate email
> for trunk.

And now a rebased patch on top of trunk.


> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch fixes some assumptions related to M profile
> architectures. Currently GCC (mostly libgcc) contains several assumptions
> that the only ARM architecture with Thumb-1 only instructions is ARMv6-
> M and the only one with Thumb-2 only instructions is ARMv7-M. ARMv8-
> M [1] make this wrong since ARMv8-M baseline is also (mostly) Thumb-1
> only and ARMv8-M mainline is also Thumb-2 only. This patch replace
> checks for __ARM_ARCH_*__ for checks against
> __ARM_ARCH_ISA_THUMB and __ARM_ARCH_ISA_ARM instead. For
> instance, Thumb-1 only can be checked with
> #if !defined(__ARM_ARCH_ISA_ARM) && (__ARM_ARCH_ISA_THUMB
> == 1). It also fixes the guard for DIV code to not apply to ARMv8-M
> Baseline since it uses Thumb-2 instructions.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entries are as follow:
> 
> 

*** gcc/ChangeLog ***

2015-11-13  Thomas Preud'homme  

* config/arm/elf.h: Use __ARM_ARCH_ISA_THUMB and __ARM_ARCH_ISA_ARM to
decide whether to prevent some libgcc routines being included for some
multilibs rather than __ARM_ARCH_6M__ and add comment to indicate the
link between this condition and the one in
libgcc/config/arm/lib1func.S.
* config/arm/arm.h (TARGET_ARM_V6M): Add check to TARGET_ARM_ARCH.
(TARGET_ARM_V7M): Likewise.


*** gcc/testsuite/ChangeLog ***

2015-11-10  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_arm_cortex_m): Use
    __ARM_ARCH_ISA_ARM to test for Cortex-M devices.


*** libgcc/ChangeLog ***

2015-12-17  Thomas Preud'homme  

* config/arm/bpabi-v6m.S: Fix header comment to mention Thumb-1 rather
than ARMv6-M.
* config/arm/lib1funcs.S (__prefer_thumb__): Define among other cases
for all Thumb-1 only targets.
(__only_thumb1__): Define for all Thumb-1 only targets.
(THUMB_LDIV0): Test for __only_thumb1__ rather than __ARM_ARCH_6M__.
(EQUIV): Likewise.
(ARM_FUNC_ALIAS): Likewise.
(umodsi3): Add check to __only_thumb1__ to guard the idiv version.
(modsi3): Likewise.
(HAVE_ARM_CLZ): Remove block defining it.
(clzsi2): Test for __only_thumb1__ rather than __ARM_ARCH_6M__ and
check __ARM_FEATURE_CLZ instead of HAVE_ARM_CLZ.
(clzdi2): Likewise.
(ctzsi2): Likewise.
(L_interwork_call_via_rX): Test for __ARM_ARCH_ISA_ARM rather than
__ARM_ARCH_6M__ in guard for checking whether it is defined.
(final includes): Test for __only_thumb1__ rather than
__ARM_ARCH_6M__ and add comment to indicate the connection between
this condition and the one in gcc/config/arm/elf.h.
* config/arm/libunwind.S: Test for __ARM_ARCH_ISA_THUMB and
__ARM_ARCH_ISA_ARM rather than __ARM_ARCH_6M__.
* config/arm/t-softfp: Likewise.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index fd999dd..0d23f39 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2182,8 +2182,10 @@ extern int making_const_table;
 #define TARGET_ARM_ARCH\
   (arm_base_arch)  \
 
-#define TARGET_ARM_V6M (!arm_arch_notm && !arm_arch_thumb2)
-#define TARGET_ARM_V7M (!arm_arch_notm && arm_arch_thumb2)
+#define TARGET_ARM_V6M (TARGET_ARM_ARCH == BASE_ARCH_6M && !arm_arch_notm \
+   && !arm_arch_thumb2)
+#define TARGET_ARM_V7M (TARGET_ARM_ARCH == BASE_ARCH_7M && !arm_arch_notm \
+   && arm_arch_thumb2)
 
 /* The highest Thumb instruction set version supported by the chip.  */
 #define TARGET_ARM_ARCH_ISA_THUMB  \
diff --git a/gcc/config/arm/elf.h b/gcc/config/arm/elf.h
index 3795728..579a580 100644
--- a/gcc/config/arm/elf.h
+++ b/gcc/config/arm/elf.h
@@ -148,8 +148,9 @@
   while (0)
 
 /* Horrible hack: We want to prevent some libgcc routines being included
-   for some multilibs.  */
-#ifndef __ARM_ARCH_6M__
+   for some multilibs.  The condition should match the one in
+   libgcc/config/arm/lib1funcs.S.  */
+#if __ARM_ARCH_ISA_ARM || __ARM_ARCH_ISA_THUMB != 1
 #undef L_fixdfsi
 #undef L_fixunsdfsi
 #undef L_truncdfsf2
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4e349e9..3f96826 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-su

[PATCH, testsuite] Stabilize test result output of dump-noaddr

2016-01-12 Thread Thomas Preud'homme

Hi,

Everytime the static pass number of passes change, testsuite output for dump-
noaddr will change, leading to a series of noise lines like the following 
under dg-cmp-results:

PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -O1  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -O2  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -O2 -flto -fno-use-
linker-plugin -flto-partition=none  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -O2 -flto -fuse-
linker-plugin -fno-fat-lto-objects  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -O3 -fomit-frame-
pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -O3 -g  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -Og -g  comparison
PASS->NA: gcc.c-torture/unsorted/dump-noaddr.c.036t.fre1,  -Os  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -O1  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -O2  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -O2 -flto -fno-use-
linker-plugin -flto-partition=none  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -O2 -flto -fuse-
linker-plugin -fno-fat-lto-objects  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -O3 -fomit-frame-
pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -O3 -g  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -Og -g  comparison
NA->PASS: gcc.c-torture/unsorted/dump-noaddr.c.034t.fre1,  -Os  comparison

This patch solve this problem by replacing the static pass number in the 
output by a star, allowing for a stable output while retaining easy copy/
pasting in shell.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2015-12-30  Thomas Preud'homme  

* gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Replace static
pass number in output by a star.


diff --git a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x b/gcc/
testsuite/gcc.c-torture/unsorted/dump-noaddr.x
index a8174e0..001dd6b 100644
--- a/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
+++ b/gcc/testsuite/gcc.c-torture/unsorted/dump-noaddr.x
@@ -18,6 +18,7 @@ proc dump_compare { src options } {
foreach dump1 [lsort [glob -nocomplain dump1/*]] {
regsub dump1/ $dump1 dump2/ dump2
set dumptail "gcc.c-torture/unsorted/[file tail $dump1]"
+   regsub {\.\d+((t|r|i)\.[^.]+)$} $dumptail {.*\1} dumptail
#puts "$option $dump1"
set tmp [ diff "$dump1" "$dump2" ]
if { $tmp == 0 } {


Is this ok for stage3?

Best regards,

Thomas

Re: [PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-11 Thread Thomas Preud'homme

On Monday, January 11, 2016 04:57:18 PM Bernd Schmidt wrote:
> On 01/08/2016 10:33 AM, Thomas Preud'homme wrote:
> > 2016-01-08  Thomas Preud'homme  
> > 
> >  * g++.dg/pr67989.C: Remove ARM-specific option.
> >  * gcc.target/arm/pr67989.C: New file.
> 
> I checked some other arm tests and they have things like
> 
> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> "-march=*" } { "-march=armv4t" } } */
> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> "-mthumb" } { "" } } */
> 
> Do you need the same in your testcase?

That was the first approach I took but Kyrill suggested me to use arm_arch_v4t 
and arm_arch_v4t_ok machinery instead. It should take care about whether the 
architecture can be selected.

Best regards,

Thomas

Re: [PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-08 Thread Thomas Preud'homme

On Thursday, January 07, 2016 10:26:28 AM Richard Earnshaw wrote:
> On 07/01/16 09:15, Kyrill Tkachov wrote:
> > In this case perhaps we should go the route of just removing the
> > target-specific option
> > altogether.
> > 
> > Richard, that's the approach you recommended, right?
> 
> Yes.
> 
> I think if you really need to test a specific set of target flags, then
> it might be acceptable to have a duplicate of the test in dg.target/arm
> (but please put a comment in the (arm version of the) test to explain
> why it has been duplicated.

What about the following:


*** gcc/testsuite/ChangeLog ***

2016-01-08  Thomas Preud'homme  

* g++.dg/pr67989.C: Remove ARM-specific option.
* gcc.target/arm/pr67989.C: New file.


diff --git a/gcc/testsuite/g++.dg/pr67989.C b/gcc/testsuite/g++.dg/pr67989.C
index 
90261c450b4b9429fb989f7df62f3743017c7363..c3023557d31a21aead717fd58483c82e3e74da95
 
100644
--- a/gcc/testsuite/g++.dg/pr67989.C
+++ b/gcc/testsuite/g++.dg/pr67989.C
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-std=c++11 -O2" } */
-/* { dg-additional-options "-marm -march=armv4t" { target arm*-*-* } } */
 
 __extension__ typedef unsigned long long int uint64_t;
 namespace std __attribute__ ((__visibility__ ("default")))
diff --git a/gcc/testsuite/gcc.target/arm/pr67989.C b/gcc/testsuite/
gcc.target/arm/pr67989.C
new file mode 100644
index 
..0006924e24f698711e1e501d09b5098049522ad6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr67989.C
@@ -0,0 +1,82 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c++11 -O2" } */
+/* { dg-require-effective-target arm_arch_v4t_ok } */
+/* { dg-add-options arm_arch_v4t } */
+/* { dg-additional-options "-marm" } */
+
+/* Duplicate version of the test in g++.dg to be able to run this test only 
if
+   ARMv4t in ARM execution state can be targetted.  Newer architecture don't
+   expose the bug this testcase was written for.  */
+
+
+__extension__ typedef unsigned long long int uint64_t;
+namespace std __attribute__ ((__visibility__ ("default")))
+{
+  typedef enum memory_order
+  {
+memory_order_seq_cst
+  } memory_order;
+}
+
+namespace std __attribute__ ((__visibility__ ("default")))
+{
+  template < typename _Tp > struct atomic
+  {
+static constexpr int _S_min_alignment
+  = (sizeof (_Tp) & (sizeof (_Tp) - 1)) || sizeof (_Tp) > 16
+  ? 0 : sizeof (_Tp);
+static constexpr int _S_alignment
+  = _S_min_alignment > alignof (_Tp) ? _S_min_alignment : alignof (_Tp);
+  alignas (_S_alignment) _Tp _M_i;
+operator  _Tp () const noexcept
+{
+  return load ();
+}
+_Tp load (memory_order __m = memory_order_seq_cst) const noexcept
+{
+  _Tp tmp;
+__atomic_load (&_M_i, &tmp, __m);
+}
+  };
+}
+
+namespace lldb_private
+{
+  namespace imp
+  {
+  }
+  class Address;
+}
+namespace lldb
+{
+  typedef uint64_t addr_t;
+  class SBSection
+  {
+  };
+  class SBAddress
+  {
+void SetAddress (lldb::SBSection section, lldb::addr_t offset);
+  lldb_private::Address & ref ();
+  };
+}
+namespace lldb_private
+{
+  class Address
+  {
+  public:
+const Address & SetOffset (lldb::addr_t offset)
+{
+  bool changed = m_offset != offset;
+}
+std::atomic < lldb::addr_t > m_offset;
+  };
+}
+
+using namespace lldb;
+using namespace lldb_private;
+void
+SBAddress::SetAddress (lldb::SBSection section, lldb::addr_t offset)
+{
+  Address & addr = ref ();
+  addr.SetOffset (offset);
+}


Is this ok for stage3?

Best regards,

Thomas

RE: [PATCH, ARM, ping1] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2016-01-08 Thread Thomas Preud'homme

Ping?

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, December 16, 2015 5:11 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution
> failure on cortex-m0
> 
> During reorg pass, thumb1_reorg () is tasked with rewriting mov rd, rn to
> subs rd, rn, 0 to avoid a comparison against 0 instruction before doing a
> conditional branch based on it. The actual avoiding of cmp is done in
> cbranchsi4_insn instruction C output template. When the condition is
> met, the source register (rn) is also propagated into the comparison in
> place the destination register (rd).
> 
> However, right now thumb1_reorg () only look for a mov followed by a
> cbranchsi but does not check whether the comparison in cbranchsi is
> against the constant 0. This is not safe because a non clobbering
> instruction could exist between the mov and the comparison that
> modifies the source register. This is what happens here with a post
> increment of the source register after the mov, which skip the &a[i] ==
> &a[1] comparison for iteration i == 1.
> 
> This patch fixes the issue by checking that the comparison is against
> constant 0.
> 
> ChangeLog entry is as follow:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-12-07  Thomas Preud'homme  
> 
> * config/arm/arm.c (thumb1_reorg): Check that the comparison is
> against the constant 0.
> 
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 42bf272..49c0a06 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -17195,7 +17195,7 @@ thumb1_reorg (void)
>FOR_EACH_BB_FN (bb, cfun)
>  {
>rtx dest, src;
> -  rtx pat, op0, set = NULL;
> +  rtx cmp, op0, op1, set = NULL;
>rtx_insn *prev, *insn = BB_END (bb);
>bool insn_clobbered = false;
> 
> @@ -17208,8 +17208,13 @@ thumb1_reorg (void)
>   continue;
> 
>/* Get the register with which we are comparing.  */
> -  pat = PATTERN (insn);
> -  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
> +  cmp = XEXP (SET_SRC (PATTERN (insn)), 0);
> +  op0 = XEXP (cmp, 0);
> +  op1 = XEXP (cmp, 1);
> +
> +  /* Check that comparison is against ZERO.  */
> +  if (!CONST_INT_P (op1) || INTVAL (op1) != 0)
> + continue;
> 
>/* Find the first flag setting insn before INSN in basic block BB.  */
>gcc_assert (insn != BB_HEAD (bb));
> @@ -17249,7 +17254,7 @@ thumb1_reorg (void)
> PATTERN (prev) = gen_rtx_SET (dest, src);
> INSN_CODE (prev) = -1;
> /* Set test register in INSN to dest.  */
> -   XEXP (XEXP (SET_SRC (pat), 0), 0) = copy_rtx (dest);
> +   XEXP (cmp, 0) = copy_rtx (dest);
> INSN_CODE (insn) = -1;
>   }
>  }
> 
> 
> Testsuite shows no regression when run for arm-none-eabi with -
> mcpu=cortex-m0 -mthumb
> 
> Is this ok for trunk?
> 
> Best regards,
> 
> Thomas

Re: [PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-06 Thread Thomas Preud'homme

On Tuesday, January 05, 2016 10:47:38 AM Kyrill Tkachov wrote:
> Hi Thomas,

Hi Kyrill,

> > 
> > diff --git a/gcc/testsuite/g++.dg/pr67989.C
> > b/gcc/testsuite/g++.dg/pr67989.C index
> > 90261c450b4b9429fb989f7df62f3743017c7363..61be8e172a96df5bb76f7ecd8543dadf
> > 825e7dc7 100644
> > --- a/gcc/testsuite/g++.dg/pr67989.C
> > +++ b/gcc/testsuite/g++.dg/pr67989.C
> > @@ -1,5 +1,6 @@
> > 
> >   /* { dg-do compile } */
> >   /* { dg-options "-std=c++11 -O2" } */
> > 
> > +/* { dg-skip-if "do not override -mcpu" { arm*-*-* } { "-march=*"
> > "-mcpu=*" } { "-march=armv4t" } } */
> > 
> >   /* { dg-additional-options "-marm -march=armv4t" { target arm*-*-* } }
> >   */
> 
> How about we try to do it using the add_options_for_arm_arch_v4t machinery
> and the arm_arch_v4t_ok check?

I don't quite understand. dg-add-options doesn't take a selector according to 
GCC internals documentation and dg-additional-options doesn't take feature. If 
I use dg-add-options with a require-effective-target that will limit this test 
to ARM.

Did I misunderstand your point?

Best regards,

Thomas

[PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-04 Thread Thomas Preud'homme

Hi,

g++.dg/pr67989.C passes -march=armv4t to gcc when compiling which fails if 
RUNTESTFLAGS passes -mcpu or -march with a different value. This patch adds a 
dg-skip-if directive to skip the test when such a thing happens.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2015-12-31  Thomas Preud'homme  

* g++.dg/pr67989.C: Skip test if already running it with -mcpu or 
-march with different value.


diff --git a/gcc/testsuite/g++.dg/pr67989.C b/gcc/testsuite/g++.dg/pr67989.C
index 
90261c450b4b9429fb989f7df62f3743017c7363..61be8e172a96df5bb76f7ecd8543dadf825e7dc7
 
100644
--- a/gcc/testsuite/g++.dg/pr67989.C
+++ b/gcc/testsuite/g++.dg/pr67989.C
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-std=c++11 -O2" } */
+/* { dg-skip-if "do not override -mcpu" { arm*-*-* } { "-march=*" "-mcpu=*" } 
{ "-march=armv4t" } } */
 /* { dg-additional-options "-marm -march=armv4t" { target arm*-*-* } } */
 
 __extension__ typedef unsigned long long int uint64_t;


Is this ok for stage3?

Best regards,

Thomas

Re: [PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets

2016-01-04 Thread Thomas Preud'homme

On Tuesday, January 05, 2016 01:53:37 PM you wrote:
> 
> Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC
> and on an arm-none-eabi GCC cross-compiler without any regression. I'm
> waiting for a slot on gcc110 to do a big endian bootstrap but at least the
> testcase works on mips-linux. I'll send an update once bootstrap is
> complete.

Bootstrap went fine on gcc110 with the following language enabled: 
c,c++,objc,obj-c++,java,fortran,ada,go,lto.

Best regards,

Thomas

[PATCH, GCC] Fix PR67781: wrong code generation for partial load on big endian targets

2016-01-04 Thread Thomas Preud'homme

Hi,

bswap optimization pass generate wrong code on big endian targets when the 
result of a bit operation it analyzed is a partial load of the range of memory 
accessed by the original expression (when one or more bytes at lowest address 
were lost in the computation). This is due to the way cmpxchg and cmpnop are 
adjusted in find_bswap_or_nop before being compared to the result of the 
symbolic expression. Part of the adjustment is endian independent: it's to 
ignore the bytes that were not accessed by the original gimple expression. 
However, when the result has less byte than that original expression, some 
more byte need to be ignored and this is endian dependent.

The current code only support loss of bytes at the highest addresses because 
there is no code to adjust the address of the load. However, for little and 
big endian targets the bytes at highest address translate into different byte 
significance in the result. This patch first separate cmpxchg and cmpnop 
adjustement into 2 steps and then deal with endianness correctly for the 
second step.

ChangeLog entries are as follow:


*** gcc/ChangeLog ***

2015-12-16  Thomas Preud'homme  

PR tree-optimization/67781
* tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg
and cmpnop in two steps: first the ones not accessed in original
gimple expression in a endian independent way and then the ones not
accessed in the final result in an endian-specific way.


*** gcc/testsuite/ChangeLog ***

2015-12-16  Thomas Preud'homme  

PR tree-optimization/67781
* gcc.c-torture/execute/pr67781.c: New file.


diff --git a/gcc/testsuite/gcc.c-torture/execute/pr67781.c 
b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
new file mode 100644
index 000..bf50aa2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr67781.c
@@ -0,0 +1,34 @@
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#ifdef __UINT8_TYPE__
+typedef __UINT8_TYPE__ uint8_t;
+#else
+typedef unsigned char uint8_t;
+#endif
+
+struct
+{
+  uint32_t a;
+  uint8_t b;
+} s = { 0x123456, 0x78 };
+
+int pr67781()
+{
+  uint32_t c = (s.a << 8) | s.b;
+  return c;
+}
+
+int
+main ()
+{
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+return 0;
+
+  if (pr67781 () != 0x12345678)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index b00f046..e5a185f 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2441,6 +2441,8 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number 
*n, int limit)
 static gimple *
 find_bswap_or_nop (gimple *stmt, struct symbolic_number *n, bool *bswap)
 {
+  unsigned rsize;
+  uint64_t tmpn, mask;
 /* The number which the find_bswap_or_nop_1 result should match in order
to have a full byte swap.  The number is shifted to the right
according to the size of the symbolic number before using it.  */
@@ -2464,24 +2466,38 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number 
*n, bool *bswap)
 
   /* Find real size of result (highest non-zero byte).  */
   if (n->base_addr)
-{
-  int rsize;
-  uint64_t tmpn;
-
-  for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
-  n->range = rsize;
-}
+for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
+  else
+rsize = n->range;
 
-  /* Zero out the extra bits of N and CMP*.  */
+  /* Zero out the bits corresponding to untouched bytes in original gimple
+ expression.  */
   if (n->range < (int) sizeof (int64_t))
 {
-  uint64_t mask;
-
   mask = ((uint64_t) 1 << (n->range * BITS_PER_MARKER)) - 1;
   cmpxchg >>= (64 / BITS_PER_MARKER - n->range) * BITS_PER_MARKER;
   cmpnop &= mask;
 }
 
+  /* Zero out the bits corresponding to unused bytes in the result of the
+ gimple expression.  */
+  if (rsize < n->range)
+{
+  if (BYTES_BIG_ENDIAN)
+   {
+ mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1;
+ cmpxchg &= mask;
+ cmpnop >>= (n->range - rsize) * BITS_PER_MARKER;
+   }
+  else
+   {
+ mask = ((uint64_t) 1 << (rsize * BITS_PER_MARKER)) - 1;
+ cmpxchg >>= (n->range - rsize) * BITS_PER_MARKER;
+ cmpnop &= mask;
+   }
+  n->range = rsize;
+}
+
   /* A complete byte swap should make the symbolic number to start with
  the largest digit in the highest order byte. Unchanged symbolic
  number indicates a read with same endianness as target architecture.  */



Regression testsuite was run on a bootstrapped native x86_64-linux-gnu GCC and 
on an arm-none-eabi GCC cross-compiler without any regression. I'm waiting for 
a slot on gcc110 to do a big endian bootstrap but at least the testcase works 
on mips-linux. I

RE: [PATCH, ARM, 1/3] Document --with-multilib-list for arm--* targets

2016-01-03 Thread Thomas Preud'homme

> From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
> Sent: Sunday, January 03, 2016 6:49 AM
> 
> On Wed, 16 Dec 2015, Thomas Preud'homme wrote:
> > Currently, the documentation for --with-multilib-list in
> > gcc/doc/install.texi only mentions sh*-*-* and x86-64-*-linux* targets.
> > However, arm*-*-* targets also support this option. This patch adds
> > documention for the meaning of this option for arm*-*-* targets.
> >
> > 2015-12-09  Thomas Preud'homme  
> >
> > * doc/install.texi (--with-multilib-list): Describe the meaning of 
> > the
> > option for arm*-*-* targets.
> 
> Ok (since I don't think I saw a response from the ARM maintainers).
> 
> (The list of options for -mfpu= is a bit inconsistent with the other
> cases, in that it has -mfpu= only in the first case, but I guess this
> is fine if you want to keep it that way.)

Oh indeed right. How could I miss that? I fixed it and committed the following:


2016-01-04  Thomas Preud'homme  

* doc/install.texi (--with-multilib-list): Describe the meaning of the
option for arm*-*-* targets.


diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
81cadb5ed59c3ce8de2b8f8a474fcb3d9de10f32..f3052c07b06bca2e1fc77f0133b723563467a94d
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1102,9 +1102,19 @@ sysv, aix.
 @item --with-multilib-list=@var{list}
 @itemx --without-multilib-list
 Specify what multilibs to build.
-Currently only implemented for sh*-*-* and x86-64-*-linux*.
+Currently only implemented for arm*-*-*, sh*-*-* and x86-64-*-linux*.
 
 @table @code
+@item arm*-*-*
+@var{list} is either @code{default} or @code{aprofile}.  Specifying
+@code{default} is equivalent to omitting this option while specifying
+@code{aprofile} builds multilibs for each combination of ISA (@code{-marm} or
+@code{-mthumb}), architecture (@code{-march=armv7-a}, @code{-march=armv7ve},
+or @code{-march=armv8-a}), FPU available (none, @code{-mfpu=vfpv3-d16},
+@code{-mfpu=neon}, @code{-mfpu=vfpv4-d16}, @code{-mfpu=neon-vfpv4} or
+@code{-mfpu=neon-fp-armv8} depending on architecture) and floating-point ABI
+(@code{-mfloat-abi=softfp} or @code{-mfloat-abi=hard}).
+
 @item sh*-*-*
 @var{list} is a comma separated list of CPU names.  These must be of the
 form @code{sh*} or @code{m*} (in which case they match the compiler option
Best regards,

Thomas

[RFC][PATCH, ARM 8/8] Added support for ARMV8-M Security Extension cmse_nonsecure_caller intrinsic

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch adds support ARMv8-M's Security Extension's cmse_nonsecure_caller 
intrinsic. This intrinsic is used to check whether an entry function was called 
from a non-secure state. 
See Section 5.4.3 of ARM®v8-M Security Extensions: Requirements on Development 
Tools (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html) 
for further details.

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
        Thomas Preud'homme  

* gcc/config/arm/arm-builtins.c (arm_builtins): Define
  ARM_BUILTIN_CMSE_NONSECURE_CALLER.
  (bdesc_2arg): Add line for cmse_nonsecure_caller.
  (arm_init_builtins): Init for cmse_nonsecure_caller.
  (arm_expand_builtin): Handle cmse_nonsecure_caller.
* gcc/config/arm/arm_cmse.h (cmse_nonsecure_caller): New.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-1.c: Added test for 
  cmse_nonsecure_caller.


diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
11cd17d0b8f3c29ccbe16cb463a17d55ba0fa1e3..7934cf1d4d96c40255d3e93dc9902b4568014984
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -515,6 +515,8 @@ enum arm_builtins
   ARM_BUILTIN_GET_FPSCR,
   ARM_BUILTIN_SET_FPSCR,
 
+  ARM_BUILTIN_CMSE_NONSECURE_CALLER,
+
 #undef CRYPTO1
 #undef CRYPTO2
 #undef CRYPTO3
@@ -1263,6 +1265,10 @@ static const struct builtin_description bdesc_2arg[] =
   FP_BUILTIN (set_fpscr, SET_FPSCR)
 #undef FP_BUILTIN
 
+  {ARM_FSET_MAKE_CPU2 (FL2_CMSE), CODE_FOR_andsi3,
+   "__builtin_arm_cmse_nonsecure_caller", ARM_BUILTIN_CMSE_NONSECURE_CALLER,
+   UNKNOWN, 0},
+
 #define CRC32_BUILTIN(L, U) \
   {ARM_FSET_EMPTY, CODE_FOR_##L, "__builtin_arm_"#L, \
ARM_BUILTIN_##U, UNKNOWN, 0},
@@ -1797,6 +1803,17 @@ arm_init_builtins (void)
= add_builtin_function ("__builtin_arm_stfscr", ftype_set_fpscr,
ARM_BUILTIN_SET_FPSCR, BUILT_IN_MD, NULL, 
NULL_TREE);
 }
+
+  if (arm_arch_cmse)
+{
+  tree ftype_cmse_nonsecure_caller
+   = build_function_type_list (unsigned_type_node, NULL);
+  arm_builtin_decls[ARM_BUILTIN_CMSE_NONSECURE_CALLER]
+   = add_builtin_function ("__builtin_arm_cmse_nonsecure_caller",
+   ftype_cmse_nonsecure_caller,
+   ARM_BUILTIN_CMSE_NONSECURE_CALLER, BUILT_IN_MD,
+   NULL, NULL_TREE);
+}
 }
 
 /* Return the ARM builtin for CODE.  */
@@ -2356,6 +2373,14 @@ arm_expand_builtin (tree exp,
   emit_insn (pat);
   return target;
 
+case ARM_BUILTIN_CMSE_NONSECURE_CALLER:
+  icode = CODE_FOR_andsi3;
+  target = gen_reg_rtx (SImode);
+  op0 = arm_return_addr (0, NULL_RTX);
+  pat = GEN_FCN (icode) (target, op0, const1_rtx);
+  emit_insn (pat);
+  return target;
+
 case ARM_BUILTIN_TEXTRMSB:
 case ARM_BUILTIN_TEXTRMUB:
 case ARM_BUILTIN_TEXTRMSH:
diff --git a/gcc/config/arm/arm_cmse.h b/gcc/config/arm/arm_cmse.h
index 
ab20a3ec46025f268a1e9bed895d27da9af7aab6..0bdff668d03d54e1acf2bdd3b5ff1bfb2b463bd8
 100644
--- a/gcc/config/arm/arm_cmse.h
+++ b/gcc/config/arm/arm_cmse.h
@@ -163,6 +163,13 @@ __attribute__ ((__always_inline__))
 cmse_TTAT (void *p)
 CMSE_TT_ASM (at)
 
+//TODO: diagnose use outside cmse_nonsecure_entry functions
+__extension__ static __inline int __attribute__ ((__always_inline__))
+cmse_nonsecure_caller (void)
+{
+  return __builtin_arm_cmse_nonsecure_caller ();
+}
+
 #define CMSE_AU_NONSECURE  2
 #define CMSE_MPU_NONSECURE 16
 #define CMSE_NONSECURE 18
diff --git a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c 
b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
index 
1c3d4e9e934f4b1166d4d98383cf4ae8c3515117..ccecf396d3cda76536537b4d146bbb5f70589fd5
 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/cmse-1.c
@@ -66,3 +66,32 @@ int foo (char * p)
 /* { dg-final { scan-assembler-times "ttat " 2 } } */
 /* { dg-final { scan-assembler-times "bl.cmse_check_address_range" 7 } } */
 /* { dg-final { scan-assembler-not "cmse_check_pointed_object" } } */
+
+typedef int (*int_ret_funcptr_t) (void);
+typedef int __attribute__ ((cmse_nonsecure_call)) (*int_ret_nsfuncptr_t) 
(void);
+
+int __attribute__ ((cmse_nonsecure_entry))
+baz (void)
+{
+  return cmse_nonsecure_caller ();
+}
+
+int __attribute__ ((cmse_nonsecure_entry))
+qux (int_ret_funcptr_t int_ret_funcptr)
+{
+  int_ret_nsfuncptr_t int_ret_nsfunc_ptr;
+
+  if (cmse_is_nsfptr (int_ret_funcptr))
+{
+  int_ret_nsfunc_ptr = cmse_nsfptr_create (int_ret_funcptr);
+  return int_ret_nsfunc_ptr ();
+}
+  return 0;
+}
+/* { dg-final { scan-assembler "baz:" } } */
+/* { dg-final { scan-assembler "__acle_se_baz:" }

[RFC][PATCH, ARM 7/8] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch extends support for the ARMv8-M Security Extensions 
'cmse_nonsecure_call' to use a new library function 
'__gnu_cmse_nonsecure_call'. This library function is responsible for (without 
using r0-r3 or d0-d7):
1) saving and clearing all callee-saved registers using the secure stack
2) clearing the LSB of the address passed in r4 and using blxns to 'jump' to it
3) clearing ASPR, including the 'ge bits' if DSP is enabled
4) clearing FPSCR if using non-soft float-abi
5) restoring callee-saved registers.

The decisions whether to include DSP 'ge bits' clearing and floating point 
registers (single/double precision) all depends on the multilib used.

See Section 5.5 of ARM®v8-M Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

*** gcc/ChangeLog ***

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
Thomas Preud'homme  

* gcc/config/arm/arm.c (detect_cmse_nonsecure_call): New.
  (cmse_nonsecure_call_clear_caller_saved): New.
* gcc/config/arm/arm-protos.h (detect_cmse_nonsecure_call): New.
* gcc/config/arm/arm.md (call): Handle cmse_nonsecure_entry.
  (call_value): Likewise.
  (nonsecure_call_internal): New.
  (nonsecure_call_value_internal): New.
* gcc/config/arm/thumb1.md (*nonsecure_call_reg_thumb1_v5): New.
  (*nonsecure_call_value_reg_thumb1_v5): New.
* gcc/config/arm/thumb2.md (*nonsecure_call_reg_thumb2): New.
  (*nonsecure_call_value_reg_thumb2): New.
* gcc/config/arm/unspecs.md (UNSPEC_NONSECURE_MEM): New.
* libgcc/config/arm/cmse_nonsecure_call.S: New.
* libgcc/config/arm/t-arm: Compile cmse_nonsecure_call.S


*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
Thomas Preud'homme  

* gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-11.c: New.
* gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/baseline/cmse-6.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/hard/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/soft/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-13.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-7.c: New.
* gcc/testsuite/gcc.target/arm/cmse/mainline/softfp/cmse-8.c: New.


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 
9ee8c333046d9a5bb0487f7b710a5aff42d2..694ee02f534019a5fc9377757f3269dfe6ccfbc0
 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -132,6 +132,7 @@ extern int arm_const_double_inline_cost (rtx);
 extern bool arm_const_double_by_parts (rtx);
 extern bool arm_const_double_by_immediates (rtx);
 extern void arm_emit_call_insn (rtx, rtx, bool);
+bool detect_cmse_nonsecure_call (tree);
 extern const char *output_call (rtx *);
 void arm_emit_movpair (rtx, rtx);
 extern const char *output_mov_long_double_arm_from_arm (rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
4b4eea88cbec8e04d5b92210f0af2440ce6fb6e4..320f7b447501047a59ceef4f7ded2dadc2088664
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17403,6 +17403,129 @@ note_invalid_constants (rtx_insn *insn, HOST_WIDE_INT 
address, int do_pushes)
   return;
 }
 
+/* Saves callee saved registers, clears callee saved registers and caller saved
+   registers not used to pass arguments before a cmse_nonsecure_call.  And
+   restores the callee saved registers after.  */
+
+static void
+cmse_nonsecure_call_clear_caller_saved (void)
+{
+  basic_block bb;
+
+  FOR_EACH_BB_FN (bb, cfun)
+{
+  rtx_insn *insn;
+
+  FOR_BB_INSNS (bb, insn)
+   {
+ uint64_t to_clear_mask, float_mask;
+ rtx_insn *seq;
+ rtx pat, call, unspec, link, reg, cleared_reg, tmp;
+ unsigned int regno, maxregno;
+ rtx address;
+
+ if (!NONDEBUG_INSN_P (insn))
+   continue;
+
+ if (!CALL_P (insn))
+   continue;
+
+ pat = PATTERN (insn);
+ gcc_assert (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) > 0);
+ call

[RFC][PATCH, ARM 6/8] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch adds support for the ARMv8-M Security Extensions 
'cmse_nonsecure_call' attribute. This attribute may only be used for function 
types and when used in combination with the '-mcmse' compilation flag. See 
Section 5.5 of ARM®v8-M Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

We currently do not support cmse_nonsecure_call functions that pass arguments 
or return variables on the stack and we diagnose this. 

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc/config/arm/arm.c (gimplify.h): New include.
  (arm_handle_cmse_nonsecure_call): New.
  (arm_attribute_table): Added cmse_nonsecure_call.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-3.c: Add tests.
* gcc.target/arm/cmse/cmse-4.c: Add tests.


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
0700478ca38307f35d0cb01f83ea182802ba28fa..4b4eea88cbec8e04d5b92210f0af2440ce6fb6e4
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -61,6 +61,7 @@
 #include "builtins.h"
 #include "tm-constrs.h"
 #include "rtl-iter.h"
+#include "gimplify.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -136,6 +137,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, 
int, bool *);
 static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *);
 #endif
 static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *);
+static tree arm_handle_cmse_nonsecure_call (tree *, tree, tree, int, bool *);
 static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT);
 static void arm_output_function_prologue (FILE *, HOST_WIDE_INT);
 static int arm_comp_type_attributes (const_tree, const_tree);
@@ -347,6 +349,8 @@ static const struct attribute_spec arm_attribute_table[] =
   /* ARMv8-M Security Extensions support.  */
   { "cmse_nonsecure_entry", 0, 0, true, false, false,
 arm_handle_cmse_nonsecure_entry, false },
+  { "cmse_nonsecure_call", 0, 0, true, false, false,
+arm_handle_cmse_nonsecure_call, false },
   { NULL,   0, 0, false, false, false, NULL, false }
 };
 

@@ -6667,6 +6671,76 @@ arm_handle_cmse_nonsecure_entry (tree *node, tree name,
   return NULL_TREE;
 }
 
+
+/* Called upon detection of the use of the cmse_nonsecure_call attribute, this
+   function will check whether the attribute is allowed here and will add the
+   attribute to the function type tree or otherwise issue a diagnose.  The
+   reason we check this at declaration time is to only allow the use of the
+   attribute with declartions of function pointers and not function
+   declartions.  */
+
+static tree
+arm_handle_cmse_nonsecure_call (tree *node, tree name,
+tree /* args */,
+int /* flags */,
+bool *no_add_attrs)
+{
+  tree decl = NULL_TREE;
+  tree type, fntype, main_variant;
+
+  if (!use_cmse)
+{
+  *no_add_attrs = true;
+  return NULL_TREE;
+}
+
+  if (TREE_CODE (*node) == VAR_DECL || TREE_CODE (*node) == TYPE_DECL)
+{
+  decl = *node;
+  type = TREE_TYPE (decl);
+}
+
+  if (!decl
+  || (!(TREE_CODE (type) == POINTER_TYPE
+   && TREE_CODE (TREE_TYPE (type)) == FUNCTION_TYPE)
+ && TREE_CODE (type) != FUNCTION_TYPE))
+{
+   warning (OPT_Wattributes, "%qE attribute only applies to base type of a 
"
+"function pointer", name);
+   *no_add_attrs = true;
+   return NULL_TREE;
+}
+
+  /* type is either a function pointer, when the attribute is used on a 
function
+   * pointer, or a function type when used in a typedef.  */
+  if (TREE_CODE (type) == FUNCTION_TYPE)
+fntype = type;
+  else
+fntype = TREE_TYPE (type);
+
+  *no_add_attrs |= cmse_func_args_or_return_in_stack (NULL, name, fntype);
+
+  if (*no_add_attrs)
+return NULL_TREE;
+
+  /* Prevent tree's being shared among function types with and without
+ cmse_nonsecure_call attribute.  Do however make sure they keep the same
+ main_variant, this is required for correct DIE output.  */
+  main_variant = TYPE_MAIN_VARIANT (fntype);
+  fntype = build_distinct_type_copy (fntype);
+  TYPE_MAIN_VARIANT (fntype) = main_variant;
+  if (TREE_CODE (type) == FUNCTION_TYPE)
+TREE_TYPE (decl) = fntype;
+  else
+TREE_TYPE (type) = fntype;
+
+  /* Construct a type attribute and add it to the function type.  */
+  tree attrs = tree_cons (get_identifier ("cmse_nonsecure_call"), NULL_TREE,
+ TYPE_ATTRIBUTES (fntype));
+  TYPE_ATTRIBUTES (fntype) = attrs;
+  return NULL_TREE;
+}
+
 /* Return 0 if the attributes for two t

[RFC][PATCH, ARM 5/8] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch extends support for the ARMv8-M Security Extensions 
'cmse_nonsecure_entry' attribute to safeguard against leak of information 
through unbanked registers.

When returning from a nonsecure entry function we clear all caller-saved 
registers that are not used to pass return values, by writing either the LR, in 
case of general purpose registers, or the value 0, in case of FP registers. We 
use the LR to write to APSR and FPSCR too. We currently only support 32 FP 
registers as in we only clear D0-D7.
We currently do not support entry functions that pass arguments or return 
variables on the stack and we diagnose this. This patch relies on the existing 
code to make sure callee-saved registers used in cmse_nonsecure_entry functions 
are saved and restored thus retaining their nonsecure mode value, this should 
be happening already as it is required by AAPCS.


*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
        Thomas Preud'homme  

* gcc/config/arm/arm.c (output_return_instruction): Clear
  registers.
  (thumb2_expand_return): Likewise.
  (thumb1_expand_epilogue): Likewise.
  (arm_expand_epilogue): Likewise.
  (cmse_nonsecure_entry_clear_before_return): New.
* gcc/config/arm/arm.h (TARGET_DSP_ADD): New macro define.
* gcc/config/arm/thumb1.md (*epilogue_insns): Change length attribute.
* gcc/config/arm/thumb2.md (*thumb2_return): Likewise.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: Test different multilibs separate.
* gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers are 
cleared.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
f12e3c93bbe24b10ed8eee6687161826773ef649..b06e0586a3da50f57645bda13629bc4dbd3d53b7
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -230,6 +230,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 /* Integer SIMD instructions, and extend-accumulate instructions.  */
 #define TARGET_INT_SIMD \
   (TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em))
+/* Parallel addition and subtraction instructions.  */
+#define TARGET_DSP_ADD \
+  (TARGET_ARM_ARCH >= 6 && (arm_arch_notm || arm_arch7em))
 
 /* Should MOVW/MOVT be used in preference to a constant pool.  */
 #define TARGET_USE_MOVT \
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
e530b772e3cc053c16421a2a2861d815d53ebb01..0700478ca38307f35d0cb01f83ea182802ba28fa
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -19755,6 +19755,24 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
default:
  if (IS_CMSE_ENTRY (func_type))
{
+ char flags[12] = "APSR_nzcvq";
+ /* Check if we have to clear the 'GE bits' which is only used if
+parallel add and subtraction instructions are available.  */
+ if (TARGET_DSP_ADD)
+   {
+ /* If so also clear the ge flags.  */
+ flags[10] = 'g';
+ flags[11] = '\0';
+   }
+ snprintf (instr, sizeof (instr),  "msr%s\t%s, %%|lr", conditional,
+   flags);
+ output_asm_insn (instr, & operand);
+ if (TARGET_HARD_FLOAT && TARGET_VFP)
+   {
+ snprintf (instr, sizeof (instr), "vmsr%s\tfpscr, %%|lr",
+   conditional);
+ output_asm_insn (instr, & operand);
+   }
  snprintf (instr, sizeof (instr), "bxns%s\t%%|lr", conditional);
}
  /* Use bx if it's available.  */
@@ -23999,6 +24017,17 @@ thumb_pop (FILE *f, unsigned long mask)
 static void
 thumb1_cmse_nonsecure_entry_return (FILE *f, int reg_containing_return_addr)
 {
+  char flags[12] = "APSR_nzcvq";
+  /* Check if we have to clear the 'GE bits' which is only used if
+ parallel add and subtraction instructions are available.  */
+  if (TARGET_DSP_ADD)
+{
+  flags[10] = 'g';
+  flags[11] = '\0';
+}
+  asm_fprintf (f, "\tmsr\t%s, %r\n", flags, reg_containing_return_addr);
+  if (TARGET_HARD_FLOAT && TARGET_VFP)
+asm_fprintf (f, "\tvmsr\tfpscr, %r\n", reg_containing_return_addr);
   asm_fprintf (f, "\tbxns\t%r\n", reg_containing_return_addr);
 }

[RFC][PATCH, ARM 4/8] ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and bxns return

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch extends support for the ARMv8-M Security Extensions 
'cmse_nonsecure_entry' attribute in two ways:

1) Generate two labels for the function, the regular function name and one with 
the function's name appended to '__acle_se_', this will trigger the linker to 
create a secure gateway veneer for this entry function.
2) Return from cmse_nonsecure_entry marked functions using bxns.

See Section 5.4 of ARM®v8-M Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).


*** gcc/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc/config/arm/arm.c (use_return_insn): Change to return with  bxns
  when cmse_nonsecure_entry.
  (output_return_instruction): Likewise.
  (arm_output_function_prologue): Likewise.
  (thumb_pop): Likewise.
  (thumb_exit): Likewise.
  (arm_function_ok_for_sibcall): Disable sibcall for entry functions.
  (arm_asm_declare_function_name): New.
  (thumb1_cmse_nonsecure_entry_return): New.
* gcc/config/arm/arm-protos.h (arm_asm_declare_function_name): New.
* gcc/config/arm/elf.h (ASM_DECLARE_FUNCTION_NAME): Redefine to
  use arm_asm_declare_function_name.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira
Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-2.c: New.
* gcc.target/arm/cmse/cmse-4.c: New.


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 
85dca057d63544c672188db39b05a33b1be10915..9ee8c333046d9a5bb0487f7b710a5aff42d2
 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -31,6 +31,7 @@ extern int arm_volatile_func (void);
 extern void arm_expand_prologue (void);
 extern void arm_expand_epilogue (bool);
 extern void arm_declare_function_name (FILE *, const char *, tree);
+extern void arm_asm_declare_function_name (FILE *, const char *, tree);
 extern void thumb2_expand_return (bool);
 extern const char *arm_strip_name_encoding (const char *);
 extern void arm_asm_output_labelref (FILE *, const char *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
5b9e51b10e91eee64e3383c1ed50269c3e6cf24c..e530b772e3cc053c16421a2a2861d815d53ebb01
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3795,6 +3795,11 @@ use_return_insn (int iscond, rtx sibling)
return 0;
 }
 
+  /* ARMv8-M nonsecure entry function need to use bxns to return and thus need
+ several instructions if anything needs to be popped.  */
+  if (saved_int_regs && IS_CMSE_ENTRY (func_type))
+return 0;
+
   /* If there are saved registers but the LR isn't saved, then we need
  two instructions for the return.  */
   if (saved_int_regs && !(saved_int_regs & (1 << LR_REGNUM)))
@@ -6820,6 +6825,11 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
   if (IS_INTERRUPT (func_type))
 return false;
 
+  /* ARMv8-M non-secure entry functions need to return with bxns which is only
+ generated for entry functions themselves.  */
+  if (IS_CMSE_ENTRY (arm_current_func_type ()))
+return false;
+
   if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (cfun->decl
 {
   /* Check that the return value locations are the same.  For
@@ -19607,6 +19617,7 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
 (e.g. interworking) then we can load the return address
 directly into the PC.  Otherwise we must load it into LR.  */
   if (really_return
+ && !IS_CMSE_ENTRY (func_type)
  && (IS_INTERRUPT (func_type) || !TARGET_INTERWORK))
return_reg = reg_names[PC_REGNUM];
   else
@@ -19742,8 +19753,12 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
  break;
 
default:
+ if (IS_CMSE_ENTRY (func_type))
+   {
+ snprintf (instr, sizeof (instr), "bxns%s\t%%|lr", conditional);
+   }
  /* Use bx if it's available.  */
- if (arm_arch5 || arm_arch4t)
+ else if (arm_arch5 || arm_arch4t)
sprintf (instr, "bx%s\t%%|lr", conditional);
  else
sprintf (instr, "mov%s\t%%|pc, %%|lr", conditional);
@@ -19756,6 +19771,42 @@ output_return_instruction (rtx operand, bool 
really_return, bool reverse,
   return "";
 }
 
+/* Output in FILE asm statements needed to declare the NAME of the function
+   defined by its DECL node.  */
+
+void
+arm_asm_declare_function_name (FILE *file, const char *name, tree decl)
+{
+  size_t cmse_name_len;
+  char *cmse_name = 0;
+  char cmse_prefix[] = "__acle_se_";
+
+  if (use_cmse && lookup_attribute ("cmse_nonsecure_entry",
+   DECL_ATTRIBUTES (decl)))
+{
+

[RFC][PATCH, ARM 3/8] Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch adds support for the ARMv8-M Security Extensions 
'cmse_nonsecure_entry' attribute. In this patch we implement the attribute 
handling and diagnosis around the attribute. See Section 5.4 of ARM®v8-M 
Security Extensions 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
        Thomas Preud'homme  

* gcc/config/arm/arm.c (arm_handle_cmse_nonsecure_entry): New.
  (arm_attribute_table): Added cmse_nonsecure_entry
  (arm_compute_func_type): Handle cmse_nonsecure_entry.
  (cmse_func_args_or_return_in_stack): New.
  (arm_handle_cmse_nonsecure_entry): New.
* gcc/config/arm/arm.h (ARM_FT_CMSE_ENTRY): New macro define.
  (IS_CMSE_ENTRY): Likewise.

*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc.target/arm/cmse/cmse-3.c: New.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
cf6d9466fb79e4f8a2dbfe725c52d5be8ea24fd2..f12e3c93bbe24b10ed8eee6687161826773ef649
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1375,6 +1375,7 @@ enum reg_class
 #define ARM_FT_VOLATILE(1 << 4) /* Does not return.  */
 #define ARM_FT_NESTED  (1 << 5) /* Embedded inside another func.  */
 #define ARM_FT_STACKALIGN  (1 << 6) /* Called with misaligned stack.  */
+#define ARM_FT_CMSE_ENTRY  (1 << 7) /* ARMv8-M non-secure entry function.  
*/
 
 /* Some macros to test these flags.  */
 #define ARM_FUNC_TYPE(t)   (t & ARM_FT_TYPE_MASK)
@@ -1383,6 +1384,7 @@ enum reg_class
 #define IS_NAKED(t)(t & ARM_FT_NAKED)
 #define IS_NESTED(t)   (t & ARM_FT_NESTED)
 #define IS_STACKALIGN(t)   (t & ARM_FT_STACKALIGN)
+#define IS_CMSE_ENTRY(t)   (t & ARM_FT_CMSE_ENTRY)
 
 
 /* Structure used to hold the function stack frame layout.  Offsets are
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
2223101fbf96bceb4beb3a7d6cb04162481dc3bf..5b9e51b10e91eee64e3383c1ed50269c3e6cf24c
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -135,6 +135,7 @@ static tree arm_handle_isr_attribute (tree *, tree, tree, 
int, bool *);
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 static tree arm_handle_notshared_attribute (tree *, tree, tree, int, bool *);
 #endif
+static tree arm_handle_cmse_nonsecure_entry (tree *, tree, tree, int, bool *);
 static void arm_output_function_epilogue (FILE *, HOST_WIDE_INT);
 static void arm_output_function_prologue (FILE *, HOST_WIDE_INT);
 static int arm_comp_type_attributes (const_tree, const_tree);
@@ -343,6 +344,9 @@ static const struct attribute_spec arm_attribute_table[] =
   { "notshared",0, 0, false, true, false, arm_handle_notshared_attribute,
 false },
 #endif
+  /* ARMv8-M Security Extensions support.  */
+  { "cmse_nonsecure_entry", 0, 0, true, false, false,
+arm_handle_cmse_nonsecure_entry, false },
   { NULL,   0, 0, false, false, false, NULL, false }
 };
 

@@ -3562,6 +3566,9 @@ arm_compute_func_type (void)
   else
 type |= arm_isr_value (TREE_VALUE (a));
 
+  if (lookup_attribute ("cmse_nonsecure_entry", attr))
+type |= ARM_FT_CMSE_ENTRY;
+
   return type;
 }
 
@@ -6552,6 +6559,109 @@ arm_handle_notshared_attribute (tree *node,
 }
 #endif
 
+/* This function is used to check whether functions with attributes
+   cmse_nonsecure_call or cmse_nonsecure_entry use the stack to pass arguments
+   or return variables.  If the function does indeed use the stack this
+   function returns true and diagnoses this, otherwise it returns false.  */
+
+static bool
+cmse_func_args_or_return_in_stack (tree fndecl, tree name, tree fntype)
+{
+  function_args_iterator args_iter;
+  CUMULATIVE_ARGS args_so_far_v;
+  cumulative_args_t args_so_far;
+  bool first_param = true;
+  tree arg_type, prev_arg_type = NULL_TREE, ret_type;
+
+  /* Error out if any argument is passed on the stack.  */
+  arm_init_cumulative_args (&args_so_far_v, fntype, NULL_RTX, fndecl);
+  args_so_far = pack_cumulative_args (&args_so_far_v);
+  FOREACH_FUNCTION_ARGS (fntype, arg_type, args_iter)
+{
+  rtx arg_rtx;
+  machine_mode arg_mode = TYPE_MODE (arg_type);
+
+  prev_arg_type = arg_type;
+  if (VOID_TYPE_P (arg_type))
+   continue;
+
+  if (!first_param)
+   arm_function_arg_advance (args_so_far, arg_mode, arg_type, true);
+  arg_rtx = arm_function_arg (args_so_far, arg_mode, arg_type, true);
+  if (!arg_rtx
+ || arm_arg_partial_bytes (args_so_far, arg_mode, arg_type, true))
+   {
+ error ("%qE attribute not available to functions with arguments "
+"passed on the stack", name);
+ return true;
+   }
+  first_param = false;
+}
+
+  /* Error out for

[RFC][PATCH , ARM 2/8] Add RTL patterns for thumb1 push/pop

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch adds RTL patterns for the push and pop instructions for thumb1. 
These are needed by subsequent patches in the series.

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira
Thomas Preud'homme  

* gcc/config/arm/arm-ldmstm.nl (constr thumb): Enabled
  stackpointer to be written/read.
* gcc/config/arm/ldmstm.md: Regenerated.
* gcc/config/arm/thumb1.md (*thumb1_pop_single): New.
  (*thumb1_load_multiple_operation): New.
* gcc/config/arm/arm.c (thumb_pop): Fix of comment.


diff --git a/gcc/config/arm/arm-ldmstm.ml b/gcc/config/arm/arm-ldmstm.ml
index 
62982df594d5d4a1407df359e927c66986a9788c..f3ee741e93927d8d44a9eccec8970b46a8984216
 100644
--- a/gcc/config/arm/arm-ldmstm.ml
+++ b/gcc/config/arm/arm-ldmstm.ml
@@ -63,7 +63,7 @@ let rec final_offset addrmode nregs =
   | DB -> -4 * nregs
 
 let constr thumb =
-  if thumb then "l" else "rk"
+  if thumb then "lk" else "rk"
 
 let inout_constr op_type =
   match op_type with
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
06a6184ee0c4ed1a7cec1de4c1786e297cc57872..2223101fbf96bceb4beb3a7d6cb04162481dc3bf
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -23773,8 +23773,8 @@ thumb1_emit_multi_reg_push (unsigned long mask, 
unsigned long real_regs)
   return insn;
 }
 
-/* Emit code to push or pop registers to or from the stack.  F is the
-   assembly file.  MASK is the registers to pop.  */
+/* Emit code to pop registers from the stack.  F is the assembly file.
+   MASK is the registers to pop.  */
 static void
 thumb_pop (FILE *f, unsigned long mask)
 {
diff --git a/gcc/config/arm/ldmstm.md b/gcc/config/arm/ldmstm.md
index 
ebb09ab86e799f3606e0988980edf3cd0189272b..8c0472e07799bd9d08759e35b6b98f3536d3d013
 100644
--- a/gcc/config/arm/ldmstm.md
+++ b/gcc/config/arm/ldmstm.md
@@ -43,7 +43,7 @@
 (define_insn "*thumb_ldm4_ia"
   [(match_parallel 0 "load_multiple_operation"
 [(set (match_operand:SI 1 "low_register_operand" "")
-  (mem:SI (match_operand:SI 5 "s_register_operand" "l")))
+  (mem:SI (match_operand:SI 5 "s_register_operand" "lk")))
  (set (match_operand:SI 2 "low_register_operand" "")
   (mem:SI (plus:SI (match_dup 5)
   (const_int 4
@@ -80,7 +80,7 @@
 
 (define_insn "*thumb_ldm4_ia_update"
   [(match_parallel 0 "load_multiple_operation"
-[(set (match_operand:SI 5 "s_register_operand" "+&l")
+[(set (match_operand:SI 5 "s_register_operand" "+&lk")
   (plus:SI (match_dup 5) (const_int 16)))
  (set (match_operand:SI 1 "low_register_operand" "")
   (mem:SI (match_dup 5)))
@@ -133,7 +133,7 @@
 
 (define_insn "*thumb_stm4_ia_update"
   [(match_parallel 0 "store_multiple_operation"
-[(set (match_operand:SI 5 "s_register_operand" "+&l")
+[(set (match_operand:SI 5 "s_register_operand" "+&lk")
   (plus:SI (match_dup 5) (const_int 16)))
  (set (mem:SI (match_dup 5))
   (match_operand:SI 1 "low_register_operand" ""))
@@ -491,7 +491,7 @@
 (define_insn "*thumb_ldm3_ia"
   [(match_parallel 0 "load_multiple_operation"
 [(set (match_operand:SI 1 "low_register_operand" "")
-  (mem:SI (match_operand:SI 4 "s_register_operand" "l")))
+  (mem:SI (match_operand:SI 4 "s_register_operand" "lk")))
  (set (match_operand:SI 2 "low_register_operand" "")
   (mem:SI (plus:SI (match_dup 4)
   (const_int 4
@@ -522,7 +522,7 @@
 
 (define_insn "*thumb_ldm3_ia_update"
   [(match_parallel 0 "load_multiple_operation"
-[(set (match_operand:SI 4 "s_register_operand" "+&l")
+[(set (match_operand:SI 4 "s_register_operand" "+&lk")
   (plus:SI (match_dup 4) (const_int 12)))
  (set (match_operand:SI 1 "low_register_operand" "")
   (mem:SI (match_dup 4)))
@@ -568,7 +568,7 @@
 
 (define_insn "*thumb_stm3_ia_update"
   [(match_parallel 0 "store_multiple_operation"
-[(set (match_operand:SI 4 "s_register_operand" "+&l")
+[(set (match_operand:SI 4 "s_register_operand" "+&lk")
   (plus:SI (match_dup 4) (const_int 12)))
  (set (mem:SI (match_dup 4))
   (match_operand:SI 1 "low_register_operand" ""))
@@ -877,7 +877,7 @@
 (define_insn "*thumb_ldm2_ia"
   [(match_parallel 0 "load_multiple_operation"
 [(set (match_operand:SI 1 "low_

RE: [RFC][PATCH, ARM 1/8] Add support for ARMv8-M's Security Extensions flag and intrinsics

2015-12-25 Thread Thomas Preud'homme

And even better, with the patch (see below ChangeLog entries)! Sigh...

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Saturday, December 26, 2015 9:41 AM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [RFC][PATCH, ARM 1/8] Add support for ARMv8-M's Security
> Extensions flag and intrinsics
> 
> [Sending on behalf of Andre Vieira]
> 
> Hello,
> 
> This patch adds the support of the '-mcmse' option to enable ARMv8-M's
> Security Extensions and supports the following intrinsics:
> cmse_TT
> cmse_TT_fptr
> cmse_TTT
> cmse_TTT_fptr
> cmse_TTA
> cmse_TTA_fptr
> cmse_TTAT
> cmse_TTAT_fptr
> cmse_check_address_range
> cmse_check_pointed_object
> cmse_is_nsfptr
> cmse_nsfptr_create
> 
> It also defines the mandatory cmse_address_info struct and the
> __ARM_FEATURE_CMSE macro.
> See Chapter 4, Sections 5.2, 5.3 and 5.6 of ARM®v8-M Security
> Extensions: Requirements on Development Tools
> (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index
> .html).
> 
> *** gcc/ChangeLog ***
> 2015-10-27  Andre Vieira
> Thomas Preud'homme  
> 
> * gcc/config.gcc (extra_headers): Added arm_cmse.h.
> * gcc/config/arm/arm-arches.def (ARM_ARCH):
>   (armv8-m): Add FL2_CMSE.
>   (armv8-m.main): Likewise.
>   (armv8-m.main+dsp): Likewise.
> * gcc/config/arm/arm-c.c
>   (arm_cpu_builtins): Added __ARM_FEATURE_CMSE macro.
> * gcc/config/arm/arm-protos.h
>   (arm_is_constant_pool_ref): Define FL2_CMSE.
> * gcc/config/arm.c (arm_arch_cmse): New.
>   (arm_option_override): New error for unsupported cmse target.
> * gcc/config/arm/arm.h (arm_arch_cmse): New.
> * gcc/config/arm/arm.opt (mcmse): New.
> * gcc/doc/invoke.texi (ARM Options): Add -mcmse.
> * gcc/config/arm/arm_cmse.h: New file.
> * libgcc/config/arm/cmse.c: Likewise.
> * libgcc/config/arm/t-arm (HAVE_CMSE): New.
> 
> 
> *** gcc/testsuite/ChangeLog ***
> 2015-10-27  Andre Vieira
> Thomas Preud'homme  
> 
> * gcc.target/arm/cmse/cmse.exp: New.
> * gcc.target/arm/cmse/cmse-1.c: New.
> * gcc.target/arm/cmse/cmse-12.c: New.
> * lib/target-supports.exp
>   (check_effective_target_arm_cmse_ok): New.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 
882e4134b4c883a5fe0f19996e54ac63769bada1..701082e82ee3da6c5bf00da799293c92af8624ff
 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -321,7 +321,7 @@ arc*-*-*)
 arm*-*-*)
cpu_type=arm
extra_objs="arm-builtins.o aarch-common.o"
-   extra_headers="mmintrin.h arm_neon.h arm_acle.h"
+   extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_cmse.h"
target_type_format_char='%'
c_target_objs="arm-c.o"
cxx_target_objs="arm-c.o"
diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index 
1d0301a3b9414127d387834584f3e42c225b6d3f..52518e64e07a1b085ae5ed1932b598e064258971
 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -58,11 +58,11 @@ ARM_ARCH("armv7e-m", cortexm4,  7EM,
ARM_FSET_MAKE_CPU1 (FL_CO_PROC |  FL_F
 ARM_ARCH("armv8-a", cortexa53,  8A,ARM_FSET_MAKE_CPU1 (FL_CO_PROC |
 FL_FOR_ARCH8A))
 ARM_ARCH("armv8-a+crc",cortexa53, 8A,   ARM_FSET_MAKE_CPU1 (FL_CO_PROC | 
FL_CRC32  | FL_FOR_ARCH8A))
 ARM_ARCH("armv8-m.base", cortexm0, 8M_BASE,
-ARM_FSET_MAKE_CPU1 ( FL_FOR_ARCH8M_BASE))
+ARM_FSET_MAKE (  FL_FOR_ARCH8M_BASE, FL2_CMSE))
 ARM_ARCH("armv8-m.main", cortexm7, 8M_MAIN,
-ARM_FSET_MAKE_CPU1(FL_CO_PROC |  FL_FOR_ARCH8M_MAIN))
+ARM_FSET_MAKE (FL_CO_PROC |  FL_FOR_ARCH8M_MAIN, FL2_CMSE))
 ARM_ARCH("armv8-m.main+dsp", cortexm7, 8M_MAIN,
-ARM_FSET_MAKE_CPU1(FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN))
+ARM_FSET_MAKE (FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN, FL2_CMSE))
 ARM_ARCH("iwmmxt",  iwmmxt, 5TE,   ARM_FSET_MAKE_CPU1 (FL_LDSCHED | 
FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT))
 ARM_ARCH("iwmmxt2", iwmmxt2,5TE,   ARM_FSET_MAKE_CPU1 (FL_LDSCHED | 
FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2))
 
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 
7dee28ec52df68f8c7a60fe66e1b049fed39c1c0..459ddbb1f41947cbeeb1a291ab7395843528e562
 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -73

[RFC][PATCH, ARM 1/8] Add support for ARMv8-M's Security Extensions flag and intrinsics

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch adds the support of the '-mcmse' option to enable ARMv8-M's Security 
Extensions and supports the following intrinsics:
cmse_TT
cmse_TT_fptr
cmse_TTT
cmse_TTT_fptr
cmse_TTA
cmse_TTA_fptr
cmse_TTAT
cmse_TTAT_fptr
cmse_check_address_range
cmse_check_pointed_object
cmse_is_nsfptr
cmse_nsfptr_create

It also defines the mandatory cmse_address_info struct and the 
__ARM_FEATURE_CMSE macro.
See Chapter 4, Sections 5.2, 5.3 and 5.6 of ARM®v8-M Security Extensions: 
Requirements on Development Tools 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

*** gcc/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc/config.gcc (extra_headers): Added arm_cmse.h.
* gcc/config/arm/arm-arches.def (ARM_ARCH):
  (armv8-m): Add FL2_CMSE.
  (armv8-m.main): Likewise.
  (armv8-m.main+dsp): Likewise.
* gcc/config/arm/arm-c.c
  (arm_cpu_builtins): Added __ARM_FEATURE_CMSE macro.
* gcc/config/arm/arm-protos.h 
  (arm_is_constant_pool_ref): Define FL2_CMSE.
* gcc/config/arm.c (arm_arch_cmse): New.
  (arm_option_override): New error for unsupported cmse target.
* gcc/config/arm/arm.h (arm_arch_cmse): New.
* gcc/config/arm/arm.opt (mcmse): New.
* gcc/doc/invoke.texi (ARM Options): Add -mcmse.
* gcc/config/arm/arm_cmse.h: New file.
* libgcc/config/arm/cmse.c: Likewise.
* libgcc/config/arm/t-arm (HAVE_CMSE): New.


*** gcc/testsuite/ChangeLog ***
2015-10-27  Andre Vieira    
    Thomas Preud'homme  

* gcc.target/arm/cmse/cmse.exp: New.
* gcc.target/arm/cmse/cmse-1.c: New.
* gcc.target/arm/cmse/cmse-12.c: New.
* lib/target-supports.exp
  (check_effective_target_arm_cmse_ok): New.

We welcome any comments.

Cheers,

Andre

[RFC][PATCH, ARM 0/8] ARMv8-M Security Extensions

2015-12-25 Thread Thomas Preud'homme

[Sending on behalf of Andre Vieira]

Hello,

This patch series aims at implementing an alpha status support for ARMv8-M's 
Security Extensions. It is only posted as RFC at this stage. You can find the 
specification of ARMV8-M Security Extensions in: ARM®v8-M Security Extensions: 
Requirements on Development Tools 
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

We currently:
- do not support passing arguments or returning on the stack for 
cmse_nonsecure_{call,entry} functions,
- do not guarantee padding bits are cleared for arguments or return variables 
of cmse_nonsecure_{call,entry} functions,
- only test Security Extensions for -mfpu=fpv5-d16 and fpv5-sp-d16 and only 
support single and double precision FPU's with d16.


Andre Vieira (8):
 Add support for ARMv8-M's Security Extensions flag and intrinsics
 Add RTL patterns for thumb1 push/pop
 Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute
 ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and bxns 
return
 ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers
 Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute
 ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call
 Added support for ARMV8-M Security Extension cmse_nonsecure_caller intrinsic


Cheers,

Andre

[PATCH, ARM 7/6] Enable atomics for ARMv8-M Mainline

2015-12-17 Thread Thomas Preud'homme

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch enable atomics for ARMv8-M Mainline. No change is needed to 
existing patterns since Thumb-2 backend can already handle them fine.

[1] For a quick overview of ARMv8-M please refer to the initial cover letter.


ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2015-12-17  Thomas Preud'homme  

* config/arm/arm.h (TARGET_HAVE_LDACQ): Enable for ARMv8-M Mainline.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
1f79c37b5c36a410a2d500ba92c62a5ba4ca1178..fa2a6fb03ffd2ca53bfb7e7c8f03022b626880e0
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -258,7 +258,7 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 || arm_arch7) && arm_arch_notm)
 
 /* Nonzero if this chip supports load-acquire and store-release.  */
-#define TARGET_HAVE_LDACQ  (TARGET_ARM_ARCH >= 8 && arm_arch_notm)
+#define TARGET_HAVE_LDACQ  (TARGET_ARM_ARCH >= 8 && TARGET_32BIT)
 
 /* Nonzero if this chip provides the movw and movt instructions.  */
 #define TARGET_HAVE_MOVT   (arm_arch_thumb2 || arm_arch8)


Testing:

* Toolchain was built successfully with and without the ARMv8-M support patches 
with the following multilib list: armv6-m,armv7-m,armv7e-m,cortex-m7. The code 
generation for crtbegin.o, crtend.o, crti.o, crtn.o, libgcc.a, libgcov.a, 
libc.a, libg.a, libgloss-linux.a, libm.a, libnosys.a, librdimon.a, librdpmon.a, 
libstdc++.a and libsupc++.a is unchanged for all these targets. 

* GCC also showed no testsuite regression when targeting ARMv8-M Baseline 
compared to ARMv6-M on ARM Fast Models and when targeting ARMv6-M and ARMv7-M 
(compared to without the patch)
* GCC was bootstrapped successfully targeting Thumb-1 and targeting Thumb-2

Is this ok for stage3?

Best regards,

Thomas

[arm-embedded][PATCH, ARM, 3/3] Add multilib support for bare-metal ARM architectures

2015-12-17 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, December 16, 2015 8:04 PM
> To: 'Ramana Radhakrishnan'; Richard Earnshaw; Kyrylo Tkachov; gcc-
> patches
> Cc: Jasmin J.
> Subject: [PATCH, ARM, 3/3] Add multilib support for bare-metal ARM
> architectures
> 
> Hi Ramana,
> 
> As suggested in your initial answer to this thread, we updated the
> multilib patch provided in ARM's embedded branch to be up-to-date
> with regards to supported CPUs in GCC. As to the need to modify
> Makefile.in and configure.ac, this is because the patch aims to let control
> to the user as to what multilib should be built. To this effect, it takes a 
> list
> of architecture at configure time and that list needs to be passed down
> to t-baremetal Makefile to set the multilib variables appropriately.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-12-15  Thomas Preud'homme  
> 
> * Makefile.in (with_multilib_list): New variables substituted by
> configure.
> * config.gcc: Handle bare-metal multilibs in --with-multilib-list
> option.
> * config/arm/t-baremetal: New file.
> * configure.ac (with_multilib_list): New AC_SUBST.
> * configure: Regenerate.
> * doc/install.texi (--with-multilib-list): Update description for
> arm*-*-* targets to mention bare-metal multilibs.
> 
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index
> 1f698798aa2df3f44d6b3a478bb4bf48e9fa7372..18b790afa114aa7580be06
> 62d3ac9ffbc94e919d 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -546,6 +546,7 @@ lang_opt_files=@lang_opt_files@ $(srcdir)/c-
> family/c.opt $(srcdir)/common.opt
>  lang_specs_files=@lang_specs_files@
>  lang_tree_files=@lang_tree_files@
>  target_cpu_default=@target_cpu_default@
> +with_multilib_list=@with_multilib_list@
>  OBJC_BOEHM_GC=@objc_boehm_gc@
>  extra_modes_file=@extra_modes_file@
>  extra_opt_files=@extra_opt_files@
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index
> af948b5e203f6b4f53dfca38e9d02d060d00c97b..d8098ed3cefacd00cb1059
> 0db1ec86d48e9fcdbc 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -3787,15 +3787,25 @@ case "${target}" in
>   default)
>   ;;
>   *)
> - echo "Error: --with-multilib-
> list=${with_multilib_list} not supported." 1>&2
> - exit 1
> + for arm_multilib in ${arm_multilibs}; do
> + case ${arm_multilib} in
> + armv6-m | armv7-m | armv7e-m
> | armv7-r | armv8-m.base | armv8-m.main)
> +
>   tmake_profile_file="arm/t-baremetal"
> + ;;
> + *)
> + echo "Error: --with-
> multilib-list=${with_multilib_list} not supported." 1>&2
> + exit 1
> + ;;
> + esac
> + done
>   ;;
>   esac
> 
>   if test "x${tmake_profile_file}" != x ; then
> - # arm/t-aprofile is only designed to work
> - # without any with-cpu, with-arch, with-
> mode,
> - # with-fpu or with-float options.
> + # arm/t-aprofile and arm/t-baremetal are
> only
> + # designed to work without any with-cpu,
> + # with-arch, with-mode, with-fpu or
> with-float
> + # options.
>   if test "x$with_arch" != x \
>   || test "x$with_cpu" != x \
>   || test "x$with_float" != x \
> diff --git a/gcc/config/arm/t-baremetal b/gcc/config/arm/t-baremetal
> new file mode 100644
> index
> ..ffd29815e6ec22c747e777
> 47ed9b69e0ae21b63a
> --- /dev/null
> +++ b/gcc/config/arm/t-baremetal
> @@ -0,0 +1,130 @@
> +# A set of predefined MULTILIB which can be used for different ARM
> targets.
> +# Via the configure option --

[arm-embedded][PATCH, GCC/ARM, 2/3] Error out for incompatible ARM multilibs

2015-12-17 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, December 16, 2015 7:59 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, GCC/ARM, 2/3] Error out for incompatible ARM
> multilibs
> 
> Currently in config.gcc, only the first multilib in a multilib list is 
> checked for
> validity and the following elements are ignored due to the break which
> only breaks out of loop in shell. A loop is also done over the multilib list
> elements despite no combination being legal. This patch rework the code
> to address both issues.
> 
> ChangeLog entry is as follows:
> 
> 
> 2015-11-24  Thomas Preud'homme  
> 
> * config.gcc: Error out when conflicting multilib is detected.  Do not
> loop over multilibs since no combination is legal.
> 
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 59aee2c..be3c720 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -3772,38 +3772,40 @@ case "${target}" in
>   # Add extra multilibs
>   if test "x$with_multilib_list" != x; then
>   arm_multilibs=`echo $with_multilib_list | sed -e
> 's/,/ /g'`
> - for arm_multilib in ${arm_multilibs}; do
> - case ${arm_multilib} in
> - aprofile)
> + case ${arm_multilibs} in
> + aprofile)
>   # Note that arm/t-aprofile is a
>   # stand-alone make file fragment to be
>   # used only with itself.  We do not
>   # specifically use the
>   # TM_MULTILIB_OPTION framework
> because
>   # this shorthand is more
> - # pragmatic. Additionally it is only
> - # designed to work without any
> - # with-cpu, with-arch with-mode
> + # pragmatic.
> + tmake_profile_file="arm/t-aprofile"
> + ;;
> + default)
> + ;;
> + *)
> + echo "Error: --with-multilib-
> list=${with_multilib_list} not supported." 1>&2
> + exit 1
> + ;;
> + esac
> +
> + if test "x${tmake_profile_file}" != x ; then
> + # arm/t-aprofile is only designed to work
> + # without any with-cpu, with-arch, with-
> mode,
>   # with-fpu or with-float options.
> - if test "x$with_arch" != x \
> - || test "x$with_cpu" != x \
> - || test "x$with_float" != x \
> - || test "x$with_fpu" != x \
> - || test "x$with_mode" != x ;
> then
> - echo "Error: You cannot use
> any of --with-arch/cpu/fpu/float/mode with --with-multilib-list=aprofile"
> 1>&2
> - exit 1
> - fi
> - tmake_file="${tmake_file}
> arm/t-aprofile"
> - break
> - ;;
> - default)
> - ;;
> - *)
> - echo "Error: --with-multilib-
> list=${with_multilib_list} not supported." 1>&2
> - exit 1
> - ;;
> - esac
> - done
> + if test "x$with_arch" != x \
> + || test "x$with_cpu" != x \
> + || test "x$with_float" != x \
> + || test "x$with_fpu" != x \
> + || test "x$with_mode" != x ; then
> +

[arm-embedded][PATCH, ARM, 1/3] Document --with-multilib-list for arm--* targets

2015-12-17 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, December 16, 2015 7:56 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, ARM, 1/3] Document --with-multilib-list for arm*-*-*
> targets
> 
> Currently, the documentation for --with-multilib-list in
> gcc/doc/install.texi only mentions sh*-*-* and x86-64-*-linux* targets.
> However, arm*-*-* targets also support this option. This patch adds
> documention for the meaning of this option for arm*-*-* targets.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-12-09  Thomas Preud'homme  
> 
> * doc/install.texi (--with-multilib-list): Describe the meaning of the
> option for arm*-*-* targets.
> 
> 
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 57399ed..2c93eb0 100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -1102,9 +1102,19 @@ sysv, aix.
>  @item --with-multilib-list=@var{list}
>  @itemx --without-multilib-list
>  Specify what multilibs to build.
> -Currently only implemented for sh*-*-* and x86-64-*-linux*.
> +Currently only implemented for arm*-*-*, sh*-*-* and x86-64-*-linux*.
> 
>  @table @code
> +@item arm*-*-*
> +@var{list} is either @code{default} or @code{aprofile}.  Specifying
> +@code{default} is equivalent to omitting this option while specifying
> +@code{aprofile} builds multilibs for each combination of ISA (@code{-
> marm} or
> +@code{-mthumb}), architecture (@code{-march=armv7-a}, @code{-
> march=armv7ve},
> +or @code{-march=armv8-a}), FPU available (none, @code{-
> mfpu=vfpv3-d16},
> +@code{neon}, @code{vfpv4-d16}, @code{neon-vfpv4} or
> @code{neon-fp-armv8}
> +depending on architecture) and floating-point ABI (@code{-mfloat-
> abi=softfp}
> +or @code{-mfloat-abi=hard}).
> +
>  @item sh*-*-*
>  @var{list} is a comma separated list of CPU names.  These must be of
> the
>  form @code{sh*} or @code{m*} (in which case they match the compiler
> option
> 
> 
> PDF builds fine out of the updated file and look as expected.
> 
> Is this ok for trunk?
> 
> Best regards,
> 
> Thomas

RE: [PATCH, testsuite] Fix PR68629: attr-simd-3.c failure on arm-none-eabi targets

2015-12-17 Thread Thomas Preud'homme

Reverted now.

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Wednesday, December 09, 2015 5:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH, testsuite] Fix PR68629: attr-simd-3.c failure on arm-
> none-eabi targets
> 
> c-c++-common/attr-simd-3.c fails to compile on arm-none-eabi targets
> due to -fcilkplus needing -pthread which is not available for those targets.
> This patch solves this issue by adding a condition to the cilkplus effective
> target that compiling with -fcilkplus succeeds and requires cilkplus as an
> effective target for attr-simd-3.c testcase.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2015-12-08  Thomas Preud'homme  
> 
> PR testsuite/68629
> * lib/target-supports.exp (check_effective_target_cilkplus): Also
> check that compiling with -fcilkplus does not give an error.
> * c-c++-common/attr-simd-3.c: Require cilkplus effective target.
> 
> 
> diff --git a/gcc/testsuite/c-c++-common/attr-simd-3.c b/gcc/testsuite/c-
> c++-common/attr-simd-3.c
> index d61ba82..1970c67 100644
> --- a/gcc/testsuite/c-c++-common/attr-simd-3.c
> +++ b/gcc/testsuite/c-c++-common/attr-simd-3.c
> @@ -1,4 +1,5 @@
>  /* { dg-do compile } */
> +/* { dg-require-effective-target "cilkplus" } */
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-prune-output "undeclared here \\(not in a
> function\\)|\[^\n\r\]* was not declared in this scope" } */
> 
> diff --git a/gcc/testsuite/lib/target-supports.exp
> b/gcc/testsuite/lib/target-supports.exp
> index 4e349e9..95b903c 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -1432,7 +1432,12 @@ proc check_effective_target_cilkplus { } {
>  if { [istarget avr-*-*] } {
>   return 0;
>  }
> -return 1
> +return [ check_no_compiler_messages_nocache fcilkplus_available
> executable {
> + #ifdef __cplusplus
> + extern "C"
> + #endif
> + int dummy;
> + } "-fcilkplus" ]
>  }
> 
>  proc check_linker_plugin_available { } {
> 
> 
> Testsuite shows no regression when run with
>   + an arm-none-eabi GCC cross-compiler targeting Cortex-M3
>   + a bootstrapped x86_64-linux-gnu GCC native compiler
> 
> Is this ok for trunk?
> 
> Best regards,
> 
> Thomas
>

RE: [PATCH, testsuite] Fix PR68629: attr-simd-3.c failure on arm-none-eabi targets

2015-12-17 Thread Thomas Preud'homme

Hi,

> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Thursday, December 17, 2015 4:26 PM
> > >
> > > > --- a/gcc/testsuite/lib/target-supports.exp
> > > > +++ b/gcc/testsuite/lib/target-supports.exp
> > > > @@ -1432,7 +1432,12 @@ proc check_effective_target_cilkplus { } {
> > > >  if { [istarget avr-*-*] } {
> > > > return 0;
> > > >  }
> > > > -return 1
> > > > +return [ check_no_compiler_messages_nocache
> fcilkplus_available executable {
> > > > +   #ifdef __cplusplus
> > > > +   extern "C"
> > > > +   #endif
> > > > +   int dummy;
> > > > +   } "-fcilkplus" ]
> > > >  }
> 
> That change has been obviously bad.  If anything, you want to make it
> compile time only, i.e. check_no_compiler_messages_nocache
> fcilkplus_available assembly

Indeed, I failed to parse the space and didn't realize the kind of testing 
could be selected.

> Just look at cilk-plus.exp:
> It checks check_effective_target_cilkplus, and performs lots of tests if it
> it returns true, and then checks check_libcilkrts_available and performs
> further tests.
> So, if any use of -fcilkplus fails on your target, then putting it
> into check_effective_target_cilkplus is fine, you won't lose any Cilk+
> testing that way.  Otherwise, if it is conditional say only some constructs,
> say array notation is fine, but _Cilk_for is not, then even that is wrong.

Ok. When I saw the very small list of target for which the condition returned 
true, I thought the goal was only to check if the target *could* support 
cilkplus and that actual support was tested by cilk-plus.exp. I'll revert this 
commit and prepare a patch to add arm in that list.

> 
> In any case, IMHO the attr-simd-3.c test just should be moved into
> c-c++-common/cilk-plus/SE/ directory.

That was my thought initially but then I changed my mind, thinking that the 
test was placed there for a reason. I'll prepare a third patch to do that.

My apologize for the breakage.

Best regards,

Thomas

[arm-embedded][PATCH, ARM 6/6] Add support for CB(N)Z and (U|S)DIV to ARMv8-M Baseline

2015-12-17 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 4:18 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, ARM 6/6] Add support for CB(N)Z and (U|S)DIV to
> ARMv8-M Baseline
> 
> Hi,
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch makes the compiler start generating code with the
> new CB(N)Z and (U|S)DIV instructions for ARMv8-M Baseline.
> 
> Sharing of instruction patterns for div insn template with ARM or Thumb-
> 2 was done by allowing %? punctuation character for Thumb-1. This is
> safe to do since the compiler would fault in arm_print_condition if a
> condition code is not handled by a branch in Thumb1.
> 
> Unfortunately, cbz cannot be shared with cbranchsi4 because it would
> lead to worse code for Thumb-1. Indeed, choosing cb(n)z over the other
> alternatives for cbranchsi4 depends on the distance between target and
> pc which lead insn-attrtab to evaluate the minimum length of this
> pattern to be 2 as it cannot computer the distance statically. It would be
> possible to determine that this alternative is not available for non
> ARMv8-M Thumb-1 target statically but genattrtab is not currently
> capable to do it, so this is for a later patch.
> 
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-11-13  Thomas Preud'homme  
> 
> * config/arm/arm.c (arm_print_operand_punct_valid_p): Make %?
> valid
> for Thumb-1.
> * config/arm/arm.h (TARGET_HAVE_CBZ): Define.
> (TARGET_IDIV): Set for all Thumb targets provided they have
> hardware
> divide feature.
> * config/arm/thumb1.md (thumb1_cbz): New insn.
> 
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 015df50..247f144 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -263,9 +263,12 @@ extern void
> (*arm_lang_output_object_attributes_hook)(void);
>  /* Nonzero if this chip provides the movw and movt instructions.  */
>  #define TARGET_HAVE_MOVT (arm_arch_thumb2 || arm_arch8)
> 
> +/* Nonzero if this chip provides the cb{n}z instruction.  */
> +#define TARGET_HAVE_CBZ  (arm_arch_thumb2 || arm_arch8)
> +
>  /* Nonzero if integer division instructions supported.  */
>  #define TARGET_IDIV  ((TARGET_ARM && arm_arch_arm_hwdiv) \
> -  || (TARGET_THUMB2 &&
> arm_arch_thumb_hwdiv))
> +  || (TARGET_THUMB &&
> arm_arch_thumb_hwdiv))
> 
>  /* Nonzero if disallow volatile memory access in IT block.  */
>  #define TARGET_NO_VOLATILE_CE
>   (arm_arch_no_volatile_ce)
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index d832309..5ef3a1d 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -22568,7 +22568,7 @@ arm_print_operand_punct_valid_p
> (unsigned char code)
>  {
>return (code == '@' || code == '|' || code == '.'
> || code == '(' || code == ')' || code == '#'
> -   || (TARGET_32BIT && (code == '?'))
> +   || code == '?'
> || (TARGET_THUMB2 && (code == '!'))
> || (TARGET_THUMB && (code == '_')));
>  }
> diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
> index 7e3bcb4..074b267 100644
> --- a/gcc/config/arm/thumb1.md
> +++ b/gcc/config/arm/thumb1.md
> @@ -973,6 +973,92 @@
>DONE;
>  })
> 
> +;; A pattern for the cb(n)z instruction added in ARMv8-M baseline
> profile,
> +;; adapted from cbranchsi4_insn.  Modifying cbranchsi4_insn instead
> leads to
> +;; code generation difference for ARMv6-M because the minimum
> length of the
> +;; instruction becomes 2 even for it due to a limitation in genattrtab's
> +;; handling of pc in the length condition.
> +(define_insn "thumb1_cbz"
> +  [(set (pc) (if_then_else
> +   (match_operator 0 "equality_operator"
> +[(match_operand:SI 1 "s_register_operand" "l")
> + (const_int 0)])
> +   (label_ref (match_operand 2 "" ""))
> +   (pc)))]
> +  "TARGET_THUMB1 && TARGET_HAVE_MOVT"
> +{
> +  if (get_attr_length (insn) == 2)
> +{

[PATCH, ARM 6/6] Add support for CB(N)Z and (U|S)DIV to ARMv8-M Baseline

2015-12-17 Thread Thomas Preud'homme

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch makes the compiler start generating code with the new CB(N)Z and 
(U|S)DIV instructions for ARMv8-M Baseline.

Sharing of instruction patterns for div insn template with ARM or Thumb-2 was 
done by allowing %? punctuation character for Thumb-1. This is safe to do since 
the compiler would fault in arm_print_condition if a condition code is not 
handled by a branch in Thumb1.

Unfortunately, cbz cannot be shared with cbranchsi4 because it would lead to 
worse code for Thumb-1. Indeed, choosing cb(n)z over the other alternatives for 
cbranchsi4 depends on the distance between target and pc which lead 
insn-attrtab to evaluate the minimum length of this pattern to be 2 as it 
cannot computer the distance statically. It would be possible to determine that 
this alternative is not available for non ARMv8-M Thumb-1 target statically but 
genattrtab is not currently capable to do it, so this is for a later patch.


[1] For a quick overview of ARMv8-M please refer to the initial cover letter.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-11-13  Thomas Preud'homme  

* config/arm/arm.c (arm_print_operand_punct_valid_p): Make %? valid
for Thumb-1.
* config/arm/arm.h (TARGET_HAVE_CBZ): Define.
(TARGET_IDIV): Set for all Thumb targets provided they have hardware
divide feature.
* config/arm/thumb1.md (thumb1_cbz): New insn.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 015df50..247f144 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -263,9 +263,12 @@ extern void 
(*arm_lang_output_object_attributes_hook)(void);
 /* Nonzero if this chip provides the movw and movt instructions.  */
 #define TARGET_HAVE_MOVT   (arm_arch_thumb2 || arm_arch8)
 
+/* Nonzero if this chip provides the cb{n}z instruction.  */
+#define TARGET_HAVE_CBZ(arm_arch_thumb2 || arm_arch8)
+
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV((TARGET_ARM && arm_arch_arm_hwdiv) \
-|| (TARGET_THUMB2 && arm_arch_thumb_hwdiv))
+|| (TARGET_THUMB && arm_arch_thumb_hwdiv))
 
 /* Nonzero if disallow volatile memory access in IT block.  */
 #define TARGET_NO_VOLATILE_CE  (arm_arch_no_volatile_ce)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d832309..5ef3a1d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -22568,7 +22568,7 @@ arm_print_operand_punct_valid_p (unsigned char code)
 {
   return (code == '@' || code == '|' || code == '.'
  || code == '(' || code == ')' || code == '#'
- || (TARGET_32BIT && (code == '?'))
+ || code == '?'
  || (TARGET_THUMB2 && (code == '!'))
  || (TARGET_THUMB && (code == '_')));
 }
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 7e3bcb4..074b267 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -973,6 +973,92 @@
   DONE;
 })
 
+;; A pattern for the cb(n)z instruction added in ARMv8-M baseline profile,
+;; adapted from cbranchsi4_insn.  Modifying cbranchsi4_insn instead leads to
+;; code generation difference for ARMv6-M because the minimum length of the
+;; instruction becomes 2 even for it due to a limitation in genattrtab's
+;; handling of pc in the length condition.
+(define_insn "thumb1_cbz"
+  [(set (pc) (if_then_else
+ (match_operator 0 "equality_operator"
+  [(match_operand:SI 1 "s_register_operand" "l")
+   (const_int 0)])
+ (label_ref (match_operand 2 "" ""))
+ (pc)))]
+  "TARGET_THUMB1 && TARGET_HAVE_MOVT"
+{
+  if (get_attr_length (insn) == 2)
+{
+  if (GET_CODE (operands[0]) == EQ)
+   return "cbz\t%1, %l2";
+  else
+   return "cbnz\t%1, %l2";
+}
+  else
+{
+  rtx t = cfun->machine->thumb1_cc_insn;
+  if (t != NULL_RTX)
+   {
+ if (!rtx_equal_p (cfun->machine->thumb1_cc_op0, operands[1])
+ || !rtx_equal_p (cfun->machine->thumb1_cc_op1, operands[2]))
+   t = NULL_RTX;
+ if (cfun->machine->thumb1_cc_mode == CC_NOOVmode)
+   {
+ if (!noov_comparison_operator (operands[0], VOIDmode))
+   t = NULL_RTX;
+   }
+ else if (cfun->machine->thumb1_cc_mode != CCmode)
+   t = NULL_RTX;
+   }
+  if (t == NULL_RTX)
+   {
+ output_asm_insn ("cmp\t%1, #0", operands);
+ cfun->machine->thumb1_cc_insn = insn;
+ cfun->machine->thumb1_cc_op0 = operands[1];
+ cfun->machi

[arm-embedded][PATCH, ARM 5/6] Add support for MOVT/MOVW to ARMv8-M Baseline

2015-12-17 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 4:08 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, ARM 5/6] Add support for MOVT/MOVW to ARMv8-M
> Baseline
> 
> Hi,
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch makes the compiler start generating code with the
> new MOVT/MOVW instructions for ARMv8-M Baseline.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-11-13  Thomas Preud'homme  
> 
> * config/arm/arm.h (TARGET_HAVE_MOVT): Include ARMv8-M as
> having MOVT.
> * config/arm/arm.c (arm_arch_name): (const_ok_for_op): Check
> MOVT/MOVW
> availability with TARGET_HAVE_MOVT.
> (thumb_legitimate_constant_p): Legalize high part of a label_ref as a
> constant.
> (thumb1_rtx_costs): Also return 0 if setting a half word constant and
> movw is available.
> (thumb1_size_rtx_costs): Make set of half word constant also cost 1
> extra instruction if MOVW is available.  Make constant with bottom
> half
> word zero cost 2 instruction if MOVW is available.
> * config/arm/arm.md (define_attr "arch"): Add v8mb.
> (define_attr "arch_enabled"): Set to yes if arch value is v8mb and
> target is ARMv8-M Baseline.
> * config/arm/thumb1.md (thumb1_movdi_insn): Add ARMv8-M
> Baseline only
> alternative for constants satisfying j constraint.
> (thumb1_movsi_insn): Likewise.
> (movsi splitter for K alternative): Tighten condition to not trigger
> if movt is available and j constraint is satisfied.
> (Pe immediate splitter): Likewise.
> (thumb1_movhi_insn): Add ARMv8-M Baseline only alternative for
> constant fitting in an halfword to use movw.
> * doc/sourcebuild.texi (arm_thumb1_movt_ko): Document new
> ARM
> effective target.
> 
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2015-11-13  Thomas Preud'homme  
> 
> * lib/target-supports.exp
> (check_effective_target_arm_thumb1_movt_ko):
> Define effective target.
> * gcc.target/arm/pr42574.c: Require arm_thumb1_movt_ko instead
> of
> arm_thumb1_ok as effective target to exclude ARMv8-M Baseline.
> 
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index ff3cfcd..015df50 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -261,7 +261,7 @@ extern void
> (*arm_lang_output_object_attributes_hook)(void);
>  #define TARGET_HAVE_LDACQ(TARGET_ARM_ARCH >= 8 &&
> arm_arch_notm)
> 
>  /* Nonzero if this chip provides the movw and movt instructions.  */
> -#define TARGET_HAVE_MOVT (arm_arch_thumb2)
> +#define TARGET_HAVE_MOVT (arm_arch_thumb2 || arm_arch8)
> 
>  /* Nonzero if integer division instructions supported.  */
>  #define TARGET_IDIV  ((TARGET_ARM && arm_arch_arm_hwdiv) \
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 51d501e..d832309 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -8158,6 +8158,12 @@ arm_legitimate_constant_p_1
> (machine_mode, rtx x)
>  static bool
>  thumb_legitimate_constant_p (machine_mode mode
> ATTRIBUTE_UNUSED, rtx x)
>  {
> +  /* Splitters for TARGET_USE_MOVT call arm_emit_movpair which
> creates high
> + RTX.  These RTX must therefore be allowed for Thumb-1 so that
> when run
> + for ARMv8-M baseline or later the result is valid.  */
> +  if (TARGET_HAVE_MOVT && GET_CODE (x) == HIGH)
> +x = XEXP (x, 0);
> +
>return (CONST_INT_P (x)
> || CONST_DOUBLE_P (x)
> || CONSTANT_ADDRESS_P (x)
> @@ -8244,7 +8250,8 @@ thumb1_rtx_costs (rtx x, enum rtx_code code,
> enum rtx_code outer)
>  case CONST_INT:
>if (outer == SET)
>   {
> -   if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256)
> +   if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256
> +   || (TARGET_HAVE_MOVT && !(INTVAL (x) & 0x)))
>   return 0;
> if (thumb_shiftable_const (INTVAL (x)))
>   return COSTS_N_INSNS (2);
> @@ -8994,16 +9001,24 @@ thumb1_size_rtx_costs (rtx x, enum
> rtx_code code, enum rtx_code outer)
>the mode.  */
>

[PATCH, ARM 5/6] Add support for MOVT/MOVW to ARMv8-M Baseline

2015-12-17 Thread Thomas Preud'homme

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch makes the compiler start generating code with the new MOVT/MOVW 
instructions for ARMv8-M Baseline.

[1] For a quick overview of ARMv8-M please refer to the initial cover letter.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-11-13  Thomas Preud'homme  

* config/arm/arm.h (TARGET_HAVE_MOVT): Include ARMv8-M as having MOVT.
* config/arm/arm.c (arm_arch_name): (const_ok_for_op): Check MOVT/MOVW
availability with TARGET_HAVE_MOVT.
(thumb_legitimate_constant_p): Legalize high part of a label_ref as a
constant.
(thumb1_rtx_costs): Also return 0 if setting a half word constant and
movw is available.
(thumb1_size_rtx_costs): Make set of half word constant also cost 1
extra instruction if MOVW is available.  Make constant with bottom half
word zero cost 2 instruction if MOVW is available.
* config/arm/arm.md (define_attr "arch"): Add v8mb.
(define_attr "arch_enabled"): Set to yes if arch value is v8mb and
target is ARMv8-M Baseline.
* config/arm/thumb1.md (thumb1_movdi_insn): Add ARMv8-M Baseline only
alternative for constants satisfying j constraint.
(thumb1_movsi_insn): Likewise.
(movsi splitter for K alternative): Tighten condition to not trigger
if movt is available and j constraint is satisfied.
(Pe immediate splitter): Likewise.
(thumb1_movhi_insn): Add ARMv8-M Baseline only alternative for
constant fitting in an halfword to use movw.
* doc/sourcebuild.texi (arm_thumb1_movt_ko): Document new ARM
effective target.


*** gcc/testsuite/ChangeLog ***

2015-11-13  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_arm_thumb1_movt_ko):
Define effective target.
* gcc.target/arm/pr42574.c: Require arm_thumb1_movt_ko instead of
arm_thumb1_ok as effective target to exclude ARMv8-M Baseline.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index ff3cfcd..015df50 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -261,7 +261,7 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 #define TARGET_HAVE_LDACQ  (TARGET_ARM_ARCH >= 8 && arm_arch_notm)
 
 /* Nonzero if this chip provides the movw and movt instructions.  */
-#define TARGET_HAVE_MOVT   (arm_arch_thumb2)
+#define TARGET_HAVE_MOVT   (arm_arch_thumb2 || arm_arch8)
 
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV((TARGET_ARM && arm_arch_arm_hwdiv) \
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 51d501e..d832309 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8158,6 +8158,12 @@ arm_legitimate_constant_p_1 (machine_mode, rtx x)
 static bool
 thumb_legitimate_constant_p (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
 {
+  /* Splitters for TARGET_USE_MOVT call arm_emit_movpair which creates high
+ RTX.  These RTX must therefore be allowed for Thumb-1 so that when run
+ for ARMv8-M baseline or later the result is valid.  */
+  if (TARGET_HAVE_MOVT && GET_CODE (x) == HIGH)
+x = XEXP (x, 0);
+
   return (CONST_INT_P (x)
  || CONST_DOUBLE_P (x)
  || CONSTANT_ADDRESS_P (x)
@@ -8244,7 +8250,8 @@ thumb1_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer)
 case CONST_INT:
   if (outer == SET)
{
- if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256)
+ if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256
+ || (TARGET_HAVE_MOVT && !(INTVAL (x) & 0x)))
return 0;
  if (thumb_shiftable_const (INTVAL (x)))
return COSTS_N_INSNS (2);
@@ -8994,16 +9001,24 @@ thumb1_size_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer)
 the mode.  */
   words = ARM_NUM_INTS (GET_MODE_SIZE (GET_MODE (SET_DEST (x;
   return COSTS_N_INSNS (words)
-+ COSTS_N_INSNS (1) * (satisfies_constraint_J (SET_SRC (x))
-   || satisfies_constraint_K (SET_SRC (x))
-  /* thumb1_movdi_insn.  */
-   || ((words > 1) && MEM_P (SET_SRC (x;
++ COSTS_N_INSNS (1)
+  * (satisfies_constraint_J (SET_SRC (x))
+ || satisfies_constraint_K (SET_SRC (x))
+/* Too big immediate for 2byte mov, using movt.  */
+ || ((unsigned HOST_WIDE_INT) INTVAL (SET_SRC (x)) >= 256
+ && TARGET_HAVE_MOVT
+ && satisfies_constraint_j (SET_SRC (x)))
+/* thumb1_movdi_insn.  */
+ || ((words > 1) && MEM_P (SET_SRC (x;
 
 case CONST_INT:
   if (outer == SET)
 {

[arm-embedded][PATCH, ARM 4/6] Factor out MOVW/MOVT availability and desirability checks

2015-12-17 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 3:59 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, ARM 4/6] Factor out MOVW/MOVT availability and
> desirability checks
> 
> Hi,
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch factors out the checks for MOVW/MOVT availability
> and whether to use it. To this end, the new macro TARGET_HAVE_MOVT
> is introduced and code is modified to use it or the existing
> TARGET_USE_MOVT as needed.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-11-09  Thomas Preud'homme  
> 
> * config/arm/arm.h (TARGET_USE_MOVT): Check MOVT/MOVW
> availability
> with TARGET_HAVE_MOVT.
> (TARGET_HAVE_MOVT): Define.
> * config/arm/arm.c (const_ok_for_op): Check MOVT/MOVW
> availability with TARGET_HAVE_MOVT.
> * config/arm/arm.md (arm_movt): Use TARGET_HAVE_MOVT to
> check movt
> availability.
> (addsi splitter): Use TARGET_USE_MOVT to check whether to use
> movt + movw.
> (symbol_refs movsi splitter): Remove TARGET_32BIT check.
> (arm_movtas_ze): Use TARGET_HAVE_MOVT to check movt
> availability.
> * config/arm/constraints.md (define_constraint "j"): Use
> TARGET_HAVE_MOVT to check movt availability.
> 
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index fed3205..1831d01 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -233,7 +233,7 @@ extern void
> (*arm_lang_output_object_attributes_hook)(void);
> 
>  /* Should MOVW/MOVT be used in preference to a constant pool.  */
>  #define TARGET_USE_MOVT \
> -  (arm_arch_thumb2 \
> +  (TARGET_HAVE_MOVT \
> && (arm_disable_literal_pool \
> || (!optimize_size && !current_tune->prefer_constant_pool)))
> 
> @@ -268,6 +268,9 @@ extern void
> (*arm_lang_output_object_attributes_hook)(void);
>  /* Nonzero if this chip supports load-acquire and store-release.  */
>  #define TARGET_HAVE_LDACQ(TARGET_ARM_ARCH >= 8)
> 
> +/* Nonzero if this chip provides the movw and movt instructions.  */
> +#define TARGET_HAVE_MOVT (arm_arch_thumb2)
> +
>  /* Nonzero if integer division instructions supported.  */
>  #define TARGET_IDIV  ((TARGET_ARM && arm_arch_arm_hwdiv) \
>|| (TARGET_THUMB2 &&
> arm_arch_thumb_hwdiv))
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 62287bc..ec5197a 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3851,7 +3851,7 @@ const_ok_for_op (HOST_WIDE_INT i, enum
> rtx_code code)
>  {
>  case SET:
>/* See if we can use movw.  */
> -  if (arm_arch_thumb2 && (i & 0x) == 0)
> +  if (TARGET_HAVE_MOVT && (i & 0x) == 0)
>   return 1;
>else
>   /* Otherwise, try mvn.  */
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 8ebb1bf..78dafa0 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -5736,7 +5736,7 @@
>[(set (match_operand:SI 0 "nonimmediate_operand" "=r")
>   (lo_sum:SI (match_operand:SI 1 "nonimmediate_operand" "0")
>  (match_operand:SI 2 "general_operand"  "i")))]
> -  "arm_arch_thumb2 && arm_valid_symbolic_address_p (operands[2])"
> +  "TARGET_HAVE_MOVT && arm_valid_symbolic_address_p
> (operands[2])"
>"movt%?\t%0, #:upper16:%c2"
>[(set_attr "predicable" "yes")
> (set_attr "predicable_short_it" "no")
> @@ -5796,8 +5796,7 @@
>[(set (match_operand:SI 0 "arm_general_register_operand" "")
>   (const:SI (plus:SI (match_operand:SI 1 "general_operand" "")
>  (match_operand:SI 2 "const_int_operand"
> ""]
> -  "TARGET_THUMB2
> -   && arm_disable_literal_pool
> +  "TARGET_USE_MOVT
> && reload_completed
> && GET_CODE (operands[1]) == SYMBOL_REF"
>[(clobber (const_int 0))]
> @@ -5827,8 +5826,7 @@
>  (define_split
>[(set (match_operand:SI

[PATCH, ARM 4/6] Factor out MOVW/MOVT availability and desirability checks

2015-12-17 Thread Thomas Preud'homme

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch factors out the checks for MOVW/MOVT availability and whether to 
use it. To this end, the new macro TARGET_HAVE_MOVT is introduced and code is 
modified to use it or the existing TARGET_USE_MOVT as needed.

[1] For a quick overview of ARMv8-M please refer to the initial cover letter.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-11-09  Thomas Preud'homme  

* config/arm/arm.h (TARGET_USE_MOVT): Check MOVT/MOVW availability
with TARGET_HAVE_MOVT.
(TARGET_HAVE_MOVT): Define.
* config/arm/arm.c (const_ok_for_op): Check MOVT/MOVW
availability with TARGET_HAVE_MOVT.
* config/arm/arm.md (arm_movt): Use TARGET_HAVE_MOVT to check movt
availability.
(addsi splitter): Use TARGET_USE_MOVT to check whether to use
movt + movw.
(symbol_refs movsi splitter): Remove TARGET_32BIT check.
(arm_movtas_ze): Use TARGET_HAVE_MOVT to check movt availability.
* config/arm/constraints.md (define_constraint "j"): Use
TARGET_HAVE_MOVT to check movt availability.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index fed3205..1831d01 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -233,7 +233,7 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 
 /* Should MOVW/MOVT be used in preference to a constant pool.  */
 #define TARGET_USE_MOVT \
-  (arm_arch_thumb2 \
+  (TARGET_HAVE_MOVT \
&& (arm_disable_literal_pool \
|| (!optimize_size && !current_tune->prefer_constant_pool)))
 
@@ -268,6 +268,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 /* Nonzero if this chip supports load-acquire and store-release.  */
 #define TARGET_HAVE_LDACQ  (TARGET_ARM_ARCH >= 8)
 
+/* Nonzero if this chip provides the movw and movt instructions.  */
+#define TARGET_HAVE_MOVT   (arm_arch_thumb2)
+
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV((TARGET_ARM && arm_arch_arm_hwdiv) \
 || (TARGET_THUMB2 && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 62287bc..ec5197a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3851,7 +3851,7 @@ const_ok_for_op (HOST_WIDE_INT i, enum rtx_code code)
 {
 case SET:
   /* See if we can use movw.  */
-  if (arm_arch_thumb2 && (i & 0x) == 0)
+  if (TARGET_HAVE_MOVT && (i & 0x) == 0)
return 1;
   else
/* Otherwise, try mvn.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8ebb1bf..78dafa0 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5736,7 +5736,7 @@
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r")
(lo_sum:SI (match_operand:SI 1 "nonimmediate_operand" "0")
   (match_operand:SI 2 "general_operand"  "i")))]
-  "arm_arch_thumb2 && arm_valid_symbolic_address_p (operands[2])"
+  "TARGET_HAVE_MOVT && arm_valid_symbolic_address_p (operands[2])"
   "movt%?\t%0, #:upper16:%c2"
   [(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")
@@ -5796,8 +5796,7 @@
   [(set (match_operand:SI 0 "arm_general_register_operand" "")
(const:SI (plus:SI (match_operand:SI 1 "general_operand" "")
   (match_operand:SI 2 "const_int_operand" ""]
-  "TARGET_THUMB2
-   && arm_disable_literal_pool
+  "TARGET_USE_MOVT
&& reload_completed
&& GET_CODE (operands[1]) == SYMBOL_REF"
   [(clobber (const_int 0))]
@@ -5827,8 +5826,7 @@
 (define_split
   [(set (match_operand:SI 0 "arm_general_register_operand" "")
(match_operand:SI 1 "general_operand" ""))]
-  "TARGET_32BIT
-   && TARGET_USE_MOVT && GET_CODE (operands[1]) == SYMBOL_REF
+  "TARGET_USE_MOVT && GET_CODE (operands[1]) == SYMBOL_REF
&& !flag_pic && !target_word_relocations
&& !arm_tls_referenced_p (operands[1])"
   [(clobber (const_int 0))]
@@ -11030,7 +11028,7 @@
(const_int 16)
(const_int 16))
 (match_operand:SI 1 "const_int_operand" ""))]
-  "arm_arch_thumb2"
+  "TARGET_HAVE_MOVT"
   "movt%?\t%0, %L1"
  [(set_attr "predicable" "yes")
   (set_attr "predicable_short_it" "no")
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index d01a918..838e031 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/confi

RE: [PATCH, ARM 3/6] Fix indentation of FL_FOR_ARCH* definition after adding support for ARMv8-M

2015-12-16 Thread Thomas Preud'homme

[Fixed the subject and added ARM maintainers to recipient.]

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 3:51 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH, ARM 3/8] Fix indentation of FL_FOR_ARCH* definition
> after adding support for ARMv8-M
> 
> Hi,
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch fixes the indentation of FL_FOR_ARCH* macros
> definition following the patch to add support for ARMv8-M. Since this is
> an obvious change, I'm not expecting a review and will commit it as soon
> as the other patches in the series are accepted.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-11-06  Thomas Preud'homme  
> 
> * config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.
> 
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 1371ee7..bf0d1b4 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -391,32 +391,33 @@ extern bool arm_is_constant_pool_ref (rtx);
>  #define FL_TUNE  (FL_WBUF | FL_VFPV2 | FL_STRONG |
> FL_LDSCHED \
>| FL_CO_PROC)
> 
> -#define FL_FOR_ARCH2 FL_NOTM
> -#define FL_FOR_ARCH3 (FL_FOR_ARCH2 | FL_MODE32)
> -#define FL_FOR_ARCH3M(FL_FOR_ARCH3 | FL_ARCH3M)
> -#define FL_FOR_ARCH4 (FL_FOR_ARCH3M | FL_ARCH4)
> -#define FL_FOR_ARCH4T(FL_FOR_ARCH4 | FL_THUMB)
> -#define FL_FOR_ARCH5 (FL_FOR_ARCH4 | FL_ARCH5)
> -#define FL_FOR_ARCH5T(FL_FOR_ARCH5 | FL_THUMB)
> -#define FL_FOR_ARCH5E(FL_FOR_ARCH5 | FL_ARCH5E)
> -#define FL_FOR_ARCH5TE   (FL_FOR_ARCH5E | FL_THUMB)
> -#define FL_FOR_ARCH5TEJ  FL_FOR_ARCH5TE
> -#define FL_FOR_ARCH6 (FL_FOR_ARCH5TE | FL_ARCH6)
> -#define FL_FOR_ARCH6JFL_FOR_ARCH6
> -#define FL_FOR_ARCH6K(FL_FOR_ARCH6 | FL_ARCH6K)
> -#define FL_FOR_ARCH6ZFL_FOR_ARCH6
> -#define FL_FOR_ARCH6KZ   (FL_FOR_ARCH6K | FL_ARCH6KZ)
> -#define FL_FOR_ARCH6T2   (FL_FOR_ARCH6 | FL_THUMB2)
> -#define FL_FOR_ARCH6M(FL_FOR_ARCH6 & ~FL_NOTM)
> -#define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM) |
> FL_ARCH7)
> -#define FL_FOR_ARCH7A(FL_FOR_ARCH7 | FL_NOTM |
> FL_ARCH6K)
> -#define FL_FOR_ARCH7VE   (FL_FOR_ARCH7A | FL_THUMB_DIV |
> FL_ARM_DIV)
> -#define FL_FOR_ARCH7R(FL_FOR_ARCH7A | FL_THUMB_DIV)
> -#define FL_FOR_ARCH7M(FL_FOR_ARCH7 | FL_THUMB_DIV)
> -#define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
> -#define FL_FOR_ARCH8A(FL_FOR_ARCH7VE | FL_ARCH8)
> -#define FL_FOR_ARCH8M_BASE (FL_FOR_ARCH6M | FL_ARCH8 |
> FL_THUMB_DIV)
> -#define FL_FOR_ARCH8M_MAIN (FL_FOR_ARCH7M | FL_ARCH8)
> +#define FL_FOR_ARCH2 FL_NOTM
> +#define FL_FOR_ARCH3 (FL_FOR_ARCH2 | FL_MODE32)
> +#define FL_FOR_ARCH3M(FL_FOR_ARCH3 | FL_ARCH3M)
> +#define FL_FOR_ARCH4 (FL_FOR_ARCH3M | FL_ARCH4)
> +#define FL_FOR_ARCH4T(FL_FOR_ARCH4 | FL_THUMB)
> +#define FL_FOR_ARCH5 (FL_FOR_ARCH4 | FL_ARCH5)
> +#define FL_FOR_ARCH5T(FL_FOR_ARCH5 | FL_THUMB)
> +#define FL_FOR_ARCH5E(FL_FOR_ARCH5 | FL_ARCH5E)
> +#define FL_FOR_ARCH5TE   (FL_FOR_ARCH5E | FL_THUMB)
> +#define FL_FOR_ARCH5TEJ  FL_FOR_ARCH5TE
> +#define FL_FOR_ARCH6 (FL_FOR_ARCH5TE | FL_ARCH6)
> +#define FL_FOR_ARCH6JFL_FOR_ARCH6
> +#define FL_FOR_ARCH6K(FL_FOR_ARCH6 | FL_ARCH6K)
> +#define FL_FOR_ARCH6ZFL_FOR_ARCH6
> +#define FL_FOR_ARCH6ZK   FL_FOR_ARCH6K
> +#define FL_FOR_ARCH6KZ   (FL_FOR_ARCH6K | FL_ARCH6KZ)
> +#define FL_FOR_ARCH6T2   (FL_FOR_ARCH6 | FL_THUMB2)
> +#define FL_FOR_ARCH6M(FL_FOR_ARCH6 & ~FL_NOTM)
> +#define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM)
> | FL_ARCH7)
> +#define FL_FOR_ARCH7A(FL_FOR_ARCH7 | FL_NOTM |
> FL_ARCH6K)
> +#define FL_FOR_ARCH7VE   (FL_FOR_ARCH7A |
> FL_THUMB_DIV | FL_ARM_DIV)
> +#define FL_FOR_ARCH7R(FL_FOR_ARCH7A |
> FL_THUMB_DIV)
> +#define FL_FOR_ARCH7M(FL_FOR_ARCH7 |
> FL_THUMB_DIV)
> +#define FL_FOR_ARCH7EM   (FL_FOR_ARCH7M |
> FL_ARCH7EM)
> +#define FL_FOR_ARCH8A(FL_FOR_ARCH7VE | FL_ARCH8)
> +#define FL_FOR_ARCH8M_BASE   (FL_FOR_ARCH6M | FL_ARCH8 |
> FL_THUMB_DIV)
> +#define FL_FOR_ARCH8M_MAIN   (FL_FOR_ARCH7M | FL_ARCH8)
> 
>  /* There are too many feature bits to fit in a single word so the set of
> cpu and
> fpu capabilities is a structure.  A feature set is created and manipulated
> 
> 
> Is this ok for stage3?
> 
> Best regards,
> 
> Thomas

[arm-embedded][PATCH, ARM 3/6] Fix indentation of FL_FOR_ARCH* definition after adding support for ARMv8-M

2015-12-16 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 3:51 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH, ARM 3/8] Fix indentation of FL_FOR_ARCH* definition
> after adding support for ARMv8-M
> 
> Hi,
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch fixes the indentation of FL_FOR_ARCH* macros
> definition following the patch to add support for ARMv8-M. Since this is
> an obvious change, I'm not expecting a review and will commit it as soon
> as the other patches in the series are accepted.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entry is as follows:
> 
> 
> *** gcc/ChangeLog ***
> 
> 2015-11-06  Thomas Preud'homme  
> 
> * config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.
> 
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 1371ee7..bf0d1b4 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -391,32 +391,33 @@ extern bool arm_is_constant_pool_ref (rtx);
>  #define FL_TUNE  (FL_WBUF | FL_VFPV2 | FL_STRONG |
> FL_LDSCHED \
>| FL_CO_PROC)
> 
> -#define FL_FOR_ARCH2 FL_NOTM
> -#define FL_FOR_ARCH3 (FL_FOR_ARCH2 | FL_MODE32)
> -#define FL_FOR_ARCH3M(FL_FOR_ARCH3 | FL_ARCH3M)
> -#define FL_FOR_ARCH4 (FL_FOR_ARCH3M | FL_ARCH4)
> -#define FL_FOR_ARCH4T(FL_FOR_ARCH4 | FL_THUMB)
> -#define FL_FOR_ARCH5 (FL_FOR_ARCH4 | FL_ARCH5)
> -#define FL_FOR_ARCH5T(FL_FOR_ARCH5 | FL_THUMB)
> -#define FL_FOR_ARCH5E(FL_FOR_ARCH5 | FL_ARCH5E)
> -#define FL_FOR_ARCH5TE   (FL_FOR_ARCH5E | FL_THUMB)
> -#define FL_FOR_ARCH5TEJ  FL_FOR_ARCH5TE
> -#define FL_FOR_ARCH6 (FL_FOR_ARCH5TE | FL_ARCH6)
> -#define FL_FOR_ARCH6JFL_FOR_ARCH6
> -#define FL_FOR_ARCH6K(FL_FOR_ARCH6 | FL_ARCH6K)
> -#define FL_FOR_ARCH6ZFL_FOR_ARCH6
> -#define FL_FOR_ARCH6KZ   (FL_FOR_ARCH6K | FL_ARCH6KZ)
> -#define FL_FOR_ARCH6T2   (FL_FOR_ARCH6 | FL_THUMB2)
> -#define FL_FOR_ARCH6M(FL_FOR_ARCH6 & ~FL_NOTM)
> -#define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM) |
> FL_ARCH7)
> -#define FL_FOR_ARCH7A(FL_FOR_ARCH7 | FL_NOTM |
> FL_ARCH6K)
> -#define FL_FOR_ARCH7VE   (FL_FOR_ARCH7A | FL_THUMB_DIV |
> FL_ARM_DIV)
> -#define FL_FOR_ARCH7R(FL_FOR_ARCH7A | FL_THUMB_DIV)
> -#define FL_FOR_ARCH7M(FL_FOR_ARCH7 | FL_THUMB_DIV)
> -#define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
> -#define FL_FOR_ARCH8A(FL_FOR_ARCH7VE | FL_ARCH8)
> -#define FL_FOR_ARCH8M_BASE (FL_FOR_ARCH6M | FL_ARCH8 |
> FL_THUMB_DIV)
> -#define FL_FOR_ARCH8M_MAIN (FL_FOR_ARCH7M | FL_ARCH8)
> +#define FL_FOR_ARCH2 FL_NOTM
> +#define FL_FOR_ARCH3 (FL_FOR_ARCH2 | FL_MODE32)
> +#define FL_FOR_ARCH3M(FL_FOR_ARCH3 | FL_ARCH3M)
> +#define FL_FOR_ARCH4 (FL_FOR_ARCH3M | FL_ARCH4)
> +#define FL_FOR_ARCH4T(FL_FOR_ARCH4 | FL_THUMB)
> +#define FL_FOR_ARCH5 (FL_FOR_ARCH4 | FL_ARCH5)
> +#define FL_FOR_ARCH5T(FL_FOR_ARCH5 | FL_THUMB)
> +#define FL_FOR_ARCH5E(FL_FOR_ARCH5 | FL_ARCH5E)
> +#define FL_FOR_ARCH5TE   (FL_FOR_ARCH5E | FL_THUMB)
> +#define FL_FOR_ARCH5TEJ  FL_FOR_ARCH5TE
> +#define FL_FOR_ARCH6 (FL_FOR_ARCH5TE | FL_ARCH6)
> +#define FL_FOR_ARCH6JFL_FOR_ARCH6
> +#define FL_FOR_ARCH6K(FL_FOR_ARCH6 | FL_ARCH6K)
> +#define FL_FOR_ARCH6ZFL_FOR_ARCH6
> +#define FL_FOR_ARCH6ZK   FL_FOR_ARCH6K
> +#define FL_FOR_ARCH6KZ   (FL_FOR_ARCH6K | FL_ARCH6KZ)
> +#define FL_FOR_ARCH6T2   (FL_FOR_ARCH6 | FL_THUMB2)
> +#define FL_FOR_ARCH6M(FL_FOR_ARCH6 & ~FL_NOTM)
> +#define FL_FOR_ARCH7 ((FL_FOR_ARCH6T2 & ~FL_NOTM)
> | FL_ARCH7)
> +#define FL_FOR_ARCH7A(FL_FOR_ARCH7 | FL_NOTM |
> FL_ARCH6K)
> +#define FL_FOR_ARCH7VE   (FL_FOR_ARCH7A |
> FL_THUMB_DIV | FL_ARM_DIV)
> +#define FL_FOR_ARCH7R(FL_FOR_ARCH7A |
> FL_THUMB_DIV)
> +#define FL_FOR_ARCH7M(FL_FOR_ARCH7 |
> FL_THUMB_DIV)
> +#define FL_FOR_ARCH7EM   (FL_FOR_ARCH7M |
> FL_ARCH7EM)
> +#define FL_FOR_ARCH8A(FL_FOR_ARCH7VE | FL_ARCH8)
> +#define FL_FOR_ARCH8M_BASE   (FL_FOR_ARCH6M | FL_ARCH8 |
> FL_THUMB_DIV)
> +#define FL_FOR_ARCH8M_MAIN   (FL_FOR_ARCH7M | FL_ARCH8)
> 
>  /* There are too many feature bits to fit in a single word so the set of
> cpu and
> fpu capabilities is a structure.  A feature set is created and manipulated
> 
> 
> Is this ok for stage3?
> 
> Best regards,
> 
> Thomas

[PATCH, ARM 3/8] Fix indentation of FL_FOR_ARCH* definition after adding support for ARMv8-M

2015-12-16 Thread Thomas Preud'homme

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch fixes the indentation of FL_FOR_ARCH* macros definition 
following the patch to add support for ARMv8-M. Since this is an obvious 
change, I'm not expecting a review and will commit it as soon as the other 
patches in the series are accepted.

[1] For a quick overview of ARMv8-M please refer to the initial cover letter.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-11-06  Thomas Preud'homme  

* config/arm/arm-protos.h: Reindent FL_FOR_* macro definitions.


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 1371ee7..bf0d1b4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -391,32 +391,33 @@ extern bool arm_is_constant_pool_ref (rtx);
 #define FL_TUNE(FL_WBUF | FL_VFPV2 | FL_STRONG | FL_LDSCHED \
 | FL_CO_PROC)
 
-#define FL_FOR_ARCH2   FL_NOTM
-#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
-#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
-#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
-#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
-#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
-#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
-#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
-#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
-#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE
-#define FL_FOR_ARCH6   (FL_FOR_ARCH5TE | FL_ARCH6)
-#define FL_FOR_ARCH6J  FL_FOR_ARCH6
-#define FL_FOR_ARCH6K  (FL_FOR_ARCH6 | FL_ARCH6K)
-#define FL_FOR_ARCH6Z  FL_FOR_ARCH6
-#define FL_FOR_ARCH6KZ (FL_FOR_ARCH6K | FL_ARCH6KZ)
-#define FL_FOR_ARCH6T2 (FL_FOR_ARCH6 | FL_THUMB2)
-#define FL_FOR_ARCH6M  (FL_FOR_ARCH6 & ~FL_NOTM)
-#define FL_FOR_ARCH7   ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
-#define FL_FOR_ARCH7A  (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
-#define FL_FOR_ARCH7VE (FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
-#define FL_FOR_ARCH7R  (FL_FOR_ARCH7A | FL_THUMB_DIV)
-#define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
-#define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
-#define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
-#define FL_FOR_ARCH8M_BASE (FL_FOR_ARCH6M | FL_ARCH8 | FL_THUMB_DIV)
-#define FL_FOR_ARCH8M_MAIN (FL_FOR_ARCH7M | FL_ARCH8)
+#define FL_FOR_ARCH2   FL_NOTM
+#define FL_FOR_ARCH3   (FL_FOR_ARCH2 | FL_MODE32)
+#define FL_FOR_ARCH3M  (FL_FOR_ARCH3 | FL_ARCH3M)
+#define FL_FOR_ARCH4   (FL_FOR_ARCH3M | FL_ARCH4)
+#define FL_FOR_ARCH4T  (FL_FOR_ARCH4 | FL_THUMB)
+#define FL_FOR_ARCH5   (FL_FOR_ARCH4 | FL_ARCH5)
+#define FL_FOR_ARCH5T  (FL_FOR_ARCH5 | FL_THUMB)
+#define FL_FOR_ARCH5E  (FL_FOR_ARCH5 | FL_ARCH5E)
+#define FL_FOR_ARCH5TE (FL_FOR_ARCH5E | FL_THUMB)
+#define FL_FOR_ARCH5TEJFL_FOR_ARCH5TE
+#define FL_FOR_ARCH6   (FL_FOR_ARCH5TE | FL_ARCH6)
+#define FL_FOR_ARCH6J  FL_FOR_ARCH6
+#define FL_FOR_ARCH6K  (FL_FOR_ARCH6 | FL_ARCH6K)
+#define FL_FOR_ARCH6Z  FL_FOR_ARCH6
+#define FL_FOR_ARCH6ZK FL_FOR_ARCH6K
+#define FL_FOR_ARCH6KZ (FL_FOR_ARCH6K | FL_ARCH6KZ)
+#define FL_FOR_ARCH6T2 (FL_FOR_ARCH6 | FL_THUMB2)
+#define FL_FOR_ARCH6M  (FL_FOR_ARCH6 & ~FL_NOTM)
+#define FL_FOR_ARCH7   ((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
+#define FL_FOR_ARCH7A  (FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
+#define FL_FOR_ARCH7VE (FL_FOR_ARCH7A | FL_THUMB_DIV | FL_ARM_DIV)
+#define FL_FOR_ARCH7R  (FL_FOR_ARCH7A | FL_THUMB_DIV)
+#define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
+#define FL_FOR_ARCH7EM (FL_FOR_ARCH7M | FL_ARCH7EM)
+#define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
+#define FL_FOR_ARCH8M_BASE (FL_FOR_ARCH6M | FL_ARCH8 | FL_THUMB_DIV)
+#define FL_FOR_ARCH8M_MAIN (FL_FOR_ARCH7M | FL_ARCH8)
 
 /* There are too many feature bits to fit in a single word so the set of cpu 
and
fpu capabilities is a structure.  A feature set is created and manipulated


Is this ok for stage3?

Best regards,

Thomas

[arm-embedded][PATCH, ARM 2/6] Add support for ARMv8-M

2015-12-16 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 3:25 PM
> To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> Kyrylo Tkachov
> Subject: [PATCH, ARM 2/6] Add support for ARMv8-M
> 
> Hi,
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch adds basic support for the new architecture, allowing
> the new names to be accepted by -march and the compiler to behave
> like ARMv6-M (for ARMv8-M Baseline) and or ARMv7-M (for ARMv8-M
> Mainline). The changes are divided in two categories:
> 
> * those to recognize the new architecture name
> * those to keep the behavior as previous architectures
> 
> Changes to make the compiler generate code with the new instructions
> are in follow-up patches.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> 
> ChangeLog entries are as follow:
> 
> *** gcc/ChangeLog ***
> 
> 2015-11-23  Thomas Preud'homme  
> 
> * config/arm/arm-arches.def (armv8-m.base): Define new
> architecture.
> (armv8-m.main): Likewise.
> (armv8-m.main+dsp): Likewise
> * config/arm/arm-protos.h (FL_FOR_ARCH8M_BASE): Define.
> (FL_FOR_ARCH8M_MAIN): Likewise.
> * config/arm/arm-tables.opt: Regenerate.
> * config/arm/bpabi.h: Add armv8-m.base, armv8-m.main and
> armv8-m.main+dsp to BE8_LINK_SPEC.
> * config/arm/arm.h (TARGET_HAVE_LDACQ): Exclude ARMv8-M.
> (enum base_architecture): Add BASE_ARCH_8M_BASE and
> BASE_ARCH_8M_MAIN.
> (TARGET_ARM_V8M): Define.
> * config/arm/arm.c (arm_arch_name): Increase size to work with
> ARMv8-M
> Baseline and Mainline.
> (arm_option_override_internal): Also disable arm_restrict_it when
> !arm_arch_notm.
> (arm_file_start): Increase architecture buffer size.
> * doc/invoke.texi: Document architectures armv8-m.base, armv8-
> m.main
> and armv8-m.main+dsp.
> (mno-unaligned-access): Clarify that this is disabled by default for
> ARMv8-M Baseline architecture as well.
> 
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2015-11-10  Thomas Preud'homme  
> 
> * lib/target-supports.exp: Generate
> add_options_for_arm_arch_FUNC and
> check_effective_target_arm_arch_FUNC_multilib for ARMv8-M
> Baseline and
> ARMv8-M Mainline architectures.
> 
> 
> *** libgcc/ChangeLog ***
> 
> 2015-11-10  Thomas Preud'homme  
> 
> * config/arm/lib1funcs.S (__ARM_ARCH__): Define to 8 for ARMv8-
> M.
> 
> 
> diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-
> arches.def
> index
> ddf6c3c330f91640d647d266f3d0e2350e7b986a..1d0301a3b9414127d38783
> 4584f3e42c225b6d3f 100644
> --- a/gcc/config/arm/arm-arches.def
> +++ b/gcc/config/arm/arm-arches.def
> @@ -57,6 +57,12 @@ ARM_ARCH("armv7-m", cortexm3,  7M,
>   ARM_FSET_MAKE_CPU1 (FL_CO_PROC |  FL_FOR_
>  ARM_ARCH("armv7e-m", cortexm4,  7EM, ARM_FSET_MAKE_CPU1
> (FL_CO_PROC |   FL_FOR_ARCH7EM))
>  ARM_ARCH("armv8-a", cortexa53,  8A,  ARM_FSET_MAKE_CPU1
> (FL_CO_PROC | FL_FOR_ARCH8A))
>  ARM_ARCH("armv8-a+crc",cortexa53, 8A,   ARM_FSET_MAKE_CPU1
> (FL_CO_PROC | FL_CRC32  | FL_FOR_ARCH8A))
> +ARM_ARCH("armv8-m.base", cortexm0, 8M_BASE,
> +  ARM_FSET_MAKE_CPU1 (
> FL_FOR_ARCH8M_BASE))
> +ARM_ARCH("armv8-m.main", cortexm7, 8M_MAIN,
> +  ARM_FSET_MAKE_CPU1(FL_CO_PROC |
> FL_FOR_ARCH8M_MAIN))
> +ARM_ARCH("armv8-m.main+dsp", cortexm7, 8M_MAIN,
> +  ARM_FSET_MAKE_CPU1(FL_CO_PROC | FL_ARCH7EM |
> FL_FOR_ARCH8M_MAIN))
>  ARM_ARCH("iwmmxt",  iwmmxt, 5TE, ARM_FSET_MAKE_CPU1
> (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE |
> FL_IWMMXT))
>  ARM_ARCH("iwmmxt2", iwmmxt2,5TE, ARM_FSET_MAKE_CPU1
> (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE |
> FL_IWMMXT | FL_IWMMXT2))
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index
> e7328e79650739fca1c3e21b10c194feaa697465..dc7a0871c37bfda267
> 1f197bfe83c20c7888 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -415,6 +415,8 @@ extern bool arm_is_constant_pool_ref (rtx);
>  #define FL_FOR_ARCH7M(FL_FOR_ARCH7 | FL_THUMB_DIV)
>  #define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH

[PATCH, ARM 2/6] Add support for ARMv8-M

2015-12-16 Thread Thomas Preud'homme

Hi,

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch adds basic support for the new architecture, allowing the new 
names to be accepted by -march and the compiler to behave like ARMv6-M (for 
ARMv8-M Baseline) and or ARMv7-M (for ARMv8-M Mainline). The changes are 
divided in two categories:

* those to recognize the new architecture name
* those to keep the behavior as previous architectures

Changes to make the compiler generate code with the new instructions are in 
follow-up patches.

[1] For a quick overview of ARMv8-M please refer to the initial cover letter.


ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2015-11-23  Thomas Preud'homme  

* config/arm/arm-arches.def (armv8-m.base): Define new architecture.
(armv8-m.main): Likewise.
(armv8-m.main+dsp): Likewise
* config/arm/arm-protos.h (FL_FOR_ARCH8M_BASE): Define.
(FL_FOR_ARCH8M_MAIN): Likewise.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/bpabi.h: Add armv8-m.base, armv8-m.main and
armv8-m.main+dsp to BE8_LINK_SPEC.
* config/arm/arm.h (TARGET_HAVE_LDACQ): Exclude ARMv8-M.
(enum base_architecture): Add BASE_ARCH_8M_BASE and BASE_ARCH_8M_MAIN.
(TARGET_ARM_V8M): Define.
* config/arm/arm.c (arm_arch_name): Increase size to work with ARMv8-M
Baseline and Mainline.
(arm_option_override_internal): Also disable arm_restrict_it when
!arm_arch_notm.
(arm_file_start): Increase architecture buffer size.
* doc/invoke.texi: Document architectures armv8-m.base, armv8-m.main
and armv8-m.main+dsp.
(mno-unaligned-access): Clarify that this is disabled by default for
ARMv8-M Baseline architecture as well.


*** gcc/testsuite/ChangeLog ***

2015-11-10  Thomas Preud'homme  

* lib/target-supports.exp: Generate add_options_for_arm_arch_FUNC and
check_effective_target_arm_arch_FUNC_multilib for ARMv8-M Baseline and
ARMv8-M Mainline architectures.


*** libgcc/ChangeLog ***

2015-11-10  Thomas Preud'homme  

* config/arm/lib1funcs.S (__ARM_ARCH__): Define to 8 for ARMv8-M.


diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index 
ddf6c3c330f91640d647d266f3d0e2350e7b986a..1d0301a3b9414127d387834584f3e42c225b6d3f
 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -57,6 +57,12 @@ ARM_ARCH("armv7-m", cortexm3,7M, 
ARM_FSET_MAKE_CPU1 (FL_CO_PROC |  FL_FOR_
 ARM_ARCH("armv7e-m", cortexm4,  7EM,   ARM_FSET_MAKE_CPU1 (FL_CO_PROC |
  FL_FOR_ARCH7EM))
 ARM_ARCH("armv8-a", cortexa53,  8A,ARM_FSET_MAKE_CPU1 (FL_CO_PROC |
 FL_FOR_ARCH8A))
 ARM_ARCH("armv8-a+crc",cortexa53, 8A,   ARM_FSET_MAKE_CPU1 (FL_CO_PROC | 
FL_CRC32  | FL_FOR_ARCH8A))
+ARM_ARCH("armv8-m.base", cortexm0, 8M_BASE,
+ARM_FSET_MAKE_CPU1 ( FL_FOR_ARCH8M_BASE))
+ARM_ARCH("armv8-m.main", cortexm7, 8M_MAIN,
+ARM_FSET_MAKE_CPU1(FL_CO_PROC |  FL_FOR_ARCH8M_MAIN))
+ARM_ARCH("armv8-m.main+dsp", cortexm7, 8M_MAIN,
+ARM_FSET_MAKE_CPU1(FL_CO_PROC | FL_ARCH7EM | FL_FOR_ARCH8M_MAIN))
 ARM_ARCH("iwmmxt",  iwmmxt, 5TE,   ARM_FSET_MAKE_CPU1 (FL_LDSCHED | 
FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT))
 ARM_ARCH("iwmmxt2", iwmmxt2,5TE,   ARM_FSET_MAKE_CPU1 (FL_LDSCHED | 
FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2))
 
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 
e7328e79650739fca1c3e21b10c194feaa697465..dc7a0871c37bfda2671f197bfe83c20c7888
 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -415,6 +415,8 @@ extern bool arm_is_constant_pool_ref (rtx);
 #define FL_FOR_ARCH7M  (FL_FOR_ARCH7 | FL_THUMB_DIV)
 #define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
 #define FL_FOR_ARCH8A  (FL_FOR_ARCH7VE | FL_ARCH8)
+#define FL_FOR_ARCH8M_BASE (FL_FOR_ARCH6M | FL_ARCH8 | FL_THUMB_DIV)
+#define FL_FOR_ARCH8M_MAIN (FL_FOR_ARCH7M | FL_ARCH8)
 
 /* There are too many feature bits to fit in a single word so the set of cpu 
and
fpu capabilities is a structure.  A feature set is created and manipulated
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 
48aac41c37a35e27440f67863d4a92457916dd1b..2f24bf4c9a9ae1a12ba284fd160c34e577b2e4c6
 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -416,10 +416,19 @@ EnumValue
 Enum(arm_arch) String(armv8-a+crc) Value(26)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt) Value(27)
+Enum(arm_arch) String(armv8-m.base) Value(27)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt2) Value(28)
+Enum(arm_arch) String(armv8-m.main) Value(28)
+
+EnumValue
+Enum(arm_arch) String(armv8-m.main+dsp) Value(29)
+
+EnumValue
+Enum(arm_arch) St

RE: [arm-embedded][PATCH, libgcc/ARM 1/6] Fix Thumb-1 only == ARMv6-M & Thumb-2 only == ARMv7-M assumptions

2015-12-16 Thread Thomas Preud'homme

CH_5E__) || defined(__ARM_ARCH_5TE__) \
 || defined(__ARM_ARCH_5TEJ__)
 #define HAVE_ARM_CLZ 1
 #endif
 
 #ifdef L_clzsi2
-#if defined(__ARM_ARCH_6M__)
+#if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1
 FUNC_START clzsi2
mov r1, #28
mov r3, #1
@@ -1753,7 +1753,7 @@ ARM_FUNC_START clzsi2
 #ifdef L_clzdi2
 #if !defined(HAVE_ARM_CLZ)
 
-# if defined(__ARM_ARCH_6M__)
+# if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1
 FUNC_START clzdi2
push{r4, lr}
 # else
@@ -1778,7 +1778,7 @@ ARM_FUNC_START clzdi2
bl  __clzsi2
 # endif
 2:
-# if defined(__ARM_ARCH_6M__)
+# if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1
pop {r4, pc}
 # else
RETLDM  r4
@@ -1800,7 +1800,7 @@ ARM_FUNC_START clzdi2
 #endif /* L_clzdi2 */
 
 #ifdef L_ctzsi2
-#if defined(__ARM_ARCH_6M__)
+#if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1
 FUNC_START ctzsi2
neg r1, r0
and r0, r0, r1
@@ -1915,7 +1915,7 @@ ARM_FUNC_START ctzsi2
 
 /* Don't bother with the old interworking routines for Thumb-2.  */
 /* ??? Maybe only omit these on "m" variants.  */
-#if !defined(__thumb2__) && !defined(__ARM_ARCH_6M__)
+#if __ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1
 
 #if defined L_interwork_call_via_rX
 
@@ -2150,11 +2150,12 @@ LSYM(Lchange_\register):
 #endif /* Arch supports thumb.  */
 
 #ifndef __symbian__
-#ifndef __ARM_ARCH_6M__
+/* The condition here must match the one in gcc/config/arm/elf.h.  */
+#if __ARM_ARCH_ISA_ARM || __ARM_ARCH_ISA_THUMB != 1
 #include "ieee754-df.S"
 #include "ieee754-sf.S"
 #include "bpabi.S"
-#else /* __ARM_ARCH_6M__ */
+#else /* !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1 */
 #include "bpabi-v6m.S"
-#endif /* __ARM_ARCH_6M__ */
+#endif /* !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1 */
 #endif /* !__symbian__ */
diff --git a/libgcc/config/arm/libunwind.S b/libgcc/config/arm/libunwind.S
index 
cac102231914aa85320ff579168a17afa8479f67..393ec8aaee43948154956b72960860902400df50
 100644
--- a/libgcc/config/arm/libunwind.S
+++ b/libgcc/config/arm/libunwind.S
@@ -58,7 +58,7 @@
 #endif
 #endif
 
-#ifdef __ARM_ARCH_6M__
+#if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1
 
 /* r0 points to a 16-word block.  Upload these values to the actual core
state.  */
@@ -169,7 +169,7 @@ FUNC_START gnu_Unwind_Save_WMMXC
UNPREFIX \name
 .endm
 
-#else /* !__ARM_ARCH_6M__ */
+#else /* __ARM_ARCH_ISA_ARM || __ARM_ARCH_ISA_THUMB != 1 */
 
 /* r0 points to a 16-word block.  Upload these values to the actual core
state.  */
@@ -351,7 +351,7 @@ ARM_FUNC_START gnu_Unwind_Save_WMMXC
UNPREFIX \name
 .endm
 
-#endif /* !__ARM_ARCH_6M__ */
+#endif /* __ARM_ARCH_ISA_ARM || __ARM_ARCH_ISA_THUMB != 1 */
 
 UNWIND_WRAPPER _Unwind_RaiseException 1
 UNWIND_WRAPPER _Unwind_Resume 1
diff --git a/libgcc/config/arm/t-softfp b/libgcc/config/arm/t-softfp
index 
4ede438baf6a297737e52db00395f6c3a359f681..554ec9bc47b04445e79e84b1f957bf88680c08d1
 100644
--- a/libgcc/config/arm/t-softfp
+++ b/libgcc/config/arm/t-softfp
@@ -1,2 +1,2 @@
-softfp_wrap_start := '\#ifdef __ARM_ARCH_6M__'
+softfp_wrap_start := '\#if !__ARM_ARCH_ISA_ARM && __ARM_ARCH_ISA_THUMB == 1'
 softfp_wrap_end := '\#endif'


Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, December 17, 2015 1:58 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [arm-embedded][PATCH, libgcc/ARM 1/6] Fix Thumb-1 only ==
> ARMv6-M & Thumb-2 only == ARMv7-M assumptions
> 
> Hi,
> 
> We decided to apply the following patch to the ARM embedded 5 branch.
> This is *not* intended for trunk for now. We will send a separate email
> for trunk.
> 
> This patch is part of a patch series to add support for ARMv8-M[1] to GCC.
> This specific patch fixes some assumptions related to M profile
> architectures. Currently GCC (mostly libgcc) contains several assumptions
> that the only ARM architecture with Thumb-1 only instructions is ARMv6-
> M and the only one with Thumb-2 only instructions is ARMv7-M. ARMv8-
> M [1] make this wrong since ARMv8-M baseline is also (mostly) Thumb-1
> only and ARMv8-M mainline is also Thumb-2 only. This patch replace
> checks for __ARM_ARCH_*__ for checks against
> __ARM_ARCH_ISA_THUMB and __ARM_ARCH_ISA_ARM instead. For
> instance, Thumb-1 only can be checked with
> #if !defined(__ARM_ARCH_ISA_ARM) && (__ARM_ARCH_ISA_THUMB
> == 1). It also fixes the guard for DIV code to not apply to ARMv8-M
> Baseline since it uses Thumb-2 instructions.
> 
> [1] For a quick overview of ARMv8-M please refer to the initial cover
> letter.
> 
> ChangeLog entries are as follow:
&g

[arm-embedded][PATCH, libgcc/ARM 1/6] Fix Thumb-1 only == ARMv6-M & Thumb-2 only == ARMv7-M assumptions

2015-12-16 Thread Thomas Preud'homme

Hi,

We decided to apply the following patch to the ARM embedded 5 branch. This is 
*not* intended for trunk for now. We will send a separate email for trunk.

This patch is part of a patch series to add support for ARMv8-M[1] to GCC. This 
specific patch fixes some assumptions related to M profile architectures. 
Currently GCC (mostly libgcc) contains several assumptions that the only ARM 
architecture with Thumb-1 only instructions is ARMv6-M and the only one with 
Thumb-2 only instructions is ARMv7-M. ARMv8-M [1] make this wrong since ARMv8-M 
baseline is also (mostly) Thumb-1 only and ARMv8-M mainline is also Thumb-2 
only. This patch replace checks for __ARM_ARCH_*__ for checks against 
__ARM_ARCH_ISA_THUMB and __ARM_ARCH_ISA_ARM instead. For instance, Thumb-1 only 
can be checked with #if !defined(__ARM_ARCH_ISA_ARM) && (__ARM_ARCH_ISA_THUMB 
== 1). It also fixes the guard for DIV code to not apply to ARMv8-M Baseline 
since it uses Thumb-2 instructions.

[1] For a quick overview of ARMv8-M please refer to the initial cover letter.

ChangeLog entries are as follow:


*** gcc/ChangeLog ***

2015-11-13  Thomas Preud'homme  

* config/arm/elf.h: Use __ARM_ARCH_ISA_THUMB and __ARM_ARCH_ISA_ARM to
decide whether to prevent some libgcc routines being included for some
multilibs rather than __ARM_ARCH_6M__ and add comment to indicate the
link between this condition and the one in
libgcc/config/arm/lib1func.S.
* config/arm/arm.h (TARGET_ARM_V6M): Add check to TARGET_ARM_ARCH.
(TARGET_ARM_V7M): Likewise.


*** gcc/testsuite/ChangeLog ***

2015-11-10  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_arm_cortex_m): Use
__ARM_ARCH_ISA_ARM to test for Cortex-M devices.


*** libgcc/ChangeLog ***

2015-11-13  Thomas Preud'homme  

* config/arm/bpabi-v6m.S: Fix header comment to mention Thumb-1 rather
than ARMv6-M.
* config/arm/lib1funcs.S (__prefer_thumb__): Define among other cases
for all Thumb-1 only targets.
(__only_thumb1__): Define for all Thumb-1 only targets.
(THUMB_LDIV0): Test for __only_thumb1__ rather than __ARM_ARCH_6M__.
(EQUIV): Likewise.
(ARM_FUNC_ALIAS): Likewise.
(umodsi3): Add check to __only_thumb1__ to guard the idiv version.
(modsi3): Likewise.
(HAVE_ARM_CLZ): Test for __only_thumb1__ rather than __ARM_ARCH_6M__.
(clzsi2): Likewise.
(clzdi2): Likewise.
(ctzsi2): Likewise.
(L_interwork_call_via_rX): Test for __ARM_ARCH_ISA_ARM rather than
__ARM_ARCH_6M__ in guard for checking whether it is defined.
(final includes): Test for __only_thumb1__ rather than
__ARM_ARCH_6M__ and add comment to indicate the connection between
this condition and the one in gcc/config/arm/elf.h.
* config/arm/libunwind.S: Test for __ARM_ARCH_ISA_THUMB and
__ARM_ARCH_ISA_ARM rather than __ARM_ARCH_6M__.
* config/arm/t-softfp: Likewise.


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 6ed8ad3..06abcf3 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2181,8 +2181,10 @@ extern int making_const_table;
 #define TARGET_ARM_ARCH\
   (arm_base_arch)  \
 
-#define TARGET_ARM_V6M (!arm_arch_notm && !arm_arch_thumb2)
-#define TARGET_ARM_V7M (!arm_arch_notm && arm_arch_thumb2)
+#define TARGET_ARM_V6M (TARGET_ARM_ARCH == BASE_ARCH_6M && !arm_arch_notm \
+   && !arm_arch_thumb2)
+#define TARGET_ARM_V7M (TARGET_ARM_ARCH == BASE_ARCH_7M && !arm_arch_notm \
+   && arm_arch_thumb2)
 
 /* The highest Thumb instruction set version supported by the chip.  */
 #define TARGET_ARM_ARCH_ISA_THUMB  \
diff --git a/gcc/config/arm/elf.h b/gcc/config/arm/elf.h
index 3795728..579a580 100644
--- a/gcc/config/arm/elf.h
+++ b/gcc/config/arm/elf.h
@@ -148,8 +148,9 @@
   while (0)
 
 /* Horrible hack: We want to prevent some libgcc routines being included
-   for some multilibs.  */
-#ifndef __ARM_ARCH_6M__
+   for some multilibs.  The condition should match the one in
+   libgcc/config/arm/lib1funcs.S.  */
+#if __ARM_ARCH_ISA_ARM || __ARM_ARCH_ISA_THUMB != 1
 #undef L_fixdfsi
 #undef L_fixunsdfsi
 #undef L_truncdfsf2
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 254c4e3..6cf7ee1 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3210,10 +3210,8 @@ proc check_effective_target_arm_cortex_m { } {
return 0
 }
 return [check_no_compiler_messages arm_cortex_m assembly {
-   #if !defined(__ARM_ARCH_7M__) \
-&& !defined (__ARM_ARCH_7EM__) \
-&& !defined (__ARM_ARCH_6M__)
-   #error !__ARM_ARCH_7M__ && !__ARM_ARCH_7EM__ && !__ARM_ARCH_6M__
+   #i

[PATCH, GCC, V8M 0/6] Add support for ARMv8-M

2015-12-16 Thread Thomas Preud'homme

Hi,

I'll be posting a patch series intended for trunk whose aim is to add support 
for ARMv8-M. This patch series does not include changes to support the security 
extensions [nor does it include atomics for ARMv8-M Baseline]. This will be 
posted as a separate patch series.


=== Quick overview of ARMv8-M ===

ARMv8-M has two profiles[1]: Baseline and Mainline.  In terms of features they 
can be defined as:

ARMv8-M Baseline (armv8-m.base):
 * All ARMv6-M features
 * 16-bit immediate moves
 * Wide Branch
 * Compare & branch if (not) zero
 * Integer divide
 * Load/store exclusives
 * Atomic Load/stores
 * security extensions

ARMv8-M Mainline (armv8-m.main):
 * All ARMv7-M features
 * Atomic load/stores
 * security extensions.

ARMv8-M Mainline with DSP extension (armv8-m.main+dsp):
 * ARMv8-M Mainline
 * Those instructions added to ARMv7E-M on top of ARMv7-M.

Note that although certain architectural features of the security extensions 
are optional for cores implementing ARMv8-M, some of the new instructions are 
always available in the architecture.

Note also that only the security extensions instructions are new instructions, 
all other instructions have previously been available in other ARM Architecture 
profiles.

[1] 
http://www.arm.com/products/processors/instruction-set-architectures/armv8-m-architecture.php

[PATCH, ARM, 3/3] Add multilib support for bare-metal ARM architectures

2015-12-16 Thread Thomas Preud'homme

Hi Ramana,

As suggested in your initial answer to this thread, we updated the multilib 
patch provided in ARM's embedded branch to be up-to-date with regards to 
supported CPUs in GCC. As to the need to modify Makefile.in and configure.ac, 
this is because the patch aims to let control to the user as to what multilib 
should be built. To this effect, it takes a list of architecture at configure 
time and that list needs to be passed down to t-baremetal Makefile to set the 
multilib variables appropriately.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-12-15  Thomas Preud'homme  

* Makefile.in (with_multilib_list): New variables substituted by
configure.
* config.gcc: Handle bare-metal multilibs in --with-multilib-list
option.
* config/arm/t-baremetal: New file.
* configure.ac (with_multilib_list): New AC_SUBST.
* configure: Regenerate.
* doc/install.texi (--with-multilib-list): Update description for
arm*-*-* targets to mention bare-metal multilibs.


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 
1f698798aa2df3f44d6b3a478bb4bf48e9fa7372..18b790afa114aa7580be0662d3ac9ffbc94e919d
 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -546,6 +546,7 @@ lang_opt_files=@lang_opt_files@ $(srcdir)/c-family/c.opt 
$(srcdir)/common.opt
 lang_specs_files=@lang_specs_files@
 lang_tree_files=@lang_tree_files@
 target_cpu_default=@target_cpu_default@
+with_multilib_list=@with_multilib_list@
 OBJC_BOEHM_GC=@objc_boehm_gc@
 extra_modes_file=@extra_modes_file@
 extra_opt_files=@extra_opt_files@
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 
af948b5e203f6b4f53dfca38e9d02d060d00c97b..d8098ed3cefacd00cb10590db1ec86d48e9fcdbc
 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3787,15 +3787,25 @@ case "${target}" in
default)
;;
*)
-   echo "Error: 
--with-multilib-list=${with_multilib_list} not supported." 1>&2
-   exit 1
+   for arm_multilib in ${arm_multilibs}; do
+   case ${arm_multilib} in
+   armv6-m | armv7-m | armv7e-m | armv7-r 
| armv8-m.base | armv8-m.main)
+   
tmake_profile_file="arm/t-baremetal"
+   ;;
+   *)
+   echo "Error: 
--with-multilib-list=${with_multilib_list} not supported." 1>&2
+   exit 1
+   ;;
+   esac
+   done
;;
esac
 
if test "x${tmake_profile_file}" != x ; then
-   # arm/t-aprofile is only designed to work
-   # without any with-cpu, with-arch, with-mode,
-   # with-fpu or with-float options.
+   # arm/t-aprofile and arm/t-baremetal are only
+   # designed to work without any with-cpu,
+   # with-arch, with-mode, with-fpu or with-float
+   # options.
if test "x$with_arch" != x \
|| test "x$with_cpu" != x \
|| test "x$with_float" != x \
diff --git a/gcc/config/arm/t-baremetal b/gcc/config/arm/t-baremetal
new file mode 100644
index 
..ffd29815e6ec22c747e77747ed9b69e0ae21b63a
--- /dev/null
+++ b/gcc/config/arm/t-baremetal
@@ -0,0 +1,130 @@
+# A set of predefined MULTILIB which can be used for different ARM targets.
+# Via the configure option --with-multilib-list, user can customize the
+# final MULTILIB implementation.
+
+comma := ,
+
+with_multilib_list := $(subst $(comma), ,$(with_multilib_list
+
+MULTILIB_OPTIONS   = mthumb/marm
+MULTILIB_DIRNAMES  = thumb arm
+MULTILIB_OPTIONS  += 
march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7/march=armv8-m.base/march=armv8-m.main
+MULTILIB_DIRNAMES += armv6-m armv7-m armv7e-m armv7-ar armv8-m.base 
armv8-m.main
+MULTILIB_OPTIONS  += mfloat-abi=softfp/mfloat-abi=hard
+MULTILIB_DIRNAMES += softfp fpu
+MULTILIB_OPTIONS  += 
mfpu=fpv5-sp-d16/mfpu=fpv5-d16/mfpu=fpv4-sp-d16/mfpu=vfpv3-d16
+MULTILIB_DIRNAMES += fpv5-sp-d16 fpv5-d16 fpv4-sp-d16 vfpv3-d16
+
+MULTILIB_MATCHES   = march?armv6s-m=mcpu?cortex-m0
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m0.small-multiply
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m0plus
+MULTILIB_MATCHES  += march?armv6s-m=mcpu?cortex-m0plus.smal

[PATCH, GCC/ARM, 2/3] Error out for incompatible ARM multilibs

2015-12-16 Thread Thomas Preud'homme

Currently in config.gcc, only the first multilib in a multilib list is checked 
for validity and the following elements are ignored due to the break which only 
breaks out of loop in shell. A loop is also done over the multilib list 
elements despite no combination being legal. This patch rework the code to 
address both issues.

ChangeLog entry is as follows:


2015-11-24  Thomas Preud'homme  

* config.gcc: Error out when conflicting multilib is detected.  Do not
loop over multilibs since no combination is legal.


diff --git a/gcc/config.gcc b/gcc/config.gcc
index 59aee2c..be3c720 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3772,38 +3772,40 @@ case "${target}" in
# Add extra multilibs
if test "x$with_multilib_list" != x; then
arm_multilibs=`echo $with_multilib_list | sed -e 's/,/ 
/g'`
-   for arm_multilib in ${arm_multilibs}; do
-   case ${arm_multilib} in
-   aprofile)
+   case ${arm_multilibs} in
+   aprofile)
# Note that arm/t-aprofile is a
# stand-alone make file fragment to be
# used only with itself.  We do not
# specifically use the
# TM_MULTILIB_OPTION framework because
# this shorthand is more
-   # pragmatic. Additionally it is only
-   # designed to work without any
-   # with-cpu, with-arch with-mode
+   # pragmatic.
+   tmake_profile_file="arm/t-aprofile"
+   ;;
+   default)
+   ;;
+   *)
+   echo "Error: 
--with-multilib-list=${with_multilib_list} not supported." 1>&2
+   exit 1
+   ;;
+   esac
+
+   if test "x${tmake_profile_file}" != x ; then
+   # arm/t-aprofile is only designed to work
+   # without any with-cpu, with-arch, with-mode,
# with-fpu or with-float options.
-   if test "x$with_arch" != x \
-   || test "x$with_cpu" != x \
-   || test "x$with_float" != x \
-   || test "x$with_fpu" != x \
-   || test "x$with_mode" != x ; then
-   echo "Error: You cannot use any of 
--with-arch/cpu/fpu/float/mode with --with-multilib-list=aprofile" 1>&2
-   exit 1
-   fi
-   tmake_file="${tmake_file} 
arm/t-aprofile"
-   break
-   ;;
-   default)
-   ;;
-   *)
-   echo "Error: 
--with-multilib-list=${with_multilib_list} not supported." 1>&2
-   exit 1
-   ;;
-   esac
-   done
+   if test "x$with_arch" != x \
+   || test "x$with_cpu" != x \
+   || test "x$with_float" != x \
+   || test "x$with_fpu" != x \
+   || test "x$with_mode" != x ; then
+   echo "Error: You cannot use any of 
--with-arch/cpu/fpu/float/mode with --with-multilib-list=${arm_multilib}" 1>&2
+   exit 1
+   fi
+
+   tmake_file="${tmake_file} ${tmake_profile_file}"
+   fi
fi
;;


Tested with the following multilib lists:
  + foo -> "Error: --with-multilib-list=foo not supported" as expected
  + default,aprofile -> "Error: --with-multilib-list=default,aprofile not 
supported" as expected
  + aprofile,default -> "Error: --with-multilib-list=aprofile,default not 
supported" as expected
  + (nothing) -> libraries in $installdir/arm-none-eabi/lib{,fpu,thumb}
  + default -> libraries in $installdir/arm-none-eabi/lib{,fpu,thumb} as 
expected
  + aprofile -> $installdir/arm-none-eabi/lib contains all supported multilib

Is this ok for trunk?

Best regards,

Thomas

[PATCH, ARM, 1/3] Document --with-multilib-list for arm--* targets

2015-12-16 Thread Thomas Preud'homme

Currently, the documentation for --with-multilib-list in gcc/doc/install.texi 
only mentions sh*-*-* and x86-64-*-linux* targets. However, arm*-*-* targets 
also support this option. This patch adds documention for the meaning of this 
option for arm*-*-* targets.

ChangeLog entry is as follows:


*** gcc/ChangeLog ***

2015-12-09  Thomas Preud'homme  

* doc/install.texi (--with-multilib-list): Describe the meaning of the
option for arm*-*-* targets.


diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 57399ed..2c93eb0 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1102,9 +1102,19 @@ sysv, aix.
 @item --with-multilib-list=@var{list}
 @itemx --without-multilib-list
 Specify what multilibs to build.
-Currently only implemented for sh*-*-* and x86-64-*-linux*.
+Currently only implemented for arm*-*-*, sh*-*-* and x86-64-*-linux*.
 
 @table @code
+@item arm*-*-*
+@var{list} is either @code{default} or @code{aprofile}.  Specifying
+@code{default} is equivalent to omitting this option while specifying
+@code{aprofile} builds multilibs for each combination of ISA (@code{-marm} or
+@code{-mthumb}), architecture (@code{-march=armv7-a}, @code{-march=armv7ve},
+or @code{-march=armv8-a}), FPU available (none, @code{-mfpu=vfpv3-d16},
+@code{neon}, @code{vfpv4-d16}, @code{neon-vfpv4} or @code{neon-fp-armv8}
+depending on architecture) and floating-point ABI (@code{-mfloat-abi=softfp}
+or @code{-mfloat-abi=hard}).
+
 @item sh*-*-*
 @var{list} is a comma separated list of CPU names.  These must be of the
 form @code{sh*} or @code{m*} (in which case they match the compiler option


PDF builds fine out of the updated file and look as expected.

Is this ok for trunk?

Best regards,

Thomas

[PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2015-12-16 Thread Thomas Preud'homme

During reorg pass, thumb1_reorg () is tasked with rewriting mov rd, rn to subs 
rd, rn, 0 to avoid a comparison against 0 instruction before doing a 
conditional branch based on it. The actual avoiding of cmp is done in 
cbranchsi4_insn instruction C output template. When the condition is met, the 
source register (rn) is also propagated into the comparison in place the 
destination register (rd).

However, right now thumb1_reorg () only look for a mov followed by a cbranchsi 
but does not check whether the comparison in cbranchsi is against the constant 
0. This is not safe because a non clobbering instruction could exist between 
the mov and the comparison that modifies the source register. This is what 
happens here with a post increment of the source register after the mov, which 
skip the &a[i] == &a[1] comparison for iteration i == 1.

This patch fixes the issue by checking that the comparison is against constant 
0.

ChangeLog entry is as follow:


*** gcc/ChangeLog ***

2015-12-07  Thomas Preud'homme  

* config/arm/arm.c (thumb1_reorg): Check that the comparison is
against the constant 0.


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 42bf272..49c0a06 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17195,7 +17195,7 @@ thumb1_reorg (void)
   FOR_EACH_BB_FN (bb, cfun)
 {
   rtx dest, src;
-  rtx pat, op0, set = NULL;
+  rtx cmp, op0, op1, set = NULL;
   rtx_insn *prev, *insn = BB_END (bb);
   bool insn_clobbered = false;
 
@@ -17208,8 +17208,13 @@ thumb1_reorg (void)
continue;
 
   /* Get the register with which we are comparing.  */
-  pat = PATTERN (insn);
-  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
+  cmp = XEXP (SET_SRC (PATTERN (insn)), 0);
+  op0 = XEXP (cmp, 0);
+  op1 = XEXP (cmp, 1);
+
+  /* Check that comparison is against ZERO.  */
+  if (!CONST_INT_P (op1) || INTVAL (op1) != 0)
+   continue;
 
   /* Find the first flag setting insn before INSN in basic block BB.  */
   gcc_assert (insn != BB_HEAD (bb));
@@ -17249,7 +17254,7 @@ thumb1_reorg (void)
  PATTERN (prev) = gen_rtx_SET (dest, src);
  INSN_CODE (prev) = -1;
  /* Set test register in INSN to dest.  */
- XEXP (XEXP (SET_SRC (pat), 0), 0) = copy_rtx (dest);
+ XEXP (cmp, 0) = copy_rtx (dest);
  INSN_CODE (insn) = -1;
}
 }


Testsuite shows no regression when run for arm-none-eabi with -mcpu=cortex-m0 
-mthumb

Is this ok for trunk?

Best regards,

Thomas

[PATCH, testsuite] Fix PR68632: gcc.target/arm/lto/pr65837 failure on M profile ARM targets

2015-12-09 Thread Thomas Preud'homme

gcc.target/arm/lto/pr65837 fails on M profile ARM targets because of lack of 
neon instructions. This patch adds the necessary arm_neon_ok effective target 
requirement to avoid running this test for such targets.

ChangeLog entry is as follows:


* gcc/testsuite/ChangeLog ***

2015-12-08  Thomas Preud'homme  

PR testsuite/68632
* gcc.target/arm/lto/pr65837_0.c: Require arm_neon_ok effective
target.


diff --git a/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c 
b/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
index 000fc2a..fcc26a1 100644
--- a/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
+++ b/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c
@@ -1,4 +1,5 @@
 /* { dg-lto-do run } */
+/* { dg-require-effective-target arm_neon_ok } */
 /* { dg-lto-options {{-flto -mfpu=neon}} } */
 /* { dg-suppress-ld-options {-mfpu=neon} } */



Testcase fails without the patch and succeeds with.

Is this ok for trunk?

Best regards,

Thomas

[PATCH, testsuite] Fix PR68629: attr-simd-3.c failure on arm-none-eabi targets

2015-12-09 Thread Thomas Preud'homme

c-c++-common/attr-simd-3.c fails to compile on arm-none-eabi targets due to 
-fcilkplus needing -pthread which is not available for those targets. This 
patch solves this issue by adding a condition to the cilkplus effective target 
that compiling with -fcilkplus succeeds and requires cilkplus as an effective 
target for attr-simd-3.c testcase.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2015-12-08  Thomas Preud'homme  

PR testsuite/68629
* lib/target-supports.exp (check_effective_target_cilkplus): Also
check that compiling with -fcilkplus does not give an error.
* c-c++-common/attr-simd-3.c: Require cilkplus effective target.


diff --git a/gcc/testsuite/c-c++-common/attr-simd-3.c 
b/gcc/testsuite/c-c++-common/attr-simd-3.c
index d61ba82..1970c67 100644
--- a/gcc/testsuite/c-c++-common/attr-simd-3.c
+++ b/gcc/testsuite/c-c++-common/attr-simd-3.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target "cilkplus" } */
 /* { dg-options "-fcilkplus" } */
 /* { dg-prune-output "undeclared here \\(not in a function\\)|\[^\n\r\]* was 
not declared in this scope" } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4e349e9..95b903c 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1432,7 +1432,12 @@ proc check_effective_target_cilkplus { } {
 if { [istarget avr-*-*] } {
return 0;
 }
-return 1
+return [ check_no_compiler_messages_nocache fcilkplus_available executable 
{
+   #ifdef __cplusplus
+   extern "C"
+   #endif
+   int dummy;
+   } "-fcilkplus" ]
 }
 
 proc check_linker_plugin_available { } {


Testsuite shows no regression when run with
  + an arm-none-eabi GCC cross-compiler targeting Cortex-M3
  + a bootstrapped x86_64-linux-gnu GCC native compiler

Is this ok for trunk?

Best regards,

Thomas

[arm-embedded][PATCHv2, ARM, libgcc] New aeabi_idiv function for armv6-m

2015-12-07 Thread Thomas Preud'homme

We decided to apply this to ARM/embedded-5-branch.

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Andre Vieira
> Sent: Wednesday, October 28, 2015 1:03 AM
> To: gcc-patches@gcc.gnu.org
> Subject: Re: [PING][PATCHv2, ARM, libgcc] New aeabi_idiv function for
> armv6-m
> 
> Ping.
> 
> BR,
> Andre
> 
> On 13/10/15 18:01, Andre Vieira wrote:
> > This patch ports the aeabi_idiv routine from Linaro Cortex-Strings
> > (https://git.linaro.org/toolchain/cortex-strings.git), which was
> > contributed by ARM under Free BSD license.
> >
> > The new aeabi_idiv routine is used to replace the one in
> > libgcc/config/arm/lib1funcs.S. This replacement happens within the
> > Thumb1 wrapper. The new routine is under LGPLv3 license.
> >
> > The main advantage of this version is that it can improve the
> > performance of the aeabi_idiv function for Thumb1. This solution will
> > also increase the code size. So it will only be used if
> > __OPTIMIZE_SIZE__ is not defined.
> >
> > Make check passed for armv6-m.
> >
> > libgcc/ChangeLog:
> > 2015-08-10  Hale Wang  
> >   Andre Vieira  
> >
> > * config/arm/lib1funcs.S: Add new wrapper.
> >

FW: [PATCH, ARM/testsuite] Fix thumb2-slow-flash-data.c failures

2015-11-09 Thread Thomas Preud'homme

[Forwarding to gcc-patches, doh!]
 
Best regards,

Thomas
--- Begin Message ---
Hi,

ARM-specific thumb2-slow-flash-data.c testcase shows 2 failures when running 
for arm-none-eabi with -mcpu=cortex-m7:

FAIL: gcc.target/arm/thumb2-slow-flash-data.c (test for excess errors)
FAIL: gcc.target/arm/thumb2-slow-flash-data.c scan-assembler-times movt 13

The first one is due to a missing type specifier in the declaration of labelref 
while the second one is due to different constant synthesis as a result of a 
different tuning for the CPU selected. This patch fixes these issues by adding 
the missing type specifier and checking for .word and similar directive instead 
of the number of movt.

The new test passes for all of -mcpu=cortex-m{3,4,7} but fail when removing the 
-mslow-flash-data switch.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2015-11-04  Thomas Preud'homme  

* gcc.target/arm/thumb2-slow-flash-data.c: Add missing typespec for
labelref and check use of constant pool by looking for .word and
similar directives.


diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c 
b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c
index 9852ea5..089a72b 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c
@@ -50,7 +50,7 @@ int
 foo (int a, int b)
 {
   int i;
-  volatile *labelref = &&label1;
+  volatile int *labelref = &&label1;
 
   if (a > b)
 {
@@ -70,5 +70,4 @@ label1:
   return a + b;
 }
 
-/* { dg-final { scan-assembler-times "movt" 13 } } */
-/* { dg-final { scan-assembler-times "movt.*LC0\\+4" 1 } } */
+/* { dg-final { scan-assembler-not 
"\\.(float|l\\?double|\d?byte|short|int|long|quad|word)\\s+\[^.\]" } } */


Is this ok for trunk?

Best regards,

Thomas
--- End Message ---

[PATCH, ARM] List Cs and US constraints as being used

2015-08-25 Thread Thomas Preud'homme

Hi,

The header in gcc/config/arm/constraints.md list all the ARM-specific 
constraints defined and for which targets they are but miss a couple of them. 
This patch add the missing Cs and US constraints to the list.

Patch was tested by verifying that arm-none-eabi-gcc cross-compiler can still 
be build (ie the comment remains a comment).

diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 42935a4..2d9ffb8 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -21,7 +21,7 @@
 ;; The following register constraints have been used:
 ;; - in ARM/Thumb-2 state: t, w, x, y, z
 ;; - in Thumb state: h, b
-;; - in both states: l, c, k, q, US
+;; - in both states: l, c, k, q, Cs, Ts, US
 ;; In ARM state, 'l' is an alias for 'r'
 ;; 'f' and 'v' were previously used for FPA and MAVERICK registers.


Committed as obvious with the following ChangeLog entry:

2015-08-25  Thomas Preud'homme  

* config/arm/constraints.md: Also list Cs and US ARM-specific
constraints as used.

Best regards,

Thomas

RE: [PATCH] Obvious fix for PR66828: left shift with undefined behavior in bswap pass

2015-08-05 Thread Thomas Preud'homme

Hi,

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Tuesday, July 28, 2015 3:04 PM
> 
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> >
> > ChangeLog entry is as follows:
> >
> > 2015-07-28  Thomas Preud'homme  
> >
> > PR tree-optimization/66828
> > * tree-ssa-math-opts.c (perform_symbolic_merge): Change type
> of
> > inc
> > from int64_t to uint64_t.

Can I backport this change to GCC 5 branch? The patch applies cleanly on
GCC 5 and shows no regression on a native x86_64-linux-gnu bootstrapped
GCC and an arm-none-eabi GCC cross-compiler.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ba37d96..a301c23 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2015-08-04  Thomas Preud'homme  
+
+   Backport from mainline
+   2015-07-28  Thomas Preud'homme  
+
+   PR tree-optimization/66828
+   * tree-ssa-math-opts.c (perform_symbolic_merge): Change type of inc
+   from int64_t to uint64_t.
+
 2015-08-03  John David Anglin  
 
PR target/67060
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index c22a677..c699dcadb 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1856,7 +1856,7 @@ perform_symbolic_merge (gimple source_stmt1, struct 
symbolic_number *n1,
  the same base (array, structure, ...).  */
   if (gimple_assign_rhs1 (source_stmt1) != gimple_assign_rhs1 (source_stmt2))
 {
-  int64_t inc;
+  uint64_t inc;
   HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
   struct symbolic_number *toinc_n_ptr, *n_end;


Best regards,

Thomas

[PATCH, loop-invariant] Fix PR67043: -fcompare-debug failure with -O3

2015-07-31 Thread Thomas Preud'homme

Hi,

Since commit r223113, loop-invariant pass rely on luids to determine if an 
invariant can be hoisted out of a loop without introducing temporaries. 
However, nothing is made to ensure luids are up-to-date. This patch adds a 
DF_LIVE problem and mark all blocks as dirty before using luids to ensure these 
will be recomputed.

ChangeLog entries are as follows:


2015-07-31  Thomas Preud'homme  

PR tree-optimization/67043
* loop-invariant.c (find_defs): Force recomputation of all luids.


2015-07-29  Thomas Preud'homme  

PR tree-optimization/67043
* gcc.dg/pr67043.c: New test.

Note: the testcase was heavily reduced from the Linux kernel sources by Markus 
Trippelsdorf and formatted to follow GNU code style.


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 1fdb84d..fc53e09 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -676,6 +676,8 @@ find_defs (struct loop *loop)
   df_remove_problem (df_chain);
   df_process_deferred_rescans ();
   df_chain_add_problem (DF_UD_CHAIN);
+  df_live_add_problem ();
+  df_live_set_all_dirty ();
   df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
   df_analyze_loop (loop);
   check_invariant_table_size ();
diff --git a/gcc/testsuite/gcc.dg/pr67043.c b/gcc/testsuite/gcc.dg/pr67043.c
new file mode 100644
index 000..36aa686
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr67043.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcompare-debug -w" } */
+
+extern void rt_mutex_owner (void);
+extern void rt_mutex_deadlock_account_lock (int);
+extern void signal_pending (void);
+__typeof__ (int *) a;
+int b;
+
+int
+try_to_take_rt_mutex (int p1) {
+  rt_mutex_owner ();
+  if (b)
+return 0;
+  rt_mutex_deadlock_account_lock (p1);
+  return 1;
+}
+
+void
+__rt_mutex_slowlock (int p1) {
+  int c;
+  for (;;) {
+c = ({
+  asm ("" : "=r"(a));
+  a;
+});
+if (try_to_take_rt_mutex (c))
+  break;
+if (__builtin_expect (p1 == 0, 0))
+  signal_pending ();
+  }
+}


Patch was tested by running the testsuite against a bootstrapped native 
x86_64-linux-gnu GCC and against an arm-none-eabi GCC cross-compiler without 
any regression.

Is this ok for trunk?

Best regards,

Thomas Preud'homme

RE: [PATCH] Obvious fix for PR66828: left shift with undefined behavior in bswap pass

2015-07-28 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> 
> ChangeLog entry is as follows:
> 
> 2015-07-28  Thomas Preud'homme  
> 
> PR tree-optimization/66828
> * tree-ssa-math-opts.c (perform_symbolic_merge): Change type of
> inc
> from int64_t to uint64_t.

And the patch is:

diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 55382f3..c3098db 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2122,7 +2122,7 @@ perform_symbolic_merge (gimple source_stmt1, struct 
symbolic_number *n1,
  the same base (array, structure, ...).  */
   if (gimple_assign_rhs1 (source_stmt1) != gimple_assign_rhs1 (source_stmt2))
 {
-  int64_t inc;
+  uint64_t inc;
   HOST_WIDE_INT start_sub, end_sub, end1, end2, end;
   struct symbolic_number *toinc_n_ptr, *n_end;


Best regards,

Thomas

[PATCH] Obvious fix for PR66828: left shift with undefined behavior in bswap pass

2015-07-28 Thread Thomas Preud'homme

The bswap pass contain the following loop:

for (i = 0; i < size; i++, inc <<= BITS_PER_MARKER)

In the update to inc and i just before exiting the loop, inc can be shifted by 
a total of more than 62bit, making the value too large to be represented by 
int64_t. This is an undefined behavior [1] and it triggers an error under an 
ubsan bootstrap. This patch change the type of inc to be unsigned, removing the 
undefined behavior.

[1] C++ 98 standard section 5.8 paragraph 2:

"The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are 
zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2 , 
reduced modulo one more than the maximum value representable in the result 
type. Otherwise, if E1 has a signed type and non-negative value, and E1 × 2E2 
is representable in the corresponding unsigned type of the result type, then 
that value, converted to the result type, is the resulting value; otherwise, the
behavior is undefined."

ChangeLog entry is as follows:

2015-07-28  Thomas Preud'homme  

PR tree-optimization/66828
* tree-ssa-math-opts.c (perform_symbolic_merge): Change type of inc
from int64_t to uint64_t.

Testsuite was run on a native x86_64-linux-gnu bootstrapped GCC and an 
arm-none-eabi cross-compiler without any regression. Committed as obvious as 
suggested by  Markus Trippelsdorf in PR66828.

Best regards,

Thomas

RE: [PATCH, ARM] Restrict pr65647 testcase to ARMv6-M effective target

2015-06-26 Thread Thomas Preud'homme

> From: James Greenhalgh [mailto:james.greenha...@arm.com]
> Sent: Friday, June 26, 2015 6:15 PM
> 
> This should already have been covered by:
> 
>   https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01105.html
> 
>   2015-06-16  James Greenhalgh  
> 
>   * gcc.target/arm/pr65647.c: Do not override -mfloat-abi
> directives
>   passed by the testsuite driver.

Indeed, time to git pull. Sorry for the noise :-(

Best regards,

Thomas

[PATCH, ARM] Restrict pr65647 testcase to ARMv6-M effective target

2015-06-26 Thread Thomas Preud'homme

Hi,

Testcase for PR65647 assumes that the compiler can compile for ARMv6-M
which might not be the case if passing some extra options via
RUNTESTFLAGS (eg. -marm/-mcpu=cortex-a9). This patch restricts the
testcase to ARMv6-M effective targets.


Testsuite ChangeLog entry is as follows:

2015-06-25  Thomas Preud'homme  

* gcc.target/arm/pr65647.c: Restrict to ARMv6-M effective targets.


diff --git a/gcc/testsuite/gcc.target/arm/pr65647.c 
b/gcc/testsuite/gcc.target/arm/pr65647.c
index d3b44b2..d828d23 100644
--- a/gcc/testsuite/gcc.target/arm/pr65647.c
+++ b/gcc/testsuite/gcc.target/arm/pr65647.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v6m_ok } */
 /* { dg-options "-march=armv6-m -mthumb -O3 -w -mfloat-abi=soft" } */
 
 a, b, c, e, g = &e, h, i = 7, l = 1, m, n, o, q = &m, r, s = &r, u, w = 9, x,


Patch was tested by running the testcase once with -mcpu=cortex-a9
(skipped as expected) and once with -mcpu=cortex-m0 (passes).

Is this ok for trunk?

Best regards,

Thomas

RE: [PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-27 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, May 27, 2015 11:24 PM
> Ah, OK.  I was looking at the code prior to the call for
> can_move_invariant_reg in move_invariant_reg which implies that DEST
> can
> be a subreg, but REG can not.
> 
> But with that check in can_move_invariant_reg obviously won't matter.
> It feels like we've likely got some dead code here, but that can be a
> follow-up if you want to pursue.

Are you referring to the subreg code? It's used at the end of the function:

inv->reg = reg;
inv->orig_regno = regno;

> 
> OK for the trunk.

Thanks, committed.

Best regards,

Thomas

RE: [PATCH 2/3, ARM, libgcc, ping7] Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc

2015-05-26 Thread Thomas Preud'homme

Ping?

> -Original Message-
> From: Thomas Preud'homme [mailto:thomas.preudho...@arm.com]
> Sent: Thursday, April 30, 2015 3:19 PM
> To: Thomas Preud'homme; Richard Earnshaw; 'gcc-patches@gcc.gnu.org';
> Marcus Shawcroft; Ramana Radhakrishnan
> (ramana.radhakrish...@arm.com)
> Subject: RE: [PATCH 2/3, ARM, libgcc, ping6] Code size optimization for
> the fmul/fdiv and dmul/ddiv function in libgcc
> 
> Here is an updated patch that prefix local symbols with __ for more
> safety.
> They appear in the symtab as local so it is not strictly necessary but one is
> never too cautious. Being local, they also do not generate any PLT entry.
> They appear only because the jumps are from one section to another
> (which is the whole purpose of this patch) and thus need a static
> relocation.
> 
> I hope this revised version address all your concerns.
> 
> ChangeLog entry is unchanged:
> 
> *** gcc/libgcc/ChangeLog ***
> 
> 2015-04-30   Tony Wang 
> 
> * config/arm/ieee754-sf.S: Expose symbols around fragment
> boundaries as function symbols.
> * config/arm/ieee754-df.S: Same with above
> 
> diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-
> df.S
> index c1468dc..39b0028 100644
> --- a/libgcc/config/arm/ieee754-df.S
> +++ b/libgcc/config/arm/ieee754-df.S
> @@ -559,7 +559,7 @@ ARM_FUNC_ALIAS aeabi_l2d floatdidf
> 
>  #ifdef L_arm_muldivdf3
> 
> -ARM_FUNC_START muldf3
> +ARM_FUNC_START muldf3, function_section
>  ARM_FUNC_ALIAS aeabi_dmul muldf3
>   do_push {r4, r5, r6, lr}
> 
> @@ -571,7 +571,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
>   COND(and,s,ne)  r5, ip, yh, lsr #20
>   teqne   r4, ip
>   teqne   r5, ip
> - bleqLSYM(Lml_s)
> + bleq__Lml_s
> 
>   @ Add exponents together
>   add r4, r4, r5
> @@ -689,7 +689,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
>   subsip, r4, #(254 - 1)
>   do_it   hi
>   cmphi   ip, #0x700
> - bhi LSYM(Lml_u)
> + bhi __Lml_u
> 
>   @ Round the result, merge final exponent.
>   cmp lr, #0x8000
> @@ -716,9 +716,12 @@ LSYM(Lml_1):
>   mov lr, #0
>   subsr4, r4, #1
> 
> -LSYM(Lml_u):
> + FUNC_END aeabi_dmul
> + FUNC_END muldf3
> +
> +ARM_SYM_START __Lml_u
>   @ Overflow?
> - bgt LSYM(Lml_o)
> + bgt __Lml_o
> 
>   @ Check if denormalized result is possible, otherwise return
> signed 0.
>   cmn r4, #(53 + 1)
> @@ -778,10 +781,11 @@ LSYM(Lml_u):
>   do_it   eq
>   biceq   xl, xl, r3, lsr #31
>   RETLDM  "r4, r5, r6"
> + SYM_END __Lml_u
> 
>   @ One or both arguments are denormalized.
>   @ Scale them leftwards and preserve sign bit.
> -LSYM(Lml_d):
> +ARM_SYM_START __Lml_d
>   teq r4, #0
>   bne 2f
>   and r6, xh, #0x8000
> @@ -804,8 +808,9 @@ LSYM(Lml_d):
>   beq 3b
>   orr yh, yh, r6
>   RET
> + SYM_END __Lml_d
> 
> -LSYM(Lml_s):
> +ARM_SYM_START __Lml_s
>   @ Isolate the INF and NAN cases away
>   teq r4, ip
>   and r5, ip, yh, lsr #20
> @@ -817,10 +822,11 @@ LSYM(Lml_s):
>   orrsr6, xl, xh, lsl #1
>   do_it   ne
>   COND(orr,s,ne)  r6, yl, yh, lsl #1
> - bne LSYM(Lml_d)
> + bne __Lml_d
> + SYM_END __Lml_s
> 
>   @ Result is 0, but determine sign anyway.
> -LSYM(Lml_z):
> +ARM_SYM_START __Lml_z
>   eor xh, xh, yh
>   and xh, xh, #0x8000
>   mov xl, #0
> @@ -832,41 +838,42 @@ LSYM(Lml_z):
>   moveq   xl, yl
>   moveq   xh, yh
>   COND(orr,s,ne)  r6, yl, yh, lsl #1
> - beq LSYM(Lml_n) @ 0 * INF or INF * 0 -> NAN
> + beq __Lml_n @ 0 * INF or INF * 0 -> NAN
>   teq r4, ip
>   bne 1f
>   orrsr6, xl, xh, lsl #12
> - bne LSYM(Lml_n) @ NAN *  -> NAN
> + bne __Lml_n @ NAN *  -> NAN
>  1:   teq r5, ip
> - bne LSYM(Lml_i)
> + bne __Lml_i
>   orrsr6, yl, yh, lsl #12
>   do_it   ne, t
>   movne   xl, yl
>   movne   xh, yh
> - bne LSYM(Lml_n) @  * NAN -> NAN
> + bne __Lml_n @  * NAN -> NAN
> + SYM_END __Lml_z
> 
>   @ Result is INF, but we need to determine its sign.
> -LSYM(Lml_i):
> +ARM_SYM_START __Lml_i
>   eor xh, xh, yh
> + SYM_END __Lml_i
> 
>   @ Overflow: return INF (sign already in xh).
> -LSYM(Lml_o):
> +ARM_SYM_START __Lml_o
>   and xh, xh, #0x8000
&

RE: [PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-24 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Saturday, May 23, 2015 6:54 AM
> >
> > -  if (!can_move_invariant_reg (loop, inv, reg))
> > +  if (!can_move_invariant_reg (loop, inv, dest))
> Won't this run into into the same problem if DEST is a SUBREG?

One of the very first test in can_move_invariant_reg is:

if (!REG_P (reg) || !HARD_REGISTER_P (reg))
  return false;

So in case of a subreg the insn will not be moved which will execute the same
code as before my patch. It would be nicer if it could work with subreg of
course but this makes for a much smaller and safer patch.

Best regards,

Thomas

RE: [PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-20 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> > From: Steven Bosscher [mailto:stevenb@gmail.com]
> > Sent: Tuesday, May 19, 2015 7:21 PM
> >
> > Not OK.
> > This will break in move_invariants() when it looks at REGNO (inv->reg).
> 
> Indeed. I'm even surprised all tests passed. Ok I will just prevent moving
> in such a case. I'm running the tests now and will get back to you
> tomorrow.

Patch is now tested via bootstrap + testsuite run on x86_64-linux-gnu and
building arm-none-eabi cross-compiler + testsuite run. Both testsuite run
show no regression.

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 76a009f..4ce3576 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1626,7 +1626,7 @@ move_invariant_reg (struct loop *loop, unsigned invno)
   if (REG_P (reg))
regno = REGNO (reg);
 
-  if (!can_move_invariant_reg (loop, inv, reg))
+  if (!can_move_invariant_reg (loop, inv, dest))
{
  reg = gen_reg_rtx_and_attrs (dest);
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr66168.c 
b/gcc/testsuite/gcc.c-torture/compile/pr66168.c
new file mode 100644
index 000..d6bfc7b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr66168.c
@@ -0,0 +1,15 @@
+int a, b;
+
+void
+fn1 ()
+{
+  for (;;)
+{
+  for (b = 0; b < 3; b++)
+   {
+ char e[2];
+ char f = e[1];
+ a ^= f ? 1 / f : 0;
+   }
+}
+}

Ok for trunk?

Best regards,

Thomas

RE: [PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-20 Thread Thomas Preud'homme

> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Tuesday, May 19, 2015 7:21 PM
> 
> Not OK.
> This will break in move_invariants() when it looks at REGNO (inv->reg).

Indeed. I'm even surprised all tests passed. Ok I will just prevent moving
in such a case. I'm running the tests now and will get back to you tomorrow.

Best regards,

Thomas

[PATCH] Fix PR66168: ICE due to incorrect invariant register info

2015-05-19 Thread Thomas Preud'homme

Hi,

r223113 made it possible for invariant to actually be moved rather than
moving the source to a new pseudoregister. However, when doing so
the inv->reg is not set up properly: in case of a subreg destination it
holds the inner register rather than the subreg expression.

This patch fixes that.


ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2015-05-18  Thomas Preud'homme  

PR rtl-optimization/66168
* loop-invariant.c (move_invariant_reg): Set inv->reg to destination
of inv->insn when moving an invariant without introducing a temporary
register.

*** gcc/testsuite/ChangeLog ***

2015-05-18  Thomas Preud'homme  

PR rtl-optimization/66168
* gcc.c-torture/compile/pr66168.c: New test.


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 76a009f..30e1945 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1642,9 +1642,13 @@ move_invariant_reg (struct loop *loop, unsigned invno)
 
  emit_insn_after (gen_move_insn (dest, reg), inv->insn);
}
-  else if (dump_file)
-   fprintf (dump_file, "Invariant %d moved without introducing a new "
-   "temporary register\n", invno);
+  else
+   {
+ reg = SET_DEST (set);
+ if (dump_file)
+   fprintf (dump_file, "Invariant %d moved without introducing a new "
+   "temporary register\n", invno);
+   }
   reorder_insns (inv->insn, inv->insn, BB_END (preheader));
 
   /* If there is a REG_EQUAL note on the insn we just moved, and the
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr66168.c 
b/gcc/testsuite/gcc.c-torture/compile/pr66168.c
new file mode 100644
index 000..d6bfc7b
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr66168.c
@@ -0,0 +1,15 @@
+int a, b;
+
+void
+fn1 ()
+{
+  for (;;)
+{
+  for (b = 0; b < 3; b++)
+   {
+ char e[2];
+ char f = e[1];
+ a ^= f ? 1 / f : 0;
+   }
+}
+}


Tested by bootstrapping on x86_64-linux-gnu and building an arm-none-eabi
cross-compiler. Testsuite run shows no regression for both of them.

Ok for trunk?

Best regards,

Thomas

RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant

2015-05-12 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Wednesday, May 13, 2015 4:05 AM
> OK for the trunk.
> 
> Thanks for your patience,

Thanks. Committed with the added "PR rtl-optimization/64616" to both
ChangeLog entries.

Best regards,

Thomas

RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant

2015-05-12 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> 
> > From: Jeff Law [mailto:l...@redhat.com]
> > Sent: Tuesday, May 12, 2015 4:17 AM
> >
> > >>
> > >> +
> > >> +  /* Check that all uses reached by the def in insn would still be
> > reached
> > >> + it.  */
> > >> +  dest_regno = REGNO (reg);
> > >> +  for (use = DF_REG_USE_CHAIN (dest_regno); use; use =
> > >> DF_REF_NEXT_REG (use))
> > [ ... ]
> > So isn't this overly conservative if DEST_REGNO is set multiple times
> > since it's going to look at all the uses, even those not necessarily
> > reached by the original SET of DEST_REGNO?
> >
> > Or is that not an issue for some reason?  And I'm not requiring you to
> > make this optimal, but if I'm right, a comment here seems wise.
> 
> My apologize, it is the comment that is incorrect since it doesn't match
> the code (a remaining of an old version of this patch). The code actually
> checks that the use was dominated by the instruction before it is moved
> out of the loop.

> >
> >
> > I think with the wrapping nits fixed and closure on the multi-set issue
> > noted immediately above and this will be good for the trunk.
> 
> I'll fix this comment right away.

Please find below a patch with the comment fixed.


*** gcc/ChangeLog ***

2015-05-12  Thomas Preud'homme  

* loop-invariant.c (can_move_invariant_reg): New.
(move_invariant_reg): Call above new function to decide whether
instruction can just be moved, skipping creation of temporary
register.

*** gcc/testsuite/ChangeLog ***

2015-05-12  Thomas Preud'homme  

* gcc.dg/loop-8.c: New test.
* gcc.dg/loop-9.c: New test.


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index e3b560d..76a009f 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1511,6 +1511,79 @@ replace_uses (struct invariant *inv, rtx reg, bool 
in_group)
   return 1;
 }
 
+/* Whether invariant INV setting REG can be moved out of LOOP, at the end of
+   the block preceding its header.  */
+
+static bool
+can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg)
+{
+  df_ref def, use;
+  unsigned int dest_regno, defs_in_loop_count = 0;
+  rtx_insn *insn = inv->insn;
+  basic_block bb = BLOCK_FOR_INSN (inv->insn);
+
+  /* We ignore hard register and memory access for cost and complexity reasons.
+ Hard register are few at this stage and expensive to consider as they
+ require building a separate data flow.  Memory access would require using
+ df_simulate_* and can_move_insns_across functions and is more complex.  */
+  if (!REG_P (reg) || HARD_REGISTER_P (reg))
+return false;
+
+  /* Check whether the set is always executed.  We could omit this condition if
+ we know that the register is unused outside of the loop, but it does not
+ seem worth finding out.  */
+  if (!inv->always_executed)
+return false;
+
+  /* Check that all uses that would be dominated by def are already dominated
+ by it.  */
+  dest_regno = REGNO (reg);
+  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
+{
+  rtx_insn *use_insn;
+  basic_block use_bb;
+
+  use_insn = DF_REF_INSN (use);
+  use_bb = BLOCK_FOR_INSN (use_insn);
+
+  /* Ignore instruction considered for moving.  */
+  if (use_insn == insn)
+   continue;
+
+  /* Don't consider uses outside loop.  */
+  if (!flow_bb_inside_loop_p (loop, use_bb))
+   continue;
+
+  /* Don't move if a use is not dominated by def in insn.  */
+  if (use_bb == bb && DF_INSN_LUID (insn) >= DF_INSN_LUID (use_insn))
+   return false;
+  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
+   return false;
+}
+
+  /* Check for other defs.  Any other def in the loop might reach a use
+ currently reached by the def in insn.  */
+  for (def = DF_REG_DEF_CHAIN (dest_regno); def; def = DF_REF_NEXT_REG (def))
+{
+  basic_block def_bb = DF_REF_BB (def);
+
+  /* Defs in exit block cannot reach a use they weren't already.  */
+  if (single_succ_p (def_bb))
+   {
+ basic_block def_bb_succ;
+
+ def_bb_succ = single_succ (def_bb);
+ if (!flow_bb_inside_loop_p (loop, def_bb_succ))
+   continue;
+   }
+
+  if (++defs_in_loop_count > 1)
+   return false;
+}
+
+  return true;
+}
+
 /* Move invariant INVNO out of the LOOP.  Returns true if this succeeds, false
otherwise.  */
 
@@ -1544,11 +1617,8 @@ move_invariant_reg (struct loop *loop, unsigned invno)
}
}
 
-  /* Move the set out of the loop.  If the set is always executed (we could

RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant

2015-05-12 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Tuesday, May 12, 2015 4:17 AM
> 
> On 05/06/2015 03:47 AM, Thomas Preud'homme wrote:
> > Ping?
> Something to consider as future work -- I'm pretty sure PRE sets up the
> same kind of problematical pattern with a new pseudo (reaching reg)
> holding the result of the redundant expression and the original
> evaluations turned into copies from the reaching reg to the final
> destination.

Yes absolutely, this is how the pattern I was interested in was created. The
reason I solved it in loop-invariant is that I thought this was on purpose
with the cleanup left to loop-invariant. When finding a TODO comment
about this in loop-invariant I thought it confirmed my initial thoughts.

> 
> That style is easy to prove correct.  There was an issue with the copies
> not propagating away that was pretty inherent in the partial redundancy
> cases that I could probably dig out of my archives if you're interested.

If you think this should also (or instead) be fixed in PRE I can take a look
at some point later since it shouldn't be much more work.

> It looks like there's a variety of line wrapping issues.  Please
> double-check line wrapping using an 80 column window.  Minor I know,
> but
> the consistency with the rest of the code is good.

Looking in vim seems to systematically cut at 80 column and
check_GNU_style.sh only complain about the dg-final line in the
new testcases. Could you point me to such an occurrence?

> 
> >>
> >> +
> >> +  /* Check that all uses reached by the def in insn would still be
> reached
> >> + it.  */
> >> +  dest_regno = REGNO (reg);
> >> +  for (use = DF_REG_USE_CHAIN (dest_regno); use; use =
> >> DF_REF_NEXT_REG (use))
> [ ... ]
> So isn't this overly conservative if DEST_REGNO is set multiple times
> since it's going to look at all the uses, even those not necessarily
> reached by the original SET of DEST_REGNO?
> 
> Or is that not an issue for some reason?  And I'm not requiring you to
> make this optimal, but if I'm right, a comment here seems wise.

My apologize, it is the comment that is incorrect since it doesn't match
the code (a remaining of an old version of this patch). The code actually
checks that the use was dominated by the instruction before it is moved
out of the loop. This is to prevent the code motion in case like:

foo = 1;
bar = 0;
for ()
  {
bar += foo;
foo = 42;
  }

which I met in some of the testsuite cases.

> 
> 
> I think with the wrapping nits fixed and closure on the multi-set issue
> noted immediately above and this will be good for the trunk.

I'll fix this comment right away.

Best regards,

Thomas

RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-05-12 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> 
> Based on my understanding of your answer quoted above, I'll commit
> it as is, despite not having been able to come up with a testcase. I'll
> wait tomorrow to do so though in case you changed your mind about it.

Committed.

Best regards,

Thomas

[PATCH, ARM] Fix testcase for PR64616

2015-05-11 Thread Thomas Preud'homme

Hi,

Testcase made for PR64616 was only passing when using a litteral pool. Rather 
than having an alternative for systems where this is not true, this patch 
changes the test to check that a global copy propagation occurs in cprop2. This 
should work accross all ARM targets (it works when targetting Cortex-M0, 
Cortex-M3 and whatever default core for ARMv7-a with vfpv3-d16 FPU).

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2015-05-04  Thomas Preud'homme  

* gcc.target/arm/pr64616.c: Test dump rather than assembly to work
accross ARM targets.

diff --git a/gcc/testsuite/gcc.target/arm/pr64616.c 
b/gcc/testsuite/gcc.target/arm/pr64616.c
index c686ffa..2280f21 100644
--- a/gcc/testsuite/gcc.target/arm/pr64616.c
+++ b/gcc/testsuite/gcc.target/arm/pr64616.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -fdump-rtl-cprop2" } */
 
 int f (int);
 unsigned int glob;
@@ -11,4 +11,5 @@ g ()
   glob = 5;
 }
 
-/* { dg-final { scan-assembler-times "ldr" 2 } } */
+/* { dg-final { scan-rtl-dump "GLOBAL COPY-PROP" "cprop2" } } */
+/* { dg-final { cleanup-rtl-dump "cprop2" } } */


Patch was tested by verifying that the pattern appears when targeting 
Cortex-M0, Cortex-M3 and
the default core for ARMv7-a with vfpv3-d16 FPU.

Best regards,

Thomas

RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant

2015-05-06 Thread Thomas Preud'homme

Ping?

Best regards,

Thomas

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Monday, March 16, 2015 8:39 PM
> To: 'Steven Bosscher'
> Cc: GCC Patches; Eric Botcazou
> Subject: RE: [PATCH, stage1] Move insns without introducing new
> temporaries in loop2_invariant
> 
> > From: Steven Bosscher [mailto:stevenb@gmail.com]
> > Sent: Monday, March 09, 2015 7:48 PM
> > To: Thomas Preud'homme
> > Cc: GCC Patches; Eric Botcazou
> > Subject: Re: [PATCH, stage1] Move insns without introducing new
> > temporaries in loop2_invariant
> 
> New patch below.
> 
> >
> > It looks like this would run for all candidate loop invariants, right?
> >
> > If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
> > potential compile time hog for large loops.
> >
> > But why compute this at all? Perhaps I'm missing something, but you
> > already have inv->always_executed available, no?
> 
> Indeed. I didn't realize the information was already there.
> 
> >
> >
> > > +  basic_block use_bb;
> > > +
> > > +  ref = DF_REF_INSN (use);
> > > +  use_bb = BLOCK_FOR_INSN (ref);
> >
> > You can use DF_REF_BB.
> 
> Since I need use_insn here I kept BLOCK_FOR_INSN but I used
> DF_REF_BB for the def below.
> 
> 
> So here are the new ChangeLog entries:
> 
> *** gcc/ChangeLog ***
> 
> 2015-03-11  Thomas Preud'homme  
> 
> * loop-invariant.c (can_move_invariant_reg): New.
> (move_invariant_reg): Call above new function to decide whether
> instruction can just be moved, skipping creation of temporary
> register.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2015-03-12  Thomas Preud'homme  
> 
> * gcc.dg/loop-8.c: New test.
> * gcc.dg/loop-9.c: New test.
> 
> 
> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index f79b497..8217d62 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -1512,6 +1512,79 @@ replace_uses (struct invariant *inv, rtx reg,
> bool in_group)
>return 1;
>  }
> 
> And the new patch:
> 
> +/* Whether invariant INV setting REG can be moved out of LOOP, at the
> end of
> +   the block preceding its header.  */
> +
> +static bool
> +can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx
> reg)
> +{
> +  df_ref def, use;
> +  unsigned int dest_regno, defs_in_loop_count = 0;
> +  rtx_insn *insn = inv->insn;
> +  basic_block bb = BLOCK_FOR_INSN (inv->insn);
> +
> +  /* We ignore hard register and memory access for cost and complexity
> reasons.
> + Hard register are few at this stage and expensive to consider as they
> + require building a separate data flow.  Memory access would require
> using
> + df_simulate_* and can_move_insns_across functions and is more
> complex.  */
> +  if (!REG_P (reg) || HARD_REGISTER_P (reg))
> +return false;
> +
> +  /* Check whether the set is always executed.  We could omit this
> condition if
> + we know that the register is unused outside of the loop, but it does
> not
> + seem worth finding out.  */
> +  if (!inv->always_executed)
> +return false;
> +
> +  /* Check that all uses reached by the def in insn would still be reached
> + it.  */
> +  dest_regno = REGNO (reg);
> +  for (use = DF_REG_USE_CHAIN (dest_regno); use; use =
> DF_REF_NEXT_REG (use))
> +{
> +  rtx_insn *use_insn;
> +  basic_block use_bb;
> +
> +  use_insn = DF_REF_INSN (use);
> +  use_bb = BLOCK_FOR_INSN (use_insn);
> +
> +  /* Ignore instruction considered for moving.  */
> +  if (use_insn == insn)
> + continue;
> +
> +  /* Don't consider uses outside loop.  */
> +  if (!flow_bb_inside_loop_p (loop, use_bb))
> + continue;
> +
> +  /* Don't move if a use is not dominated by def in insn.  */
> +  if (use_bb == bb && DF_INSN_LUID (insn) >= DF_INSN_LUID
> (use_insn))
> + return false;
> +  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
> + return false;
> +}
> +
> +  /* Check for other defs.  Any other def in the loop might reach a use
> + currently reached by the def in insn.  */
> +  for (def = DF_REG_DEF_CHAIN (dest_regno); def; def =
> DF_REF_NEXT_REG (def))
> +{
> +  basic_block def_bb = DF_REF_BB (def);
> +
> +  /* Defs in exit block cannot reach a use they weren't already.  */
> +  if (single_succ_p (def_bb))
> +

RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-05-04 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Tuesday, April 28, 2015 12:27 AM
> OK.  No need for heroics -- give it a shot, but don't burn an insane
> amount of time on it.  If we can't get to a reasonable testcase, then so
> be it.

Ok, I tried but really didn't managed to create a testcase. I did, however,
understand the condition when this patch is helpful. In the function
reg_nonzero_bits_for_combine () in combine.c there is a test to check if
last_set_nonzero_bits for a given register is still valid.

In the case I'm considering, the test evaluates to false because:

(i) the register rX whose nonzero bits are being evaluated was set in a
previous basic block than the one with the instruction using rX (hence
rsp->last_set_label < label_tick)
(ii) the predecessor of the the basic block for that same insn is not the
previous basic block analyzed by combine_instructions (hence
label_tick_ebb_start == label_tick)
(iii) the register rX is set multiple time (hence
REG_N_SETS (REGNO (x)) != 1)

Yet, the block being processed is dominated by the SET for rX so there
is a REG_EQUAL available to narrow down the set of nonzero bits.

Based on my understanding of your answer quoted above, I'll commit
it as is, despite not having been able to come up with a testcase. I'll
wait tomorrow to do so though in case you changed your mind about it.

Best regards,

Thomas

RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-04-30 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Tuesday, April 28, 2015 12:27 AM
> To: Thomas Preud'homme; 'Eric Botcazou'
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits
> 
> On 04/27/2015 04:26 AM, Thomas Preud'homme wrote:
> >> From: Jeff Law [mailto:l...@redhat.com]
> >> Sent: Saturday, April 25, 2015 3:00 AM
> >> Do you have a testcase where this change can result in better
> generated
> >> code.  If so please add that testcase.  It's OK if it's ARM specific.
> >
> > Hi Jeff,
> >
> > Last time I tried I couldn't reduce the code to a small testcase but if I
> remember
> > well it was mostly due to the problem of finding a good test for creduce
> > (zero extension is not unique enough). I'll try again with a more manual
> approach
> > and get back to you.
> OK.  No need for heroics -- give it a shot, but don't burn an insane
> amount of time on it.  If we can't get to a reasonable testcase, then so
> be it.

Sadly I couldn't get a testcase. I get almost same sequence of instruction as 
the program we found the problem into but couldn't get exactly the same. In all 
the cases I constructed the nonzero_bits info we already have were enough for 
combine to do its job. I couldn't find what cause this information to be 
inaccurate. I will try to investigate a bit further on Monday as another pass 
might not be doing its job properly. Or maybe there's something that prevent 
information being propagated.

Best regards,

Thomas

RE: [PATCH 2/3, ARM, libgcc, ping6] Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc

2015-04-30 Thread Thomas Preud'homme

Here is an updated patch that prefix local symbols with __ for more safety.
They appear in the symtab as local so it is not strictly necessary but one is
never too cautious. Being local, they also do not generate any PLT entry.
They appear only because the jumps are from one section to another
(which is the whole purpose of this patch) and thus need a static relocation.

I hope this revised version address all your concerns.

ChangeLog entry is unchanged:

*** gcc/libgcc/ChangeLog ***

2015-04-30   Tony Wang 

* config/arm/ieee754-sf.S: Expose symbols around fragment boundaries as 
function symbols.
* config/arm/ieee754-df.S: Same with above

diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-df.S
index c1468dc..39b0028 100644
--- a/libgcc/config/arm/ieee754-df.S
+++ b/libgcc/config/arm/ieee754-df.S
@@ -559,7 +559,7 @@ ARM_FUNC_ALIAS aeabi_l2d floatdidf
 
 #ifdef L_arm_muldivdf3
 
-ARM_FUNC_START muldf3
+ARM_FUNC_START muldf3, function_section
 ARM_FUNC_ALIAS aeabi_dmul muldf3
do_push {r4, r5, r6, lr}
 
@@ -571,7 +571,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
COND(and,s,ne)  r5, ip, yh, lsr #20
teqne   r4, ip
teqne   r5, ip
-   bleqLSYM(Lml_s)
+   bleq__Lml_s
 
@ Add exponents together
add r4, r4, r5
@@ -689,7 +689,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
subsip, r4, #(254 - 1)
do_it   hi
cmphi   ip, #0x700
-   bhi LSYM(Lml_u)
+   bhi __Lml_u
 
@ Round the result, merge final exponent.
cmp lr, #0x8000
@@ -716,9 +716,12 @@ LSYM(Lml_1):
mov lr, #0
subsr4, r4, #1
 
-LSYM(Lml_u):
+   FUNC_END aeabi_dmul
+   FUNC_END muldf3
+
+ARM_SYM_START __Lml_u
@ Overflow?
-   bgt LSYM(Lml_o)
+   bgt __Lml_o
 
@ Check if denormalized result is possible, otherwise return signed 0.
cmn r4, #(53 + 1)
@@ -778,10 +781,11 @@ LSYM(Lml_u):
do_it   eq
biceq   xl, xl, r3, lsr #31
RETLDM  "r4, r5, r6"
+   SYM_END __Lml_u
 
@ One or both arguments are denormalized.
@ Scale them leftwards and preserve sign bit.
-LSYM(Lml_d):
+ARM_SYM_START __Lml_d
teq r4, #0
bne 2f
and r6, xh, #0x8000
@@ -804,8 +808,9 @@ LSYM(Lml_d):
beq 3b
orr yh, yh, r6
RET
+   SYM_END __Lml_d
 
-LSYM(Lml_s):
+ARM_SYM_START __Lml_s
@ Isolate the INF and NAN cases away
teq r4, ip
and r5, ip, yh, lsr #20
@@ -817,10 +822,11 @@ LSYM(Lml_s):
orrsr6, xl, xh, lsl #1
do_it   ne
COND(orr,s,ne)  r6, yl, yh, lsl #1
-   bne LSYM(Lml_d)
+   bne __Lml_d
+   SYM_END __Lml_s
 
@ Result is 0, but determine sign anyway.
-LSYM(Lml_z):
+ARM_SYM_START __Lml_z
eor xh, xh, yh
and xh, xh, #0x8000
mov xl, #0
@@ -832,41 +838,42 @@ LSYM(Lml_z):
moveq   xl, yl
moveq   xh, yh
COND(orr,s,ne)  r6, yl, yh, lsl #1
-   beq LSYM(Lml_n) @ 0 * INF or INF * 0 -> NAN
+   beq __Lml_n @ 0 * INF or INF * 0 -> NAN
teq r4, ip
bne 1f
orrsr6, xl, xh, lsl #12
-   bne LSYM(Lml_n) @ NAN *  -> NAN
+   bne __Lml_n @ NAN *  -> NAN
 1: teq r5, ip
-   bne LSYM(Lml_i)
+   bne __Lml_i
orrsr6, yl, yh, lsl #12
do_it   ne, t
movne   xl, yl
movne   xh, yh
-   bne LSYM(Lml_n) @  * NAN -> NAN
+   bne __Lml_n @  * NAN -> NAN
+   SYM_END __Lml_z
 
@ Result is INF, but we need to determine its sign.
-LSYM(Lml_i):
+ARM_SYM_START __Lml_i
eor xh, xh, yh
+   SYM_END __Lml_i
 
@ Overflow: return INF (sign already in xh).
-LSYM(Lml_o):
+ARM_SYM_START __Lml_o
and xh, xh, #0x8000
orr xh, xh, #0x7f00
orr xh, xh, #0x00f0
mov xl, #0
RETLDM  "r4, r5, r6"
+   SYM_END __Lml_o
 
@ Return a quiet NAN.
-LSYM(Lml_n):
+ARM_SYM_START __Lml_n
orr xh, xh, #0x7f00
orr xh, xh, #0x00f8
RETLDM  "r4, r5, r6"
+   SYM_END __Lml_n
 
-   FUNC_END aeabi_dmul
-   FUNC_END muldf3
-
-ARM_FUNC_START divdf3
+ARM_FUNC_START divdf3 function_section
 ARM_FUNC_ALIAS aeabi_ddiv divdf3

do_push {r4, r5, r6, lr}
@@ -985,7 +992,7 @@ ARM_FUNC_ALIAS aeabi_ddiv divdf3
subsip, r4, #(254 - 1)
do_it   hi
cmphi   ip, #0x700
-   bhi LSYM(Lml_u)
+   bhi __Lml_u
 
@ Round the result, merge final exponent.
subsip, r5, yh
@@ -1009,13 +1016,13 @@ LSYM(Ldv_1):
orr xh, xh, #0x0010
mov lr, #0
subsr4, r4, #1
-   b   LSYM(Lml_u)
+   b   __Lml_u
 
@ Result mightt need to be denormaliz

RE: [PATCH, Aarch64] Add FMA steering pass for Cortex-A57

2015-04-28 Thread Thomas Preud'homme

> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> Sent: Thursday, February 05, 2015 5:17 PM
> >
> > *** gcc/ChangeLog ***
> >
> > 2015-01-26 Thomas Preud'homme thomas.preudho...@arm.com
> >
> > * config.gcc: Add cortex-a57-fma-steering.o to extra_objs for
> > aarch64-*-*.
> > * config/aarch64/t-aarch64: Add a rule for cortex-a57-fma-steering.o.
> > * config/aarch64/aarch64.h
> (AARCH64_FL_USE_FMA_STEERING_PASS): Define.
> > (AARCH64_TUNE_FMA_STEERING): Likewise.
> > * config/aarch64/aarch64-cores.def: Set
> > AARCH64_FL_USE_FMA_STEERING_PASS for cores with dynamic
> steering of
> > FMUL/FMADD instructions.
> > * config/aarch64/aarch64.c (aarch64_register_fma_steering): Declare.
> > (aarch64_override_options): Include cortex-a57-fma-steering.h.  Call
> > aarch64_register_fma_steering () if
> AARCH64_TUNE_FMA_STEERING is true.
> > * config/aarch64/cortex-a57-fma-steering.h: New file.
> > * config/aarch64/cortex-a57-fma-steering.c: Likewise.
> 
> OK but wait for stage-1 to open for general development before you
> commit it please.

Done after rebasing it (context line change in aarch64.c due to new header
include and adaptation to new signature of AARCH64_CORE macro in
aarch64-cores.def).

Committed patch below:

diff --git a/gcc/config.gcc b/gcc/config.gcc
index a1df043..9fec1e8 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -302,7 +302,7 @@ m32c*-*-*)
 aarch64*-*-*)
cpu_type=aarch64
extra_headers="arm_neon.h arm_acle.h"
-   extra_objs="aarch64-builtins.o aarch-common.o"
+   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
target_has_targetm_common=yes
;;
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 7c285ba..dfc9cc8 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -40,7 +40,7 @@
 /* V8 Architecture Processors.  */
 
 AARCH64_CORE("cortex-a53",  cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa53, "0x41", "0xd03")
-AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa57, "0x41", "0xd07")
+AARCH64_CORE("cortex-a57",  cortexa57, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_USE_FMA_STEERING_PASS, cortexa57, "0x41", "0xd07")
 AARCH64_CORE("cortex-a72",  cortexa72, cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC, cortexa57, "0x41", "0xd08")
 AARCH64_CORE("exynos-m1",   exynosm1,  cortexa57, 8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, "0x53", "0x001")
 AARCH64_CORE("thunderx",thunderx,  thunderx,  8,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  "0x43", "0x0a1")
@@ -48,5 +48,5 @@ AARCH64_CORE("xgene1",  xgene1,xgene1,8,  
AARCH64_FL_FOR_ARCH8, xgen
 
 /* V8 big.LITTLE implementations.  */
 
-AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03")
+AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_USE_FMA_STEERING_PASS, 
cortexa57, "0x41", "0xd07.0xd03")
 AARCH64_CORE("cortex-a72.cortex-a53",  cortexa72cortexa53, cortexa53, 8,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd08.0xd03")
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 1f7187b..3fd1b3f 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -200,6 +200,8 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_CRYPTO (1 << 2) /* Has crypto.  */
 #define AARCH64_FL_SLOWMUL(1 << 3) /* A slow multiply core.  */
 #define AARCH64_FL_CRC(1 << 4) /* Has CRC.  */
+/* Has static dispatch of FMA.  */
+#define AARCH64_FL_USE_FMA_STEERING_PASS (1 << 5)
 
 /* Has FP and SIMD.  */
 #define AARCH64_FL_FPSIMD (AARCH64_FL_FP | AARCH64_FL_SIMD)
@@ -220,6 +222,8 @@ extern unsigned long aarch64_isa_flags;
 /* Macros to test tuning flags.  */
 extern unsigned long aarch64_tune_flags;
 #define AARCH64_TUNE_SLOWMUL   (aarch64_tune_flags & AARCH64_FL_SLOWMUL)
+#define AARCH64_TUNE_FMA_STEERING \
+  (aarch64_tune_flags & AARCH64_FL_USE_FMA_STEERING_PASS)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD &

RE: [PATCH 1/2, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Saturday, April 25, 2015 2:57 AM

> > +static rtx
> > +sign_extend_short_imm (rtx src, machine_mode mode, unsigned int
> prec)
> > +{
> > +  if (GET_MODE_PRECISION (mode) < prec && CONST_INT_P (src)
> > +  && INTVAL (src) > 0 && val_signbit_known_set_p (mode, INTVAL
> (src)))
> > +src = GEN_INT (INTVAL (src) | ~GET_MODE_MASK (mode));
> Can you go ahead and put each condition of the && on a separate line.
> It uses more vertical space, but IMHO makes this easier to read.As I
> said, it was a nit :-)

You're perfectly right. Anything that can improve readability of source code
is a good thing.

> 
> OK with that fix.

Committed.

Best regards,

Thomas

RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Saturday, April 25, 2015 3:00 AM
> Do you have a testcase where this change can result in better generated
> code.  If so please add that testcase.  It's OK if it's ARM specific.

Hi Jeff,

Last time I tried I couldn't reduce the code to a small testcase but if I 
remember
well it was mostly due to the problem of finding a good test for creduce
(zero extension is not unique enough). I'll try again with a more manual 
approach
and get back to you.

Best regards,

Thomas

RE: [PATCH 1/2, combine] Try REG_EQUAL for nonzero_bits

2015-04-24 Thread Thomas Preud'homme

Hi,

first of all, sorry for the delay. We quickly entered stage 4 and I thought
it was best waiting for stage 1 to update you on this.

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> 
> Of course both approaches are not exclusive. I'll try to test with *both*
> rs6000 bootstrap and with a cross-compiler for one of these targets.

I did two experiments where I checked the impact of removing the code
guarded by SHORT_IMMEDIATES_SIGN_EXTEND. In the first one I removed
the code in both rtlanal.c and combine.c. In the second, I only removed the
code from combine.c (in both occurences). In both cases powerpc
bootstrap succeeded.

I then proceeded to use these 2 produced compilers to compile the same
gcc source (actually the source from removing all code guarded by the
macro). I compared the output of objdump on the resulting g++ and
found that in both case the output was different from the one without
any modification. Both diffs look like:

Disassembly of section .init:
@@ -1359,7 +1359,7 @@ Disassembly of section .text:
 10003a94:  f8 21 ff 81 stdur1,-128(r1)
 10003a98:  eb e4 00 00 ld  r31,0(r4)
 10003a9c:  3c 82 ff f8 addis   r4,r2,-8
-10003aa0:  38 84 d7 60 addir4,r4,-10400
+10003aa0:  38 84 d7 70 addir4,r4,-10384
 10003aa4:  7f e3 fb 78 mr  r3,r31
 10003aa8:  4b ff f0 d9 bl  10002b80 
<003d.plt_call.strcmp@@GLIBC_2.3+0>
 10003aac:  e8 41 00 28 ld  r2,40(r1)
@@ -1371,7 +1371,7 @@ Disassembly of section .text:
 10003ac4:  79 2a ff e3 rldicl. r10,r9,63,63
 10003ac8:  41 82 00 78 beq-10003b40 
<._ZL22sanitize_spec_functioniPPKc+0xc0>
 10003acc:  3c 62 ff f8 addis   r3,r2,-8
-10003ad0:  38 63 f5 70 addir3,r3,-2704
+10003ad0:  38 63 f5 b0 addir3,r3,-2640
 10003ad4:  38 21 00 80 addir1,r1,128
 10003ad8:  e8 01 00 10 ld  r0,16(r1)
 10003adc:  eb e1 ff f8 ld  r31,-8(r1)

(this one is when comparing g++ compiled by GCC with partial removal of
the code guarded by the macro compared to compiled without GCC
being modified.

I may have done a mistake when doing the experiment though and can
do it again if you wish.

Best regards,

Thomas

[PATCH, ARM, regression] Fix ternary operator in arm/unknown-elf.h

2015-04-23 Thread Thomas Preud'homme

I just committed the obvious fix below that fix build failure introduced by
revision 222371.


*** gcc/ChangeLog ***

2015-04-24  Thomas Preud'homme  

* config/arm/unknown-elf.h (ASM_OUTPUT_ALIGNED_DECL_LOCAL): fix
ternary operator in fprintf and harmonize spacing.


diff --git a/gcc/config/arm/unknown-elf.h b/gcc/config/arm/unknown-elf.h
index df0b9ce..2e5ab7e 100644
--- a/gcc/config/arm/unknown-elf.h
+++ b/gcc/config/arm/unknown-elf.h
@@ -80,9 +80,9 @@
\
   ASM_OUTPUT_ALIGN (FILE, floor_log2 (ALIGN / BITS_PER_UNIT)); \
   ASM_OUTPUT_LABEL (FILE, NAME);   \
-  fprintf (FILE, "\t.space\t%d\n", SIZE ? (int)(SIZE) : 1);
\
+  fprintf (FILE, "\t.space\t%d\n", SIZE ? (int) SIZE : 1); \
   fprintf (FILE, "\t.size\t%s, %d\n",  \
-  NAME, SIZE ? (int) SIZE, 1); \
+  NAME, SIZE ? (int) SIZE : 1);\
 }  \
   while (0)

Best regards,

Thomas

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-23 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Friday, April 24, 2015 11:15 AM
> 
> So revised review is "ok for the trunk" :-)

Committed.

Best regards,

Thomas

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-23 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Friday, April 24, 2015 10:59 AM
> 

Hi Jeff,

> > +
> > +static bool
> > +cprop_reg_p (const_rtx x)
> > +{
> > +  return REG_P (x) && !HARD_REGISTER_P (x);
> > +}
> How about instead this move to a more visible location (perhaps a macro
> in regs.h or an inline function).  Then as a followup, change the
> various places that have this sequence to use that common definition
> that exist outside of cprop.c.

According to Steven this was proposed in the past but was refused (see
end of [1]).

[1] https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01066.html

> 
> > @@ -1191,7 +1192,7 @@ do_local_cprop (rtx x, rtx_insn *insn)
> > /* Rule out USE instructions and ASM statements as we don't want
> to
> >change the hard registers mentioned.  */
> > if (REG_P (x)
> > -  && (REGNO (x) >= FIRST_PSEUDO_REGISTER
> > +  && (cprop_reg_p (x)
> > || (GET_CODE (PATTERN (insn)) != USE
> >   && asm_noperands (PATTERN (insn)) < 0)))
> Isn't the REG_P test now redundant?

I made the same mistake when reviewing that change and indeed it's not.
Note the opening parenthesis before cprop_reg_p that contains a bitwise
OR expression. So in the case where cprop_reg_p is false, REG_P still
needs to be true.

We could keep a check on FIRST_PSEUDO_REGISTER but the intent (checking
that the register is suitable for propagation) is clearer now, as pointed out by
Steven to me.

> 
> OK for the trunk with those changes.
> 
> jeff

Given the above I intent to keep the REG_P in the second excerpt and will
wait for your input about moving cprop_reg_p to rtl.h

Best regards,

Thomas

RE: [PATCH, ping1] Fix removing of df problem in df_finish_pass

2015-04-21 Thread Thomas Preud'homme

Committed. I'll wait a week and then ask for approval for a backport to 5.1.1 
once 5.1 is released.

Best regards,

Thomas

> -Original Message-
> From: Kenneth Zadeck [mailto:zad...@naturalbridge.com]
> Sent: Monday, April 20, 2015 9:26 PM
> To: Thomas Preud'homme; 'Bernhard Reutner-Fischer'; gcc-
> patc...@gcc.gnu.org; 'Paolo Bonzini'; 'Seongbae Park'
> Subject: Re: [PATCH, ping1] Fix removing of df problem in df_finish_pass
> 
> As a dataflow maintainer, I approve this patch for the next release.
> However, you will have to get approval of a release manager to get it
> into 5.0.
> 
> 
> 
> On 04/20/2015 04:22 AM, Thomas Preud'homme wrote:
> > Ping?
> >
> >> -Original Message-----
> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> >> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> >> Sent: Tuesday, March 03, 2015 12:02 PM
> >> To: 'Bernhard Reutner-Fischer'; gcc-patches@gcc.gnu.org; 'Paolo
> Bonzini';
> >> 'Seongbae Park'; 'Kenneth Zadeck'
> >> Subject: RE: [PATCH] Fix removing of df problem in df_finish_pass
> >>
> >>> From: Bernhard Reutner-Fischer [mailto:rep.dot@gmail.com]
> >>> Sent: Saturday, February 28, 2015 4:00 AM
> >>>>use df_remove_problem rather than manually removing
> problems,
> >>> living
> >>>
> >>> leaving
> >> Indeed. Please find updated changelog below:
> >>
> >> 2015-03-03  Thomas Preud'homme
> 
> >>
> >>* df-core.c (df_finish_pass): Iterate over df-
> >>> problems_by_index[] and
> >>use df_remove_problem rather than manually removing
> >> problems, leaving
> >>holes in df->problems_in_order[].
> >>
> >> Best regards,
> >>
> >> Thomas
> >>
> >>
> >>
> >>
> >
> >

RE: [PATCH, ping1] Fix removing of df problem in df_finish_pass

2015-04-20 Thread Thomas Preud'homme

Ping?

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Tuesday, March 03, 2015 12:02 PM
> To: 'Bernhard Reutner-Fischer'; gcc-patches@gcc.gnu.org; 'Paolo Bonzini';
> 'Seongbae Park'; 'Kenneth Zadeck'
> Subject: RE: [PATCH] Fix removing of df problem in df_finish_pass
> 
> > From: Bernhard Reutner-Fischer [mailto:rep.dot@gmail.com]
> > Sent: Saturday, February 28, 2015 4:00 AM
> > >   use df_remove_problem rather than manually removing problems,
> > living
> >
> > leaving
> 
> Indeed. Please find updated changelog below:
> 
> 2015-03-03  Thomas Preud'homme  
> 
>   * df-core.c (df_finish_pass): Iterate over df-
> >problems_by_index[] and
>   use df_remove_problem rather than manually removing
> problems, leaving
>   holes in df->problems_in_order[].
> 
> Best regards,
> 
> Thomas
> 
> 
> 
>

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-16 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Monday, April 13, 2015 8:48 PM
> 
> I know there were several followups between Steven and yourself.
> With
> stage1 now open, can you post a final version and do a final
> bootstrap/test with it?

Here is what came out of our discussion with Steven:

The RTL cprop pass in GCC operates by doing a local constant/copy propagation
first and then a global one. In the local one, if a constant cannot be 
propagated
(eg. due to constraints of the destination instruction) a copy propagation is
done instead. However, at the global level copy propagation is only tried if no
constant can be propagated, ie. if a constant can be propagated but the
constraints of the destination instruction forbids it, no copy propagation will 
be
tried. This patch fixes this issue.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2015-04-15  Thomas Preud'homme  
Steven Bosscher 

* cprop.c (cprop_reg_p): New.
(hash_scan_set): Use above function to check if register can be
propagated.
(find_avail_set): Return up to two sets, one whose source is
a register and one whose source is a constant.  Sets are returned in
an array passed as parameter rather than as a return value.
(cprop_insn): Use a do while loop rather than a goto.  Try each of the
sets returned by find_avail_set, starting with the one whose source is
a constant. Use cprop_reg_p to check if register can be propagated.
(do_local_cprop): Use cprop_reg_p to check if register can be
propagated.
(implicit_set_cond_p): Likewise.

*** gcc/testsuite/ChangeLog ***

2015-04-15  Thomas Preud'homme  
Steven Bosscher 

* gcc.target/arm/pr64616.c: New file.


And the patch is:


diff --git a/gcc/cprop.c b/gcc/cprop.c
index c9fb2fc..78541cf 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -285,6 +285,15 @@ cprop_constant_p (const_rtx x)
   return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x));
 }
 
+/* Determine whether the rtx X should be treated as a register that can
+   be propagated.  Any pseudo-register is fine.  */
+
+static bool
+cprop_reg_p (const_rtx x)
+{
+  return REG_P (x) && !HARD_REGISTER_P (x);
+}
+
 /* Scan SET present in INSN and add an entry to the hash TABLE.
IMPLICIT is true if it's an implicit set, false otherwise.  */
 
@@ -295,8 +304,7 @@ hash_scan_set (rtx set, rtx_insn *insn, struct hash_table_d 
*table,
   rtx src = SET_SRC (set);
   rtx dest = SET_DEST (set);
 
-  if (REG_P (dest)
-  && ! HARD_REGISTER_P (dest)
+  if (cprop_reg_p (dest)
   && reg_available_p (dest, insn)
   && can_copy_p (GET_MODE (dest)))
 {
@@ -321,9 +329,8 @@ hash_scan_set (rtx set, rtx_insn *insn, struct hash_table_d 
*table,
src = XEXP (note, 0), set = gen_rtx_SET (VOIDmode, dest, src);
 
   /* Record sets for constant/copy propagation.  */
-  if ((REG_P (src)
+  if ((cprop_reg_p (src)
   && src != dest
-  && ! HARD_REGISTER_P (src)
   && reg_available_p (src, insn))
  || cprop_constant_p (src))
insert_set_in_table (dest, src, insn, table, implicit);
@@ -821,15 +828,15 @@ try_replace_reg (rtx from, rtx to, rtx_insn *insn)
   return success;
 }
 
-/* Find a set of REGNOs that are available on entry to INSN's block.  Return
-   NULL no such set is found.  */
+/* Find a set of REGNOs that are available on entry to INSN's block.  If found,
+   SET_RET[0] will be assigned a set with a register source and SET_RET[1] a
+   set with a constant source.  If not found the corresponding entry is set to
+   NULL.  */
 
-static struct cprop_expr *
-find_avail_set (int regno, rtx_insn *insn)
+static void
+find_avail_set (int regno, rtx_insn *insn, struct cprop_expr *set_ret[2])
 {
-  /* SET1 contains the last set found that can be returned to the caller for
- use in a substitution.  */
-  struct cprop_expr *set1 = 0;
+  set_ret[0] = set_ret[1] = NULL;
 
   /* Loops are not possible here.  To get a loop we would need two sets
  available at the start of the block containing INSN.  i.e. we would
@@ -869,8 +876,10 @@ find_avail_set (int regno, rtx_insn *insn)
  If the source operand changed, we may still use it for the next
  iteration of this loop, but we may not use it for substitutions.  */
 
-  if (cprop_constant_p (src) || reg_not_set_p (src, insn))
-   set1 = set;
+  if (cprop_constant_p (src))
+   set_ret[1] = set;
+  else if (reg_not_set_p (src, insn))
+   set_ret[0] = set;
 
   /* If the source of the set is anything except a register, then
 we have reached the end of the copy chain.  */
@@ -881,10 +890,6 @@ find_avail_set (int regno, rtx_insn *insn)
 and see if we have an available copy into SRC.  */
   regno = REGNO (

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-14 Thread Thomas Preud'homme

> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Monday, April 13, 2015 8:48 PM
> Thomas,
> 
> I know there were several followups between Steven and yourself.
> With
> stage1 now open, can you post a final version and do a final
> bootstrap/test with it?

Sure, I'm testing it right now. Sorry for not doing it earlier, I wasn't sure 
what
constitute "too much disruption" as per GCC 6.0 Status Report email.

Best regards,

Thomas

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-29 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> 
> FYI testing your patch with the one cprop_reg_p negated as said in my
> previous email shows no regression on arm-none-eabi cross-compiler
> targeting Cortex-M3. Testing for x86_64 is ongoing.

Sorry, I forgot to report back on this. No regression as well on 
x86_64-linux-gnu.

Do you want me to respin the patch (adding the testcase from the patch I
sent, fixing the indentation and adding a ChangeLog)?

Best regards,

Thomas

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-23 Thread Thomas Preud'homme

> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Friday, March 20, 2015 8:14 PM
> 
> I put the cprop_reg_p check there instead of !HARD_REGISTER_P
> because
> I like to be able to quickly find all places where a similar check is
> performed. The check is whether the reg is something that copy
> propagation can handle, and that is what I added cprop_reg_p for.

Makes sense indeed. I didn't think about the meaning of it.

> (Note that cprop can _currently_ handle only pseudos but there is no
> reason why a limited set of hard regs can't be handled also, e.g. the
> flag registers like in targetm.fixed_condition_code_regs).
> 
> In this case, the result is that REG_P is checked twice.
> But then again, cprop_reg_p will be inlined and the double check
> optimized away.

True.

> 
> Anyway, I guess we've bikeshedded long enough over this patch as it is
> :-) Let's post a final form and declare it OK for stage1.

What about the cprop_reg_p that needs to be negated? Did I miss something
that makes it ok?

> 
> As for PSEUDO_REG_P: If it were up to me, I'd like to have in rtl.h:
> 
> static bool
> hard_register_p (rtx x)
> {
>   return (REG_P (x) && HARD_REGISTER_NUM_P (REGNO (x)));
> }
> 
> static bool
> pseudo_register_p (rtx x)
> {
>   return (REG_P (x) && !HARD_REGISTER_NUM_P (REGNO (x)));
> }
> 
> and do away with all the FIRST_PSEUDO_REGISTER tests. But I've
> proposed this in the past and there was opposition. Perhaps when we
> introduce a rtx_reg class...

Ok I'll try to dig up what was the reasons presented. Anyway, it would
be done in a separate patch so not a problem for this one.

FYI testing your patch with the one cprop_reg_p negated as said in my
previous email shows no regression on arm-none-eabi cross-compiler
targeting Cortex-M3. Testing for x86_64 is ongoing.

Best regards,

Thomas

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-20 Thread Thomas Preud'homme

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> 
> I noticed in do_local_cprop you replace >= FIRST_PSEUDO_REGISTER by
> cprop_reg_p without removing the REG_P as well.

Sorry, I missed the parenthesis. REG_P needs indeed to be kept. I'd be
tempted to use !HARD_REGISTER_P instead since REG_P is already
checked but I don't mind either way.

Best regards,

Thomas

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-20 Thread Thomas Preud'homme

Hi Steven,

> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Friday, March 20, 2015 3:54 PM
> 
> 
> What I meant, is that I believe the tests are already done in
> hash_scan_set and should be redundant in cprop_insn (i.e. the test can
> be replaced with gcc_[checking_]assert).

Ok.

> 
> I've attached a patch with some changes to it: introduce cprop_reg_p()
> to get rid of all the "REG_P && regno > FIRST_PSEUDO_REGISTER" tests.
> I still have the cprop_constant_p and cprop_reg_p tests in cprop_insn
> but this weekend I'll try with gcc_checking_asserts instead. Please
> have a look at the patch and let me know if you like it (given it's
> mostly yours I hope you do like it ;-)

I think it would be preferable to introduce PSEUDO_REG_P in rtl.h as this
seems like a common pattern enough [1]. It would be nice to have a
HARD_REG_P that would be cover the other common patterns
REG_P && < FIRST_PSEUDO_REGISTER and REG_P && HARD_REGISTER_P
but I can't come up with a good name (HARD_REGISTER_P is confusing
because it doesn't check if it's a register in the first place).

I noticed in do_local_cprop you replace >= FIRST_PSEUDO_REGISTER by
cprop_reg_p without removing the REG_P as well. In implicit_set_cond_p
there is a replacement of !REG_P || HARD_REGISTER_P by cprop_reg_p.
It seems to me it should rather be replaced by !cprop_reg_p. Otherwise it
looks ok.

[1] grep -R "REG_P .*&&.*>= FIRST_PSEUDO_REGISTER" . | wc -l returns 23

> 
> Bootstrapped & tested on powerpc64-unknown-linux-gnu. In building all
> of cc1, 35 extra copies are propagated with the patch.

I'll try to launch a build and testsuite run with these 2 issues fixed before I
leave tonight and will report the result on Monday.

Best regards,

Thomas

RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-16 Thread Thomas Preud'homme

> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Monday, March 09, 2015 7:48 PM
> To: Thomas Preud'homme
> Cc: GCC Patches; Eric Botcazou
> Subject: Re: [PATCH, stage1] Move insns without introducing new
> temporaries in loop2_invariant

New patch below.

> 
> It looks like this would run for all candidate loop invariants, right?
> 
> If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
> potential compile time hog for large loops.
> 
> But why compute this at all? Perhaps I'm missing something, but you
> already have inv->always_executed available, no?

Indeed. I didn't realize the information was already there.

> 
> 
> > +  basic_block use_bb;
> > +
> > +  ref = DF_REF_INSN (use);
> > +  use_bb = BLOCK_FOR_INSN (ref);
> 
> You can use DF_REF_BB.

Since I need use_insn here I kept BLOCK_FOR_INSN but I used DF_REF_BB for the 
def below.


So here are the new ChangeLog entries:

*** gcc/ChangeLog ***

2015-03-11  Thomas Preud'homme  

* loop-invariant.c (can_move_invariant_reg): New.
(move_invariant_reg): Call above new function to decide whether
instruction can just be moved, skipping creation of temporary
register.

*** gcc/testsuite/ChangeLog ***

2015-03-12  Thomas Preud'homme  

* gcc.dg/loop-8.c: New test.
* gcc.dg/loop-9.c: New test.


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..8217d62 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1512,6 +1512,79 @@ replace_uses (struct invariant *inv, rtx reg, bool 
in_group)
   return 1;
 }
 
And the new patch:

+/* Whether invariant INV setting REG can be moved out of LOOP, at the end of
+   the block preceding its header.  */
+
+static bool
+can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg)
+{
+  df_ref def, use;
+  unsigned int dest_regno, defs_in_loop_count = 0;
+  rtx_insn *insn = inv->insn;
+  basic_block bb = BLOCK_FOR_INSN (inv->insn);
+
+  /* We ignore hard register and memory access for cost and complexity reasons.
+ Hard register are few at this stage and expensive to consider as they
+ require building a separate data flow.  Memory access would require using
+ df_simulate_* and can_move_insns_across functions and is more complex.  */
+  if (!REG_P (reg) || HARD_REGISTER_P (reg))
+return false;
+
+  /* Check whether the set is always executed.  We could omit this condition if
+ we know that the register is unused outside of the loop, but it does not
+ seem worth finding out.  */
+  if (!inv->always_executed)
+return false;
+
+  /* Check that all uses reached by the def in insn would still be reached
+ it.  */
+  dest_regno = REGNO (reg);
+  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
+{
+  rtx_insn *use_insn;
+  basic_block use_bb;
+
+  use_insn = DF_REF_INSN (use);
+  use_bb = BLOCK_FOR_INSN (use_insn);
+
+  /* Ignore instruction considered for moving.  */
+  if (use_insn == insn)
+   continue;
+
+  /* Don't consider uses outside loop.  */
+  if (!flow_bb_inside_loop_p (loop, use_bb))
+   continue;
+
+  /* Don't move if a use is not dominated by def in insn.  */
+  if (use_bb == bb && DF_INSN_LUID (insn) >= DF_INSN_LUID (use_insn))
+   return false;
+  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
+   return false;
+}
+
+  /* Check for other defs.  Any other def in the loop might reach a use
+ currently reached by the def in insn.  */
+  for (def = DF_REG_DEF_CHAIN (dest_regno); def; def = DF_REF_NEXT_REG (def))
+{
+  basic_block def_bb = DF_REF_BB (def);
+
+  /* Defs in exit block cannot reach a use they weren't already.  */
+  if (single_succ_p (def_bb))
+   {
+ basic_block def_bb_succ;
+
+ def_bb_succ = single_succ (def_bb);
+ if (!flow_bb_inside_loop_p (loop, def_bb_succ))
+   continue;
+   }
+
+  if (++defs_in_loop_count > 1)
+   return false;
+}
+
+  return true;
+}
+
 /* Move invariant INVNO out of the LOOP.  Returns true if this succeeds, false
otherwise.  */
 
@@ -1545,11 +1618,8 @@ move_invariant_reg (struct loop *loop, unsigned invno)
}
}
 
-  /* Move the set out of the loop.  If the set is always executed (we could
-omit this condition if we know that the register is unused outside of
-the loop, but it does not seem worth finding out) and it has no uses
-that would not be dominated by it, we may just move it (TODO).
-Otherwise we need to create a temporary register.  */
+  /* If possible, just move the set out of the loop.  Otherwise, we
+need to create a temporary register.  */
   set = single_set (inv->insn);
   reg = dest = SET_DEST (set);

RE: [PATCH, stage1] Make function names visible in -fdump-rtl-*-graph

2015-03-13 Thread Thomas Preud'homme

> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Friday, March 13, 2015 5:02 PM
> >
> > Is this ok for stage1? It's not a bug but it helps debuggability so is
> > this something we might consider backporting?
> 
> It's ok now given you bootstrapped the change.

I did + regression testsuite on both arm-none-eabi and
x86_64-linux-gnu. I forgot to mentioned it sorry.

I just committed it.

Best regards,

Thomas

[PATCH, stage1] Make function names visible in -fdump-rtl-*-graph

2015-03-13 Thread Thomas Preud'homme

Hi,

The description is longer than the patch so you might want to skip directly to 
it.

The dot file generated by -fdump-rtl-*-graph switches group basic blocks for a 
given function together in a subgraph and use the function name as the label. 
However, when generating an image (for instance a svg with "dot -Tsvg") the 
label does not appear. This makes analyzing the resulting file more difficult 
than it should be.

The section "Subgraphs and clusters" of "The DOT language" document contains 
the following excerpt:
 
"The third role for subgraphs directly involves how the graph will be laid out 
by certain layout engines. If the name of the subgraph begins with cluster, 
Graphviz notes the subgraph as a special cluster subgraph. If supported, the 
layout engine will do the layout so that the nodes belonging to the cluster are 
drawn together, with the entire drawing of the cluster contained within a 
bounding rectangle. Note that, for good and bad, cluster subgraphs are not part 
of the DOT language, but solely a syntactic convention adhered to by certain of 
the layout engines."

Hence prepending cluster_ to subgraph id (not its label) would improve the 
output image with many layout engines while no doing any difference for other 
layout engines. The patch also make the subgraph boudary visible with dashed 
lines and add "()" to the label of the subgraph (so for a function f the label 
would be "f ()").

ChangeLog entry is as follows:

*** gcc/ChangeLog ***

2015-03-10  Thomas Preud'homme  

* graph.c (print_graph_cfg): Make function names visible and append
parenthesis to it.  Also make groups of basic blocks belonging to the
same function visible.


diff --git a/gcc/graph.c b/gcc/graph.c
index a1eb24c..5fb0d78 100644
--- a/gcc/graph.c
+++ b/gcc/graph.c
@@ -292,9 +292,10 @@ print_graph_cfg (const char *base, struct function *fun)
   pretty_printer graph_slim_pp;
   graph_slim_pp.buffer->stream = fp;
   pretty_printer *const pp = &graph_slim_pp;
-  pp_printf (pp, "subgraph \"%s\" {\n"
-"\tcolor=\"black\";\n"
-"\tlabel=\"%s\";\n",
+  pp_printf (pp, "subgraph \"cluster_%s\" {\n"
+"\tstyle=\"dashed\";\n"
+"\tcolor=\"black\";\n"
+"\tlabel=\"%s ()\";\n",
 funcname, funcname);
   draw_cfg_nodes (pp, fun);
   draw_cfg_edges (pp, fun);

Is this ok for stage1? It's not a bug but it helps debuggability so is this 
something we might consider backporting?

Best regards,

Thomas

RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-10 Thread Thomas Preud'homme

> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Monday, March 09, 2015 7:48 PM
> To: Thomas Preud'homme
> Cc: GCC Patches; Eric Botcazou
> Subject: Re: [PATCH, stage1] Move insns without introducing new
> temporaries in loop2_invariant
> 
> On Thu, Mar 5, 2015 at 10:53 AM, Thomas Preud'homme wrote:
> > diff --git a/gcc/dominance.c b/gcc/dominance.c
> > index 33d4ae4..09c8c90 100644
> > --- a/gcc/dominance.c
> > +++ b/gcc/dominance.c
> > @@ -982,7 +982,7 @@ nearest_common_dominator_for_set (enum
> cdi_direction dir, bitmap blocks)
> >
> > A_Dominated_by_B (node A, node B)
> > {
> > - return DFS_Number_In(A) >= DFS_Number_In(A)
> > + return DFS_Number_In(A) >= DFS_Number_In(B)
> >  && DFS_Number_Out (A) <= DFS_Number_Out(B);
> > }  */
> 
> This hunk is obvious enough ;-)

Thus committed.

Best regards,

Thomas

RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-09 Thread Thomas Preud'homme

> From: Jiong Wang [mailto:jiong.w...@arm.com]
> Sent: Friday, March 06, 2015 8:10 PM
> 
> On 05/03/15 09:53, Thomas Preud'homme wrote:
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2015-02-16  Thomas Preud'homme  
> >
> >  * gcc.dg/loop-7.c: Run on all targets and check for loop2_invariant
> >  being able to move instructions without introducing new
> temporary
> Thomas,
> 
>Can you please confirm this relax on all target will not fail on
> AArch64? It do fails on my quick test.

Indeed, I made a very naïve assumption here and should have tested on a wider
range of targets. I'll rework the testcases associated with this patch.

Thanks for catching it Jiong.

Best regards,

Thomas

[PATCH] Fix PR63743: Incorrect ordering of operands in sequence of commutative operations

2015-03-06 Thread Thomas Preud'homme

Hi,

Improved canonization after r216728 causes GCC to more often generate poor code 
due to suboptimal ordering of operand of commutative libcalls. Indeed, if one 
of the operands of a commutative operation is the result of a previous 
operation, both being implemented by libcall, the wrong ordering of the 
operands in the second operation can lead to extra mov. Consider the following 
case on softfloat targets:

double
test1 (double x, double y)
{
  return x * (x + y);
}

If x + y is put in the operand using the same register as the result of the 
libcall for x + y then no mov is generated, otherwise mov is needed. The 
following happens on arm softfloat with the right ordering:

bl  __aeabi_dadd
ldr r2, [sp]
ldr r3, [sp, #4]
/* r0, r1 are reused from the return values of the __aeabi_dadd libcall. */
bl  __aeabi_dmul

With the wrong ordering one gets:

bl  __aeabi_dadd
mov r2, r0
mov r3, r1
ldr r0, [sp]
ldr r1, [sp, #4]
bl  __aeabi_dmul

This patch extend the patch written by Yuri Rumyantsev in r219646 to also deal 
with the case of only one of the operand being the result of an operation. 
ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2015-03-05  Thomas Preud'homme  

PR tree-optimization/63743
* cfgexpand.c (reorder_operands): Also reorder if only second operand
had its definition forwarded by TER.

*** gcc/testsuite/ChangeLog ***

2015-03-05  Thomas Preud'homme  

PR tree-optimization/63743
* gcc.dg/pr63743.c: New test.


diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 7dfe1f6..4fbc037 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5117,13 +5117,11 @@ reorder_operands (basic_block bb)
continue;
   /* Swap operands if the second one is more expensive.  */
   def0 = get_gimple_for_ssa_name (op0);
-  if (!def0)
-   continue;
   def1 = get_gimple_for_ssa_name (op1);
   if (!def1)
continue;
   swap = false;
-  if (lattice[gimple_uid (def1)] > lattice[gimple_uid (def0)])
+  if (!def0 || lattice[gimple_uid (def1)] > lattice[gimple_uid (def0)])
swap = true;
   if (swap)
{
@@ -5132,7 +5130,7 @@ reorder_operands (basic_block bb)
  fprintf (dump_file, "Swap operands in stmt:\n");
  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
  fprintf (dump_file, "Cost left opnd=%d, right opnd=%d\n",
-  lattice[gimple_uid (def0)],
+  def0 ? lattice[gimple_uid (def0)] : 0,
   lattice[gimple_uid (def1)]);
}
  swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt),
diff --git a/gcc/testsuite/gcc.dg/pr63743.c b/gcc/testsuite/gcc.dg/pr63743.c
new file mode 100644
index 000..87254ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr63743.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-rtl-expand-details" } */
+
+double
+libcall_dep (double x, double y)
+{
+  return x * (x + y);
+}
+
+/* { dg-final { scan-rtl-dump-times "Swap operands" 1 "expand" } } */
+/* { dg-final { cleanup-rtl-dump "expand" } } */


Testsuite was run in QEMU when compiled by an arm-none-eabi GCC cross-compiler 
targeting Cortex-M3 and a bootstrapped x86_64 native GCC without any 
regression. 
CSiBE sees a -0.5034% code size decrease on arm-none-eabi and a 0.0058% code 
size increase on x86_64-linux-gnu.

Is it ok for trunk (since it fixes a code size regression in 5.0)?

Best regards,

Thomas

RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-06 Thread Thomas Preud'homme

> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Thursday, March 05, 2015 7:12 PM
> >
> > loop header
> > start of loop body
> > //stuff
> > (set (reg 128) (const_int 0))
> > //other stuff
> > end of loop body
> >
> > becomes:
> >
> > (set (reg 129) (const_int 0))
> > loop header
> > start of loop body
> > //stuff
> > (set (reg 128) (reg 128))
> > //other stuff
> > end of loop body
> >
> 
> Why doesn't copy-propagation clean this up?  It's run after loop2.

Actually cprop3 is what makes the situation worse in this case as it
will copy the constant that is set outside the loop in the mov that is in
the loop. In the case or PR64616 the constant is a symbol_ref which
makes it a memory access so it propagates the memory access in the
loop, making the load executed many times.

Note that as I said in the intro this bug is also solved by [1] which is
the first thing that goes wrong for this example. That being said,
loop invariant pass ought to simply move instructions if it can safely
do so.

[1] https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00933.html

Best regards,

Thomas

[PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-05 Thread Thomas Preud'homme

Note: this is stage1 material.

Currently loop2_invariant pass hoist instructions out of loop by creating a new 
temporary for the destination register of that instruction and leaving there a 
mov from new temporary to old register as shown below:

loop header
start of loop body
//stuff
(set (reg 128) (const_int 0))
//other stuff
end of loop body

becomes:

(set (reg 129) (const_int 0))
loop header
start of loop body
//stuff
(set (reg 128) (reg 128))
//other stuff
end of loop body

This is one of the errors that led to a useless ldr ending up inside a loop 
(PR64616). This patch fix this specific bit (some other bit was fixed in [1]) 
by simply moving the instruction if it's known to be safe. This is decided by 
looking at all the uses of the register set in the instruction and checking 
that (i) they were all dominated by the instruction and (ii) there is no other 
def in the loop that could end up reaching one of the use.

[1] https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00933.html

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2015-02-16  Thomas Preud'homme  

* dominance.c (nearest_common_dominator_for_set): Fix A_Dominated_by_B
code in comment.
* loop-invariant.c (can_move_invariant_reg): New.
(move_invariant_reg): Call above new function to decide whether
instruction can just be moved, skipping creation of temporary
register.

*** gcc/testsuite/ChangeLog ***

2015-02-16  Thomas Preud'homme  

* gcc.dg/loop-7.c: Run on all targets and check for loop2_invariant
being able to move instructions without introducing new temporary
register.
* gcc.dg/loop-8.c: New test.

diff --git a/gcc/dominance.c b/gcc/dominance.c
index 33d4ae4..09c8c90 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -982,7 +982,7 @@ nearest_common_dominator_for_set (enum cdi_direction dir, 
bitmap blocks)
 
A_Dominated_by_B (node A, node B)
{
- return DFS_Number_In(A) >= DFS_Number_In(A)
+ return DFS_Number_In(A) >= DFS_Number_In(B)
 && DFS_Number_Out (A) <= DFS_Number_Out(B);
}  */
 
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..ab2a45c 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1512,6 +1512,99 @@ replace_uses (struct invariant *inv, rtx reg, bool 
in_group)
   return 1;
 }
 
+/* Whether invariant INV setting REG can be moved out of LOOP, at the end of
+   the block preceding its header.  */
+
+static bool
+can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg)
+{
+  df_ref def, use;
+  bool ret = false;
+  unsigned int i, dest_regno, defs_in_loop_count = 0;
+  rtx_insn *insn = inv->insn;
+  bitmap may_exit, has_exit, always_executed;
+  basic_block *body, bb = BLOCK_FOR_INSN (inv->insn);
+
+  /* We ignore hard register and memory access for cost and complexity reasons.
+ Hard register are few at this stage and expensive to consider as they
+ require building a separate data flow.  Memory access would require using
+ df_simulate_* and can_move_insns_across functions and is more complex.  */
+  if (!REG_P (reg) || HARD_REGISTER_P (reg))
+return false;
+
+  /* Check whether the set is always executed.  We could omit this condition if
+ we know that the register is unused outside of the loop, but it does not
+ seem worth finding out.  */
+  may_exit = BITMAP_ALLOC (NULL);
+  has_exit = BITMAP_ALLOC (NULL);
+  always_executed = BITMAP_ALLOC (NULL);
+  body = get_loop_body_in_dom_order (loop);
+  find_exits (loop, body, may_exit, has_exit);
+  compute_always_reached (loop, body, has_exit, always_executed);
+  /* Find bit position for basic block bb.  */
+  for (i = 0; i < loop->num_nodes && body[i] != bb; i++);
+  if (!bitmap_bit_p (always_executed, i))
+goto cleanup;
+
+  /* Check that all uses reached by the def in insn would still be reached
+ it.  */
+  dest_regno = REGNO (reg);
+  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
+{
+  rtx ref;
+  basic_block use_bb;
+
+  ref = DF_REF_INSN (use);
+  use_bb = BLOCK_FOR_INSN (ref);
+
+  /* Ignore instruction considered for moving.  */
+  if (ref == insn)
+   continue;
+
+  /* Don't consider uses outside loop.  */
+  if (!flow_bb_inside_loop_p (loop, use_bb))
+   continue;
+
+  /* Don't move if a use is not dominated by def in insn.  */
+  if (use_bb == bb && DF_INSN_LUID (insn) > DF_INSN_LUID (ref))
+   goto cleanup;
+  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
+   goto cleanup;
+
+  /* Check for other defs.  Any other def in the loop might reach a use
+currently reached by the def in insn.  */
+  if (!defs_in_loop_count)
+   {
+ for (def = DF_REG_DEF_CHAIN (dest_regno); def; def = DF_REF_NEXT_REG 
(def))
+   {
+ basic_block def_bb = BLOCK_F

RE: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-03-04 Thread Thomas Preud'homme

Ping?

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme

[SNIP]

> >
> > Likewise for the REG_P and ">= FIRST_PSEUDO_REGISTER" tests here
> > (with
> > the equivalent and IMHO preferable HARD_REGISTER_P test in
> > find_avail_set()).
> 
> I'm not sure I follow you here. First, it seems to me that the equivalent
> test is rather REG_P && !HARD_REGISTER_P since here it checks if it's
> a pseudo register.
> 
> Then, do you mean the test can be simply removed because of the
> REG_P && !HARD_REGISTER_P in hash_scan_set () called indirectly by
> compute_hash_table () when called in one_cprop_pass () before any
> cprop_insn ()? Or do you mean I should move the check in
> find_avail_set ()?
> 
> Best regards,
> 
> Thomas
> 
> 
> 
>

1 2 3 >

1 - 100 of 279 matches

Mail list logo