[Patch, avr] Fix PR 50739 - nameless error with -fmerge-all-constants

2016-07-04 Thread Senthil Kumar Selvaraj
Hi,

  This patch fixes a problem with fmerge-all-constants and the progmem
  attribute - on trunk, the below testcase errors out with a section
  conflict error.

  When avr_asm_select_section renames .rodata.xyz section to
  .progmem.xyz and calls get_section, it passes in the same flags in
  sect. If the flags include SECTION_DECLARED, get_section barfs with a
  section conflict error - the section flag comparison logic strips off
  SECTION_DECLARED from existing section flags before comparing it with
  the new incoming flags.

  With -fmerge-all-constants, default_elf_select_section always returns
  .rodata.strx.x. varasm switches to that section when writing out the
  non progmem string literal, and that sets SECTION_DECLARED. The first
  call to get_section with the section name transformed to
  .progmem.data.strx.x then includes SECTION_DECLARED, but because this
  is a new section, the section flag conflict logic doesn't kick in. The
  second call to get_section, again including SECTION_DECLARED, triggers
  the section flag conflict logic and causes the error.

  Stripping off SECTION_DECLARED before calling get_section fixes the
  problem - the flag is supposed to be set by switch_section anyway.

  Reg testing showed no new regressions. Ok for trunk and backport to 6.x?

Regards
Senthil


gcc/testsuite/ChangeLog:

2016-07-05  Senthil Kumar Selvaraj  

PR target/50739 
* gcc.target/avr/pr50739.c: New test.


gcc/ChangeLog:

2016-07-05  Senthil Kumar Selvaraj  

PR target/50739 
* config/avr/avr.c (avr_asm_select_section):


diff --git gcc/config/avr/avr.c gcc/config/avr/avr.c
index 18ed766..9b7b392 100644
--- gcc/config/avr/avr.c
+++ gcc/config/avr/avr.c
@@ -9641,7 +9641,7 @@ avr_asm_select_section (tree decl, int reloc, unsigned 
HOST_WIDE_INT align)
 {
   const char *sname = ACONCAT ((new_prefix,
 name + strlen (old_prefix), NULL));
-  return get_section (sname, sect->common.flags, sect->named.decl);
+  return get_section (sname, sect->common.flags & 
~SECTION_DECLARED, sect->named.decl);
 }
 }
 
diff --git gcc/testsuite/gcc.target/avr/pr50739.c 
gcc/testsuite/gcc.target/avr/pr50739.c
new file mode 100644
index 000..a6850b7
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr50739.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-fmerge-all-constants" } */
+
+char *ca = "123";
+
+const char a[] __attribute__((__progmem__))= "a";
+const char b[] __attribute__((__progmem__))= "b";
-- 
2.7.4



Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-07-04 Thread Alessandro Fanfarillo
* PING *

2016-06-21 10:59 GMT-06:00 Alessandro Fanfarillo :
> * PING *
>
> 2016-06-06 15:05 GMT-06:00 Alessandro Fanfarillo :
>> Dear all,
>>
>> please find in attachment the first patch (of n) for the FAILED IMAGES
>> capability defined in the coarray TS 18508.
>> The patch adds support for three new intrinsic functions defined in
>> the TS for simulating a failure (fail image), checking an image status
>> (image_status) and getting the list of failed images (failed_images).
>> The patch has been built and regtested on x86_64-pc-linux-gnu.
>>
>> Ok for trunk?
>>
>> Alessandro


Re: -fopt-info handling

2016-07-04 Thread Ulrich Drepper
Anyone?

On Mon, Jun 27, 2016 at 1:31 PM, Ulrich Drepper  wrote:
> The manual says about -fop-info:
>
>If OPTIONS is omitted, it defaults to 'all-all', which means
> dump all available optimization info from all the passes.
>
> The current implementation (at at least recent gcc 6.1) don't follow
> that, though.  They just ignore the option in that case.
>
> How about the attached patch?  It is simple and doesn't duplicate the
> information what "all-all" means and instead let's the option parser
> do the hard work.


[patch, fortran] Bug 66575 - Endless compilation on missing end interface

2016-07-04 Thread Jerry DeLisle

This patch and test case regression tested on x86-64.

Will commit under simple/obvious rule.

Regards,

Jerry

2016-07-04  Jerry DeLisle  

PR fortran/66575
* decl.c (match_procedure_interface): Exit loop if procedure
interface refers to itself.

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 724f14f7..1b62833f 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -5454,7 +5454,8 @@ match_procedure_interface (gfc_symbol **proc_if)
   /* Resolve interface if possible. That way, attr.procedure is only set
 	 if it is declared by a later procedure-declaration-stmt, which is
 	 invalid per F08:C1216 (cf. resolve_procedure_interface).  */
-  while ((*proc_if)->ts.interface)
+  while ((*proc_if)->ts.interface
+	 && *proc_if != (*proc_if)->ts.interface)
 	*proc_if = (*proc_if)->ts.interface;
 
   if ((*proc_if)->attr.flavor == FL_UNKNOWN
! { dg-do compile }
! Bug 66575 - Endless compilation on missing end interface 
program p
   procedure(g) :: g ! { dg-error "may not be used as its own interface" }
   procedure(g) ! { dg-error "Syntax error in PROCEDURE statement" }
end


Re: [Fortran] Help with STAT= attribute in coarray reference

2016-07-04 Thread Mikael Morin

Le 30/06/2016 06:05, Alessandro Fanfarillo a écrit :

Dear Mikael,

thanks for your review and for the test. The attached patch, built and
regtested for x86_64-pc-linux-gnu, addresses all the suggestions.

The next patch will change the documentation related to the caf_get
and caf_send functions and will add support for STAT= to the sendget
function.

In the meantime, is this patch OK for trunk?


Yes, thanks.

Mikael




Re: [PATCH] Add code-hoisting to GIMPLE

2016-07-04 Thread Steven Bosscher
On Mon, Jul 4, 2016 at 1:26 PM, Richard Biener wrote:
>
> The following patch is Stevens code-hoisting based on PRE forward-ported
> and fixed for bootstrap plus the case of hoisting code across loops
> which we generally do not want (expressions in the loop exit target block
> are antic-in throughout the whole loop unless they are killed and thus
> get inserted into the exit block and then PREd before the loop).
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> I'm going to try making the bitmap_set ops in do_hoist_insert a bit
> faster - Steven, do you remember any issues with the approach from the
> time you worked on it?

Hi Richi,

It's been almost 8 years since I worked on this, so I really don't
recall much about this at all. Sorry :-)

But thank you for picking this old work up!

Ciao!
Steven


Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread Mike Stump
On Jul 4, 2016, at 12:36 PM, Markus Trippelsdorf  wrote:
> 
> On 2016.07.04 at 10:08 -0700, H.J. Lu wrote:
>> On Sun, Jul 3, 2016 at 9:38 PM, Davide Italiano  
>> wrote:
>>> On Thu, Jun 23, 2016 at 9:11 PM, Davide Italiano  
>>> wrote:
 + HJ who wrote the code for the option originally.
 
 On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  
 wrote:
> LLVM currently ships with a new ELF linker http://lld.llvm.org/.
> I experiment a lot with gcc and lld so it would be nice if
> -fuse-ld=lld is supported (considering the linker is now mature enough
> to link large C/C++ applications).
> 
> Also, IMHO, -fuse-ld should be a generic facility which accept other
> linkers (as long as they follow the convention ld.), and should
> also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
> Probably outside of the scope of this patch, but I thought worth
> mentioning.
>>> 
>>> Hi, can anybody take a look?
>> 
>> lld isn't compatible with GCC:
>> 
>> https://llvm.org/bugs/show_bug.cgi?id=28414
> 
> Besides the technical issues, this also raises the question if it is
> right to support lld at all. Because this project was obviously started
> to replace the GNU linkers (ld.bfd and gold) in the long run.
> So I see no reason why it should be supported in GCC.
> 
> (And who needs a buggy new ELF linker anyway?)

So, this is off-topic for the list, gnu.misc.discuss is a better forum for such 
things, if you want.  The GNU tools have no prohibition with working with 
system libraries that are non-free, nor non-free tools, such as ar, nm, ld and 
as or even simulators.  Contributions for interoperability with other tools 
will be considered.  gcc has always been widely compatible and interoperable 
with more than just Linux systems.

[PING] Re: Some fixes for autofdo test cases

2016-07-04 Thread Andi Kleen
Andi Kleen  writes:

Ping!

> This fixes some of the problems with profile test cases running with autofdo
> There are still remaining failures that need to be addressed, but this is the
> low hanging fruit.
>
> -Andi
>
>

-- 
a...@linux.intel.com -- Speaking for myself only


Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread Davide Italiano
On Mon, Jul 4, 2016 at 9:12 AM, H.J. Lu  wrote:
> On Sun, Jul 3, 2016 at 9:38 PM, Davide Italiano  wrote:
>> On Thu, Jun 23, 2016 at 9:11 PM, Davide Italiano  
>> wrote:
>>> + HJ who wrote the code for the option originally.
>>>
>>> On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  
>>> wrote:
 LLVM currently ships with a new ELF linker http://lld.llvm.org/.
 I experiment a lot with gcc and lld so it would be nice if
 -fuse-ld=lld is supported (considering the linker is now mature enough
 to link large C/C++ applications).

 Also, IMHO, -fuse-ld should be a generic facility which accept other
 linkers (as long as they follow the convention ld.), and should
 also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
 Probably outside of the scope of this patch, but I thought worth
 mentioning.

 Thanks,

>>
>> Hi, can anybody take a look?
>>
>> Thanks,
>
> lld won't build on Fedora 24/x86-64 with GCC 6:
>
> [ 39%] Building CXX object
> tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o
> /export/gnu/import/git/llvm/tools/lld/ELF/OutputSections.cpp: In
> member function ‘void
> lld::elf::GnuHashTableSection::addSymbols(std::vector long unsigned int> >&)’:
> /export/gnu/import/git/llvm/tools/lld/ELF/OutputSections.cpp:585:8:
> error: inconsistent deduction for ‘auto’: ‘auto’ and then
> ‘__gnu_cxx::__normal_iterator unsigned int>*, std::vector unsigned int> > >’
> tools/lld/ELF/CMakeFiles/lldELF.dir/build.make:302: recipe for target
> 'tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o' failed
> gmake[4]: *** [tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o] 
> Error 1
>
> Can you fix it?



>> --
>> Davide
>>
 --
 Davide

 From 323c23d79c91d7dcee2f29b9ced8c1c00703d346 Mon Sep 17 00:00:00 2001
 From: Davide Italiano 
 Date: Thu, 23 Jun 2016 20:51:53 -0700
 Subject: [PATCH] Driver: Add support for -fuse-ld=lld.

 * collect2.c  (main): Support -fuse-ld=lld.

 * common.opt: Add fuse-ld=lld

 * doc/invoke.texi:  Document -fuse-ld=lld

 * opts.c: Ignore -fuse-ld=lld
 ---
  gcc/collect2.c  | 11 ---
  gcc/common.opt  |  4 
  gcc/doc/invoke.texi |  4 
  gcc/opts.c  |  1 +
  4 files changed, 17 insertions(+), 3 deletions(-)

 diff --git a/gcc/collect2.c b/gcc/collect2.c
 index bffac80..6a8387c 100644
 --- a/gcc/collect2.c
 +++ b/gcc/collect2.c
 @@ -831,6 +831,7 @@ main (int argc, char **argv)
USE_PLUGIN_LD,
USE_GOLD_LD,
USE_BFD_LD,
 +  USE_LLD_LD,
USE_LD_MAX
  } selected_linker = USE_DEFAULT_LD;
static const char *const ld_suffixes[USE_LD_MAX] =
 @@ -838,7 +839,8 @@ main (int argc, char **argv)
"ld",
PLUGIN_LD_SUFFIX,
"ld.gold",
 -  "ld.bfd"
 +  "ld.bfd",
 +  "ld.lld"
  };
static const char *const real_ld_suffix = "real-ld";
static const char *const collect_ld_suffix = "collect-ld";
 @@ -1004,6 +1006,8 @@ main (int argc, char **argv)
selected_linker = USE_BFD_LD;
  else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
selected_linker = USE_GOLD_LD;
 +  else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
 +selected_linker = USE_LLD_LD;

  #ifdef COLLECT_EXPORT_LIST
  /* These flags are position independent, although their order
 @@ -1093,7 +1097,8 @@ main (int argc, char **argv)
/* Maybe we know the right file to use (if not cross).  */
ld_file_name = 0;
  #ifdef DEFAULT_LINKER
 -  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD)
 +  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD ||
 +  selected_linker == USE_LLD_LD)
  {
char *linker_name;
  # ifdef HOST_EXECUTABLE_SUFFIX
 @@ -1307,7 +1312,7 @@ main (int argc, char **argv)
else if (!use_collect_ld
 && strncmp (arg, "-fuse-ld=", 9) == 0)
  {
 -  /* Do not pass -fuse-ld={bfd|gold} to the linker. */
 +  /* Do not pass -fuse-ld={bfd|gold|lld} to the linker. */
ld1--;
ld2--;
  }
 diff --git a/gcc/common.opt b/gcc/common.opt
 index 5d90385..2a95a1f 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -2536,6 +2536,10 @@ fuse-ld=gold
  Common Driver Negative(fuse-ld=bfd)
  Use the gold linker instead of the default linker.

 +fuse-ld=lld
 +Common Driver Negative(fuse-ld=lld)
 +Use the lld LLVM linker instead of the default linker.
 +
  

[PING] Re: [PATCH] Fix MPX tests on systems with MPX disabled

2016-07-04 Thread Andi Kleen
Andi Kleen  writes:

PING!

> From: Andi Kleen 
>
> I have a Skylake system with MPX in the CPU, but MPX is disabled
> in the kernel configuration.
>
> This makes all the MPX tests fail because they assume if MPX
> is in CPUID it works
>
> Check the output of XGETBV too to detect non MPX kernels.
>
> gcc/testsuite/:
>
> 2016-06-25  Andi Kleen  
>
>   * gcc.target/i386/mpx/mpx-check.h: Check XGETBV output
>   if kernel supports MPX.
> ---
>  gcc/testsuite/gcc.target/i386/mpx/mpx-check.h | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/mpx-check.h 
> b/gcc/testsuite/gcc.target/i386/mpx/mpx-check.h
> index 3afa460..73aa01f 100644
> --- a/gcc/testsuite/gcc.target/i386/mpx/mpx-check.h
> +++ b/gcc/testsuite/gcc.target/i386/mpx/mpx-check.h
> @@ -16,6 +16,16 @@ mpx_test (int, const char **);
>  
>  #define DEBUG
>  
> +#define XSTATE_BNDREGS (1 << 3)
> +
> +/* This should be an intrinsic, but isn't.  */
> +static int xgetbv (unsigned x)
> +{
> +   unsigned eax, edx;
> +   asm ("xgetbv" : "=a" (eax), "=d" (edx) : "c" (x)); 
> +   return eax;
> +}
> +
>  int
>  main (int argc, const char **argv)
>  {
> @@ -27,7 +37,7 @@ main (int argc, const char **argv)
>__cpuid_count (7, 0, eax, ebx, ecx, edx);
>  
>/* Run MPX test only if host has MPX support.  */
> -  if (ebx & bit_MPX)
> +  if ((ebx & bit_MPX) && (xgetbv (0) & XSTATE_BNDREGS))
>  mpx_test (argc, argv);
>else
>  {

-- 
a...@linux.intel.com -- Speaking for myself only


Fwd: Re: [lra] Cleanup the use of offmemok and don't count spilling cost for it

2016-07-04 Thread Vladimir Makarov
gcc-patches has rejected the original message as it contained invalid 
MIME type.  Therefore I am re-sending it.



 Forwarded Message 
Subject: 	Re: [lra] Cleanup the use of offmemok and don't count spilling 
cost for it

Date:   Mon, 4 Jul 2016 15:44:25 -0400
From:   Vladimir Makarov 
To: Jiong Wang , gcc-patches@gcc.gnu.org
CC: 	Andreas Krebbel , 
v...@linux.vnet.ibm.com, Jeff Law , Robin Dapp 





On 06/30/2016 01:22 PM, Jiong Wang wrote:


Here is the patch,

From my understanding, "offmemok" is used to represent a memory 
operand who's address we want to reload, and searching of it's 
reference location seems confirmed my understanding as it's always 
used together with MEM_P check. So this patch does the following 
modifications: * Only set offmemok to true if MEM_P is also true, as 
otherwise offmemok is not used.  * Remove redundant MEM_P check which 
was used together with offmemok. * Avoid the addition of spilling cost 
if offmemok be true as an address calculation reload is not spilling. 
bootstrap & gcc/g++ regression OK on x86_64/aarch64/arm. OK for trunk?


Yes.  The patch looks OK to me.  Thank you for working on the solution, 
Jiong.  As I wrote the code is very sensitive and any its change might 
affect some targets.  Usually patches for this part of LRA can take a 
few iterations.




2016-06-30 Jiong Wang  gcc * lra-constraints.c 
(process_alt_operands): Only set "offmemok" for MEM_P. Remove 
redundant MEM_P check if it's used together with "offmemok" check. 
Don't add spilling cost for "offmemok". (curr_insn_transform): Remove 
redundant MEM_P check.








Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread Davide Italiano
On Mon, Jul 4, 2016 at 12:36 PM, Markus Trippelsdorf
 wrote:
> On 2016.07.04 at 10:08 -0700, H.J. Lu wrote:
>> On Sun, Jul 3, 2016 at 9:38 PM, Davide Italiano  
>> wrote:
>> > On Thu, Jun 23, 2016 at 9:11 PM, Davide Italiano  
>> > wrote:
>> >> + HJ who wrote the code for the option originally.
>> >>
>> >> On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  
>> >> wrote:
>> >>> LLVM currently ships with a new ELF linker http://lld.llvm.org/.
>> >>> I experiment a lot with gcc and lld so it would be nice if
>> >>> -fuse-ld=lld is supported (considering the linker is now mature enough
>> >>> to link large C/C++ applications).
>> >>>
>> >>> Also, IMHO, -fuse-ld should be a generic facility which accept other
>> >>> linkers (as long as they follow the convention ld.), and should
>> >>> also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
>> >>> Probably outside of the scope of this patch, but I thought worth
>> >>> mentioning.
>> >
>> > Hi, can anybody take a look?
>>
>> lld isn't compatible with GCC:
>>
>> https://llvm.org/bugs/show_bug.cgi?id=28414
>
> Besides the technical issues, this also raises the question if it is
> right to support lld at all. Because this project was obviously started
> to replace the GNU linkers (ld.bfd and gold) in the long run.
> So I see no reason why it should be supported in GCC.
>
> (And who needs a buggy new ELF linker anyway?)


Fair enough. Consider this patch withdrawn, sorry for the noise.

--
Davide


Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread Markus Trippelsdorf
On 2016.07.04 at 10:08 -0700, H.J. Lu wrote:
> On Sun, Jul 3, 2016 at 9:38 PM, Davide Italiano  wrote:
> > On Thu, Jun 23, 2016 at 9:11 PM, Davide Italiano  
> > wrote:
> >> + HJ who wrote the code for the option originally.
> >>
> >> On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  
> >> wrote:
> >>> LLVM currently ships with a new ELF linker http://lld.llvm.org/.
> >>> I experiment a lot with gcc and lld so it would be nice if
> >>> -fuse-ld=lld is supported (considering the linker is now mature enough
> >>> to link large C/C++ applications).
> >>>
> >>> Also, IMHO, -fuse-ld should be a generic facility which accept other
> >>> linkers (as long as they follow the convention ld.), and should
> >>> also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
> >>> Probably outside of the scope of this patch, but I thought worth
> >>> mentioning.
> >
> > Hi, can anybody take a look?
> 
> lld isn't compatible with GCC:
> 
> https://llvm.org/bugs/show_bug.cgi?id=28414

Besides the technical issues, this also raises the question if it is
right to support lld at all. Because this project was obviously started
to replace the GNU linkers (ld.bfd and gold) in the long run.
So I see no reason why it should be supported in GCC.

(And who needs a buggy new ELF linker anyway?)

-- 
Markus


[patch, fortran] Bug 35849 - "wrong" line shown in error message for parameter

2016-07-04 Thread Jerry DeLisle
I will commit attached patch provided by Steve and reviewed and tested 
by myself to trunk.  Test case provided.


Self explanatory.

Regards,

Jerry

2016-07-04  Jerry DeLisle  
Steven G. Kargl  

PR fortran/35849
* simplify.c (gfc_simplify_ishftc): Check that absolute value of
SHIFT is less than or equal to SIZE.

Index: simplify.c
===
--- simplify.c	(revision 237855)
+++ simplify.c	(working copy)
@@ -3280,7 +3280,6 @@ gfc_simplify_ishftc (gfc_expr *e, gfc_ex
 	return NULL;
 
   gfc_extract_int (sz, );
-
 }
   else
 ssize = isize;
@@ -3294,7 +3293,10 @@ gfc_simplify_ishftc (gfc_expr *e, gfc_ex
 {
   if (sz == NULL)
 	gfc_error ("Magnitude of second argument of ISHFTC exceeds "
-		   "BIT_SIZE of first argument at %L", >where);
+		   "BIT_SIZE of first argument at %C");
+  else
+	gfc_error ("Absolute value of SHIFT shall be less than or equal "
+		   "to SIZE at %C");
   return _bad_expr;
 }
! { dg-do compile }
! PR35849
INTEGER, PARAMETER :: j = 15
INTEGER, PARAMETER, DIMENSION(10)  :: A = [(i, i = 1,10)]
INTEGER, PARAMETER, DIMENSION(10)  :: B = ISHFTC(j, A, -20) ! { dg-error "must be positive" }
INTEGER, PARAMETER, DIMENSION(10)  :: C = ISHFTC(1_1, A, j) ! { dg-error "less than or equal to BIT_SIZE" }
INTEGER, PARAMETER, DIMENSION(10)  :: D = ISHFTC(3, A, 5) ! { dg-error "Absolute value of SHIFT shall be less than or equal" }
INTEGER, PARAMETER, DIMENSION(10)  :: E = ISHFTC(3_1, A) ! { dg-error "second argument of ISHFTC exceeds BIT_SIZE of first argument" }
end


Re: [PATCH PR c/71699] Handle pointer arithmetic in nonzero tree checks

2016-07-04 Thread Mike Stump
On Jul 1, 2016, at 6:10 AM, Manish Goregaokar  wrote:
> 
> +}
> \ No newline at end of file

Minor nit, please end all files with a newline...



Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-07-04 Thread Martin Sebor

On 07/04/2016 10:44 AM, Jakub Jelinek wrote:

On Mon, Jul 04, 2016 at 10:23:06AM -0600, Martin Sebor wrote:

1) Making use of -Wformat machinery in c-family/c-format.c.  This
seemed preferable to duplicating some of the same code elsewhere
(I initially started implementing it in expand_builtin in
builtins.c).  It makes the implementation readily extensible
to all the same formats as those already handled for -Wformat.
One drawback is that unlike in expand_builtin, calls to these
functions cannot readily be folded.  Another drawback pointed


folded?  You mean this -W option changes code generation?


No, it doesn't.  What I meant is that the same code, when added
in builtins.c instead, could readily be extended to fold into
strings expressions like

   sprintf (buf, "%i", 123);


I've commented in some PR a few years ago that I'm not convinced we want to
do it, or at least not without careful considerations, consider .rodata
size.  Say if the user has in 1000x different places
sprintf (buf, "foobarbaz %i", NNN); for various values of NNN, then such 
"optimization" would replace
a single string literal of length 13 bytes with 1000 string literals of 12-20 
bytes.
Consider larger string literal, with %s and long additions and it might not
be a win even for 2 occurrences.


I agree that's something to consider.  But even if the call itself
weren't folded, the return value (i.e., the number of characters
computed by the checker) could be.

Martin


Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-07-04 Thread Bernd Schmidt

On 07/04/2016 06:44 PM, Jakub Jelinek wrote:

On Mon, Jul 04, 2016 at 10:23:06AM -0600, Martin Sebor wrote:



No, it doesn't.  What I meant is that the same code, when added
in builtins.c instead, could readily be extended to fold into
strings expressions like

  sprintf (buf, "%i", 123);


I've commented in some PR a few years ago that I'm not convinced we want to
do it, or at least not without careful considerations, consider .rodata
size.  Say if the user has in 1000x different places
sprintf (buf, "foobarbaz %i", NNN); for various values of NNN, then such 
"optimization" would replace
a single string literal of length 13 bytes with 1000 string literals of 12-20 
bytes.
Consider larger string literal, with %s and long additions and it might not
be a win even for 2 occurrences.


I think that's not a highly ligkely scenario, and it would still be a 
massive speed optimization over calling sprintf. Each such call is 
likely to be larger than the string literal anyway.



Bernd



Re: [PATCH][expr.c] PR middle-end/71700: zero-extend sub-word value when widening constructor element

2016-07-04 Thread Bernd Schmidt

On 07/01/2016 11:18 AM, Kyrill Tkachov wrote:

In this arm wrong-code PR the struct assignment goes wrong when
expanding constructor elements to a register destination
when the constructor elements are signed bitfields less than a word wide.
In this testcase we're intialising a struct with a 16-bit signed
bitfield to -1 followed by a 1-bit bitfield to 0.
Before it starts storing the elements it zeroes out the register.
 The code in store_constructor extends the first field to word size
because it appears at the beginning of a word.
It sign-extends the -1 to word size. However, when it later tries to
store the 0 to bitposition 16 it has some logic
to avoid redundant zeroing since the destination was originally cleared,
so it doesn't emit the zero store.
But the previous sign-extended -1 took up the whole word, so the
position of the second bitfield contains a set bit.

This patch fixes the problem by zeroing out the bits of the widened
field that did not appear in the original value,
so that we can safely avoid storing the second zero in the constructor.

[...]


Bootstrapped and tested on arm, aarch64, x86_64 though the codepath is
gated on WORD_REGISTER_OPERATIONS I didn't
expect any effect on aarch64 and x86_64 anyway.


So - that code path starts with this comment:

/* If this initializes a field that is smaller than a
   word, at the start of a word, try to widen it to a full
   word.  This special case allows us to output C++ member
   function initializations in a form that the optimizers
   can understand.  */

Doesn't your patch completely defeat the purpose of this? Would you get 
better/identical code by just deleting this block? It seems unfortunate 
to have two different code generation approaches like this.


It would be interesting to know the effects of your patch, and the 
effects of removing this code entirely, on generated code. Try to find 
the motivating C++ member function example perhaps? Maybe another 
possibility is to ensure this doesn't happen if the value would be 
interpreted as signed.



Bernd


Re: [lra] Cleanup the use of offmemok and don't count spilling cost for it

2016-07-04 Thread Bernd Schmidt

On 07/04/2016 04:05 PM, Jiong Wang wrote:

And the corresponding s390 patten is "mov" for V_128.

(define_insn "mov"
  [(set (match_operand:V_128 0 "" "=v,v,R,  v,  v,  v,  v,  v,v,d")
  (match_operand:V_128 1 "" "v,R,v,j00,jm1,jyy,jxx,jKK,d,v"))]

As the offset "-16" does not qualify s390_short_displacement, we need a
reload.

Ideally we want alternative 2, for which gcc simply reload the mem
address into a address register.

r157:DI=r116:DI+r69:DI-0x10
[r157:DI]=r134:V16QI#0

While after r237277, gcc is treating the reload of insn 41 as a spill
and thus increased the costs for it, then alternative 8 beat alternative
2, thus the following reload sequences are generated.

r157:V4SI=r134:V16QI#0
[r116:DI+r69:DI-0x10]=r157:V4SI

GCC move the vector register into general register, then a second
instruction to store the general register into memory so it can match
alternative 8, which is "v", "d".

However the second instructions still constains the illegal mem address,
thus a further reload triggered, and gcc triggers above max number
reload issue.

The functional chang of this patch is to make gcc don't treat an memory
address reload as spill which is regression caused by r237277.

Does this explanation make sense?


Yes, it explains well what's going on. I think the part of your patch 
that avoids counting a reload of an address as a spill looks ok. I'm 
uncertain whether the code still has issues after that, it seems a 
little iffy not to count the cost of reloading the memory address at 
all. We might have to add code at some point to detect if we're 
reloading a move instruction and would be generating an identical one 
when picking a given alternative.



Bernd


[committed] Fix ICE with C++11 attributes (PR c++/71739)

2016-07-04 Thread Jakub Jelinek
Hi!

We ICE with C++11 attributes, because their TREE_PURPOSE is not
IDENTIFIER_NODE, but TREE_LIST.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk and 6.2 as obvious.

2016-07-04  Jakub Jelinek  

PR c++/71739
* tree.c (attribute_value_equal): Use get_attribute_name instead of
directly using TREE_PURPOSE.

* g++.dg/cpp0x/pr71739.C: New test.

--- gcc/tree.c.jj   2016-06-24 12:59:22.0 +0200
+++ gcc/tree.c  2016-07-04 11:17:48.425814493 +0200
@@ -5009,7 +5009,7 @@ attribute_value_equal (const_tree attr1,
   && TREE_CODE (TREE_VALUE (attr2)) == TREE_LIST)
 {
   /* Handle attribute format.  */
-  if (is_attribute_p ("format", TREE_PURPOSE (attr1)))
+  if (is_attribute_p ("format", get_attribute_name (attr1)))
{
  attr1 = TREE_VALUE (attr1);
  attr2 = TREE_VALUE (attr2);
--- gcc/testsuite/g++.dg/cpp0x/pr71739.C.jj 2016-07-04 11:19:16.629676012 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/pr71739.C2016-07-04 11:18:56.0 
+0200
@@ -0,0 +1,5 @@
+// PR c++/71739
+// { dg-do compile { target c++11 } }
+
+template  struct alignas(N) A;
+template  struct alignas(N) A {};

Jakub


Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread H.J. Lu
On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  wrote:
> LLVM currently ships with a new ELF linker http://lld.llvm.org/.
> I experiment a lot with gcc and lld so it would be nice if
> -fuse-ld=lld is supported (considering the linker is now mature enough
> to link large C/C++ applications).
>
> Also, IMHO, -fuse-ld should be a generic facility which accept other
> linkers (as long as they follow the convention ld.), and should
> also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
> Probably outside of the scope of this patch, but I thought worth
> mentioning.
>
> Thanks,
>
> --
> Davide
>
> From 323c23d79c91d7dcee2f29b9ced8c1c00703d346 Mon Sep 17 00:00:00 2001
> From: Davide Italiano 
> Date: Thu, 23 Jun 2016 20:51:53 -0700
> Subject: [PATCH] Driver: Add support for -fuse-ld=lld.
>
> * collect2.c  (main): Support -fuse-ld=lld.
>
> * common.opt: Add fuse-ld=lld
>
> * doc/invoke.texi:  Document -fuse-ld=lld
>
> * opts.c: Ignore -fuse-ld=lld

Remove blank line between them.

> ---
>  gcc/collect2.c  | 11 ---
>  gcc/common.opt  |  4 
>  gcc/doc/invoke.texi |  4 
>  gcc/opts.c  |  1 +
>  4 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/collect2.c b/gcc/collect2.c
> index bffac80..6a8387c 100644
> --- a/gcc/collect2.c
> +++ b/gcc/collect2.c
> @@ -831,6 +831,7 @@ main (int argc, char **argv)
>USE_PLUGIN_LD,
>USE_GOLD_LD,
>USE_BFD_LD,
> +  USE_LLD_LD,
>USE_LD_MAX
>  } selected_linker = USE_DEFAULT_LD;
>static const char *const ld_suffixes[USE_LD_MAX] =
> @@ -838,7 +839,8 @@ main (int argc, char **argv)
>"ld",
>PLUGIN_LD_SUFFIX,
>"ld.gold",
> -  "ld.bfd"
> +  "ld.bfd",
> +  "ld.lld"
>  };
>static const char *const real_ld_suffix = "real-ld";
>static const char *const collect_ld_suffix = "collect-ld";
> @@ -1004,6 +1006,8 @@ main (int argc, char **argv)
>selected_linker = USE_BFD_LD;
>  else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
>selected_linker = USE_GOLD_LD;
> +  else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
> +selected_linker = USE_LLD_LD;
>
>  #ifdef COLLECT_EXPORT_LIST
>  /* These flags are position independent, although their order
> @@ -1093,7 +1097,8 @@ main (int argc, char **argv)
>/* Maybe we know the right file to use (if not cross).  */
>ld_file_name = 0;
>  #ifdef DEFAULT_LINKER
> -  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD)
> +  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD ||
> +  selected_linker == USE_LLD_LD)

Please make each condition on a separate line starting with ||.

>  {
>char *linker_name;
>  # ifdef HOST_EXECUTABLE_SUFFIX
> @@ -1307,7 +1312,7 @@ main (int argc, char **argv)
>else if (!use_collect_ld
> && strncmp (arg, "-fuse-ld=", 9) == 0)
>  {
> -  /* Do not pass -fuse-ld={bfd|gold} to the linker. */
> +  /* Do not pass -fuse-ld={bfd|gold|lld} to the linker. */
>ld1--;
>ld2--;
>  }
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 5d90385..2a95a1f 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2536,6 +2536,10 @@ fuse-ld=gold
>  Common Driver Negative(fuse-ld=bfd)
>  Use the gold linker instead of the default linker.
>
> +fuse-ld=lld
> +Common Driver Negative(fuse-ld=lld)
> +Use the lld LLVM linker instead of the default linker.
> +

This is wrong.  It should be

fuse-ld=bfd
Common Driver Negative(fuse-ld=gold)
Use the bfd linker instead of the default linker.

fuse-ld=gold
Common Driver Negative(fuse-ld=lld)
Use the gold linker instead of the default linker.

fuse-ld=lld
Common Driver Negative(fuse-ld=bfd)
Use the lld LLVM linker instead of the default linker.


>  fuse-linker-plugin
>  Common Undocumented Var(flag_use_linker_plugin)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 2c87c53..4b8acff 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -10651,6 +10651,10 @@ Use the @command{bfd} linker instead of the
> default linker.
>  @opindex fuse-ld=gold
>  Use the @command{gold} linker instead of the default linker.
>
> +@item -fuse-ld=lld
> +@opindex fuse-ld=lld
> +Use the LLVM @command{lld} linker instead of the default linker.
> +
>  @cindex Libraries
>  @item -l@var{library}
>  @itemx -l @var{library}
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 7406210..f2c86f7 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -2178,6 +2178,7 @@ common_handle_option (struct gcc_options *opts,
>
>  case OPT_fuse_ld_bfd:
>  case OPT_fuse_ld_gold:
> +case OPT_fuse_ld_lld:
>  case OPT_fuse_linker_plugin:
>/* No-op. Used by the driver and passed to us because it starts with 
> f.*/
>break;
> --
> 2.5.5



-- 
H.J.


Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread H.J. Lu
On Sun, Jul 3, 2016 at 9:38 PM, Davide Italiano  wrote:
> On Thu, Jun 23, 2016 at 9:11 PM, Davide Italiano  
> wrote:
>> + HJ who wrote the code for the option originally.
>>
>> On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  
>> wrote:
>>> LLVM currently ships with a new ELF linker http://lld.llvm.org/.
>>> I experiment a lot with gcc and lld so it would be nice if
>>> -fuse-ld=lld is supported (considering the linker is now mature enough
>>> to link large C/C++ applications).
>>>
>>> Also, IMHO, -fuse-ld should be a generic facility which accept other
>>> linkers (as long as they follow the convention ld.), and should
>>> also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
>>> Probably outside of the scope of this patch, but I thought worth
>>> mentioning.
>>>
>>> Thanks,
>>>
>
> Hi, can anybody take a look?

lld isn't compatible with GCC:

https://llvm.org/bugs/show_bug.cgi?id=28414


> Thanks,
>
> --
> Davide
>
>>> --
>>> Davide
>>>
>>> From 323c23d79c91d7dcee2f29b9ced8c1c00703d346 Mon Sep 17 00:00:00 2001
>>> From: Davide Italiano 
>>> Date: Thu, 23 Jun 2016 20:51:53 -0700
>>> Subject: [PATCH] Driver: Add support for -fuse-ld=lld.
>>>
>>> * collect2.c  (main): Support -fuse-ld=lld.
>>>
>>> * common.opt: Add fuse-ld=lld
>>>
>>> * doc/invoke.texi:  Document -fuse-ld=lld
>>>
>>> * opts.c: Ignore -fuse-ld=lld
>>> ---
>>>  gcc/collect2.c  | 11 ---
>>>  gcc/common.opt  |  4 
>>>  gcc/doc/invoke.texi |  4 
>>>  gcc/opts.c  |  1 +
>>>  4 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/gcc/collect2.c b/gcc/collect2.c
>>> index bffac80..6a8387c 100644
>>> --- a/gcc/collect2.c
>>> +++ b/gcc/collect2.c
>>> @@ -831,6 +831,7 @@ main (int argc, char **argv)
>>>USE_PLUGIN_LD,
>>>USE_GOLD_LD,
>>>USE_BFD_LD,
>>> +  USE_LLD_LD,
>>>USE_LD_MAX
>>>  } selected_linker = USE_DEFAULT_LD;
>>>static const char *const ld_suffixes[USE_LD_MAX] =
>>> @@ -838,7 +839,8 @@ main (int argc, char **argv)
>>>"ld",
>>>PLUGIN_LD_SUFFIX,
>>>"ld.gold",
>>> -  "ld.bfd"
>>> +  "ld.bfd",
>>> +  "ld.lld"
>>>  };
>>>static const char *const real_ld_suffix = "real-ld";
>>>static const char *const collect_ld_suffix = "collect-ld";
>>> @@ -1004,6 +1006,8 @@ main (int argc, char **argv)
>>>selected_linker = USE_BFD_LD;
>>>  else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
>>>selected_linker = USE_GOLD_LD;
>>> +  else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
>>> +selected_linker = USE_LLD_LD;
>>>
>>>  #ifdef COLLECT_EXPORT_LIST
>>>  /* These flags are position independent, although their order
>>> @@ -1093,7 +1097,8 @@ main (int argc, char **argv)
>>>/* Maybe we know the right file to use (if not cross).  */
>>>ld_file_name = 0;
>>>  #ifdef DEFAULT_LINKER
>>> -  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD)
>>> +  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD ||
>>> +  selected_linker == USE_LLD_LD)
>>>  {
>>>char *linker_name;
>>>  # ifdef HOST_EXECUTABLE_SUFFIX
>>> @@ -1307,7 +1312,7 @@ main (int argc, char **argv)
>>>else if (!use_collect_ld
>>> && strncmp (arg, "-fuse-ld=", 9) == 0)
>>>  {
>>> -  /* Do not pass -fuse-ld={bfd|gold} to the linker. */
>>> +  /* Do not pass -fuse-ld={bfd|gold|lld} to the linker. */
>>>ld1--;
>>>ld2--;
>>>  }
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index 5d90385..2a95a1f 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -2536,6 +2536,10 @@ fuse-ld=gold
>>>  Common Driver Negative(fuse-ld=bfd)
>>>  Use the gold linker instead of the default linker.
>>>
>>> +fuse-ld=lld
>>> +Common Driver Negative(fuse-ld=lld)
>>> +Use the lld LLVM linker instead of the default linker.
>>> +
>>>  fuse-linker-plugin
>>>  Common Undocumented Var(flag_use_linker_plugin)
>>>
>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>> index 2c87c53..4b8acff 100644
>>> --- a/gcc/doc/invoke.texi
>>> +++ b/gcc/doc/invoke.texi
>>> @@ -10651,6 +10651,10 @@ Use the @command{bfd} linker instead of the
>>> default linker.
>>>  @opindex fuse-ld=gold
>>>  Use the @command{gold} linker instead of the default linker.
>>>
>>> +@item -fuse-ld=lld
>>> +@opindex fuse-ld=lld
>>> +Use the LLVM @command{lld} linker instead of the default linker.
>>> +
>>>  @cindex Libraries
>>>  @item -l@var{library}
>>>  @itemx -l @var{library}
>>> diff --git a/gcc/opts.c b/gcc/opts.c
>>> index 7406210..f2c86f7 100644
>>> --- a/gcc/opts.c
>>> +++ b/gcc/opts.c
>>> @@ -2178,6 +2178,7 @@ common_handle_option (struct gcc_options *opts,
>>>
>>>  case OPT_fuse_ld_bfd:
>>>  case OPT_fuse_ld_gold:
>>> +case OPT_fuse_ld_lld:
>>>  case OPT_fuse_linker_plugin:
>>>/* No-op. Used by the driver and passed 

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-07-04 Thread Jakub Jelinek
On Mon, Jul 04, 2016 at 10:23:06AM -0600, Martin Sebor wrote:
> >>1) Making use of -Wformat machinery in c-family/c-format.c.  This
> >>seemed preferable to duplicating some of the same code elsewhere
> >>(I initially started implementing it in expand_builtin in
> >>builtins.c).  It makes the implementation readily extensible
> >>to all the same formats as those already handled for -Wformat.
> >>One drawback is that unlike in expand_builtin, calls to these
> >>functions cannot readily be folded.  Another drawback pointed
> >
> >folded?  You mean this -W option changes code generation?
> 
> No, it doesn't.  What I meant is that the same code, when added
> in builtins.c instead, could readily be extended to fold into
> strings expressions like
> 
>   sprintf (buf, "%i", 123);

I've commented in some PR a few years ago that I'm not convinced we want to
do it, or at least not without careful considerations, consider .rodata
size.  Say if the user has in 1000x different places
sprintf (buf, "foobarbaz %i", NNN); for various values of NNN, then such 
"optimization" would replace
a single string literal of length 13 bytes with 1000 string literals of 12-20 
bytes.
Consider larger string literal, with %s and long additions and it might not
be a win even for 2 occurrences.

Jakub


Re: [ARM][testsuite] neon-testgen.ml removal

2016-07-04 Thread Kyrill Tkachov

Hi Christophe,

On 22/06/16 16:52, Christophe Lyon wrote:

Hi,

This is a new attempt at removing neon-testgen.ml and generated files.

Compared to my previous version several months ago:
- I have recently added testcases to make sure we do not lose coverage
as described in
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02922.html
- I now also remove neon.ml as requested by Kyrylo in
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01664.html, and moved
the remaining hand-written tests up to gcc.target/arm.

Doing this, I had to slightly update vst1Q_laneu64-1.c because it's
now compiled with more pedantic flags and there was a signed/unsigned
char buffer pointer mismatch.

Sorry, I had to compress the patch, otherwise it's too large and rejected
by the list server.

OK?

Christophe


[ARM] neon-testgen.ml, neon.ml and generated files removal.

gcc/

2016-06-17  Christophe Lyon

	* config/arm/neon-testgen.ml: Delete.

* config/arm/neon.ml: Delete.

gcc/testsuite/

2016-06-17  Christophe Lyon

	* gcc.target/arm/neon/polytypes.c: Move to ...

* gcc.target/arm/polytypes.c: ... here.
* gcc.target/arm/neon/pr51534.c: Move to ...
* gcc.target/arm/pr51534.c: ... here.
* gcc.target/arm/neon/vect-vcvt.c: Move to ...
* gcc.target/arm/vect-vcvt.c: ... here.
* gcc.target/arm/neon/vect-vcvtq.c: Move to ...
* gcc.target/arm/vect-vcvtq.c: ... here.
* gcc.target/arm/neon/vfp-shift-a2t2.c: Move to ...
* gcc.target/arm/vfp-shift-a2t2.c: ... here.
* gcc.target/arm/neon/vst1Q_laneu64-1.c: Move to ...
* gcc.target/arm/vst1Q_laneu64-1.c: ... here. Fix foo() prototype.
* gcc.target/arm/neon/neon.exp: Delete.
* gcc.target/arm/neon/*.c: Delete.

I think this should be "* gcc.target/arm/neon/: Delete." to make it clear that 
the
directory is being removed.

This is ok for trunk.
Thanks for dealing with this!

Kyrill




Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-07-04 Thread Martin Sebor

On 07/04/2016 04:59 AM, Richard Biener wrote:

On Fri, 1 Jul 2016, Martin Sebor wrote:


The attached patch enhances compile-time checking for buffer overflow
and output truncation in non-trivial calls to the sprintf family of
functions under a new option -Wformat-length=[12].  This initial
patch handles printf directives with string, integer, and simple
floating arguments but eventually I'd like to extend it all other
functions and directives for which it makes sense.

I made some choices in the implementation that resulted in trade-offs
in the quality of the diagnostics.  I would be grateful for comments
and suggestions how to improve them.  Besides the list I include
Jakub who already gave me some feedback (thanks), Joseph who as
I understand has deep knowledge of the c-format.c code, and Richard
for his input on the LTO concern below.

1) Making use of -Wformat machinery in c-family/c-format.c.  This
seemed preferable to duplicating some of the same code elsewhere
(I initially started implementing it in expand_builtin in
builtins.c).  It makes the implementation readily extensible
to all the same formats as those already handled for -Wformat.
One drawback is that unlike in expand_builtin, calls to these
functions cannot readily be folded.  Another drawback pointed


folded?  You mean this -W option changes code generation?


No, it doesn't.  What I meant is that the same code, when added
in builtins.c instead, could readily be extended to fold into
strings expressions like

  sprintf (buf, "%i", 123);




out by Jakub is that since the code is only available in the
C and C++ compilers, it apparently may not be available with
an LTO compiler (I don't completely understand this problem
but I mention it in the interest of full disclosure). In light
of the dependency in (2) below, I don't see a way to avoid it
(moving c-format.c to the middle end was suggested but seemed
like too much of a change to me).


Yes, lto1 is not linked with C_COMMON_OBJS (that could be changed
of course at the expense of dragging in some dead code).  Moving
all the format stuff to the middle-end (or separated better so
the overhead in lto1 is lower) would be possible as well.

That said, a langhook as you add it highlights the issue with LTO.


Thanks for the clarification.  IIUC, there are at least three
possibilities for how to proceed: leave it as is (no checking
with LTO), link LTO with C_COMMON_OBJS, or move the c-format.c
code into the middle end.  Do you have a preference for one of
these?  Or is there another solution that I missed?

FWIW, I would expect a good number of other warnings to benefit
from optimization and having a general solution for this problem
to be helpful.  I also suspect this isn't the first time this
issue has come up.  I'm wondering what solutions have already
been considered and with what pros and cons (naively, I would
think that factoring the relevant code out of cc1 into a shared
library that lto1 could load should work).

Martin


Re: [AArch64] Renaming ARMv8.1 to ARMv8.1-A in comments and documentations

2016-07-04 Thread James Greenhalgh
On Mon, Jul 04, 2016 at 05:00:18PM +0100, Jiong Wang wrote:
> As the request from
> 
>   https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01936.html
> 
> This patch replace all use of ARMv8.1 to ARMv8.1-A.
> 
> OK for trunk?

Thanks for the follow-up.

OK.

Thanks,
James


> 
> 2016-07-04  Jiong Wang  
> 
> gcc/
>   * config/aarch64/aarch64.h: Rename "ARMv8.1" to "ARMv8.1-A".
>   * config/aarch64/aarch64_neon.h: Likewise.
>   * config/aarch64/arm_neon.h: Likewise.
>   * config/aarch64/atomics.md: Likewise.
>   * config/aarch64/aarch64-simd-builtins.def: Likewise.
>   * doc/invoke.texi: Likewise.
> 



Re: [Driver] Add support for -fuse-ld=lld

2016-07-04 Thread H.J. Lu
On Sun, Jul 3, 2016 at 9:38 PM, Davide Italiano  wrote:
> On Thu, Jun 23, 2016 at 9:11 PM, Davide Italiano  
> wrote:
>> + HJ who wrote the code for the option originally.
>>
>> On Thu, Jun 23, 2016 at 9:01 PM, Davide Italiano  
>> wrote:
>>> LLVM currently ships with a new ELF linker http://lld.llvm.org/.
>>> I experiment a lot with gcc and lld so it would be nice if
>>> -fuse-ld=lld is supported (considering the linker is now mature enough
>>> to link large C/C++ applications).
>>>
>>> Also, IMHO, -fuse-ld should be a generic facility which accept other
>>> linkers (as long as they follow the convention ld.), and should
>>> also support absolute path, e.g. -fuse-ld=/usr/local/bin/ld.mylinker.
>>> Probably outside of the scope of this patch, but I thought worth
>>> mentioning.
>>>
>>> Thanks,
>>>
>
> Hi, can anybody take a look?
>
> Thanks,

lld won't build on Fedora 24/x86-64 with GCC 6:

[ 39%] Building CXX object
tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o
/export/gnu/import/git/llvm/tools/lld/ELF/OutputSections.cpp: In
member function ‘void
lld::elf::GnuHashTableSection::addSymbols(std::vector >&)’:
/export/gnu/import/git/llvm/tools/lld/ELF/OutputSections.cpp:585:8:
error: inconsistent deduction for ‘auto’: ‘auto’ and then
‘__gnu_cxx::__normal_iterator*, std::vector > >’
tools/lld/ELF/CMakeFiles/lldELF.dir/build.make:302: recipe for target
'tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o' failed
gmake[4]: *** [tools/lld/ELF/CMakeFiles/lldELF.dir/OutputSections.cpp.o] Error 1

Can you fix it?

H.J.
> --
> Davide
>
>>> --
>>> Davide
>>>
>>> From 323c23d79c91d7dcee2f29b9ced8c1c00703d346 Mon Sep 17 00:00:00 2001
>>> From: Davide Italiano 
>>> Date: Thu, 23 Jun 2016 20:51:53 -0700
>>> Subject: [PATCH] Driver: Add support for -fuse-ld=lld.
>>>
>>> * collect2.c  (main): Support -fuse-ld=lld.
>>>
>>> * common.opt: Add fuse-ld=lld
>>>
>>> * doc/invoke.texi:  Document -fuse-ld=lld
>>>
>>> * opts.c: Ignore -fuse-ld=lld
>>> ---
>>>  gcc/collect2.c  | 11 ---
>>>  gcc/common.opt  |  4 
>>>  gcc/doc/invoke.texi |  4 
>>>  gcc/opts.c  |  1 +
>>>  4 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/gcc/collect2.c b/gcc/collect2.c
>>> index bffac80..6a8387c 100644
>>> --- a/gcc/collect2.c
>>> +++ b/gcc/collect2.c
>>> @@ -831,6 +831,7 @@ main (int argc, char **argv)
>>>USE_PLUGIN_LD,
>>>USE_GOLD_LD,
>>>USE_BFD_LD,
>>> +  USE_LLD_LD,
>>>USE_LD_MAX
>>>  } selected_linker = USE_DEFAULT_LD;
>>>static const char *const ld_suffixes[USE_LD_MAX] =
>>> @@ -838,7 +839,8 @@ main (int argc, char **argv)
>>>"ld",
>>>PLUGIN_LD_SUFFIX,
>>>"ld.gold",
>>> -  "ld.bfd"
>>> +  "ld.bfd",
>>> +  "ld.lld"
>>>  };
>>>static const char *const real_ld_suffix = "real-ld";
>>>static const char *const collect_ld_suffix = "collect-ld";
>>> @@ -1004,6 +1006,8 @@ main (int argc, char **argv)
>>>selected_linker = USE_BFD_LD;
>>>  else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
>>>selected_linker = USE_GOLD_LD;
>>> +  else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
>>> +selected_linker = USE_LLD_LD;
>>>
>>>  #ifdef COLLECT_EXPORT_LIST
>>>  /* These flags are position independent, although their order
>>> @@ -1093,7 +1097,8 @@ main (int argc, char **argv)
>>>/* Maybe we know the right file to use (if not cross).  */
>>>ld_file_name = 0;
>>>  #ifdef DEFAULT_LINKER
>>> -  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD)
>>> +  if (selected_linker == USE_BFD_LD || selected_linker == USE_GOLD_LD ||
>>> +  selected_linker == USE_LLD_LD)
>>>  {
>>>char *linker_name;
>>>  # ifdef HOST_EXECUTABLE_SUFFIX
>>> @@ -1307,7 +1312,7 @@ main (int argc, char **argv)
>>>else if (!use_collect_ld
>>> && strncmp (arg, "-fuse-ld=", 9) == 0)
>>>  {
>>> -  /* Do not pass -fuse-ld={bfd|gold} to the linker. */
>>> +  /* Do not pass -fuse-ld={bfd|gold|lld} to the linker. */
>>>ld1--;
>>>ld2--;
>>>  }
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index 5d90385..2a95a1f 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -2536,6 +2536,10 @@ fuse-ld=gold
>>>  Common Driver Negative(fuse-ld=bfd)
>>>  Use the gold linker instead of the default linker.
>>>
>>> +fuse-ld=lld
>>> +Common Driver Negative(fuse-ld=lld)
>>> +Use the lld LLVM linker instead of the default linker.
>>> +
>>>  fuse-linker-plugin
>>>  Common Undocumented Var(flag_use_linker_plugin)
>>>
>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>> index 2c87c53..4b8acff 100644
>>> --- a/gcc/doc/invoke.texi
>>> +++ b/gcc/doc/invoke.texi

Re: [testsuite] asan/clone-test-1.c: Handle clone() failure

2016-07-04 Thread Jakub Jelinek
On Mon, Jul 04, 2016 at 05:44:17PM +0200, Christophe Lyon wrote:
> Hello,
> 
> This small patch handles the case were clone() would fail when
> executing asan/clone-test-1.c.

I wonder if the syscall failures shouldn't result in exit of 0 rather than 1
(ideally UNSUPPORTED), because they don't mean the test failed in what it
was testing.
But as the other spots in the test already return 1;, I guess this is fine.

> 2016-07-04  Christophe Lyon  
> 
>   * c-c++-common/asan/clone-test-1.c (main): Handle clone() failure.

> diff --git a/gcc/testsuite/c-c++-common/asan/clone-test-1.c 
> b/gcc/testsuite/c-c++-common/asan/clone-test-1.c
> index eeca09f..c58c376 100644
> --- a/gcc/testsuite/c-c++-common/asan/clone-test-1.c
> +++ b/gcc/testsuite/c-c++-common/asan/clone-test-1.c
> @@ -29,6 +29,10 @@ int main(int argc, char **argv) {
>char *sp = child_stack + kStackSize;  /* Stack grows down. */
>printf("Parent: %p\n", sp);
>pid_t clone_pid = clone(Child, sp, CLONE_FILES | CLONE_VM, NULL, 0, 0, 0);
> +  if (clone_pid == -1) {
> +perror("clone");
> +return 1;
> +  }
>int status;
>pid_t wait_result = waitpid(clone_pid, , __WCLONE);
>if (wait_result < 0) {


Jakub


[AArch64] Renaming ARMv8.1 to ARMv8.1-A in comments and documentations

2016-07-04 Thread Jiong Wang

As the request from

  https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01936.html

This patch replace all use of ARMv8.1 to ARMv8.1-A.

OK for trunk?

2016-07-04  Jiong Wang  

gcc/
  * config/aarch64/aarch64.h: Rename "ARMv8.1" to "ARMv8.1-A".
  * config/aarch64/aarch64_neon.h: Likewise.
  * config/aarch64/arm_neon.h: Likewise.
  * config/aarch64/atomics.md: Likewise.
  * config/aarch64/aarch64-simd-builtins.def: Likewise.
  * doc/invoke.texi: Likewise.

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 02d465b8a0848d2f1676015462478a83e97e6b9b..3e4740c460a335d8a4d5ce8b19fc311aa14a47d4 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -432,7 +432,7 @@
   VAR1 (TERNOP, qtbx4, 0, v8qi)
   VAR1 (TERNOP, qtbx4, 0, v16qi)
 
-  /* Builtins for ARMv8.1 Adv.SIMD instructions.  */
+  /* Builtins for ARMv8.1-A Adv.SIMD instructions.  */
 
   /* Implemented by aarch64_sqrdmlh.  */
   BUILTIN_VSDQ_HSI (TERNOP, sqrdmlah, 0)
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 59805a9f71abf0639cd6053b88304fbb8fc9e296..19159802d6bbad16e11a23b8b44507507fa4cce8 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -132,9 +132,9 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_FP (1 << 1)	/* Has FP.  */
 #define AARCH64_FL_CRYPTO (1 << 2)	/* Has crypto.  */
 #define AARCH64_FL_CRC(1 << 3)	/* Has CRC.  */
-/* ARMv8.1 architecture extensions.  */
+/* ARMv8.1-A architecture extensions.  */
 #define AARCH64_FL_LSE	  (1 << 4)  /* Has Large System Extensions.  */
-#define AARCH64_FL_V8_1	  (1 << 5)  /* Has ARMv8.1 extensions.  */
+#define AARCH64_FL_V8_1	  (1 << 5)  /* Has ARMv8.1-A extensions.  */
 /* ARMv8.2-A architecture extensions.  */
 #define AARCH64_FL_V8_2	  (1 << 8)  /* Has ARMv8.2-A features.  */
 #define AARCH64_FL_F16	  (1 << 9)  /* Has ARMv8.2-A FP16 extensions.  */
@@ -204,7 +204,7 @@ extern unsigned aarch64_architecture_version;
   ((aarch64_fix_a53_err843419 == 2)	\
   ? TARGET_FIX_ERR_A53_843419_DEFAULT : aarch64_fix_a53_err843419)
 
-/* ARMv8.1 Adv.SIMD support.  */
+/* ARMv8.1-A Adv.SIMD support.  */
 #define TARGET_SIMD_RDMA (TARGET_SIMD && AARCH64_ISA_RDMA)
 
 /* Standard register usage.  */
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index ebf6fa2b63ee6ec1f73e62a0c957b9633e22d2a6..475e200a683436af5026edafa568f16126f4340a 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -10516,7 +10516,7 @@ vbslq_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
   return __builtin_aarch64_simd_bslv2di_ (__a, __b, __c);
 }
 
-/* ARMv8.1 instrinsics.  */
+/* ARMv8.1-A instrinsics.  */
 #pragma GCC push_options
 #pragma GCC target ("arch=armv8.1-a")
 
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 3b65b4b238fd130c17e5c64edfc84790e38fbe51..d84339db2a838f51182f9896d9f0cd446924ef65 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -583,7 +583,7 @@
   }
 )
 
-;; ARMv8.1 LSE instructions.
+;; ARMv8.1-A LSE instructions.
 
 ;; Atomic swap with memory.
 (define_insn "aarch64_atomic_swp"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7e23e1c56ca1711d734ab3f10eaf47495cdaf335..fade21caced16132bb786bcc2e21e7094fbb50d2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13107,7 +13107,7 @@ The value @samp{armv8.2-a} implies @samp{armv8.1-a} and enables compiler
 support for the ARMv8.2-A architecture extensions.
 
 The value @samp{armv8.1-a} implies @samp{armv8-a} and enables compiler
-support for the ARMv8.1 architecture extension.  In particular, it
+support for the ARMv8.1-A architecture extension.  In particular, it
 enables the @samp{+crc} and @samp{+lse} features.
 
 The value @samp{native} is available on native AArch64 GNU/Linux and


[testsuite] asan/clone-test-1.c: Handle clone() failure

2016-07-04 Thread Christophe Lyon
Hello,

This small patch handles the case were clone() would fail when
executing asan/clone-test-1.c.

OK?

Christophe
2016-07-04  Christophe Lyon  

* c-c++-common/asan/clone-test-1.c (main): Handle clone() failure.
diff --git a/gcc/testsuite/c-c++-common/asan/clone-test-1.c 
b/gcc/testsuite/c-c++-common/asan/clone-test-1.c
index eeca09f..c58c376 100644
--- a/gcc/testsuite/c-c++-common/asan/clone-test-1.c
+++ b/gcc/testsuite/c-c++-common/asan/clone-test-1.c
@@ -29,6 +29,10 @@ int main(int argc, char **argv) {
   char *sp = child_stack + kStackSize;  /* Stack grows down. */
   printf("Parent: %p\n", sp);
   pid_t clone_pid = clone(Child, sp, CLONE_FILES | CLONE_VM, NULL, 0, 0, 0);
+  if (clone_pid == -1) {
+perror("clone");
+return 1;
+  }
   int status;
   pid_t wait_result = waitpid(clone_pid, , __WCLONE);
   if (wait_result < 0) {


Re: Improve insert/emplace robustness to self insertion

2016-07-04 Thread Jonathan Wakely

On 02/07/16 08:37 +0200, François Dumont wrote:
I haven't consider in this patch your remark about using allocator to 
build instance so don't hesitate to commit what you want and I will 
rebase.


Here's what I've committed to trunk.

I'm getting nervous about the smart insertion trick to avoid making a
copy, I have a devious testcase in mind which will break with that
change. I'll share the testcase later today.


commit 6aa6fa55a89c34c51366ed432bd942e09f691a0b
Author: redi 
Date:   Mon Jul 4 14:52:54 2016 +

Add tests for inserting aliased objects into std::vector

2016-07-04  Fran??ois Dumont  

	* testsuite/23_containers/vector/modifiers/emplace/self_emplace.cc:
	New test.
	* testsuite/23_containers/vector/modifiers/insert/self_insert.cc: New
	test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@237986 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/testsuite/23_containers/vector/modifiers/emplace/self_emplace.cc b/libstdc++-v3/testsuite/23_containers/vector/modifiers/emplace/self_emplace.cc
new file mode 100644
index 000..d452b5b
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/vector/modifiers/emplace/self_emplace.cc
@@ -0,0 +1,144 @@
+// { dg-options "-std=gnu++11" }
+
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+#include 
+#include "testsuite_hooks.h"
+
+bool test __attribute__((unused)) = true;
+
+void
+test01()
+{
+  std::vector vv =
+{
+  { 2, 3 },
+  { 4, 5 },
+  { 0, 1 }
+};
+
+  // Make sure emplace will imply reallocation.
+  VERIFY( vv.capacity() == 3 );
+
+  vv.emplace(vv.begin(), vv[0]);
+
+  VERIFY( vv.size() == 4 );
+  VERIFY( vv[0].size() == 2 );
+  VERIFY( vv[0][0] == 2 );
+  VERIFY( vv[0][1] == 3 );
+}
+
+void
+test02()
+{
+  std::vector vv =
+{
+  { 2, 3 },
+  { 4, 5 },
+  { 0, 1 }
+};
+
+  // Make sure emplace won't reallocate.
+  vv.reserve(4);
+  vv.emplace(vv.begin(), vv[0]);
+
+  VERIFY( vv.size() == 4 );
+  VERIFY( vv[0].size() == 2 );
+  VERIFY( vv[0][0] == 2 );
+  VERIFY( vv[0][1] == 3 );
+}
+
+struct A
+{
+  A(int i) : _i(i)
+  { }
+
+  A(const A& other) : _i(other._i)
+  {
+VERIFY( other._i >= 0 );
+  }
+
+  A(A&& other) : _i(other._i)
+  {
+VERIFY( other._i >= 0 );
+
+other._i = -1;
+  }
+
+  A(std::vector::iterator it) : _i(it->_i)
+  {
+VERIFY( it->_i >= 0 );
+  }
+
+  A& operator=(const A&) = default;
+  A& operator=(A&& other)
+  {
+VERIFY(other._i >= 0 );
+
+_i = other._i;
+other._i = -1;
+return *this;
+  }
+
+  int _i;
+};
+
+void
+test03()
+{
+  std::vector va =
+{
+  { A(1) },
+  { A(2) },
+  { A(3) }
+};
+
+  // Make sure emplace will imply reallocation.
+  VERIFY( va.capacity() == 3 );
+
+  va.emplace(va.begin(), va.begin());
+
+  VERIFY( va.size() == 4 );
+  VERIFY( va[0]._i == 1 );
+}
+
+void
+test04()
+{
+  std::vector va =
+{
+  { A(1) },
+  { A(2) },
+  { A(3) }
+};
+
+  // Make sure emplace won't reallocate.
+  va.reserve(4);
+  va.emplace(va.begin(), va.begin());
+
+  VERIFY( va.size() == 4 );
+  VERIFY( va[0]._i == 1 );
+}
+
+int main()
+{
+  test01();
+  test02();
+  test03();
+  test04();
+}
diff --git a/libstdc++-v3/testsuite/23_containers/vector/modifiers/insert/self_insert.cc b/libstdc++-v3/testsuite/23_containers/vector/modifiers/insert/self_insert.cc
new file mode 100644
index 000..9944cbb
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/vector/modifiers/insert/self_insert.cc
@@ -0,0 +1,70 @@
+// { dg-options "-std=gnu++11" }
+
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have 

Re: [PATCH v2] S/390: Add support for z13 instructions lochi and locghi.

2016-07-04 Thread Andreas Krebbel
> gcc/ChangeLog
> 
>   * config/s390/s390.md: Add "z13" cpu_facility.
>   ("*movcc"): Add support for z13 instructions lochi and locghi.
>   * config/s390/predicates.md ("loc_operand"): New predicate for "load on
>   condition" type instructions.
> gcc/testsuite/ChangeLog
> 
>   * gcc.target/s390/vector/vec-scalar-cmp-1.c: Expect lochi instead of
>   locr.
>   * gcc.target/s390/loc-1.c: New test.

Applied.  Thanks!

-Andreas-



Re: [PATCH 1/2][v3] Drop excess size used for run time allocated stack variables.

2016-07-04 Thread Andreas Krebbel
> gcc/ChangeLog0
> 
>   * explow.c (allocate_dynamic_stack_space): Simplify knowing that
>   MUST_ALIGN was always true and extra_align ist always BITS_PER_UNIT.

Applied. Thanks!

-Andreas-



Re: [PATCH 16/17][ARM] Add tests for VFP FP16 ACLE instrinsics.

2016-07-04 Thread Matthew Wahab

On 18/05/16 11:58, Matthew Wahab wrote:
> On 18/05/16 02:06, Joseph Myers wrote:
>> On Tue, 17 May 2016, Matthew Wahab wrote:
>>
>>> In some tests, there are unavoidable differences in precision when
>>> calculating the actual and the expected results of an FP16 operation. A
>>> new support function CHECK_FP_BIAS is used so that these tests can check
>>> for an acceptable margin of error. In these tests, the tolerance is
>>> given as the absolute integer difference between the bitvectors of the
>>> expected and the actual results.
>>
>> As far as I can see, CHECK_FP_BIAS is only used in the following patch,
>> but there is another bias test in vsqrth_f16_1.c in this patch.
>
> This is my mistake, the CHECK_FP_BIAS is used for the NEON tests and should
>  have gone into that patch. The VFP test can do a simpler check so doesn't
> need the macro.
>
>> Could you clarify where the "unavoidable differences in precision" come
>> from? Are the results of some of the new instructions not fully specified,
>> only specified within a given precision?  (As far as I can tell the
>> existing v8 instructions for reciprocal and reciprocal square root
>> estimates do have fully defined results, despite being loosely described
>> as esimtates.)
>
> The expected results in the new tests are represented as expressions whose
> value is expected to be calculated at compile-time. This makes the tests
> more readable but differences in the precision between the the compiler and
> the HW calculations mean that for vrecpe_f16, vrecps_f16, vrsqrts_f16 and
> vsqrth_f16_1.c the expected and actual results are different.
>
> On reflection, it may be better to remove the CHECK_FP_BIAS macro and, for
> the tests that needed it, to drop the compiler calculation and just use the
>  expected hexadecimal value.
>
> Other tests depending on compiler-time calculations involve relatively
> simple arithmetic operations and it's not clear if they are susceptible to
> the same rounding errors. I have limited knowledge in FP arithmetic though
> so I'll look into this.

The scalar tests added in this patch and the vector tests added in the
next patch have been reworked to use the exact values for the expected
results rather than compile-time expressions. The CHECK_FP_BIAS macro is
not used and is removed from this patch.

The intention with these tests and with the vector tests is to check
that the compiler emits code that produces the same results as the
instruction regardless of any optimizations that it may apply. The
expected results for the tests were produced using inline assembler
taking the same inputs as the intrinsics being tested.

Other changes are to add and use some (limited) templates for scalar
operations and to add progress and error reporting, making the scalar
tests more consistent with those for the vector operations.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

testsuite/
2016-07-04  Jiong Wang  
Matthew Wahab  

* gcc.target/aarch64/advsimd-intrinsics/binary_scalar_op.inc: New.
* gcc.target/aarch64/advsimd-intrinsics/unary_scalar_op.inc: New.
* gcc.target/aarch64/advsimd-intrinsics/ternary_scalar_op.inc: New.
* gcc.target/aarch64/advsimd-intrinsics/vabsh_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vaddh_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtah_s32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtah_u32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_s32_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_f16_u32_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_s32_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_n_f16_u32_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_n_s32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_n_u32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_s32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvth_u32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtmh_s32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtmh_u32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtnh_s32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtnh_u32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtph_s32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vcvtph_u32_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vdivh_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vfmah_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vfmsh_f16_1.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vmaxnmh_f16_1.c: New.
* 

Re: [PATCH 15/17][ARM] Add tests for ARMv8.2-A FP16 support.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:48, Matthew Wahab wrote:
> Support for using the half-precision floating point operations added by
> the ARMv8.2-A FP16 extension is based on the macros and intrinsics added
> to the ACLE for the extension.
>
> This patch adds tests to check the compilers treatment of the ACLE
> macros and the code generated for the new intrinsics. It does not
> include the executable tests for the
> gcc.target/aarch64/advsimd-intrinsics testsuite. Those are added later
> in the patch series.

Changes since the previous version are:

- Fix the vsqrte/vrsqrte spelling mistake.

- armv8_2-fp16-scalar-2.c: Set option -std=c11, needed to test that
  vaddh_f16 (vmulh_f16 (a, b), c) generates a VMLA. (Options enabled
  with the default -std=g11 mean that VFMA would be generated
  otherwise.)

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

testsuite/
2016-07-04  Matthew Wahab  

* gcc.target/arm/armv8_2-fp16-neon-1.c: New.
* gcc.target/arm/armv8_2-fp16-scalar-1.c: New.
* gcc.target/arm/armv8_2-fp16-scalar-2.c: New.
* gcc.target/arm/attr-fp16-arith-1.c: Add a test of intrinsics
support.

>From b8760efc9da23357dc2bccef36e8ba2fc2f7a856 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 13:38:02 +0100
Subject: [PATCH 15/17] [PATCH 15/17][ARM] Add tests for ARMv8.2-A FP16
 support.

testsuite/
2016-07-04  Matthew Wahab  

	* gcc.target/arm/armv8_2-fp16-neon-1.c: New.
	* gcc.target/arm/armv8_2-fp16-scalar-1.c: New.
	* gcc.target/arm/armv8_2-fp16-scalar-2.c: New.
	* gcc.target/arm/attr-fp16-arith-1.c: Add a test of intrinsics
	support.
---
 gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-1.c | 490 +
 .../gcc.target/arm/armv8_2-fp16-scalar-1.c | 203 +
 .../gcc.target/arm/armv8_2-fp16-scalar-2.c |  71 +++
 gcc/testsuite/gcc.target/arm/attr-fp16-arith-1.c   |  13 +
 4 files changed, 777 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/armv8_2-fp16-scalar-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/armv8_2-fp16-scalar-2.c

diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-1.c b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-1.c
new file mode 100644
index 000..968efae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-1.c
@@ -0,0 +1,490 @@
+/* { dg-do compile }  */
+/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok }  */
+/* { dg-options "-O2" }  */
+/* { dg-add-options arm_v8_2a_fp16_neon }  */
+
+/* Test instructions generated for the FP16 vector intrinsics.  */
+
+#include 
+
+#define MSTRCAT(L, str)	L##str
+
+#define UNOP_TEST(insn)\
+  float16x4_t	\
+  MSTRCAT (test_##insn, _16x4) (float16x4_t a)	\
+  {		\
+return MSTRCAT (insn, _f16) (a);		\
+  }		\
+  float16x8_t	\
+  MSTRCAT (test_##insn, _16x8) (float16x8_t a)	\
+  {		\
+return MSTRCAT (insn, q_f16) (a);		\
+  }
+
+#define BINOP_TEST(insn)	\
+  float16x4_t			\
+  MSTRCAT (test_##insn, _16x4) (float16x4_t a, float16x4_t b)	\
+  {\
+return MSTRCAT (insn, _f16) (a, b);\
+  }\
+  float16x8_t			\
+  MSTRCAT (test_##insn, _16x8) (float16x8_t a, float16x8_t b)	\
+  {\
+return MSTRCAT (insn, q_f16) (a, b);			\
+  }
+
+#define BINOP_LANE_TEST(insn, I)	\
+  float16x4_t\
+  MSTRCAT (test_##insn##_lane, _16x4) (float16x4_t a, float16x4_t b)	\
+  {	\
+return MSTRCAT (insn, _lane_f16) (a, b, I);\
+  }	\
+  float16x8_t\
+  MSTRCAT (test_##insn##_lane, _16x8) (float16x8_t a, float16x4_t b)	\
+  {	\
+return MSTRCAT (insn, q_lane_f16) (a, b, I);			\
+  }
+
+#define BINOP_LANEQ_TEST(insn, I)	\
+  float16x4_t\
+  MSTRCAT (test_##insn##_laneq, _16x4) (float16x4_t a, float16x8_t b)	\
+  {	\
+return MSTRCAT (insn, _laneq_f16) (a, b, I);			\
+  }	\
+  float16x8_t\
+  MSTRCAT (test_##insn##_laneq, _16x8) (float16x8_t a, float16x8_t b)	\
+  {	\
+return MSTRCAT (insn, q_laneq_f16) (a, b, I);			\
+  }	\
+
+#define BINOP_N_TEST(insn)	\
+  float16x4_t			\
+  MSTRCAT (test_##insn##_n, _16x4) (float16x4_t a, float16_t b)	\
+  {\
+return MSTRCAT (insn, _n_f16) (a, b);			\
+  }\
+  float16x8_t			\
+  MSTRCAT (test_##insn##_n, _16x8) (float16x8_t a, float16_t b)	\
+  {\
+return MSTRCAT (insn, q_n_f16) (a, b);			\
+  }
+
+#define TERNOP_TEST(insn)		\
+  float16_t\
+  MSTRCAT (test_##insn, _16) (float16_t a, float16_t b, float16_t c)	\
+  {	\
+return MSTRCAT (insn, h_f16) (a, b, c);\
+  }	\
+  float16x4_t\
+  MSTRCAT (test_##insn, _16x4) (float16x4_t a, float16x4_t b,		\
+			   

Re: [PATCH 14/17][ARM] Add NEON FP16 instrinsics.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:46, Matthew Wahab wrote:
> The ARMv8.2-A architecture introduces an optional FP16 extension adding
> half-precision floating point data processing instructions to the
> existing Adv.SIMD (NEON) support. A future version of the ACLE will add
> support for these instructions and this patch implements that support.

Updated to fix the vsqrte/vrsqrte spelling mistake.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

2016-07-04  Matthew Wahab  

* config/arm/arm_neon.h (vabd_f16): New.
(vabdq_f16): New.
(vabs_f16): New.
(vabsq_f16): New.
(vadd_f16): New.
(vaddq_f16): New.
(vcage_f16): New.
(vcageq_f16): New.
(vcagt_f16): New.
(vcagtq_f16): New.
(vcale_f16): New.
(vcaleq_f16): New.
(vcalt_f16): New.
(vcaltq_f16): New.
(vceq_f16): New.
(vceqq_f16): New.
(vceqz_f16): New.
(vceqzq_f16): New.
(vcge_f16): New.
(vcgeq_f16): New.
(vcgez_f16): New.
(vcgezq_f16): New.
(vcgt_f16): New.
(vcgtq_f16): New.
(vcgtz_f16): New.
(vcgtzq_f16): New.
(vcle_f16): New.
(vcleq_f16): New.
(vclez_f16): New.
(vclezq_f16): New.
(vclt_f16): New.
(vcltq_f16): New.
(vcltz_f16): New.
(vcltzq_f16): New.
(vcvt_f16_s16): New.
(vcvt_f16_u16): New.
(vcvt_s16_f16): New.
(vcvt_u16_f16): New.
(vcvtq_f16_s16): New.
(vcvtq_f16_u16): New.
(vcvtq_s16_f16): New.
(vcvtq_u16_f16): New.
(vcvta_s16_f16): New.
(vcvta_u16_f16): New.
(vcvtaq_s16_f16): New.
(vcvtaq_u16_f16): New.
(vcvtm_s16_f16): New.
(vcvtm_u16_f16): New.
(vcvtmq_s16_f16): New.
(vcvtmq_u16_f16): New.
(vcvtn_s16_f16): New.
(vcvtn_u16_f16): New.
(vcvtnq_s16_f16): New.
(vcvtnq_u16_f16): New.
(vcvtp_s16_f16): New.
(vcvtp_u16_f16): New.
(vcvtpq_s16_f16): New.
(vcvtpq_u16_f16): New.
(vcvt_n_f16_s16): New.
(vcvt_n_f16_u16): New.
(vcvtq_n_f16_s16): New.
(vcvtq_n_f16_u16): New.
(vcvt_n_s16_f16): New.
(vcvt_n_u16_f16): New.
(vcvtq_n_s16_f16): New.
(vcvtq_n_u16_f16): New.
(vfma_f16): New.
(vfmaq_f16): New.
(vfms_f16): New.
(vfmsq_f16): New.
(vmax_f16): New.
(vmaxq_f16): New.
(vmaxnm_f16): New.
(vmaxnmq_f16): New.
(vmin_f16): New.
(vminq_f16): New.
(vminnm_f16): New.
(vminnmq_f16): New.
(vmul_f16): New.
(vmul_lane_f16): New.
(vmul_n_f16): New.
(vmulq_f16): New.
(vmulq_lane_f16): New.
(vmulq_n_f16): New.
(vneg_f16): New.
(vnegq_f16): New.
(vpadd_f16): New.
(vpmax_f16): New.
(vpmin_f16): New.
(vrecpe_f16): New.
(vrecpeq_f16): New.
(vrnd_f16): New.
(vrndq_f16): New.
(vrnda_f16): New.
(vrndaq_f16): New.
(vrndm_f16): New.
(vrndmq_f16): New.
(vrndn_f16): New.
(vrndnq_f16): New.
(vrndp_f16): New.
(vrndpq_f16): New.
(vrndx_f16): New.
(vrndxq_f16): New.
(vrsqrte_f16): New.
(vrsqrteq_f16): New.
(vrecps_f16): New.
(vrecpsq_f16): New.
(vrsqrts_f16): New.
(vrsqrtsq_f16): New.
(vsub_f16): New.
(vsubq_f16): New.

>From c26f43f3127d18971769f891c252ec5e157026f9 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 15:36:34 +0100
Subject: [PATCH 14/17] [PATCH 14/17][ARM] Add NEON FP16 instrinsics.

2016-07-04  Matthew Wahab  

	* config/arm/arm_neon.h (vabd_f16): New.
	(vabdq_f16): New.
	(vabs_f16): New.
	(vabsq_f16): New.
	(vadd_f16): New.
	(vaddq_f16): New.
	(vcage_f16): New.
	(vcageq_f16): New.
	(vcagt_f16): New.
	(vcagtq_f16): New.
	(vcale_f16): New.
	(vcaleq_f16): New.
	(vcalt_f16): New.
	(vcaltq_f16): New.
	(vceq_f16): New.
	(vceqq_f16): New.
	(vceqz_f16): New.
	(vceqzq_f16): New.
	(vcge_f16): New.
	(vcgeq_f16): New.
	(vcgez_f16): New.
	(vcgezq_f16): New.
	(vcgt_f16): New.
	(vcgtq_f16): New.
	(vcgtz_f16): New.
	(vcgtzq_f16): New.
	(vcle_f16): New.
	(vcleq_f16): New.
	(vclez_f16): New.
	(vclezq_f16): New.
	(vclt_f16): New.
	(vcltq_f16): New.
	(vcltz_f16): New.
	(vcltzq_f16): New.
	(vcvt_f16_s16): New.
	(vcvt_f16_u16): New.
	(vcvt_s16_f16): New.
	(vcvt_u16_f16): New.
	(vcvtq_f16_s16): New.
	(vcvtq_f16_u16): New.
	(vcvtq_s16_f16): New.
	(vcvtq_u16_f16): New.
	(vcvta_s16_f16): New.
	(vcvta_u16_f16): New.
	(vcvtaq_s16_f16): New.
	(vcvtaq_u16_f16): New.
	(vcvtm_s16_f16): New.
	(vcvtm_u16_f16): New.
	

Re: [PATCH 13/17][ARM] Add VFP FP16 instrinsics.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:44, Matthew Wahab wrote:
> The ARMv8.2-A architecture introduces an optional FP16 extension adding
> half-precision floating point data processing instructions to the
> existing scalar (floating point) support. A future version of the ACLE
> will add support for these instructions and this patch implements that
> support.

Updated to use the standard arithmetic operations for vnegh_f16,
vaddh_f16, vsubh_f16, vmulh_f16 and vdivh_f16.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

2016-07-04  Matthew Wahab  

* config.gcc (extra_headers): Add arm_fp16.h
* config/arm/arm_fp16.h: New.
* config/arm/arm_neon.h: Include "arm_fp16.h".

>From a9042ae0e0ea4a61436663a1afea81ccf699e9f9 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 15:36:23 +0100
Subject: [PATCH 13/17] [PATCH 13/17][ARM] Add VFP FP16 instrinsics.

2016-07-04  Matthew Wahab  

	* config.gcc (extra_headers): Add arm_fp16.h
	* config/arm/arm_fp16.h: New.
	* config/arm/arm_neon.h: Include "arm_fp16.h".
---
 gcc/config.gcc|   2 +-
 gcc/config/arm/arm_fp16.h | 255 ++
 gcc/config/arm/arm_neon.h |   1 +
 3 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/arm/arm_fp16.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 1f75f17..4333bc9 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -320,7 +320,7 @@ arc*-*-*)
 arm*-*-*)
 	cpu_type=arm
 	extra_objs="arm-builtins.o aarch-common.o"
-	extra_headers="mmintrin.h arm_neon.h arm_acle.h"
+	extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h"
 	target_type_format_char='%'
 	c_target_objs="arm-c.o"
 	cxx_target_objs="arm-c.o"
diff --git a/gcc/config/arm/arm_fp16.h b/gcc/config/arm/arm_fp16.h
new file mode 100644
index 000..c72d8c4
--- /dev/null
+++ b/gcc/config/arm/arm_fp16.h
@@ -0,0 +1,255 @@
+/* ARM FP16 intrinsics include file.
+
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#ifndef _GCC_ARM_FP16_H
+#define _GCC_ARM_FP16_H 1
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+
+/* Intrinsics for FP16 instructions.  */
+#pragma GCC push_options
+#pragma GCC target ("fpu=fp-armv8")
+
+#if defined (__ARM_FEATURE_FP16_SCALAR_ARITHMETIC)
+
+typedef __fp16 float16_t;
+
+__extension__ static __inline float16_t __attribute__ ((__always_inline__))
+vabsh_f16 (float16_t __a)
+{
+  return __builtin_neon_vabshf (__a);
+}
+
+__extension__ static __inline float16_t __attribute__ ((__always_inline__))
+vaddh_f16 (float16_t __a, float16_t __b)
+{
+  return __a + __b;
+}
+
+__extension__ static __inline int32_t __attribute__ ((__always_inline__))
+vcvtah_s32_f16 (float16_t __a)
+{
+  return __builtin_neon_vcvtahssi (__a);
+}
+
+__extension__ static __inline uint32_t __attribute__ ((__always_inline__))
+vcvtah_u32_f16 (float16_t __a)
+{
+  return __builtin_neon_vcvtahusi (__a);
+}
+
+__extension__ static __inline float16_t __attribute__ ((__always_inline__))
+vcvth_f16_s32 (int32_t __a)
+{
+  return __builtin_neon_vcvthshf (__a);
+}
+
+__extension__ static __inline float16_t __attribute__ ((__always_inline__))
+vcvth_f16_u32 (uint32_t __a)
+{
+  return __builtin_neon_vcvthuhf (__a);
+}
+
+__extension__ static __inline float16_t __attribute__ ((__always_inline__))
+vcvth_n_f16_s32 (int32_t __a, const int __b)
+{
+  return __builtin_neon_vcvths_nhf (__a, __b);
+}
+
+__extension__ static __inline float16_t __attribute__ ((__always_inline__))
+vcvth_n_f16_u32 (uint32_t __a, const int __b)
+{
+  return __builtin_neon_vcvthu_nhf ((int32_t)__a, __b);
+}
+
+__extension__ static __inline int32_t __attribute__ ((__always_inline__))
+vcvth_n_s32_f16 (float16_t __a, const int __b)
+{
+  return __builtin_neon_vcvths_nsi (__a, __b);
+}
+
+__extension__ static __inline uint32_t 

Re: [PATCH 12/17][ARM] Add builtins for NEON FP16 intrinsics.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:42, Matthew Wahab wrote:
> This patch adds the builtins data for the ACLE intrinsics introduced to
> support the NEON instructions of the ARMv8.2-A FP16 extension.

Updated to fix the vsqrte/vrsqrte spelling mistake and correct the changelog.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

2016-07-04  Matthew Wahab  

* config/arm/arm_neon_builtins.def (vadd): New (v8hf, v4hf
variants).
(vmulf): New (v8hf, v4hf variants).
(vfma): New (v8hf, v4hf variants).
(vfms): New (v8hf, v4hf variants).
(vsub): New (v8hf, v4hf variants).
(vcage): New (v8hf, v4hf variants).
(vcagt): New (v8hf, v4hf variants).
(vcale): New (v8hf, v4hf variants).
(vcalt): New (v8hf, v4hf variants).
(vceq): New (v8hf, v4hf variants).
(vcgt): New (v8hf, v4hf variants).
(vcge): New (v8hf, v4hf variants).
(vcle): New (v8hf, v4hf variants).
(vclt): New (v8hf, v4hf variants).
(vceqz): New (v8hf, v4hf variants).
(vcgez): New (v8hf, v4hf variants).
(vcgtz): New (v8hf, v4hf variants).
(vcltz): New (v8hf, v4hf variants).
(vclez): New (v8hf, v4hf variants).
(vabd): New (v8hf, v4hf variants).
(vmaxf): New (v8hf, v4hf variants).
(vmaxnm): New (v8hf, v4hf variants).
(vminf): New (v8hf, v4hf variants).
(vminnm): New (v8hf, v4hf variants).
(vpmaxf): New (v4hf variant).
(vpminf): New (v4hf variant).
(vpadd): New (v4hf variant).
(vrecps): New (v8hf, v4hf variants).
(vrsqrts): New (v8hf, v4hf variants).
(vabs): New (v8hf, v4hf variants).
(vneg): New (v8hf, v4hf variants).
(vrecpe): New (v8hf, v4hf variants).
(vrnd): New (v8hf, v4hf variants).
(vrnda): New (v8hf, v4hf variants).
(vrndm): New (v8hf, v4hf variants).
(vrndn): New (v8hf, v4hf variants).
(vrndp): New (v8hf, v4hf variants).
(vrndx): New (v8hf, v4hf variants).
(vrsqrte): New (v8hf, v4hf variants).
(vmul_lane): Add v4hf and v8hf variants.
(vmul_n): Add v4hf and v8hf variants.
(vext): New (v8hf, v4hf variants).
(vcvts): New (v8hi, v4hi variants).
(vcvts): New (v8hf, v4hf variants).
(vcvtu): New (v8hi, v4hi variants).
(vcvtu): New (v8hf, v4hf variants).
(vcvts_n): New (v8hf, v4hf variants).
(vcvtu_n): New (v8hi, v4hi variants).
(vcvts_n): New (v8hi, v4hi variants).
(vcvtu_n): New (v8hf, v4hf variants).
(vbsl): New (v8hf, v4hf variants).
(vcvtas): New (v8hf, v4hf variants).
(vcvtau): New (v8hf, v4hf variants).
(vcvtms): New (v8hf, v4hf variants).
(vcvtmu): New (v8hf, v4hf variants).
(vcvtns): New (v8hf, v4hf variants).
(vcvtnu): New (v8hf, v4hf variants).
(vcvtps): New (v8hf, v4hf variants).
(vcvtpu): New (v8hf, v4hf variants).

>From 5df552f65de19667400c63ff939ed5e90a8cbadf Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 13:36:41 +0100
Subject: [PATCH 12/17] [PATCH 12/17][ARM] Add builtins for NEON FP16
 intrinsics.

2016-07-04  Matthew Wahab  

	* config/arm/arm_neon_builtins.def (vadd): New (v8hf, v4hf
	variants).
	(vmulf): New (v8hf, v4hf variants).
	(vfma): New (v8hf, v4hf variants).
	(vfms): New (v8hf, v4hf variants).
	(vsub): New (v8hf, v4hf variants).
	(vcage): New (v8hf, v4hf variants).
	(vcagt): New (v8hf, v4hf variants).
	(vcale): New (v8hf, v4hf variants).
	(vcalt): New (v8hf, v4hf variants).
	(vceq): New (v8hf, v4hf variants).
	(vcgt): New (v8hf, v4hf variants).
	(vcge): New (v8hf, v4hf variants).
	(vcle): New (v8hf, v4hf variants).
	(vclt): New (v8hf, v4hf variants).
	(vceqz): New (v8hf, v4hf variants).
	(vcgez): New (v8hf, v4hf variants).
	(vcgtz): New (v8hf, v4hf variants).
	(vcltz): New (v8hf, v4hf variants).
	(vclez): New (v8hf, v4hf variants).
	(vabd): New (v8hf, v4hf variants).
	(vmaxf): New (v8hf, v4hf variants).
	(vmaxnm): New (v8hf, v4hf variants).
	(vminf): New (v8hf, v4hf variants).
	(vminnm): New (v8hf, v4hf variants).
	(vpmaxf): New (v4hf variant).
	(vpminf): New (v4hf variant).
	(vpadd): New (v4hf variant).
	(vrecps): New (v8hf, v4hf variants).
	(vrsqrts): New (v8hf, v4hf variants).
	(vabs): New (v8hf, v4hf variants).
	(vneg): New (v8hf, v4hf variants).
	(vrecpe): New (v8hf, v4hf variants).
	(vrnd): New (v8hf, v4hf variants).
	(vrnda): New (v8hf, v4hf variants).
	(vrndm): New (v8hf, v4hf variants).
	(vrndn): New (v8hf, v4hf variants).
	(vrndp): New (v8hf, v4hf variants).
	(vrndx): New (v8hf, v4hf variants).
	(vrsqrte): New (v8hf, v4hf variants).
	(vmul_lane): Add v4hf and v8hf variants.
	(vmul_n): Add v4hf and v8hf variants.
	(vext): New (v8hf, v4hf 

Re: [PATCH 11/17][ARM] Add builtins for VFP FP16 intrinsics.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:41, Matthew Wahab wrote:
> The ACLE intrinsics introduced to support the ARMv8.2 FP16 extensions
> require that intrinsics for scalar floating pointer (VFP) instructions
> are available under different conditions from those for the NEON
> intrinsics.
>
> This patch adds the support code and builtins data for the new VFP
> intrinsics. Because of the similarities between the scalar and NEON
> builtins, the support code for the scalar builtins follows the code for
> the NEON builtins. The declarations for the VFP builtins are also added
> in this patch since the support code expects non-empty tables.

Updated the patch to drop the builtins for vneg, vadd, vsub, vmul and
vdiv, which are no longer needed.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

2016-07-04  Matthew Wahab  

* config/arm/arm-builtins.c (hf_UP): New.
(si_UP): New.
(vfp_builtin_data): New.  Update comment.
(enum arm_builtins): Include "arm_vfp_builtins.def".
(ARM_BUILTIN_VFP_PATTERN_START): New.
(arm_init_vfp_builtins): New.
(arm_init_builtins): Add arm_init_vfp_builtins.
(arm_expand_vfp_builtin): New.
(arm_expand_builtins): Update for arm_expand_vfp_builtin.  Fix
long line.
* config/arm/arm_vfp_builtins.def: New file.
* config/arm/t-arm (arm.o): Add arm_vfp_builtins.def.
(arm-builtins.o): Likewise.

>From 04896868ba0af25b31e9d23c3af5d3a88e70a564 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 15:33:14 +0100
Subject: [PATCH 11/17] [PATCH 11/17][ARM] Add builtins for VFP FP16
 intrinsics.

2016-07-04  Matthew Wahab  

	* config/arm/arm-builtins.c (hf_UP): New.
	(si_UP): New.
	(vfp_builtin_data): New.  Update comment.
	(enum arm_builtins): Include "arm_vfp_builtins.def".
	(ARM_BUILTIN_VFP_PATTERN_START): New.
	(arm_init_vfp_builtins): New.
	(arm_init_builtins): Add arm_init_vfp_builtins.
	(arm_expand_vfp_builtin): New.
	(arm_expand_builtins): Update for arm_expand_vfp_builtin.  Fix
	long line.
	* config/arm/arm_vfp_builtins.def: New file.
	* config/arm/t-arm (arm.o): Add arm_vfp_builtins.def.
	(arm-builtins.o): Likewise.
---
 gcc/config/arm/arm-builtins.c   | 75 +
 gcc/config/arm/arm_vfp_builtins.def | 51 +
 gcc/config/arm/t-arm|  4 +-
 3 files changed, 121 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/arm/arm_vfp_builtins.def

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 5dd81b1..70bcc07 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -190,6 +190,8 @@ arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define ti_UP	 TImode
 #define ei_UP	 EImode
 #define oi_UP	 OImode
+#define hf_UP	 HFmode
+#define si_UP	 SImode
 
 #define UP(X) X##_UP
 
@@ -239,12 +241,22 @@ typedef struct {
   VAR11 (T, N, A, B, C, D, E, F, G, H, I, J, K) \
   VAR1 (T, N, L)
 
-/* The NEON builtin data can be found in arm_neon_builtins.def.
-   The mode entries in the following table correspond to the "key" type of the
-   instruction variant, i.e. equivalent to that which would be specified after
-   the assembler mnemonic, which usually refers to the last vector operand.
-   The modes listed per instruction should be the same as those defined for
-   that instruction's pattern in neon.md.  */
+/* The NEON builtin data can be found in arm_neon_builtins.def and
+   arm_vfp_builtins.def.  The entries in arm_neon_builtins.def require
+   TARGET_NEON to be true.  The entries in arm_vfp_builtins.def require
+   TARGET_VFP to be true.  The feature tests are checked when the builtins are
+   expanded.
+
+   The mode entries in the following table correspond to
+   the "key" type of the instruction variant, i.e. equivalent to that which
+   would be specified after the assembler mnemonic, which usually refers to the
+   last vector operand.  The modes listed per instruction should be the same as
+   those defined for that instruction's pattern in neon.md.  */
+
+static neon_builtin_datum vfp_builtin_data[] =
+{
+#include "arm_vfp_builtins.def"
+};
 
 static neon_builtin_datum neon_builtin_data[] =
 {
@@ -534,6 +546,10 @@ enum arm_builtins
 #undef CRYPTO2
 #undef CRYPTO3
 
+  ARM_BUILTIN_VFP_BASE,
+
+#include "arm_vfp_builtins.def"
+
   ARM_BUILTIN_NEON_BASE,
   ARM_BUILTIN_NEON_LANE_CHECK = ARM_BUILTIN_NEON_BASE,
 
@@ -542,6 +558,9 @@ enum arm_builtins
   ARM_BUILTIN_MAX
 };
 
+#define ARM_BUILTIN_VFP_PATTERN_START \
+  (ARM_BUILTIN_VFP_BASE + 1)
+
 #define ARM_BUILTIN_NEON_PATTERN_START \
   (ARM_BUILTIN_NEON_BASE + 1)
 
@@ -1033,6 +1052,20 @@ arm_init_neon_builtins (void)
 }
 }
 
+/* Set up all the scalar floating point builtins.  */
+
+static void

Re: [PATCH 9/17][ARM] Add NEON FP16 arithmetic instructions.

2016-07-04 Thread Matthew Wahab

On 18/05/16 01:58, Joseph Myers wrote:
> On Tue, 17 May 2016, Matthew Wahab wrote:
>
>> As with the VFP FP16 arithmetic instructions, operations on __fp16
>> values are done by conversion to single-precision. Any new optimization
>> supported by the instruction descriptions can only apply to code
>> generated using intrinsics added in this patch series.
>
> As with the scalar instructions, I think it is legitimate in most cases to
> optimize arithmetic via single precision to work direct on __fp16 values
> (and this would be natural for vectorization of __fp16 arithmetic).
>
>> A number of the instructions are modelled as two variants, one using
>> UNSPEC and the other using RTL operations, with the model used decided
>> by the funsafe-math-optimizations flag. This follows the
>> single-precision instructions and is due to the half-precision
>> operations having the same conditions and restrictions on their use in
>> optmizations (when they are enabled).
>
> (Of course, these restrictions still apply.)

The F16 support generally follows the F32 implementation and, for F32,
direct arithmetic vector operations are only available when
unsafe-math-optimizations is enabled. I want to check the behaviour of
the F16 operations when unsafe-math is enabled so I'll defer to a follow
up patch the change to use standard names for the vector operations.

There are still some changes from the previous patch:

- Two fma/fmsub patterns *fma4 and <*fmsub4 are
  dropped since they just duplicated *fma4_intrinsic and
  <*fmsub4_intrinsic.

- Patterns neon_vadd_unspec and neon_vsub_unspec are
  dropped, they were redundant.

- 2_fp16 is renamed to 2. This
  implements the abs and neg operations which are always safe to use.

- neon_vsqrte is renamed to neon_vrsqrte. This is a
  misspelled intrinsic that wasn't caught in testing because the
  relevant test case is missing. The intrinsic is fixed here and in
  other patches and an advsimd-intrinsics test added later in the
  (updated) series.

- neon_vcvt_n

* config/arm/iterators.md (VCVTHI): New.
(NEON_VCMP): Add UNSPEC_VCLT and UNSPEC_VCLE.  Fix a long line.
(NEON_VAGLTE): New.
(VFM_LANE_AS): New.
(VH_CVTTO): New.
(V_reg): Add HF, V4HF and V8HF.  Fix white-space.
(V_HALF): Add V4HF.  Fix white-space.
(V_if_elem): Add HF, V4HF and V8HF.  Fix white-space.
(V_s_elem): Likewise.
(V_sz_elem): Fix white-space.
(V_elem_ch): Likewise.
(VH_elem_ch): New.
(scalar_mul_constraint): Add V8HF and V4HF.
(Is_float_mode): Fix white-space.
(Is_d_reg): Fix white-space.
(q): Add HF.  Fix white-space.
(float_sup): New.
(float_SUP): New.
(cmp_op_unsp): Add UNSPEC_VCALE and UNSPEC_VCALT.
(neon_vfm_lane_as): New.
* config/arm/neon.md (add3_fp16): New.
(sub3_fp16): New.
(mul3add_neon): New.
(fma4_intrinsic): New.
(fmsub4_intrinsic): Fix white-space.
(fmsub4_intrinsic): New.
(2): New.
(neon_v): New.
(neon_v): New.
(neon_vrsqrte): New.
(neon_vpaddv4hf): New.
(neon_vadd): New.
(neon_vsub): New.
(neon_vmulf): New.
(neon_vfma): New.
(neon_vfms): New.
(neon_vc): New.
(neon_vc_fp16insn): New
(neon_vc_fp16insn_unspec): New.
(neon_vca): New.
(neon_vca_fp16insn): New.
(neon_vca_fp16insn_unspec): New.
(neon_vcz): New.
(neon_vabd): New.
(neon_vf): New.
(neon_vpfv4hf: New.
(neon_): New.
(neon_vrecps): New.
(neon_vrsqrts): New.
(neon_vrecpe): New (VH variant).
(neon_vdup_lane_internal): New.
(neon_vdup_lane): New.
(neon_vcvt): New (VCVTHI variant).
(neon_vcvt): New (VH variant).
(neon_vcvt_n): New (VH variant).
(neon_vcvt_n): New (VCVTHI variant).
(neon_vcvt): New.
(neon_vmul_lane): New.
(neon_vmul_n): New.
* config/arm/unspecs.md (UNSPEC_VCALE): New
(UNSPEC_VCALT): New.
(UNSPEC_VFMA_LANE): New.
(UNSPECS_VFMS_LANE): New.

testsuite/
2016-07-04  Matthew Wahab  

* gcc.target/arm/armv8_2-fp16-arith-1.c: Use arm_v8_2a_fp16_neon
options.  Add tests for float16x4_t and float16x8_t.

>From 4cbebc297f74f0c2e3ddac600d7902083c09c934 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 16:19:57 +0100
Subject: [PATCH 09/17] [PATCH 9/17][ARM] Add NEON FP16 arithmetic
 instructions.

2016-07-04  Matthew Wahab  

	* config/arm/iterators.md (VCVTHI): New.
	(NEON_VCMP): Add UNSPEC_VCLT and UNSPEC_VCLE.  Fix a long line.
	(NEON_VAGLTE): New.
	(VFM_LANE_AS): New.
	(VH_CVTTO): New.
	(V_reg): Add HF, V4HF and V8HF.  Fix white-space.
	(V_HALF): Add V4HF.  Fix white-space.
	(V_if_elem): Add HF, V4HF and V8HF.  Fix white-space.
	(V_s_elem): 

Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-07-04 Thread Andreas Krebbel
On 07/04/2016 02:19 PM, Dominik Vogt wrote:
> Version 4 with the following change:
> 
>  * Rebased on top of the "Minor cleanup to
>allocate_dynamic_stack_space" patch.  The "Drop excess size
>used for run time allocated stack variables." path needs an
>update because it touches the dsame code as the patch in this
>message.
> 
> Ran the testsuite on s390x biarch, s390 and x86_64.
> 
> On Fri, Jun 24, 2016 at 01:30:44PM +0100, Dominik Vogt wrote:
>>> The only open question I'm aware of is the
>>> stack-usage-2.c test.  I guess foo3() will not generate
>>>
>>>   stack usage might be ... bytes
>>>
>>> On any target anymore, and using alloca() with a constant size
>>> results in "unbounded".  It's unclear to me whether that message
>>> is ever generated, and if so, how to trigger it.
> 
> This point is still open.  If nobody has more comments Andreas
> will commit the (afaik already approved) patch soon and we can
> clean up the test case in a follow up patch.

I would like to see an explicit approval before doing the commit.  I think it 
would also make sense
to let other target maintainers have a look whether this might cause any 
problems.

Bye,

-Andreas-


> 
> Ciao
> 
> Dominik ^_^  ^_^
> 



Re: [lra] Cleanup the use of offmemok and don't count spilling cost for it

2016-07-04 Thread Jiong Wang



On 04/07/16 14:12, Bernd Schmidt wrote:

On 06/30/2016 07:24 PM, Jiong Wang wrote:

From my understanding, "offmemok" is used to represent a memory operand
who's address we want to reload, and searching of it's reference 
location

seems confirmed my understanding as it's always used together with MEM_P
check.

So this patch does the following modifications:

  * Only set offmemok to true if MEM_P is also true, as otherwise 
offmemok

is not used.

>   * Remove redundant MEM_P check which was used together with offmemok.

I really dislike this part. The various _ok variables say what is 
acceptable - the type of the operand doesn't really factor into that. 
I think the code becomes more confusing when merging the two.


  * Avoid the addition of spilling cost if offmemok be true as an 
address

calculation reload is not spilling.


This part seems to be plausible. I am however unclear how this would 
fix the ICE (if it does - Andreas?) since it only seems to modify cost 
computations. What exactly is preventing the correct sequence of 
events (reloading the address) from triggering without this patch?




Hi Bernd,

The ICE reported was a "Max. number of generated reload insns per insn
is achieved",

The input rtx pattern which triggers this issue is:

(insn 41 38 120 4 (set (mem:V4SI
(plus:DI (plus:DI (reg/v/f:DI 116) (reg:DI 69 ))
 (const_int -16))
 (subreg:V4SI (reg:V16QI 134) 0)) {movv4si}

And the corresponding s390 patten is "mov" for V_128.

(define_insn "mov"
  [(set (match_operand:V_128 0 "" "=v,v,R,  v,  v,  v,  v,  v,v,d")
  (match_operand:V_128 1 "" "v,R,v,j00,jm1,jyy,jxx,jKK,d,v"))]

As the offset "-16" does not qualify s390_short_displacement, we need a
reload.

Ideally we want alternative 2, for which gcc simply reload the mem
address into a address register.

r157:DI=r116:DI+r69:DI-0x10
[r157:DI]=r134:V16QI#0

While after r237277, gcc is treating the reload of insn 41 as a spill
and thus increased the costs for it, then alternative 8 beat alternative
2, thus the following reload sequences are generated.

r157:V4SI=r134:V16QI#0
[r116:DI+r69:DI-0x10]=r157:V4SI

GCC move the vector register into general register, then a second
instruction to store the general register into memory so it can match
alternative 8, which is "v", "d".

However the second instructions still constains the illegal mem address,
thus a further reload triggered, and gcc triggers above max number
reload issue.

The functional chang of this patch is to make gcc don't treat an memory
address reload as spill which is regression caused by r237277.

Does this explanation make sense?

Thanks.

Regards,
Jiong



Bernd




Re: [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-07-04 Thread Matthew Wahab

On 19/05/16 15:54, Matthew Wahab wrote:
> On 18/05/16 16:20, Joseph Myers wrote:
>> On Wed, 18 May 2016, Matthew Wahab wrote:
>>
>> In short: instructions for direct HFmode arithmetic should be described
>> with patterns with the standard names.  It's the job of the
>> architecture-independent compiler to ensure that fp16 arithmetic in the
>> user's source code only generates direct fp16 arithmetic in GIMPLE (and
>> thus ends up using those patterns) if that is a correct representation of
>> the source code's semantics according to ACLE.
>>
>> The intrinsics you provide can then be written to use direct arithmetic,
>> and rely on convert_to_real_1 eliminating the promotions, rather than
>> needing built-in functions at all, just like many arm_neon.h intrinsics
>> make direct use of GNU C vector arithmetic.
>
> I think it's clear that this has exhausted my knowledge of FP semantics.
>
> Forcing promotion to single-precision was to settle concerns brought up in
> internal discussions about __fp16 semantics. I'll see if anybody has any
> problem with the changes you suggest.

This patch changes the implementation to use the standard names for the
HFmode arithmetic. Later patches will also be updated to use the
arithmetic operators where appropriate.

Changes since the last version of this patch:
- The standard names for plus, minus, mult, div and fma are defined for
  HF mode.
- The patterns supporting the new ACLE intrinsics vnegh_f16, vaddh_f16,
  vsubh_f16, vmulh_f16 and vdivh_f16 are removed, the arithmetic
  operators will be used instead.
- The tests are updated to expect f16 instructions rather than the f32
  instructions that were previously emitted.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

2016-07-04  Matthew Wahab  

* config/arm/iterators.md (Code iterators): Fix some white-space
in the comments.
(GLTE): New.
(ABSNEG): New
(FCVT): Moved from vfp.md.
(VCVT_HF_US_N): New.
(VCVT_SI_US_N): New.
(VCVT_HF_US): New.
(VCVTH_US): New.
(FP16_RND): New.
(absneg_str): New.
(FCVTI32typename): Moved from vfp.md.
(sup): Add UNSPEC_VCVTA_S, UNSPEC_VCVTA_U, UNSPEC_VCVTM_S,
UNSPEC_VCVTM_U, UNSPEC_VCVTN_S, UNSPEC_VCVTN_U, UNSPEC_VCVTP_S,
UNSPEC_VCVTP_U, UNSPEC_VCVT_HF_S_N, UNSPEC_VCVT_HF_U_N,
UNSPEC_VCVT_SI_S_N, UNSPEC_VCVT_SI_U_N,  UNSPEC_VCVTH_S_N,
UNSPEC_VCVTH_U_N, UNSPEC_VCVTH_S and UNSPEC_VCVTH_U.
(vcvth_op): New.
(fp16_rnd_str): New.
(fp16_rnd_insn): New.
* config/arm/unspecs.md (UNSPEC_VCVT_HF_S_N): New.
(UNSPEC_VCVT_HF_U_N): New.
(UNSPEC_VCVT_SI_S_N): New.
(UNSPEC_VCVT_SI_U_N): New.
(UNSPEC_VCVTH_S): New.
(UNSPEC_VCVTH_U): New.
(UNSPEC_VCVTA_S): New.
(UNSPEC_VCVTA_U): New.
(UNSPEC_VCVTM_S): New.
(UNSPEC_VCVTM_U): New.
(UNSPEC_VCVTN_S): New.
(UNSPEC_VCVTN_U): New.
(UNSPEC_VCVTP_S): New.
(UNSPEC_VCVTP_U): New.
(UNSPEC_VCVTP_S): New.
(UNSPEC_VCVTP_U): New.
(UNSPEC_VRND): New.
(UNSPEC_VRNDA): New.
(UNSPEC_VRNDI): New.
(UNSPEC_VRNDM): New.
(UNSPEC_VRNDN): New.
(UNSPEC_VRNDP): New.
(UNSPEC_VRNDX): New.
* config/arm/vfp.md (hf2): New.
(neon_vabshf): New.
(neon_vhf): New.
(neon_vrndihf): New.
(addhf3): New.
(subhf3): New.
(divhf3): New.
(mulhf3): New.
(*mulsf3neghf_vfp): New.
(*negmulhf3_vfp): New.
(*mulsf3addhf_vfp): New.
(*mulhf3subhf_vfp): New.
(*mulhf3neghfaddhf_vfp): New.
(*mulhf3neghfsubhf_vfp): New.
(fmahf4): New.
(neon_vfmahf): New.
(fmsubhf4_fp16): New.
(neon_vfmshf): New.
(*fnmsubhf4): New.
(*fnmaddhf4): New.
(neon_vsqrthf): New.
(neon_vrsqrtshf): New.
(FCVT): Move to iterators.md.
(FCVTI32typename): Likewise.
(neon_vcvthhf): New.
(neon_vcvthsi): New.
(neon_vcvth_nhf_unspec): New.
(neon_vcvth_nhf): New.
(neon_vcvth_nsi_unspec): New.
(neon_vcvth_nsi): New.
(neon_vcvthsi): New.
(neon_hf): New.

testsuite/
2016-07-04  Matthew Wahab  

* gcc.target/arm/armv8_2-fp16-arith-1.c: New.
* gcc.target/arm/armv8_2-fp16-conv-1.c: New.

>From 780903a1c5ef2e4393c9ee2843307d9041f36f87 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 14:49:17 +0100
Subject: [PATCH 08/17] [PATCH 8/17][ARM] Add VFP FP16 arithmetic instructions.

2016-07-04  Matthew Wahab  

	* config/arm/iterators.md (Code iterators): Fix some white-space
	in the comments.
	(GLTE): New.
	

Re: [PATCH 7/17][ARM] Add FP16 data movement instructions.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:34, Matthew Wahab wrote:
> The ARMv8.2-A FP16 extension adds a number of instructions to support
> data movement for FP16 values. This patch adds these instructions to the
> backend, making them available to the compiler code generator.

This updates the expected output for the test added by the patch since
gcc now generates ldrh/strh for some indexed loads/stores which were
previously done with vld1/vstr1.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

2016-07-04  Matthew Wahab  
Jiong Wang 

* config/arm/arm.c (coproc_secondary_reload_class): Make HFmode
available when FP16 instructions are available.
(output_move_vfp): Add support for 16-bit data moves.
(arm_validize_comparison): Fix some white-space.  Support HFmode
by conversion to SFmode.
* config/arm/arm.md (truncdfhf2): Fix a comment.
(extendhfdf2): Likewise.
(cstorehf4): New.
(movsicc): Fix some white-space.
(movhfcc): New.
(movsfcc): Fix some white-space.
(*cmovhf): New.
* config/arm/vfp.md (*arm_movhi_vfp): Disable when VFP FP16
instructions are available.
(*thumb2_movhi_vfp): Likewise.
(*arm_movhi_fp16): New.
(*thumb2_movhi_fp16): New.
(*movhf_vfp_fp16): New.
(*movhf_vfp_neon): Disable when VFP FP16 instructions are
available.
(*movhf_vfp): Likewise.
(extendhfsf2): Enable when VFP FP16 instructions are available.
(truncsfhf2):  Enable when VFP FP16 instructions are available.

testsuite/
2016-07-04  Matthew Wahab  

* gcc.target/arm/armv8_2_fp16-move-1.c: New.

>From 0633bbb2f2d43a6994adaeb44898e18c304ee728 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 13:35:04 +0100
Subject: [PATCH 07/17] [PATCH 7/17][ARM] Add FP16 data movement instructions.

2016-07-04  Matthew Wahab  
	Jiong Wang 

	* config/arm/arm.c (coproc_secondary_reload_class): Make HFmode
	available when FP16 instructions are available.
	(output_move_vfp): Add support for 16-bit data moves.
	(arm_validize_comparison): Fix some white-space.  Support HFmode
	by conversion to SFmode.
	* config/arm/arm.md (truncdfhf2): Fix a comment.
	(extendhfdf2): Likewise.
	(cstorehf4): New.
	(movsicc): Fix some white-space.
	(movhfcc): New.
	(movsfcc): Fix some white-space.
	(*cmovhf): New.
	* config/arm/vfp.md (*arm_movhi_vfp): Disable when VFP FP16
	instructions are available.
	(*thumb2_movhi_vfp): Likewise.
	(*arm_movhi_fp16): New.
	(*thumb2_movhi_fp16): New.
	(*movhf_vfp_fp16): New.
	(*movhf_vfp_neon): Disable when VFP FP16 instructions are
	available.
	(*movhf_vfp): Likewise.
	(extendhfsf2): Enable when VFP FP16 instructions are available.
	(truncsfhf2):  Enable when VFP FP16 instructions are available.

testsuite/
2016-07-04  Matthew Wahab  

	* gcc.target/arm/armv8_2_fp16-move-1.c: New.
---
 gcc/config/arm/arm.c   |  16 +-
 gcc/config/arm/arm.md  |  81 -
 gcc/config/arm/vfp.md  | 182 -
 gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c | 165 +++
 4 files changed, 432 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ce18f75..f07e2c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -13187,7 +13187,7 @@ coproc_secondary_reload_class (machine_mode mode, rtx x, bool wb)
 {
   if (mode == HFmode)
 {
-  if (!TARGET_NEON_FP16)
+  if (!TARGET_NEON_FP16 && !TARGET_VFP_FP16INST)
 	return GENERAL_REGS;
   if (s_register_operand (x, mode) || neon_vector_mem_operand (x, 2, true))
 	return NO_REGS;
@@ -18638,6 +18638,8 @@ output_move_vfp (rtx *operands)
   rtx reg, mem, addr, ops[2];
   int load = REG_P (operands[0]);
   int dp = GET_MODE_SIZE (GET_MODE (operands[0])) == 8;
+  int sp = (!TARGET_VFP_FP16INST
+	|| GET_MODE_SIZE (GET_MODE (operands[0])) == 4);
   int integer_p = GET_MODE_CLASS (GET_MODE (operands[0])) == MODE_INT;
   const char *templ;
   char buff[50];
@@ -18684,7 +18686,7 @@ output_move_vfp (rtx *operands)
 
   sprintf (buff, templ,
 	   load ? "ld" : "st",
-	   dp ? "64" : "32",
+	   dp ? "64" : sp ? "32" : "16",
 	   dp ? "P" : "",
 	   integer_p ? "\t%@ int" : "");
   output_asm_insn (buff, ops);
@@ -29326,7 +29328,7 @@ arm_validize_comparison (rtx *comparison, rtx * op1, rtx * op2)
 {
   enum rtx_code code = GET_CODE (*comparison);
   int code_int;
-  machine_mode mode = (GET_MODE (*op1) == VOIDmode) 
+  machine_mode mode = (GET_MODE (*op1) == VOIDmode)
 ? GET_MODE (*op2) : GET_MODE (*op1);
 
  

Re: [PATCH 3/17][Testsuite] Add ARM support for ARMv8.2-A with FP16 arithmetic instructions.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:26, Matthew Wahab wrote:
> The ARMv8.2-A FP16 extension adds to both the VFP and the NEON
> instruction sets. This patch adds support to the testsuite to select
> targets and set options for tests that make use of these
> instructions. It also adds documentation for ARMv8.1-A selectors.

This is a rebase of the patch to take account of changes in
sourcebuild.texi.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

2016-07-04  Matthew Wahab  

* doc/sourcebuild.texi (ARM-specific attributes): Add anchor for
arm_v8_1a_neon_ok.  Add entries for arm_v8_2a_fp16_scalar_ok,
arm_v8_2a_fp16_scalar_hw, arm_v8_2a_fp16_neon_ok and
arm_v8_2a_fp16_neon_hw.
(Add options): Add entries for arm_v8_1a_neon, arm_v8_2a_fp16_scalar,
arm_v8_2a_fp16_neon.
* lib/target-supports.exp
(add_options_for_arm_v8_2a_fp16_scalar): New.
(add_options_for_arm_v8_2a_fp16_neon): New.
(check_effective_target_arm_arch_v8_2a_ok): Auto-generate.
(add_options_for_arm_arch_v8_2a): Auto-generate.
(check_effective_target_arm_arch_v8_2a_multilib): Auto-generate.
(check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache): New.
(check_effective_target_arm_v8_2a_fp16_scalar_ok): New.
(check_effective_target_arm_v8_2a_fp16_neon_ok_nocache): New.
(check_effective_target_arm_v8_2a_fp16_neon_ok): New.
(check_effective_target_arm_v8_2a_fp16_scalar_hw): New.
(check_effective_target_arm_v8_2a_fp16_neon_hw): New.

>From 47ead98473ac1f6dda5df2638800e5b4c8ec38a1 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 13:34:30 +0100
Subject: [PATCH 03/17] [PATCH 3/17][Testsuite] Add ARM support for ARMv8.2-A
 with FP16   arithmetic instructions.

2016-07-04  Matthew Wahab  

	* doc/sourcebuild.texi (ARM-specific attributes): Add anchor for
	arm_v8_1a_neon_ok.  Add entries for arm_v8_2a_fp16_scalar_ok,
	arm_v8_2a_fp16_scalar_hw, arm_v8_2a_fp16_neon_ok and
	arm_v8_2a_fp16_neon_hw.
	(Add options): Add entries for arm_v8_1a_neon, arm_v8_2a_scalar,
	arm_v8_2a_neon.
	* lib/target-supports.exp
	(add_options_for_arm_v8_2a_fp16_scalar): New.
	(add_options_for_arm_v8_2a_fp16_neon): New.
	(check_effective_target_arm_arch_v8_2a_ok): Auto-generate.
	(add_options_for_arm_arch_v8_2a): Auto-generate.
	(check_effective_target_arm_arch_v8_2a_multilib): Auto-generate.
	(check_effective_target_arm_v8_2a_fp16_scalar_ok_nocache): New.
	(check_effective_target_arm_v8_2a_fp16_scalar_ok): New.
	(check_effective_target_arm_v8_2a_fp16_neon_ok_nocache): New.
	(check_effective_target_arm_v8_2a_fp16_neon_ok): New.
	(check_effective_target_arm_v8_2a_fp16_scalar_hw): New.
	(check_effective_target_arm_v8_2a_fp16_neon_hw): New.
---
 gcc/doc/sourcebuild.texi  |  40 ++
 gcc/testsuite/lib/target-supports.exp | 145 +-
 2 files changed, 184 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 1fa962d..4f83307 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1596,6 +1596,7 @@ ARM target supports @code{-mfpu=neon-fp-armv8 -mfloat-abi=softfp}.
 Some multilibs may be incompatible with these options.
 
 @item arm_v8_1a_neon_ok
+@anchor{arm_v8_1a_neon_ok}
 ARM target supports options to generate ARMv8.1 Adv.SIMD instructions.
 Some multilibs may be incompatible with these options.
 
@@ -1607,6 +1608,28 @@ arm_v8_1a_neon_ok.
 @item arm_acq_rel
 ARM target supports acquire-release instructions.
 
+@item arm_v8_2a_fp16_scalar_ok
+@anchor{arm_v8_2a_fp16_scalar_ok}
+ARM target supports options to generate instructions for ARMv8.2 and
+scalar instructions from the FP16 extension.  Some multilibs may be
+incompatible with these options.
+
+@item arm_v8_2a_fp16_scalar_hw
+ARM target supports executing instructions for ARMv8.2 and scalar
+instructions from the FP16 extension.  Some multilibs may be
+incompatible with these options.  Implies arm_v8_2a_fp16_neon_ok.
+
+@item arm_v8_2a_fp16_neon_ok
+@anchor{arm_v8_2a_fp16_neon_ok}
+ARM target supports options to generate instructions from ARMv8.2 with
+the FP16 extension.  Some multilibs may be incompatible with these
+options.  Implies arm_v8_2a_fp16_scalar_ok.
+
+@item arm_v8_2a_fp16_neon_hw
+ARM target supports executing instructions from ARMv8.2 with the FP16
+extension.  Some multilibs may be incompatible with these options.
+Implies arm_v8_2a_fp16_neon_ok and arm_v8_2a_fp16_scalar_hw.
+
 @item arm_prefer_ldrd_strd
 ARM target prefers @code{LDRD} and @code{STRD} instructions over
 @code{LDM} and @code{STM} instructions.
@@ -2091,6 +2114,23 @@ the @ref{arm_neon_fp16_ok,,arm_neon_fp16_ok effective target keyword}.
 arm vfp3 floating point support; see
 the @ref{arm_vfp3_ok,,arm_vfp3_ok effective 

Re: [PATCH 1/17][ARM] Add ARMv8.2-A command line option and profile.

2016-07-04 Thread Matthew Wahab

On 17/05/16 15:22, Matthew Wahab wrote:
> This patch adds the command options for the architecture ARMv8.2-A and
> the half-precision extension. The architecture is selected by
> -march=armv8.2-a and has all the properties of -march=armv8.1-a.
>
> This patch also enables the CRC extension (+crc) which is required
> for both ARMv8.2-A and ARMv8.1-A architectures but is not currently
> enabled by default for -march=armv8.1-a.
>
> The half-precision extension is selected using the extension +fp16. This
> enables the VFP FP16 instructions if an ARMv8 VFP unit is also
> specified, e.g. by -mfpu=fp-armv8. It also enables the FP16 NEON
> instructions if an ARMv8 NEON unit is specified, e.g. by
> -mfpu=neon-fp-armv8. Note that if the NEON FP16 instructions are enabled
> then so are the VFP FP16 instructions.

This a minor respin that moves the setting of arm_fp16_inst in
arm_option_override to immediately before it is used to set the required
arm_fp16_format.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

2016-07-04  Matthew Wahab  

* config/arm/arm-arches.def ("armv8.1-a"): Add FL_CRC32.
("armv8.2-a"): New.
("armv8.2-a+fp16"): New.
* config/arm/arm-protos.h (FL2_ARCH8_2): New.
(FL2_FP16INST): New.
(FL2_FOR_ARCH8_2A): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.c (arm_arch8_2): New.
(arm_fp16_inst): New.
(arm_option_override): Set arm_arch8_2 and arm_fp16_inst.  Check
for incompatible fp16-format settings.
* config/arm/arm.h (TARGET_VFP_FP16INST): New.
(TARGET_NEON_FP16INST): New.
(arm_arch8_2): Declare.
(arm_fp16_inst): Declare.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add entries for
march=armv8.2-a and march=armv8.2-a+fp16.
* config/arm/t-aprofile (Arch Matches): Add entries for armv8.2-a
and armv8.2-a+fp16.
* doc/invoke.texi (ARM Options): Add "-march=armv8.1-a",
"-march=armv8.2-a" and "-march=armv8.2-a+fp16".

>From e165b4e8bc4338608ff9505a7fd1a26d8a996b0a Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Thu, 7 Apr 2016 13:31:24 +0100
Subject: [PATCH 01/17] [PATCH 1/17][ARM] Add ARMv8.2-A command line option and
 profile.

2016-07-04  Matthew Wahab  

	* config/arm/arm-arches.def ("armv8.1-a"): Add FL_CRC32.
	("armv8.2-a"): New.
	("armv8.2-a+fp16"): New.
	* config/arm/arm-protos.h (FL2_ARCH8_2): New.
	(FL2_FP16INST): New.
	(FL2_FOR_ARCH8_2A): New.
	* config/arm/arm-tables.opt: Regenerate.
	* config/arm/arm.c (arm_arch8_2): New.
	(arm_fp16_inst): New.
	(arm_option_override): Set arm_arch8_2 and arm_fp16_inst.  Check
	for incompatible fp16-format settings.
	* config/arm/arm.h (TARGET_VFP_FP16INST): New.
	(TARGET_NEON_FP16INST): New.
	(arm_arch8_2): Declare.
	(arm_fp16_inst): Declare.
	* config/arm/bpabi.h (BE8_LINK_SPEC): Add entries for
	march=armv8.2-a and march=armv8.2-a+fp16.
	* config/arm/t-aprofile (Arch Matches): Add entries for armv8.2-a
	and armv8.2-a+fp16.
	* doc/invoke.texi (ARM Options): Add "-march=armv8.1-a",
	"-march=armv8.2-a" and "-march=armv8.2-a+fp16".
---
 gcc/config/arm/arm-arches.def | 10 --
 gcc/config/arm/arm-protos.h   |  4 
 gcc/config/arm/arm-tables.opt | 10 --
 gcc/config/arm/arm.c  | 15 +++
 gcc/config/arm/arm.h  | 14 ++
 gcc/config/arm/bpabi.h|  4 
 gcc/config/arm/t-aprofile |  2 ++
 gcc/doc/invoke.texi   | 13 +
 8 files changed, 68 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index fd02b18..2b4a80e 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -58,10 +58,16 @@ ARM_ARCH("armv7e-m", cortexm4,  7EM,	ARM_FSET_MAKE_CPU1 (FL_CO_PROC |	  FL_F
 ARM_ARCH("armv8-a", cortexa53,  8A,	ARM_FSET_MAKE_CPU1 (FL_CO_PROC | FL_FOR_ARCH8A))
 ARM_ARCH("armv8-a+crc",cortexa53, 8A,   ARM_FSET_MAKE_CPU1 (FL_CO_PROC | FL_CRC32  | FL_FOR_ARCH8A))
 ARM_ARCH("armv8.1-a", cortexa53,  8A,
-	  ARM_FSET_MAKE (FL_CO_PROC | FL_FOR_ARCH8A,  FL2_FOR_ARCH8_1A))
+	  ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A,
+			 FL2_FOR_ARCH8_1A))
 ARM_ARCH("armv8.1-a+crc",cortexa53, 8A,
 	  ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A,
 			 FL2_FOR_ARCH8_1A))
+ARM_ARCH ("armv8.2-a", cortexa53,  8A,
+	  ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A,
+			 FL2_FOR_ARCH8_2A))
+ARM_ARCH ("armv8.2-a+fp16", cortexa53,  8A,
+	  ARM_FSET_MAKE (FL_CO_PROC | FL_CRC32 | FL_FOR_ARCH8A,
+			 FL2_FOR_ARCH8_2A | FL2_FP16INST))
 ARM_ARCH("iwmmxt",  iwmmxt, 5TE,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT))
 ARM_ARCH("iwmmxt2", iwmmxt2,5TE,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE 

Re: [PATCH v2] S/390: Add support for z13 instructions lochi and locghi.

2016-07-04 Thread Dominik Vogt
On Mon, Jul 04, 2016 at 02:56:06PM +0200, Andreas Krebbel wrote:
> On 07/01/2016 04:31 PM, Dominik Vogt wrote:
> Could you try merging the two testcases into one by putting the lp64 and ! 
> lp64 as condition on the
> scan assembler expressions?

Done.

> Also I don't think it is really necessary to have these multiline matching 
> checks in such a small
> test. It should be enough to just make sure that the expected mnemonic occurs 
> somewhere. Sure this
> wouldn't catch cases where e.g. the mnemonics are in the asm file but not in 
> the right function but
> I think the risk should be really low here.

Done.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.md: Add "z13" cpu_facility.
("*movcc"): Add support for z13 instructions lochi and locghi.
* config/s390/predicates.md ("loc_operand"): New predicate for "load on
condition" type instructions.
gcc/testsuite/ChangeLog

* gcc.target/s390/vector/vec-scalar-cmp-1.c: Expect lochi instead of
locr.
* gcc.target/s390/loc-1.c: New test.
>From facfa26100f69f0e648f6a87534e44498cf9a0d2 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 25 May 2016 11:47:00 +0100
Subject: [PATCH] S/390: Add support for z13 instructions lochi and locghi.

---
 gcc/config/s390/predicates.md  |  7 +++
 gcc/config/s390/s390.md| 24 ++
 gcc/testsuite/gcc.target/s390/loc-1.c  | 22 
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c  |  4 ++--
 4 files changed, 47 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/loc-1.c

diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index e66f4a4..75e4cb8 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -182,6 +182,13 @@
   return s390_contiguous_bitmask_p (INTVAL (op), GET_MODE_BITSIZE (mode), NULL, NULL);
 })
 
+;; Return true if OP is ligitimate for any LOC instruction.
+
+(define_predicate "loc_operand"
+  (ior (match_operand 0 "nonimmediate_operand")
+  (and (match_code "const_int")
+	   (match_test "INTVAL (op) <= 32767 && INTVAL (op) >= -32768"
+
 ;; operators --
 
 ;; Return nonzero if OP is a valid comparison operator
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index f8c61a8..6d8d041 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -483,7 +483,7 @@
   (const (symbol_ref "s390_tune_attr")))
 
 (define_attr "cpu_facility"
-  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vec"
+  "standard,ieee,zarch,cpu_zarch,longdisp,extimm,dfp,z10,z196,zEC12,vec,z13"
   (const_string "standard"))
 
 (define_attr "enabled" ""
@@ -528,7 +528,12 @@
 
  (and (eq_attr "cpu_facility" "vec")
   (match_test "TARGET_VX"))
-	 (const_int 1)]
+	 (const_int 1)
+
+ (and (eq_attr "cpu_facility" "z13")
+  (match_test "TARGET_Z13"))
+	 (const_int 1)
+	 ]
 	(const_int 0)))
 
 ;; Pipeline description for z900.  For lack of anything better,
@@ -6309,21 +6314,23 @@
  XEXP (operands[1], 1));
 })
 
-; locr, loc, stoc, locgr, locg, stocg
+; locr, loc, stoc, locgr, locg, stocg, lochi, locghi
 (define_insn_and_split "*movcc"
-  [(set (match_operand:GPR 0 "nonimmediate_operand"   "=d,d,d,d,S,S,")
+  [(set (match_operand:GPR 0 "nonimmediate_operand"   "=d,d,d,d,d,d,S,S,")
 	(if_then_else:GPR
 	  (match_operator 1 "s390_comparison"
-	[(match_operand 2 "cc_reg_operand"" c,c,c,c,c,c,c")
+	[(match_operand 2 "cc_reg_operand"" c,c,c,c,c,c,c,c,c")
 	 (match_operand 5 "const_int_operand" "")])
-	  (match_operand:GPR 3 "nonimmediate_operand" " d,0,S,0,d,0,S")
-	  (match_operand:GPR 4 "nonimmediate_operand" " 0,d,0,S,0,d,S")))]
+	  (match_operand:GPR 3 "loc_operand" " d,0,S,0,K,0,d,0,S")
+	  (match_operand:GPR 4 "loc_operand" " 0,d,0,S,0,K,0,d,S")))]
   "TARGET_Z196"
   "@
locr%C1\t%0,%3
locr%D1\t%0,%4
loc%C1\t%0,%3
loc%D1\t%0,%4
+   lochi%C1\t%0,%h3
+   lochi%D1\t%0,%h4
stoc%C1\t%3,%0
stoc%D1\t%4,%0
#"
@@ -6340,7 +6347,8 @@
 	 (match_dup 0)
 	 (match_dup 4)))]
   ""
-  [(set_attr "op_type" "RRF,RRF,RSY,RSY,RSY,RSY,*")])
+  [(set_attr "op_type" "RRF,RRF,RSY,RSY,RIE,RIE,RSY,RSY,*")
+   (set_attr "cpu_facility" "*,*,*,*,z13,z13,*,*,*")])
 
 ;;
 ;;- Multiply instructions.
diff --git a/gcc/testsuite/gcc.target/s390/loc-1.c b/gcc/testsuite/gcc.target/s390/loc-1.c
new file mode 100644
index 000..26dbd9c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/loc-1.c
@@ -0,0 +1,22 @@
+/* Test load on condition patterns.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=z13 -mzarch" } */
+
+unsigned long loc_r (unsigned long rc, unsigned long cond, unsigned long val)
+{
+  if (cond)
+rc = val;
+  return rc;
+}
+/* { dg-final { scan-assembler 

Re: [lra] Cleanup the use of offmemok and don't count spilling cost for it

2016-07-04 Thread Bernd Schmidt

On 06/30/2016 07:24 PM, Jiong Wang wrote:

From my understanding, "offmemok" is used to represent a memory operand
who's address we want to reload, and searching of it's reference location
seems confirmed my understanding as it's always used together with MEM_P
check.

So this patch does the following modifications:

  * Only set offmemok to true if MEM_P is also true, as otherwise offmemok
is not used.

>   * Remove redundant MEM_P check which was used together with offmemok.

I really dislike this part. The various _ok variables say what is 
acceptable - the type of the operand doesn't really factor into that. I 
think the code becomes more confusing when merging the two.



  * Avoid the addition of spilling cost if offmemok be true as an address
calculation reload is not spilling.


This part seems to be plausible. I am however unclear how this would fix 
the ICE (if it does - Andreas?) since it only seems to modify cost 
computations. What exactly is preventing the correct sequence of events 
(reloading the address) from triggering without this patch?



Bernd


Re: [PATCH] S/390: Add support for z13 instructions lochi and locghi.

2016-07-04 Thread Andreas Krebbel
On 07/01/2016 04:31 PM, Dominik Vogt wrote:
> The attached patch adds patterns to make use of the z13 LOCHI and
> LOCGHI instructions.
> 
> Tested on s390x biarch and s390, regression tested on s390x.
> 
> Ciao
> 
> Dominik ^_^  ^_^
> 

Looks good, thanks!

Could you try merging the two testcases into one by putting the lp64 and ! lp64 
as condition on the
scan assembler expressions?

Also I don't think it is really necessary to have these multiline matching 
checks in such a small
test. It should be enough to just make sure that the expected mnemonic occurs 
somewhere. Sure this
wouldn't catch cases where e.g. the mnemonics are in the asm file but not in 
the right function but
I think the risk should be really low here.

Bye,

-Andreas-



Re: Fix fir PR71696 in Libiberty Demangler (6)

2016-07-04 Thread Bernd Schmidt

On 06/30/2016 08:46 AM, Marcel Böhme wrote:

The attached patch fixes the stack overflow in the demangler due to
cycles in the references of “remembered” mangled types
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71696).

The methods demangle_signature and do_arg in cplus-dem.c allow to
“remember” mangled type names that can later be referenced and will
also be demangled. The method demangle_args demangles those types
following any references. So, if there is a cycle in the referencing
(or in the simplest case a self-reference), the method enters
infinite recursion.

The patch tracks the mangled types that are currently being demangled
in a new variable called work->proctypevec. If a referenced type is
currently being demangled, the demangling is marked as not
successful.


I'll defer reviewing these to someone who understands demangling better. 
You might want to Cc Jason.



Bernd




Re: [v3 PATCH] PR libstdc++/71313

2016-07-04 Thread Jonathan Wakely

On 04/07/16 15:31 +0300, Ville Voutilainen wrote:

On 4 July 2016 at 15:27, Ville Voutilainen  wrote:

Tested on Linux-x64.

2016-07-04  Ville Voutilainen  

PR libstdc++/71313
* src/filesystem/ops.cc (remove_all(const path&, error_code&)):
Call remove_all for children of a directory.
* testsuite/experimental/filesystem/operations/create_directories.cc:
Adjust.


Minor tidy-up, use std::uintmax_t in the test instead of int.


OK for trunk + gcc-6-branch + gcc-5-branch, thanks.



Re: [v3 PATCH] PR libstdc++/71313

2016-07-04 Thread Ville Voutilainen
On 4 July 2016 at 15:27, Ville Voutilainen  wrote:
> Tested on Linux-x64.
>
> 2016-07-04  Ville Voutilainen  
>
> PR libstdc++/71313
> * src/filesystem/ops.cc (remove_all(const path&, error_code&)):
> Call remove_all for children of a directory.
> * testsuite/experimental/filesystem/operations/create_directories.cc:
> Adjust.

Minor tidy-up, use std::uintmax_t in the test instead of int.
diff --git a/libstdc++-v3/src/filesystem/ops.cc 
b/libstdc++-v3/src/filesystem/ops.cc
index 67ed8e6..9fb5b639 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -1194,7 +1194,7 @@ fs::remove_all(const path& p, error_code& ec) noexcept
   uintmax_t count = 0;
   if (ec.value() == 0 && fs.type() == file_type::directory)
 for (directory_iterator d(p, ec), end; ec.value() == 0 && d != end; ++d)
-  count += fs::remove(d->path(), ec);
+  count += fs::remove_all(d->path(), ec);
   if (ec.value())
 return -1;
   return fs::remove(p, ec) ? ++count : -1;  // fs:remove() calls ec.clear()
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
index 4be41a6..a52efe4 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
@@ -65,7 +65,8 @@ test01()
   VERIFY( b );
   VERIFY( is_directory(p/"./d4/../d5") );
 
-  remove_all(p, ec);
+  std::uintmax_t count = remove_all(p, ec);
+  VERIFY( count == 6 );
 }
 
 int


[v3 PATCH] PR libstdc++/71313

2016-07-04 Thread Ville Voutilainen
Tested on Linux-x64.

2016-07-04  Ville Voutilainen  

PR libstdc++/71313
* src/filesystem/ops.cc (remove_all(const path&, error_code&)):
Call remove_all for children of a directory.
* testsuite/experimental/filesystem/operations/create_directories.cc:
Adjust.
diff --git a/libstdc++-v3/src/filesystem/ops.cc 
b/libstdc++-v3/src/filesystem/ops.cc
index 67ed8e6..9fb5b639 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -1194,7 +1194,7 @@ fs::remove_all(const path& p, error_code& ec) noexcept
   uintmax_t count = 0;
   if (ec.value() == 0 && fs.type() == file_type::directory)
 for (directory_iterator d(p, ec), end; ec.value() == 0 && d != end; ++d)
-  count += fs::remove(d->path(), ec);
+  count += fs::remove_all(d->path(), ec);
   if (ec.value())
 return -1;
   return fs::remove(p, ec) ? ++count : -1;  // fs:remove() calls ec.clear()
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
index 4be41a6..7091ab4 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/create_directories.cc
@@ -65,7 +65,8 @@ test01()
   VERIFY( b );
   VERIFY( is_directory(p/"./d4/../d5") );
 
-  remove_all(p, ec);
+  int count = remove_all(p, ec);
+  VERIFY( count == 6 );
 }
 
 int


Re: [PATCH v2] Allocate constant size dynamic stack space in the prologue

2016-07-04 Thread Dominik Vogt
Version 4 with the following change:

 * Rebased on top of the "Minor cleanup to
   allocate_dynamic_stack_space" patch.  The "Drop excess size
   used for run time allocated stack variables." path needs an
   update because it touches the dsame code as the patch in this
   message.

Ran the testsuite on s390x biarch, s390 and x86_64.

On Fri, Jun 24, 2016 at 01:30:44PM +0100, Dominik Vogt wrote:
> > The only open question I'm aware of is the
> > stack-usage-2.c test.  I guess foo3() will not generate
> > 
> >   stack usage might be ... bytes
> > 
> > On any target anymore, and using alloca() with a constant size
> > results in "unbounded".  It's unclear to me whether that message
> > is ever generated, and if so, how to trigger it.

This point is still open.  If nobody has more comments Andreas
will commit the (afaik already approved) patch soon and we can
clean up the test case in a follow up patch.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* cfgexpand.c (expand_stack_vars): Implement synamic stack space
allocation in the prologue.
* explow.c (get_dynamic_stack_base): New function to return an address
expression for the dynamic stack base.
(get_dynamic_stack_size): New function to do the required dynamic stack
space size calculations.
(allocate_dynamic_stack_space): Use new functions.
(align_dynamic_address): Move some code from
allocate_dynamic_stack_space to new function.
* explow.h (get_dynamic_stack_base, get_dynamic_stack_size): Export.
gcc/testsuite/ChangeLog

* gcc.target/s390/warn-dynamicstack-1.c: New test.
* gcc.dg/stack-usage-2.c (foo3): Adapt expected warning.
stack-layout-dynamic-1.c: New test.
>From 83fafd37883e1af3deb2ff13b9fcaefc9d3b7c7e Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 25 Nov 2015 09:31:19 +0100
Subject: [PATCH] Allocate constant size dynamic stack space in the
 prologue ...

... and place it in the virtual stack vars area, if the platform supports it.
On S/390 this saves adjusting the stack pointer twice and forcing the frame
pointer into existence.  It also removes the warning with -mwarn-dynamicstack
that is triggered by cfun->calls_alloca == 1.

This fixes a problem with the Linux kernel which aligns the page structure to
16 bytes at run time using inefficient code and issuing a bogus warning.
---
 gcc/cfgexpand.c|  21 +-
 gcc/explow.c   | 226 ++---
 gcc/explow.h   |   8 +
 gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c  |  14 ++
 gcc/testsuite/gcc.dg/stack-usage-2.c   |   4 +-
 .../gcc.target/s390/warn-dynamicstack-1.c  |  17 ++
 6 files changed, 209 insertions(+), 81 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/stack-layout-dynamic-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/warn-dynamicstack-1.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 56ef71d..f0ef80f 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1052,7 +1052,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
   size_t si, i, j, n = stack_vars_num;
   HOST_WIDE_INT large_size = 0, large_alloc = 0;
   rtx large_base = NULL;
+  rtx large_allocsize = NULL;
   unsigned large_align = 0;
+  bool large_allocation_done = false;
   tree decl;
 
   /* Determine if there are any variables requiring "large" alignment.
@@ -1099,8 +1101,10 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 
   /* If there were any, allocate space.  */
   if (large_size > 0)
-	large_base = allocate_dynamic_stack_space (GEN_INT (large_size), 0,
-		   large_align, true);
+	{
+	  large_allocsize = GEN_INT (large_size);
+	  get_dynamic_stack_size (_allocsize, 0, large_align, NULL);
+	}
 }
 
   for (si = 0; si < n; ++si)
@@ -1186,6 +1190,19 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data)
 	  /* Large alignment is only processed in the last pass.  */
 	  if (pred)
 	continue;
+
+	  if (large_allocsize && ! large_allocation_done)
+	{
+	  /* Allocate space the virtual stack vars area in the
+	 prologue.  */
+	  HOST_WIDE_INT loffset;
+
+	  loffset = alloc_stack_frame_space
+		(INTVAL (large_allocsize),
+		 PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
+	  large_base = get_dynamic_stack_base (loffset, large_align);
+	  large_allocation_done = true;
+	}
 	  gcc_assert (large_base != NULL);
 
 	  large_alloc += alignb - 1;
diff --git a/gcc/explow.c b/gcc/explow.c
index 09a0330..d505e98 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1146,82 +1146,55 @@ record_new_stack_level (void)
 update_sjlj_context ();
 }
 
-/* Return an rtx representing the address of an area of memory dynamically
-   pushed on the stack.
+/* Return an rtx doing runtime alignment to REQUIRED_ALIGN on 

Re: [Ping, Patch, Fortran, 71623, v1][5/6/7 Regression] Segfault when allocating deferred-length characters to size of a pointer

2016-07-04 Thread Andre Vehreschild
Ping!

Additionally did I get the time to check the patch for gcc-6 and -5. It
applies cleanly to both and has no regressions. Ok for trunk and with
one week delay for gcc-6 and -5?

- Andre

On Wed, 29 Jun 2016 18:43:27 +0200
Andre Vehreschild  wrote:

> Hi all,
> 
> the attached patch fixes the regression at least for trunk (I haven't
> checked the others, but for 6 it should do either, 5 may need a little
> bit more work). The issue here is that computing the typespec involved
> code in se.pre that was not merged to the parent block.
> 
> Bootstrapped and regtested on x86_64-linux-gnu/F23? Ok for trunk?
> 
> I promise to do a backport to gcc-6 and -5 next week, once this patch
> has been lifing in trunk a few days.
> 
> Regards,
>   Andre


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 84bf749..5aa7778 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -5696,9 +5696,11 @@ gfc_trans_allocate (gfc_code * code)
 	  tmp = gfc_get_char_type (code->ext.alloc.ts.kind);
 	  tmp = TYPE_SIZE_UNIT (tmp);
 	  tmp = fold_convert (TREE_TYPE (se_sz.expr), tmp);
+	  gfc_add_block_to_block (, _sz.pre);
 	  expr3_esize = fold_build2_loc (input_location, MULT_EXPR,
 	 TREE_TYPE (se_sz.expr),
 	 tmp, se_sz.expr);
+	  expr3_esize = gfc_evaluate_now (expr3_esize, );
 	}
 }
 
@@ -5897,6 +5899,7 @@ gfc_trans_allocate (gfc_code * code)
 		 source= or mold= expression.  */
 	  gfc_init_se (_sz, NULL);
 	  gfc_conv_expr (_sz, code->ext.alloc.ts.u.cl->length);
+	  gfc_add_block_to_block (, _sz.pre);
 	  gfc_add_modify (, al_len,
 			  fold_convert (TREE_TYPE (al_len),
 	se_sz.expr));
@@ -5981,11 +5984,19 @@ gfc_trans_allocate (gfc_code * code)
 		 specified by a type spec for deferred length character
 		 arrays or unlimited polymorphic objects without a
 		 source= or mold= expression.  */
-	  gfc_init_se (_sz, NULL);
-	  gfc_conv_expr (_sz, code->ext.alloc.ts.u.cl->length);
-	  gfc_add_modify (, al_len,
-			  fold_convert (TREE_TYPE (al_len),
-	se_sz.expr));
+	  if (expr3_esize == NULL_TREE || code->ext.alloc.ts.kind != 1)
+		{
+		  gfc_init_se (_sz, NULL);
+		  gfc_conv_expr (_sz, code->ext.alloc.ts.u.cl->length);
+		  gfc_add_block_to_block (, _sz.pre);
+		  gfc_add_modify (, al_len,
+  fold_convert (TREE_TYPE (al_len),
+		se_sz.expr));
+		}
+	  else
+		gfc_add_modify (, al_len,
+fold_convert (TREE_TYPE (al_len),
+	  expr3_esize));
 	}
 	  else
 	/* No length information needed, because type to allocate
diff --git a/gcc/testsuite/gfortran.dg/deferred_character_17.f90 b/gcc/testsuite/gfortran.dg/deferred_character_17.f90
new file mode 100644
index 000..5a9d725
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/deferred_character_17.f90
@@ -0,0 +1,13 @@
+!{ dg-do run }
+
+! Check fix for PR fortran/71623
+
+program allocatemvce
+  implicit none
+  character(len=:), allocatable :: string
+  integer, dimension(4), target :: array = [1,2,3,4]
+  integer, dimension(:), pointer :: array_ptr
+  array_ptr => array
+  ! The allocate used to segfault
+  allocate(character(len=size(array_ptr))::string)
+end program allocatemvce
gcc/fortran/ChangeLog:

2016-06-28  Andre Vehreschild  

PR fortran/71623
* trans-stmt.c (gfc_trans_allocate): Add code of pre block of typespec
in allocate to parent block.

gcc/testsuite/ChangeLog:

2016-06-28  Andre Vehreschild  

PR fortran/71623
* gfortran.dg/deferred_character_17.f90: New test.




[PATCH] Add code-hoisting to GIMPLE

2016-07-04 Thread Richard Biener

The following patch is Stevens code-hoisting based on PRE forward-ported
and fixed for bootstrap plus the case of hoisting code across loops
which we generally do not want (expressions in the loop exit target block
are antic-in throughout the whole loop unless they are killed and thus
get inserted into the exit block and then PREd before the loop).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I'm going to try making the bitmap_set ops in do_hoist_insert a bit
faster - Steven, do you remember any issues with the approach from the
time you worked on it?

Thanks,
Richard.

2016-07-04  Steven Bosscher  

PR tree-optimization/23286
PR tree-optimization/70159
* doc/invoke.texi: Document -ftree-hoist.
* common.opt (ftree-hoist): New flag.
* opts.c (default_options_table): Enable -ftree-hoist at -O2+.
* tree-ssa-pre.c (pre_stats): Add hoist_insert.
(do_regular_insertion): Rename to ...
(do_pre_regular_insertion): ... this and amend general comments
on insertion strathegy.
(do_partial_partial_insertion): Rename to ...
(do_pre_partial_partial_insertion): ... this.
(do_hoist_insertion): New function.
(insert_aux): Take flags on whether to do PRE and/or hoist insertion
and call do_hoist_insertion properly.
(insert): Adjust.
(pass_pre::gate): Enable also if -ftree-hoist is enabled.
(pass_pre::execute): Register hoist_insert stats.

* gcc.dg/tree-ssa/ssa-pre-11.c: Disable code hosting.
* gcc.dg/tree-ssa/ssa-pre-27.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-28.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-2.c: Likewise.
* gcc.dg/tree-ssa/pr35286.c: Likewise.
* gcc.dg/tree-ssa/pr35287.c: Likewise.
* gcc.dg/tree-ssa/loadpre3.c: Adjust so hosting doesn't apply.
* gcc.dg/tree-ssa/pr43491.c: Scan optimized dump for desired result.
* gcc.dg/tree-ssa/ssa-hoist-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-2.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-3.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-4.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-5.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-6.c: New testcase.

Index: gcc/doc/invoke.texi
===
*** gcc/doc/invoke.texi.orig2016-07-04 11:31:18.171727125 +0200
--- gcc/doc/invoke.texi 2016-07-04 11:31:33.391904573 +0200
*** Objective-C and Objective-C++ Dialects}.
*** 404,410 
  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
  -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
! -ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
  -ftree-loop-if-convert-stores -ftree-loop-im @gol
  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
--- 404,410 
  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
  -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
  -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
! -ftree-dse -ftree-forwprop -ftree-fre -ftree-hoist -ftree-loop-if-convert @gol
  -ftree-loop-if-convert-stores -ftree-loop-im @gol
  -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
  -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
*** also turns on the following optimization
*** 6372,6377 
--- 6372,6378 
  -fstrict-aliasing -fstrict-overflow @gol
  -ftree-builtin-call-dce @gol
  -ftree-switch-conversion -ftree-tail-merge @gol
+ -ftree-hoist @gol
  -ftree-pre @gol
  -ftree-vrp @gol
  -fipa-ra}
*** and the @option{large-stack-frame-growth
*** 7265,7270 
--- 7266,7279 
  Perform reassociation on trees.  This flag is enabled by default
  at @option{-O} and higher.
  
+ @item -ftree-hoist
+ @opindex ftree-hoist
+ Perform code hoisting on trees.  Code hoisting tries to move the
+ evaluation of expressions executed on all paths to the function exit
+ as early as possible.  This is especially useful as a code size
+ optimization, but it often helps for code speed as well.
+ This flag is enabled by defailt at @option{-O2} and higher.
+ 
  @item -ftree-pre
  @opindex ftree-pre
  Perform partial redundancy elimination (PRE) on trees.  This flag is
*** Dump each function after STORE-CCP@.  Th
*** 1,12229 
  
  @item pre
  @opindex fdump-tree-pre
! Dump trees after partial redundancy elimination.  The file name is made
! by appending @file{.pre} to the source file name.
  
  @item fre
  @opindex fdump-tree-fre
--- 12231,12238 
  
  @item pre
  @opindex fdump-tree-pre
! Dump trees after partial redundancy elimination and/or code hoisting.
! The file name is 

Re: [PATCH][RTL ifcvt] PR rtl-optimization/71594: ICE in noce_emit_cmove due to mismatched source modes

2016-07-04 Thread Bernd Schmidt

On 07/04/2016 01:18 PM, Kyrill Tkachov wrote:

That does seem like it could cause trouble but I couldn't think of how
that sequence could appear or what its
semantics would be. Would assigning to the SImode reg 0 in your example
not touch the upper bits of the DImode value?


No, multi-word subreg accesses are per-word.


In any case, bb_ok_for_noce_convert_multiple_sets doesn't keep track of
dependencies between the instructions
so I think the best place to handle this case would be in
noce_convert_multiple_sets where instead of the assert
in this patch we'd just end_sequence () and return FALSE.
Would that be preferable?


That should at least work, and I'd be ok with that.


Bernd


Re: [PATCH][RTL ifcvt] PR rtl-optimization/71594: ICE in noce_emit_cmove due to mismatched source modes

2016-07-04 Thread Kyrill Tkachov

Hi Bernd,

On 04/07/16 12:08, Bernd Schmidt wrote:



On 07/04/2016 12:28 PM, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01731.html

Thanks,
Kyrill

On 24/06/16 09:32, Kyrill Tkachov wrote:

Hi all,

In this PR we get an ICE when trying to emit a conditional move
through noce_convert_multiple_sets.
The comment in the patch explains the situation but we get a
two-instruction sequence like:
(insn 20 19 21 3 (set (reg:SI 89 [ _5 ])
(reg:SI 88 [ _4 ])) wice.c:8 82 {*movsi_internal}
 (nil))
(insn 21 20 25 3 (set (reg:HI 90 [ a_lsm.10 ])
(subreg:HI (reg:SI 89 [ _5 ]) 0)) wice.c:8 84 {*movhi_internal}
 (nil))



+
+  /* We allow simple lowpart register subreg SET sources in
+ bb_ok_for_noce_convert_multiple_sets.  Be careful when processing
+ sequences like:
+ (set (reg:SI r1) (reg:SI r2))
+ (set (reg:HI r3) (subreg:HI (r1)))
+ For the second insn new_val or old_val (r1 in this example) will be
+ taken from the temporaries and have the wider mode which will not
+ match with the mode of the other source of the conditional move, so
+ we'll end up trying to emit r4:HI = cond ? (r1:SI) : (r3:HI).
+ Wrap the two cmove operands into subregs if appropriate to prevent
+ that.  */
+  if (GET_MODE (new_val) != GET_MODE (temp))
+{
+  machine_mode src_mode = GET_MODE (new_val);
+  machine_mode dst_mode = GET_MODE (temp);
+  gcc_assert (GET_MODE_SIZE (src_mode) > GET_MODE_SIZE (dst_mode));
+  new_val = lowpart_subreg (dst_mode, new_val, src_mode);


The question I have would be what happens if you have the inverse of the 
sequence you expect, maybe with multi-word regs?

(set (reg:SI 0) (reg:SI x))
(set (reg:DI y) (reg:DI 0))

That seems like it would fail the assert. Maybe this is something we need to 
catch in the bb_ok function.



That does seem like it could cause trouble but I couldn't think of how that 
sequence could appear or what its
semantics would be. Would assigning to the SImode reg 0 in your example not 
touch the upper bits of the DImode
value?

In any case, bb_ok_for_noce_convert_multiple_sets doesn't keep track of 
dependencies between the instructions
so I think the best place to handle this case would be in 
noce_convert_multiple_sets where instead of the assert
in this patch we'd just end_sequence () and return FALSE.
Would that be preferable?

Thanks for the review,
Kyrill




Bernd





Re: [PATCH][RTL ifcvt] PR rtl-optimization/71594: ICE in noce_emit_cmove due to mismatched source modes

2016-07-04 Thread Bernd Schmidt



On 07/04/2016 12:28 PM, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01731.html

Thanks,
Kyrill

On 24/06/16 09:32, Kyrill Tkachov wrote:

Hi all,

In this PR we get an ICE when trying to emit a conditional move
through noce_convert_multiple_sets.
The comment in the patch explains the situation but we get a
two-instruction sequence like:
(insn 20 19 21 3 (set (reg:SI 89 [ _5 ])
(reg:SI 88 [ _4 ])) wice.c:8 82 {*movsi_internal}
 (nil))
(insn 21 20 25 3 (set (reg:HI 90 [ a_lsm.10 ])
(subreg:HI (reg:SI 89 [ _5 ]) 0)) wice.c:8 84 {*movhi_internal}
 (nil))



+
+  /* We allow simple lowpart register subreg SET sources in
+bb_ok_for_noce_convert_multiple_sets.  Be careful when processing
+sequences like:
+(set (reg:SI r1) (reg:SI r2))
+(set (reg:HI r3) (subreg:HI (r1)))
+For the second insn new_val or old_val (r1 in this example) will be
+taken from the temporaries and have the wider mode which will not
+match with the mode of the other source of the conditional move, so
+we'll end up trying to emit r4:HI = cond ? (r1:SI) : (r3:HI).
+Wrap the two cmove operands into subregs if appropriate to prevent
+that.  */
+  if (GET_MODE (new_val) != GET_MODE (temp))
+   {
+ machine_mode src_mode = GET_MODE (new_val);
+ machine_mode dst_mode = GET_MODE (temp);
+ gcc_assert (GET_MODE_SIZE (src_mode) > GET_MODE_SIZE (dst_mode));
+ new_val = lowpart_subreg (dst_mode, new_val, src_mode);


The question I have would be what happens if you have the inverse of the 
sequence you expect, maybe with multi-word regs?


(set (reg:SI 0) (reg:SI x))
(set (reg:DI y) (reg:DI 0))

That seems like it would fail the assert. Maybe this is something we 
need to catch in the bb_ok function.



Bernd


Re: [PATCH 2/3] Add support for arm*-*-phoenix* targets.

2016-07-04 Thread Jakub Sejdak
Ping. If this is OK for both branches (or at least one) would you
rather prefer separate patch?

2016-06-23 9:37 GMT+02:00 Jakub Sejdak :
> How about backporting this to gcc-6 and gcc-5?
>
> 2016-06-21 22:10 GMT+02:00 Jeff Law :
>> On 06/15/2016 08:22 AM, Kuba Sejdak wrote:
>>>
>>> Is it ok for trunk? If possible, If possible, please merge it also to
>>> GCC-6 and GCC-5 branches.
>>>
>>> 2016-06-15  Jakub Sejdak  
>>>
>>>* config.gcc: Add support for arm*-*-phoenix* targets.
>>>* config/arm/t-phoenix: New.
>>>* config/phoenix.h: New.
>>>
>>> ---
>>>  gcc/ChangeLog|  6 ++
>>>  gcc/config.gcc   | 11 +++
>>>  gcc/config/arm/t-phoenix | 29 +
>>>  gcc/config/phoenix.h | 33 +
>>>  4 files changed, 79 insertions(+)
>>>  create mode 100644 gcc/config/arm/t-phoenix
>>>  create mode 100644 gcc/config/phoenix.h
>>>
>>
>>> +arm*-*-phoenix*)
>>> +   tm_file="dbxelf.h elfos.h arm/unknown-elf.h arm/elf.h arm/bpabi.h"
>>> +   tm_file="${tm_file} newlib-stdint.h phoenix.h"
>>> +   tm_file="${tm_file} arm/aout.h arm/arm.h"
>>> +   tmake_file="${tmake_file} arm/t-arm arm/t-bpabi arm/t-phoenix"
>>
>> Do you really need dbxelf.h?  We're trying to get away from stabs, so unless
>> there's a strong need, avoid dbxelf.h :-)
>>
>> OK for the trunk with dbxelf.h removed.
>>
>> jeff
>
>
>
> --
> Jakub Sejdak
> Software Engineer
> Phoenix Systems (www.phoesys.com)
> +48 608 050 163



-- 
Jakub Sejdak
Software Engineer
Phoenix Systems (www.phoesys.com)
+48 608 050 163


Re: [PATCH 3/3] Add support for arm*-*-phoenix* targets in libgcc.

2016-07-04 Thread Jakub Sejdak
Ping. If this is OK for both branches (or at least one) would you
rather prefer separate patch?

2016-06-23 9:37 GMT+02:00 Jakub Sejdak :
> How about backporting this to gcc-6 and gcc-5?
>
> 2016-06-21 22:11 GMT+02:00 Jeff Law :
>> On 06/15/2016 08:22 AM, Kuba Sejdak wrote:
>>>
>>> Is it ok for trunk? If possible, If possible, please merge it also to
>>> GCC-6 and GCC-5 branches.
>>>
>>> 2016-06-15  Jakub Sejdak  
>>>
>>>* config.host: Add suport for arm*-*-phoenix* targets.
>>
>> OK for the trunk.
>>
>> jeff
>>
>
>
>
> --
> Jakub Sejdak
> Software Engineer
> Phoenix Systems (www.phoesys.com)
> +48 608 050 163



-- 
Jakub Sejdak
Software Engineer
Phoenix Systems (www.phoesys.com)
+48 608 050 163


Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-07-04 Thread Richard Biener
On Fri, 1 Jul 2016, Martin Sebor wrote:

> The attached patch enhances compile-time checking for buffer overflow
> and output truncation in non-trivial calls to the sprintf family of
> functions under a new option -Wformat-length=[12].  This initial
> patch handles printf directives with string, integer, and simple
> floating arguments but eventually I'd like to extend it all other
> functions and directives for which it makes sense.
> 
> I made some choices in the implementation that resulted in trade-offs
> in the quality of the diagnostics.  I would be grateful for comments
> and suggestions how to improve them.  Besides the list I include
> Jakub who already gave me some feedback (thanks), Joseph who as
> I understand has deep knowledge of the c-format.c code, and Richard
> for his input on the LTO concern below.
> 
> 1) Making use of -Wformat machinery in c-family/c-format.c.  This
>seemed preferable to duplicating some of the same code elsewhere
>(I initially started implementing it in expand_builtin in
>builtins.c).  It makes the implementation readily extensible
>to all the same formats as those already handled for -Wformat.
>One drawback is that unlike in expand_builtin, calls to these
>functions cannot readily be folded.  Another drawback pointed

folded?  You mean this -W option changes code generation?

>out by Jakub is that since the code is only available in the
>C and C++ compilers, it apparently may not be available with
>an LTO compiler (I don't completely understand this problem
>but I mention it in the interest of full disclosure). In light
>of the dependency in (2) below, I don't see a way to avoid it
>(moving c-format.c to the middle end was suggested but seemed
>like too much of a change to me).

Yes, lto1 is not linked with C_COMMON_OBJS (that could be changed
of course at the expense of dragging in some dead code).  Moving
all the format stuff to the middle-end (or separated better so
the overhead in lto1 is lower) would be possible as well.

That said, a langhook as you add it highlights the issue with LTO.

Richard.

> 2) Optimization.
>In keeping with the other -Wformat options, the checking is
>enabled without optimization.  Especially at level 2, the
>warnings can be useful even without it.  But to make buffer
>sizes and non-constant argument values available in calls to
>functions like sprintf (via __builtin_object_size) better
>results are obtained with optimization.
> 
> 3) Truncation warnings.
>Although calls to bounded functions like snprintf aren't subject
>to buffer overflow, they can be subject to accidental truncation
>when the destination buffer isn't sized appropriately.  With the
>patch, such calls are diagnosed under the same option, but I
>wonder if have a separate warning option for them might be
>preferable (e.g., -Wformat-trunc=[01] or something like that).
>Independently, it might be useful to differentiate between
>truncating calls that check the return value and those that
>don't.
> 
> Besides the usual testing I compiled several packages with the
> warning.  If found a few bugs in boundary cases in Binutils that
> are being fixed.
> 
> Thanks
> Martin
> 
> PS There are a few FIXME notes in the patch that I will either
> fix or remove, depending on feedback, before committing the
> patch.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [ARM][testsuite] neon-testgen.ml removal

2016-07-04 Thread Christophe Lyon
ping ^2 ?
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01615.html

On 27 June 2016 at 12:54, Christophe Lyon  wrote:
> ping?
>
> On 22 June 2016 at 17:52, Christophe Lyon  wrote:
>> Hi,
>>
>> This is a new attempt at removing neon-testgen.ml and generated files.
>>
>> Compared to my previous version several months ago:
>> - I have recently added testcases to make sure we do not lose coverage
>> as described in
>> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02922.html
>> - I now also remove neon.ml as requested by Kyrylo in
>> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01664.html, and moved
>> the remaining hand-written tests up to gcc.target/arm.
>>
>> Doing this, I had to slightly update vst1Q_laneu64-1.c because it's
>> now compiled with more pedantic flags and there was a signed/unsigned
>> char buffer pointer mismatch.
>>
>> Sorry, I had to compress the patch, otherwise it's too large and rejected
>> by the list server.
>>
>> OK?
>>
>> Christophe


Re: [PATCH] Allow fwprop to undo vectorization harm (PR68961)

2016-07-04 Thread Richard Biener
On Sun, 3 Jul 2016, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Wed, 15 Jun 2016, Richard Sandiford wrote:
> >
> >> Richard Biener  writes:
> >> > With the proposed cost change for vector construction we will end up
> >> > vectorizing the testcase in PR68961 again (on x86_64 and likely
> >> > on ppc64le as well after that target gets adjustments).  Currently
> >> > we can't optimize that away again noticing the direct overlap of
> >> > argument and return registers.  The obstackle is
> >> >
> >> > (insn 7 4 8 2 (set (reg:V2DF 93)
> >> > (vec_concat:V2DF (reg/v:DF 91 [ a ])
> >> > (reg/v:DF 92 [ aa ]))) 
> >> > ...
> >> > (insn 21 8 24 2 (set (reg:DI 97 [ D.1756 ])
> >> > (subreg:DI (reg:TI 88 [ D.1756 ]) 0))
> >> > (insn 24 21 11 2 (set (reg:DI 100 [+8 ])
> >> > (subreg:DI (reg:TI 88 [ D.1756 ]) 8))
> >> >
> >> > which we eventually optimize to DFmode subregs of (reg:V2DF 93).
> >> >
> >> > First of all simplify_subreg doesn't handle the subregs of a vec_concat
> >> > (easy fix below).
> >> >
> >> > Then combine doesn't like to simplify the multi-use (it tries some
> >> > parallel it seems).  So I went to forwprop which eventually manages
> >> > to do this but throws away the result (reg:DF 91) or (reg:DF 92)
> >> > because it is not a constant.  Thus I allow arbitrary simplification
> >> > results for SUBREGs of [VEC_]CONCAT operations.  There doesn't seem
> >> > to be a magic flag to tell it to restrict to the case where all
> >> > uses can be simplified or so, nor to restrict simplifications to a REG.
> >> > But I don't see any undesirable simplifications of (subreg 
> >> > ([vec_]concat)).
> >> 
> >> Adding that as a special case to propgate_rtx feels like a hack though :-)
> >> I think:
> >> 
> >> > Index: gcc/fwprop.c
> >> > ===
> >> > *** gcc/fwprop.c (revision 237286)
> >> > --- gcc/fwprop.c (working copy)
> >> > *** propagate_rtx (rtx x, machine_mode mode,
> >> > *** 664,670 
> >> > || (GET_CODE (new_rtx) == SUBREG
> >> >&& REG_P (SUBREG_REG (new_rtx))
> >> >&& (GET_MODE_SIZE (mode)
> >> > !  <= GET_MODE_SIZE (GET_MODE (SUBREG_REG (new_rtx))
> >> >   flags |= PR_CAN_APPEAR;
> >> > if (!varying_mem_p (new_rtx))
> >> >   flags |= PR_HANDLE_MEM;
> >> > --- 664,673 
> >> > || (GET_CODE (new_rtx) == SUBREG
> >> >&& REG_P (SUBREG_REG (new_rtx))
> >> >&& (GET_MODE_SIZE (mode)
> >> > !  <= GET_MODE_SIZE (GET_MODE (SUBREG_REG (new_rtx)
> >> > !   || ((GET_CODE (new_rtx) == VEC_CONCAT
> >> > !   || GET_CODE (new_rtx) == CONCAT)
> >> > !  && GET_CODE (x) == SUBREG))
> >> >   flags |= PR_CAN_APPEAR;
> >> > if (!varying_mem_p (new_rtx))
> >> >   flags |= PR_HANDLE_MEM;
> >> 
> >> ...this if statement should fundamentally only test new_rtx.
> >> E.g. we'd want the same thing for any SUBREG inside X.
> >> 
> >> How about changing:
> >> 
> >>   /* The replacement we made so far is valid, if all of the recursive
> >>  replacements were valid, or we could simplify everything to
> >>  a constant.  */
> >>   return valid_ops || can_appear || CONSTANT_P (tem);
> >> 
> >> so that (REG_P (tem) && !HARD_REGISTER_P (tem)) is also valid?
> >> I suppose that's likely to increase register pressure though,
> >> if only some uses of new_rtx simplify.  (There again, requiring all
> >> uses to be replacable could make hot code the hostage of cold code.)
> >
> > Yes, my fear was about register presure increase for the case not all
> > uses can be replaced (fwprop doesn't seem to have code to verify or
> > require that).
> >
> > I can avoid checking for GET_CODE (x) == SUBREG and add a PR_REG
> > case to restrict REG_P (tem) && !HARD_REGISTER_P (tem) to the
> > new_rtx == [VEC_]CONCAT case for example.
> 
> I don't think that helps though.  There might be other uses of a
> VEC_CONCAT that aren't SUBREGs, in which case we'd have the same
> problem of keeping both values live at once.
> 
> How about restricting the REG_P (tem) && !HARD_REGISTER_P (tem)
> to cases where new_rtx has more words than tem?

So would you really make a simple mode-size check here?  I wonder
which cases are there other than the subreg of [vec_concat] that
would end up with this case.  That is,

  if (REG_P (tem) && !HARD_REGISTER_P (tem)
  && GET_MODE (tem) == GET_MODE_INNER (GET_MODE (new_rtx))
  && (VECTOR_MODE_P (GET_MODE (new_rtx))
  || COMPLEX_MODE_P (GET_MODE (new_rtx
return true;

works as would constraining new_rtx to [VEC_]CONCAT instead of
vector/complex modes.  I'm worried about relaxing it further
as partial int modes also have a GET_MODE_INNER - relaxing to

  && GET_MODE_INNER (GET_MODE (new_rtx)) != GET_MODE (new_rtx)

I mean.  If we restrict it to [VEC_]CONCAT we can even allow
any REG_P && 

Re: [PATCH][RTL ifcvt] PR rtl-optimization/71594: ICE in noce_emit_cmove due to mismatched source modes

2016-07-04 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01731.html

Thanks,
Kyrill

On 24/06/16 09:32, Kyrill Tkachov wrote:

Hi all,

In this PR we get an ICE when trying to emit a conditional move through 
noce_convert_multiple_sets.
The comment in the patch explains the situation but we get a two-instruction 
sequence like:
(insn 20 19 21 3 (set (reg:SI 89 [ _5 ])
(reg:SI 88 [ _4 ])) wice.c:8 82 {*movsi_internal}
 (nil))
(insn 21 20 25 3 (set (reg:HI 90 [ a_lsm.10 ])
(subreg:HI (reg:SI 89 [ _5 ]) 0)) wice.c:8 84 {*movhi_internal}
 (nil))

The first instruction feeds the second, but the second takes the lowpart subreg 
of the first destination.
This leads to the noce_emit_cmove call taking as arguments the first SImode 
destination and the second HImode
destination.  This causes an assertion failure deeper down the line.

The solution in this patch is catch this case and wrap the first destination in 
a lowpart subreg so that the two
operands of the cmove have the same mode.

Bootstrapped and tested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf and 
x86_64-unknown-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-06-24  Kyrylo Tkachov  

PR rtl-optimization/71594
* ifcvt.c (noce_convert_multiple_sets): Wrap new_val or old_val
into subregs of appropriate mode before trying to emit a conditional
move.

2016-06-24  Kyrylo Tkachov  

PR rtl-optimization/71594
* gcc.dg/torture/pr71594.c: New test.




Re: [PATCH, libgcc/ARM 1a/6, ping2] Fix Thumb-1 only == ARMv6-M & Thumb-2 only == ARMv7-M assumptions

2016-07-04 Thread Thomas Preudhomme
Ping?

Best regards,

Thomas

On Monday 27 June 2016 16:52:50 Thomas Preudhomme wrote:
> Ping?
> 
> Best regards,
> 
> Thomas
> 
> On Friday 17 June 2016 18:21:44 Thomas Preudhomme wrote:
> > On Wednesday 01 June 2016 10:00:52 Ramana Radhakrishnan wrote:
> > > Please fix up the macros, post back and redo the test. Otherwise this
> > > is ok from a quick read.
> > 
> > What about the updated patch in attachment? As for the original patch,
> > I've
> > checked that code generation does not change for a number of combinations
> > of ISAs (ARM/Thumb), optimization levels (Os/O2), and architectures
> > (armv4, armv4t, armv5, armv5t, armv5te, armv6, armv6j, armv6k, armv6s-m,
> > armv6kz, armv6t2, armv6z, armv6zk, armv7, armv7-a, armv7e-m, armv7-m,
> > armv7-r, armv7ve, armv8-a, armv8-a+crc, iwmmxt and iwmmxt2).
> > 
> > Note, I renumbered this patch 1a to not make the numbering of other
> > patches
> > look strange. The CLZ part is now in patch 1b/7.
> > 
> > ChangeLog entries are now as follow:
> > 
> > 
> > *** gcc/ChangeLog ***
> > 
> > 2016-05-23  Thomas Preud'homme  
> > 
> > * config/arm/elf.h: Use __ARM_ARCH_ISA_THUMB and
> > __ARM_ARCH_ISA_ARM
> > 
> > to decide whether to prevent some libgcc routines being included for some
> > multilibs rather than __ARM_ARCH_6M__ and add comment to indicate the link
> > between this condition and the one in
> > 
> > libgcc/config/arm/lib1func.S.
> > 
> > *** gcc/testsuite/ChangeLog ***
> > 
> > 2015-11-10  Thomas Preud'homme  
> > 
> > * lib/target-supports.exp (check_effective_target_arm_cortex_m):
> > Use
> > 
> > __ARM_ARCH_ISA_ARM to test for Cortex-M devices.
> > 
> > 
> > *** libgcc/ChangeLog ***
> > 
> > 2016-06-01  Thomas Preud'homme  
> > 
> > * config/arm/bpabi-v6m.S: Clarify what architectures is the
> > implementation suitable for.
> > * config/arm/lib1funcs.S (__prefer_thumb__): Define among other
> > 
> > cases for all Thumb-1 only targets.
> > 
> > (NOT_ISA_TARGET_32BIT): Define for Thumb-1 only targets.
> > (THUMB_LDIV0): Test for NOT_ISA_TARGET_32BIT rather than
> > __ARM_ARCH_6M__.
> > (EQUIV): Likewise.
> > (ARM_FUNC_ALIAS): Likewise.
> > (umodsi3): Add check to __ARM_ARCH_ISA_THUMB != 1 to guard the
> > idiv
> > version.
> > (modsi3): Likewise.
> > (clzsi2): Test for NOT_ISA_TARGET_32BIT rather than
> > __ARM_ARCH_6M__.
> > 
> > (clzdi2): Likewise.
> > 
> > (ctzsi2): Likewise.
> > (L_interwork_call_via_rX): Test for __ARM_ARCH_ISA_ARM rather than
> > __ARM_ARCH_6M__ in guard for checking whether it is defined.
> > (final includes): Test for NOT_ISA_TARGET_32BIT rather than
> > __ARM_ARCH_6M__ and add comment to indicate the connection between
> > this condition and the one in gcc/config/arm/elf.h.
> > * config/arm/libunwind.S: Test for __ARM_ARCH_ISA_THUMB and
> > __ARM_ARCH_ISA_ARM rather than __ARM_ARCH_6M__.
> > * config/arm/t-softfp: Likewise.
> > 
> > Best regards,
> > 
> > Thomas



Re: [PATCH 2/4] PR c++/62314: add fixit hint for "expected ';' after class definition"

2016-07-04 Thread Bernd Schmidt

On 07/01/2016 07:40 PM, David Malcolm wrote:


A better argument is that as of r237712 we now have -fdiagnostics
-parseable-fixits.  This allows for an IDE to offer to automatically
apply a fix-it hint.  Hence by providing a fix-it here, an IDE can
potentially insert the semicolon itself:



In that light, is the patch OK?


Yeah, that's an argument I can buy.


Bernd


[Ada] Gnatfind crash on references to unknown files

2016-07-04 Thread Arnaud Charlet
This patch fixes an obscure bug in gnatfind that could cause it to
crash on references to unknown files. The crash was caused by
dereferencing an uninitialized pointer value, so it was flaky.
No test is available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-07-04  Bob Duff  

* xref_lib.adb (Parse_X_Filename, Parse_Identifier_Info): Ignore
unknown files. Check that File_Nr is in the range of files we
know about. The previous code was checking the lower bound,
but not the upper bound.

Index: xref_lib.adb
===
--- xref_lib.adb(revision 237957)
+++ xref_lib.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1998-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 1998-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -890,9 +890,13 @@
 
   Parse_Token (Ali, Ptr, E_Name);
 
-  --  Exit if the symbol does not match
-  --  or if we have a local symbol and we do not want it
+  --  Exit if the symbol does not match or if we have a local
+  --  symbol and we do not want it or if the file is unknown.
 
+  if File.X_File = Empty_File then
+ return;
+  end if;
+
   if (not Local_Symbols and not E_Global)
 or else (Pattern.Initialized
   and then not Match (Ali (E_Name .. Ptr - 1), Pattern.Entity))
@@ -1261,8 +1265,12 @@
  Ptr := Ptr + 1;
  Parse_Number (Ali, Ptr, File_Nr);
 
- if File_Nr > 0 then
+ --  If the referenced file is unknown, we simply ignore it
+
+ if File_Nr in Dependencies_Tables.First .. Last (File.Dep) then
 File.X_File := File.Dep.Table (File_Nr);
+ else
+File.X_File := Empty_File;
  end if;
 
  Parse_EOL (Ali, Ptr);


[Ada] Spurious type errors because of views confusion in predicate functions

2016-07-04 Thread Arnaud Charlet
In the context of a predicate function the formal and the actual in a call may
have different views of the same type, because of the delayed analysis of
predicates aspects. This patch extends existing code that handles this
discrepancy, to cover private and full views as well.

Executing the following:

   gnatmake -q main
   main

must yield:

   toto

---
with GPR2.Attribute; use GPR2.Attribute;
procedure Main is
   Q_Name : constant GPR2.Attribute.Qualified_Name :=
  GPR2.Attribute.Create ("toto");
begin
   Dump (Q_Name);
end Main;
---
package GPR2 is
   subtype Name_Type is String
 with Dynamic_Predicate => Name_Type'Length > 0;
end GPR2;
---
with Text_IO; use Text_IO;
package body GPR2.Attribute is

   function Create (Name : Name_Type) return Qualified_Name is
   begin
  return Qualified_Name (Name);
   end;

   procedure Dump (Obj : Qualified_Name) is
   begin
  Put_Line (String (Obj));
   end;
end GPR2.Attribute;
---
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
package GPR2.Attribute is

   type Qualified_Name (<>) is private;

   function Create (Name : Name_Type) return Qualified_Name;
   procedure Dump (Obj : Qualified_Name);
private

   type Qualified_Name is new Name_Type;
end GPR2.Attribute;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-07-04  Ed Schonberg  

* sem_ch4.adb (Resolve_One_Call): In the context of a predicate
function the formal and the actual in a call may have different
views of the same type, because of the delayed analysis of
predicates aspects. Extend the patch that handles this potential
discrepancy to handle private and full views as well.
* sem_ch8.adb (Find_Selected_Component): Refine predicate that
produces additional error when an illegal selected component
looks like a prefixed call whose first formal is untagged.

Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 237957)
+++ sem_ch4.adb (working copy)
@@ -3413,9 +3413,17 @@
--  an incomplete type, while resolution of the corresponding
--  predicate function may see the full view, as a consequence
--  of the delayed resolution of the corresponding expressions.
+   --  This can occur in the body of a predicate function, or in
+   --  a call to such.
 
-   elsif Ekind (Etype (Formal)) = E_Incomplete_Type
- and then Full_View (Etype (Formal)) = Etype (Actual)
+   elsif ((Ekind (Current_Scope) = E_Function
+   and then Is_Predicate_Function (Current_Scope))
+ or else (Ekind (Nam) = E_Function
+   and then Is_Predicate_Function (Nam)))
+  and then
+   (Base_Type (Underlying_Type (Etype (Formal))) =
+ Base_Type (Underlying_Type (Etype (Actual
+  and then Serious_Errors_Detected = 0
then
   Set_Etype (Formal, Etype (Actual));
   Next_Actual (Actual);
Index: sem_ch8.adb
===
--- sem_ch8.adb (revision 237957)
+++ sem_ch8.adb (working copy)
@@ -6983,7 +6983,8 @@
 elsif Nkind (P) /= N_Attribute_Reference then
 
--  This may have been meant as a prefixed call to a primitive
-   --  of an untagged type.
+   --  of an untagged type. If it is a function call check type of
+   --  its first formal and add explanation.
 
declare
   F : constant Entity_Id :=
@@ -6992,8 +6993,7 @@
   if Present (F)
 and then Is_Overloadable (F)
 and then Present (First_Entity (F))
-and then Etype (First_Entity (F)) = Etype (P)
-and then not Is_Tagged_Type (Etype (P))
+and then not Is_Tagged_Type (Etype (First_Entity (F)))
   then
  Error_Msg_N
("prefixed call is only allowed for objects "


[Ada] Use chained locations in GNATprove for inherited pre and post

2016-07-04 Thread Arnaud Charlet
When a class-wide pre- or postcondition is inherited by an overriding
subprogram, the locations of the inherited pragma and of its expression
are the same as the locations of the original pragma. This is inconvenient
to distinguish properties proved on the overridden and the overriding
subprograms. This patch changes these locations to use chained locations
in such a case, similarly to what we get on generic instantiations and
inlined subprograms.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-07-04  Yannick Moy  

* sem_ch12.adb, sem_ch12.ads Update calls to
Create_Instantiation_Source to use default argument.
(Adjust_Inherited_Pragma_Sloc): New function to adjust sloc
of inherited pragma.
(Set_Copied_Sloc_For_Inherited_Pragma):
New function that wraps call to Create_Instantiation_Source for
copying an inherited pragma.
(Set_Copied_Sloc_For_Inlined_Body): Update call to
Create_Instantiation_Source with new arguments.
* sem_prag.adb (Build_Pragma_Check_Equivalent): In the case
of inherited pragmas, use the generic machinery to get chained
locations for the pragma and its sub-expressions.
* sinput-c.adb: Adapt to new type Source_File_Record.
* sinput-l.adb, sinput-l.ads (Create_Instantiation_Source):
Add parameter Inherited_Pragma and make parameter Inlined_Body
optional.
* sinput.adb, sinput.ads (Comes_From_Inherited_Pragma): New
function to return when a location comes from an inherited pragma.
(Inherited_Pragma): New function to detect when a location comes
from an inherited pragma.
(Source_File_Record): New component Inherited_Pragma.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 237961)
+++ sem_prag.adb(working copy)
@@ -26395,7 +26395,11 @@
-- Build_Classwide_Expression --

 
-   procedure Build_Classwide_Expression (Prag : Node_Id; Subp : Entity_Id) is
+   procedure Build_Classwide_Expression
+ (Prag: Node_Id;
+  Subp: Entity_Id;
+  Adjust_Sloc : Boolean)
+   is
   function Replace_Entity (N : Node_Id) return Traverse_Result;
   --  Replace reference to formal of inherited operation or to primitive
   --  operation of root type, with corresponding entity for derived type,
@@ -26410,6 +26414,10 @@
  New_E : Entity_Id;
 
   begin
+ if Adjust_Sloc then
+Adjust_Inherited_Pragma_Sloc (N);
+ end if;
+
  if Nkind (N) = N_Identifier
and then Present (Entity (N))
and then
@@ -26576,15 +26584,22 @@
 Next_Formal (Inher_Formal);
 Next_Formal (Subp_Formal);
  end loop;
-  end if;
 
-  --  Copy the original pragma while performing substitutions (if
-  --  applicable).
+ --  Use generic machinery to copy inherited pragma, as if it were an
+ --  instantiation, resetting source locations appropriately, so that
+ --  expressions inside the inherited pragma use chained locations.
+ --  This is used in particular in GNATprove to locate precisely
+ --  messages on a given inherited pragma.
 
-  Check_Prag := New_Copy_Tree (Source => Prag);
+ Set_Copied_Sloc_For_Inherited_Pragma
+   (Unit_Declaration_Node (Subp_Id), Inher_Id);
+ Check_Prag := New_Copy_Tree (Source => Prag);
+ Build_Classwide_Expression (Check_Prag, Subp_Id, Adjust_Sloc => True);
 
-  if Present (Inher_Id) then
- Build_Classwide_Expression (Check_Prag, Subp_Id);
+  --  Otherwise simply copy the original pragma
+
+  else
+ Check_Prag := New_Copy_Tree (Source => Prag);
   end if;
 
   --  Mark the pragma as being internally generated and reset the Analyzed
Index: sem_prag.ads
===
--- sem_prag.ads(revision 237962)
+++ sem_prag.ads(working copy)
@@ -244,16 +244,21 @@
procedure Analyze_Test_Case_In_Decl_Part (N : Node_Id);
--  Perform preanalysis of pragma Test_Case
 
-   procedure Build_Classwide_Expression (Prag : Node_Id; Subp : Entity_Id);
+   procedure Build_Classwide_Expression
+ (Prag: Node_Id;
+  Subp: Entity_Id;
+  Adjust_Sloc : Boolean);
--  Build the expression for an inherited classwide condition. Prag is
--  the pragma constructed from the corresponding aspect of the parent
-   --  subprogram, and Subp is the overridding operation.
-   --  The routine is also called to check whether an inherited operation
-   --  that is not overridden but has inherited conditions need a wrapper,
-   --  because the inherited condition includes calls to other primitives that
-   --  have been overridden. In that case the first argument is the expression
-   --  of 

[Ada] Confusing pragma unreferenced

2016-07-04 Thread Arnaud Charlet
Two pragmas exist - Unmodified and Unreferenced which issue warnings if the
respective entities contained get written or read repectivly. Additionally,
pragma Unreferenced will surpress compiler generated warnings for unread
variables. However, this can lead to confusion about pragma Unreferenced
whereby the assumed meaning would encompass writing as well as reading and
to achive this effect both pragmas would have to be utilized which is
inefficient. This patch adds a new pragma "Unused" to serve as a combination
of Unmodified and Unreferenced.


-- Source --


--  main.adb

with Ada.Text_IO;

--  Context clause
pragma Unused (Ada.Text_IO);--  Warn Unused
pragma Unmodified (Ada.Text_IO);--  Warn Unmodified
pragma Unreferenced (Ada.Text_IO);  --  Valid

procedure Main is

   --  Improper use
   X, Y, Z : Boolean := False;

   --  Non-variable
   procedure Test is begin null; end;
   pragma Unmodified (Test);--  Warn Unmodified
   pragma Unused (Test);--  Warn Unused
   pragma Unreferenced (Test);  --  Valid

   --  Equivalence of Unused to Unmodified + Unreferenced
   pragma Unmodified (X);   --  Valid
   pragma Unmodified (X);   --  Warn Unmodified
   pragma Unreferenced (X); --  Valid
   pragma Unused (Y);   --  Valid

   --  Duplicate error messages
   pragma Unreferenced (X); --  Warn Unreferenced
   pragma Unused (X);   --  Warn Unmodified and Unreferenced
   pragma Unused (Y);   --  Warn Unused
   pragma Unmodified (Y);   --  Warn Unused
   pragma Unreferenced (Y); --  Warn Unused

   --  Proper use
   A, B, C, D : Boolean := True;
   pragma Unmodified (A);   --  Valid
   pragma Unreferenced (B); --  Valid
   pragma Unmodified (C);   --  Valid
   pragma Unreferenced (C); --  Valid
   pragma Unused (D);   --  Valid

begin
   X := True;   --  Warn Unmodified
   Z := X;  --  Warn Unreferenced
   Y := True;   --  Warn Unused
   Z := Y;  --  Warn Unused
   Z := A;  --  Valid
   B := False;  --  Valid
end Main;


-- Compilation and output --


$ gcc -gnatl -c main.adb

[...]

 1. with Ada.Text_IO;
 2.
 3. --  Context clause
 4. pragma Unused (Ada.Text_IO);--  Warn Unused
  |
>>> pragma "Unused" argument must be in same declarative part

 5. pragma Unmodified (Ada.Text_IO);--  Warn Unmodified
  |
>>> pragma "Unmodified" argument must be in same declarative part

 6. pragma Unreferenced (Ada.Text_IO);  --  Valid
 7.
 8. procedure Main is
 9.
10.--  Improper use
11.X, Y, Z : Boolean := False;
12.
13.--  Non-variable
14.procedure Test is begin null; end;
15.pragma Unmodified (Test);--  Warn Unmodified
  |
>>> pragma "Unmodified" can only be applied to a variable

16.pragma Unused (Test);--  Warn Unused
  |
>>> pragma "Unused" can only be applied to a variable

17.pragma Unreferenced (Test);  --  Valid
18.
19.--  Equivalence of Unused to Unmodified + Unreferenced
20.pragma Unmodified (X);   --  Valid
21.pragma Unmodified (X);   --  Warn Unmodified
  |
>>> warning: pragma Unmodified given for "X"

22.pragma Unreferenced (X); --  Valid
23.pragma Unused (Y);   --  Valid
24.
25.--  Duplicate error messages
26.pragma Unreferenced (X); --  Warn Unreferenced
|
>>> warning: pragma Unreferenced given for "X"

27.pragma Unused (X);   --  Warn Unmodified and Unreferenced
  |
>>> warning: pragma Unmodified given for "X"
>>> warning: pragma Unreferenced given for "X"

28.pragma Unused (Y);   --  Warn Unused
  |
>>> warning: pragma Unused given for "Y"

29.pragma Unmodified (Y);   --  Warn Unused
  |
>>> warning: pragma Unused given for "Y"

30.pragma Unreferenced (Y); --  Warn Unused
|
>>> warning: pragma Unused given for "Y"

31.
32.--  Proper use
33.A, B, C, D : Boolean := True;
34.pragma Unmodified (A);   --  Valid
35.pragma Unreferenced (B); --  Valid
36.pragma Unmodified (C);   --  Valid
37.pragma Unreferenced (C); --  Valid
38.pragma Unused (D);   --  Valid
39.
40. begin
41.  

[Ada] Early finalization of ctrl func result clobbers array element

2016-07-04 Thread Arnaud Charlet
This match modifies the expansion of array aggregates to perform in-place
side effect removal when a controlled function call acts as an initialization
expression. This eliminates the transient property of the function call and
ensures the proper order of copy, adjustment, and finalization.


-- Source --


--  types.ads

with Ada.Finalization; use Ada.Finalization;

package Types is
   type Ctrl is new Controlled with record
  Id : Natural := 0;
   end record;

   procedure Adjust (Obj : in out Ctrl);
   procedure Finalize (Obj : in out Ctrl);
   procedure Initialize (Obj : in out Ctrl);

   function Make_Ctrl return Ctrl;

   type Arr_1 is array (1 .. 10) of Ctrl;
   type Arr_2 is array (Integer range <>) of Ctrl;

   type Arr_3 is array (-10 .. -1) of Arr_1;
end Types;

--  types.adb

with Ada.Text_IO; use Ada.Text_IO;

package body Types is
   Id_Gen : Natural := 100;

   procedure Adjust (Obj : in out Ctrl) is
  Old_Id : constant Natural := Obj.Id;
  New_Id : constant Natural := Old_Id + 1;

   begin
  if Old_Id = 0 then
 Put_Line ("ERROR: adjusting finalized object");
  end if;

  Put_Line ("  adj" & Old_Id'Img & " ->" & New_Id'Img);
  Obj.Id := New_Id;
   end Adjust;

   procedure Finalize (Obj : in out Ctrl) is
   begin
  Put_Line ("  fin" & Obj.Id'Img);
  Obj.Id := 0;
   end Finalize;

   procedure Initialize (Obj : in out Ctrl) is
   begin
  Id_Gen := Id_Gen + 100;
  Obj.Id := Id_Gen;
  Put_Line ("  ini" & Obj.Id'Img);
   end Initialize;

   function Make_Ctrl return Ctrl is
   begin
  return Result : Ctrl;
   end Make_Ctrl;
end Types;

--  aggregates.ads

with Types; use Types;

package Aggregates is
   function Func_4 (Build : Boolean) return Arr_3;
end Aggregates;

--  aggregats.adb

package body Aggregates is
   function Func_4 (Build : Boolean) return Arr_3 is
   begin
  if Build then
 return (-4 =>--  1) resolve 6) transient scope
   (others  =>--  2) resolve
 Make_Ctrl),  -- 13) transient scope
 -1 =>
   (others  =>--  3) resolve
 Make_Ctrl),  -- 14) transient scope
 -9 .. -5   =>
   (others  =>-- 10) resolve 11) transient scope
 Make_Ctrl),  -- 12) transient scope
 -10=>
   (1 .. 3  =>--  4) resolve
  Make_Ctrl,  --  8) transient scope
4   =>
  Make_Ctrl,  --  5) transient scope
others  =>
  Make_Ctrl), --  9) transient scope
 others =>
   (1 .. 10 =>--  7) resolve 15) resolve 16) transient s
 Make_Ctrl)); -- 17) transient scope
  else
 raise Program_Error;
  end if;
   end Func_4;
end Aggregates;

--  main.adb

with Ada.Finalization; use Ada.Finalization;
with Ada.Text_IO;  use Ada.Text_IO;
with Aggregates;   use Aggregates;
with Types;use Types;

procedure Main is
begin
   Put_Line ("Complex mixed aggregate");
   declare
 Obj_4 : constant Arr_3 := Func_4 (True);
   begin null; end;

   Put_Line ("End");
end Main;


-- Compilation and output --


$ gnatmake -q main.adb
$ ./main
Complex mixed aggregate
  ini 200
  adj 200 -> 201
  fin 200
  ini 300
  adj 300 -> 301
  fin 300
  adj 301 -> 302
  fin 301
  ini 400
  adj 400 -> 401
  fin 400
  adj 401 -> 402
  fin 401
  ini 500
  adj 500 -> 501
  fin 500
  adj 501 -> 502
  fin 501
  adj 201 -> 202
  ini 600
  adj 600 -> 601
  fin 600
  adj 601 -> 602
  fin 601
  ini 700
  adj 700 -> 701
  fin 700
  adj 701 -> 702
  fin 701
  ini 800
  adj 800 -> 801
  fin 800
  adj 801 -> 802
  fin 801
  ini 900
  adj 900 -> 901
  fin 900
  adj 901 -> 902
  fin 901
  ini 1000
  adj 1000 -> 1001
  fin 1000
  adj 1001 -> 1002
  fin 1001
  ini 1100
  adj 1100 -> 1101
  fin 1100
  adj 1101 -> 1102
  fin 1101
  ini 1200
  adj 1200 -> 1201
  fin 1200
  adj 1201 -> 1202
  fin 1201
  ini 1300
  adj 1300 -> 1301
  fin 1300
  adj 1301 -> 1302
  fin 1301
  ini 1400
  adj 1400 -> 1401
  fin 1400
  adj 1401 -> 1402
  fin 1401
  ini 1500
  adj 1500 -> 1501
  fin 1500
  adj 1501 -> 1502
  fin 1501
  ini 1600
  adj 1600 -> 1601
  fin 1600
  adj 1601 -> 1602
  fin 1601
  ini 1700
  adj 1700 -> 1701
  fin 1700
  adj 1701 -> 1702
  fin 1701
  ini 1800
  adj 1800 -> 1801
  fin 1800
  adj 1801 -> 1802
  fin 1801
  ini 1900
  adj 1900 -> 1901
  fin 1900
  adj 1901 -> 1902
  fin 1901
  ini 2000
  adj 2000 -> 2001
  fin 2000
  adj 2001 -> 2002
  fin 2001
  ini 2100
  adj 2100 -> 2101
  fin 2100
  adj 2101 -> 2102
  fin 2101
  fin 2102
  fin 2002
  fin 1902
  fin 1802
  fin 1702
  fin 1602
  fin 1502
  fin 1402
  fin 1302
  fin 1202
  ini 2200
  adj 2200 -> 2201
  fin 2200
  adj 2201 -> 2202
  fin 2201
  ini 2300
  adj 

Re: [6/7] Explicitly classify vector loads and stores

2016-07-04 Thread Richard Biener
On Sun, Jul 3, 2016 at 7:10 PM, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Wed, Jun 15, 2016 at 10:52 AM, Richard Sandiford
>>  wrote:
>>> This is the main patch in the series.  It adds a new enum and routines
>>> for classifying a vector load or store implementation.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Why's the setting and checking of the memory access type conditional on !slp?
>> I'd rather avoid doing this :/
>
> For loads we need it for hybrid SLP, since we can vectorise the
> same load twice, once for SLP and once not.  (See e.g. pr62075.c.)

Ah, indeed.

> For stores it was unnecessary cut-&-paste.
>
> Is it OK with the !slp restricted to vectorizable_load?

Yes.

Thanks,
Richard.

> Thanks,
> Richard
>


Re: [PATCH 0/9] separate shrink-wrapping

2016-07-04 Thread Segher Boessenkool
On Wed, Jun 29, 2016 at 06:16:10PM -0500, Segher Boessenkool wrote:
> On Thu, Jun 30, 2016 at 01:03:17AM +0200, Bernd Schmidt wrote:
> > On 06/08/2016 07:26 PM, Segher Boessenkool wrote:
> > >One thing I should try is put a USE of the saved registers at such
> > >exits, maybe that helps those passes that now delete frame restores
> > >to not do that.
> > 
> > Have you had a chance to try this?
> 
> Not yet.  I have tried to get dwarf2cfi not to complain when one path
> entering a block has a restore and some other patch doesn't (and mark
> the register as unavailable).  That works great in most cases but it
> seems sometimes this also then happens for exception handlers, which
> is disastrous of course (and horrendous to debug).
> 
> The USE thing should be much easier, I might have results tomorrow,
> if not, next week.

Not so easy actually; but I have results now.

Putting USEs of all separately shrink-wrapped variables at each "strange"
exit (i.e. those blocks that have no successors) helps regrename (makes
the first regrename patch unnecessary), but not any of the other patches
(which makes sense if you look at what they do).  It does not help the
problem the second regrename patch is for though (the patch that disables
regrename altogether if separate shrink-wrapping is used).

So the problem shows up on PowerPC with hash_map_rand on -m32.  This is
a huge testcase, and randomised; sometimes it works, more often it corrupts
various data structures.

I tracked it down to regrename renaming one variable (r9, the first choice
for allocation) to some non-volatile register.  This however is a register
that is not separately shrink-wrapped!  So it seems the problem could
happen always, they just rarely do (one in a million or so testcases, and
almost nobody uses regrename anyway).

If at all blocks without successors I put a USE of every register that
*could* be separately shrink-wrapped, the problem goes away.  I do not
yet know why exactly; I don't understand what regrename does very well.

Doing this requires yet another hook (I hardcoded the PowerPC requirements
for now, adding hooks is painful, probably as it should be ;-) ).  It
feels wrong anyway.  I'll have to dig even deeper to find out if this
can or cannot happen without this patch series.


Segher


Re: [PATCH 2/2] gcc/genrecog: Don't warn for missing mode on special predicates

2016-07-04 Thread Richard Sandiford
Andrew Burgess  writes:
> +/* Return true if OPERAND is a MATCH_OPERAND using a special predicate
> +   function.  */
> +
> +static bool
> +special_predicate_operand_p (rtx operand)
> +{
> +  if (GET_CODE (operand) == MATCH_OPERAND)
> +{
> +  const char *pred_name = predicate_name (operand);
> +  if (pred_name[0] != 0)
> + {
> +   const struct pred_data *pred;
> +
> +   pred = lookup_predicate (pred_name);
> +   return pred->special;

Thanks for removing the duplicated error check for unknown predicates.
I think that error gets reported later though, so we should check for
null here:

  return pred && pred->special;

OK with that change, thanks.

Richard


Re: [AArch64] ARMv8.2 command line and feature macros support

2016-07-04 Thread Jiong Wang

On 29/06/16 09:43, James Greenhalgh wrote:
This is OK for trunk otherwise. 


Thanks. Committed attached patch as r237956.

Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 237955)
+++ gcc/ChangeLog	(revision 237956)
@@ -1,3 +1,20 @@
+2016-07-04  Matthew Wahab  
+	Jiong Wang  
+
+	* config/aarch64/aarch64-arches.def: Add "armv8.2-a".
+	* config/aarch64/aarch64.h (AARCH64_FL_V8_2): New.
+	(AARCH64_FL_F16): New.
+	(AARCH64_FL_FOR_ARCH8_2): New.
+	(AARCH64_ISA_8_2): New.
+	(AARCH64_ISA_F16): New.
+	(TARGET_FP_F16INST): New.
+	(TARGET_SIMD_F16INST): New.
+	* config/aarch64/aarch64-option-extensions.def ("fp16"): New entry.
+	("fp"): Disabling "fp" also disables "fp16".
+	* config/aarch64/aarch64-c.c (arch64_update_cpp_builtins): Conditionally define
+	__ARM_FEATURE_FP16_SCALAR_ARITHMETIC and __ARM_FEATURE_FP16_VECTOR_ARITHMETIC.
+	* doc/invoke.texi (AArch64 Options): Document "armv8.2-a" and "fp16".
+
 2016-07-04  Jan Beulich  
 
 	* gcc.c (default_compilers["@c-header"]): Conditionalize "-o".
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 237955)
+++ gcc/doc/invoke.texi	(revision 237956)
@@ -13101,8 +13101,11 @@
 @option{-march=@var{arch}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}.
 
 The permissible values for @var{arch} are @samp{armv8-a},
-@samp{armv8.1-a} or @var{native}.
+@samp{armv8.1-a}, @samp{armv8.2-a} or @var{native}.
 
+The value @samp{armv8.2-a} implies @samp{armv8.1-a} and enables compiler
+support for the ARMv8.2-A architecture extensions.
+
 The value @samp{armv8.1-a} implies @samp{armv8-a} and enables compiler
 support for the ARMv8.1 architecture extension.  In particular, it
 enables the @samp{+crc} and @samp{+lse} features.
@@ -13208,6 +13211,8 @@
 @item lse
 Enable Large System Extension instructions.  This is on by default for
 @option{-march=armv8.1-a}.
+@item fp16
+Enable FP16 extension.  This also enables floating-point instructions.
 
 @end table
 
Index: gcc/config/aarch64/aarch64-arches.def
===
--- gcc/config/aarch64/aarch64-arches.def	(revision 237955)
+++ gcc/config/aarch64/aarch64-arches.def	(revision 237956)
@@ -32,4 +32,5 @@
 
 AARCH64_ARCH("armv8-a",	  generic,	 8A,	8,  AARCH64_FL_FOR_ARCH8)
 AARCH64_ARCH("armv8.1-a", generic,	 8_1A,	8,  AARCH64_FL_FOR_ARCH8_1)
+AARCH64_ARCH("armv8.2-a", generic,	 8_2A,	8,  AARCH64_FL_FOR_ARCH8_2)
 
Index: gcc/config/aarch64/aarch64-option-extensions.def
===
--- gcc/config/aarch64/aarch64-option-extensions.def	(revision 237955)
+++ gcc/config/aarch64/aarch64-option-extensions.def	(revision 237956)
@@ -39,8 +39,8 @@
that are required.  Their order is not important.  */
 
 /* Enabling "fp" just enables "fp".
-   Disabling "fp" also disables "simd", "crypto".  */
-AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | AARCH64_FL_CRYPTO, "fp")
+   Disabling "fp" also disables "simd", "crypto" and "fp16".  */
+AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | AARCH64_FL_CRYPTO | AARCH64_FL_F16, "fp")
 
 /* Enabling "simd" also enables "fp".
Disabling "simd" also disables "crypto".  */
@@ -55,3 +55,7 @@
 
 /* Enabling or disabling "lse" only changes "lse".  */
 AARCH64_OPT_EXTENSION("lse", AARCH64_FL_LSE, 0, 0, "atomics")
+
+/* Enabling "fp16" also enables "fp".
+   Disabling "fp16" just disables "fp16".  */
+AARCH64_OPT_EXTENSION("fp16", AARCH64_FL_F16, AARCH64_FL_FP, 0, "fp16")
Index: gcc/config/aarch64/aarch64-c.c
===
--- gcc/config/aarch64/aarch64-c.c	(revision 237955)
+++ gcc/config/aarch64/aarch64-c.c	(revision 237956)
@@ -95,6 +95,11 @@
   else
 cpp_undef (pfile, "__ARM_FP");
 
+  aarch64_def_or_undef (TARGET_FP_F16INST,
+			"__ARM_FEATURE_FP16_SCALAR_ARITHMETIC", pfile);
+  aarch64_def_or_undef (TARGET_SIMD_F16INST,
+			"__ARM_FEATURE_FP16_VECTOR_ARITHMETIC", pfile);
+
   aarch64_def_or_undef (TARGET_SIMD, "__ARM_FEATURE_NUMERIC_MAXMIN", pfile);
   aarch64_def_or_undef (TARGET_SIMD, "__ARM_NEON", pfile);
 
Index: gcc/config/aarch64/aarch64.h
===
--- gcc/config/aarch64/aarch64.h	(revision 237955)
+++ gcc/config/aarch64/aarch64.h	(revision 237956)
@@ -135,6 +135,9 @@
 /* ARMv8.1 architecture extensions.  */
 #define AARCH64_FL_LSE	  (1 << 4)  /* Has Large System Extensions.  */
 #define AARCH64_FL_V8_1	  (1 << 5)  /* Has ARMv8.1 extensions.  */
+/* ARMv8.2-A architecture extensions.  */
+#define AARCH64_FL_V8_2	  (1 << 8)  /* Has ARMv8.2-A features.  */
+#define AARCH64_FL_F16	  (1 << 9)  /* Has ARMv8.2-A FP16 extensions.  */
 
 /* Has FP and SIMD.  */
 #define AARCH64_FL_FPSIMD (AARCH64_FL_FP |