RE: [Patch, ARM, ping1] Fix PR target/56846

2014-11-27 Thread Thomas Preud'homme
Thanks. Ccing release manager for their opinion.

Best regards,

Thomas

 -Original Message-
 From: Jonathan Wakely [mailto:jwak...@redhat.com]
 Sent: Wednesday, November 26, 2014 5:33 PM
 To: Thomas Preud'homme
 Cc: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
 g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
 libstd...@gcc.gnu.org
 Subject: Re: [Patch, ARM, ping1] Fix PR target/56846
 
 On 26/11/14 17:23 -, Thomas Preud'homme wrote:
 Ping?
 
 I'm OK with backporting it if a release manager approves it.
 
 
  -Original Message-
  From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
  ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
  Sent: Wednesday, November 19, 2014 6:00 PM
  To: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
  g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
  libstd...@gcc.gnu.org
  Subject: RE: [Patch ARM] Fix PR target/56846
 
   From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
   ow...@gcc.gnu.org] On Behalf Of Tony Wang
  
 
  
   Hi all,
  
   The bug is reported at
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846, and it’s about
 the
   problem that
   when exception handler is involved in the function, then
   _Unwind_Backtrace function will run into deadloop on
   arm target.
 
  The patch (in r215101) can be backported without any change on 4.8
 and
  4.9 branches. I checked in QEMU with and without the patch on both
  branches and it indeed solves the problem.
 
  Testsuite run without regression when compiled with arm-none-eabi
  cross compiler and executed on QEMU emulating Cortex-M3.
 
  I also bootstrapped gcc on x86_64-linux-gnu and run the testsuite
  without
  regressions.
 
  Is it ok for backport?
 
  Best regards,
 
  Thomas
 
 
 
 
 
 
 
 






Re: [PATCH][ARM] Add Cortex-A17 support

2014-11-27 Thread Ramana Radhakrishnan
On Thu, Nov 13, 2014 at 5:25 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 This patch adds support for the Cortex-A17 processor to the arm backend.
 Cortex-A17 is an ARMv7ve core with the same architectural features as the
 Cortex-A7, A12 and A15 cores.
 The -m{tune, cpu}=cortex-a17 option is added and a pipeline description for
 instruction scheduling is provided.
 This has given an uplift over -mcpu=cortex-a15 tuning on a number of
 benchmarks.

 The patch is fairly self-contained with the bulk of the diffstat being the
 pipeline description files.

 Bootstrapped and tested on arm-none-linux-gnueabihf.

 Ok for trunk?


Ok with documentation update in invoke.texi and an entry for changes.html

Ramana

 Thanks,
 Kyrill

 2014-11-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm.md (generic_sched): Specify cortexa17 in 'no' list.
 Include cortex-a17.md.
 * config/arm/arm.c (arm_issue_rate): Specify 2 for cortexa17.
 * config/arm/arm-cores.def (cortex-a17): New entry.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm-tune.md: Regenerate.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Specify mcpu=cortex-a17.
 * config/arm/cortex-a17.md: New file.
 * config/arm/cortex-a17-neon.md: New file.
 * config/arm/driver-arm.c (arm_cpu_table): Add entry for cortex-a17.
 * config/arm/t-aprofile: Add cortex-a17 entries to MULTILIB_MATCHES.


Re: [PATCH][ARM] Use Cortex-A17 tuning parameters for Cortex-A12

2014-11-27 Thread Ramana Radhakrishnan
On Thu, Nov 13, 2014 at 5:25 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 The Cortex-A12 very close to the Cortex-A17 and this patch updates the
 tuning struct
 parameters to match the Cortex-A17 ones.

 This has improved performance in a number of benchmarks that I tried.
 The instruction scheduling is also changed to match the Cortex-A17
 scheduling

 Tested arm-none-eabi.

 Ok for trunk once the Cortex-A17 patch goes in?


Ok.

Ramana

 Thanks,
 Kyrill

 2014-11-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm.c (arm_cortex_a12_tune): Update entries to match
 Cortex-A17 tuning parameters.
 * config/arm/arm-cores.def (cortex-a12): Schedule for cortex-a17.


Re: path fixing PPC bootstrap

2014-11-27 Thread Richard Sandiford
Vladimir Makarov vmaka...@redhat.com writes:
 Mike Meissner pointed me out that my last patch broke PPC bootstrap.  I 
 submitted a quick fix for it and now I am sending the path after 
 bootstrap on ppc is done successfully.

 Sorry for the inconvinience.

 2014-11-25  Vladimir Makarov  vmaka...@redhat.com

  * ira-lives.c (process_bb_node_lives): Make code with conditional
  REAL_PIC_OFFSET_TABLE_REGNUM.

REAL_PIC_OFFSET_TABLE_REGNUM is really an i386-local thing though.
There's no documentation for it in tm.texi and this is the first
use of it in generic code.

I suppose I'm just being lazy and not building and trying it out
for myself, but could you describe the original fix a bit more?
The comment says:

  /* Processing insn usage in call insn can create conflict
 with pic pseudo and pic hard reg and that is wrong.
 Check this situation and fix it at the end of the insn
 processing.  */

but where does the conflict come from?

Thanks,
Richard



Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-27 Thread Richard Biener
On Wed, 26 Nov 2014, Jakub Jelinek wrote:

 On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
  Well, if you want to aggressively prune unused bits then you could 
  back-propagate the shift count value-range.
  
  And note that all the frontend shorten-optimizations should move to the 
  middle-end.
  
  That said, instead of casting to int we could cast to unsigned char...
 
 As long as we don't have  256 bitsize integers, yes, for the purposes of
 GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
 to promote it to wider type if the shift instructions operate on wider shift
 count modes than QImode.

True.  I suppose the promotion pass could promote/demote the shift count
to a mode appropriate for the target.  But that's again orthogonal to
optimizing the computation of the shift count based on its value-range
which does not invoke undefined behavior (thus, shortening of those
computations).  We don't have a pass yet that allows this kind of
back-propagation though of course a simple shortening pass wouldn't
be hard to implement (or similar to what the FEs do now we can add
some patterns to match.pd catching those opportunities).

Richard.


RE: [Patch, ARM, ping1] Fix PR target/56846

2014-11-27 Thread Richard Biener
On Thu, 27 Nov 2014, Thomas Preud'homme wrote:

 Thanks. Ccing release manager for their opinion.

It doesn't look ARM specific and frankly I have not too much expertise
in this area.  The patch has been on trunk for more than two months
though so I guess it is ok to backport.

Richard.

 Best regards,
 
 Thomas
 
  -Original Message-
  From: Jonathan Wakely [mailto:jwak...@redhat.com]
  Sent: Wednesday, November 26, 2014 5:33 PM
  To: Thomas Preud'homme
  Cc: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
  g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
  libstd...@gcc.gnu.org
  Subject: Re: [Patch, ARM, ping1] Fix PR target/56846
  
  On 26/11/14 17:23 -, Thomas Preud'homme wrote:
  Ping?
  
  I'm OK with backporting it if a release manager approves it.
  
  
   -Original Message-
   From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
   ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
   Sent: Wednesday, November 19, 2014 6:00 PM
   To: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
   g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
   libstd...@gcc.gnu.org
   Subject: RE: [Patch ARM] Fix PR target/56846
  
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
ow...@gcc.gnu.org] On Behalf Of Tony Wang
   
  
   
Hi all,
   
The bug is reported at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846, and it’s about
  the
problem that
when exception handler is involved in the function, then
_Unwind_Backtrace function will run into deadloop on
arm target.
  
   The patch (in r215101) can be backported without any change on 4.8
  and
   4.9 branches. I checked in QEMU with and without the patch on both
   branches and it indeed solves the problem.
  
   Testsuite run without regression when compiled with arm-none-eabi
   cross compiler and executed on QEMU emulating Cortex-M3.
  
   I also bootstrapped gcc on x86_64-linux-gnu and run the testsuite
   without
   regressions.
  
   Is it ok for backport?
  
   Best regards,
  
   Thomas
  
  
  
  
  
  
  
  
 
 
 
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany

Re: [Ping] Port of VTV for Cygwin and MinGW

2014-11-27 Thread Patrick Wollgast
On 12.11.2014 19:40, Kai Tietz wrote:
 TerminateProcess is actually bad, as it doesn't call any of the atexit
 handlers.  You simply nuke the process off.  For cygwin this behavior
 is inacceptable.  Why a classical abort, or a classical exit call
 cause for you that issues?  It seems to me more related to some other
 thing you try to paper over by this.
 

It turns out the test program made some trouble. I rewrote it to the
attached program (virtual_func_test_min_AW.cpp). I changed obstack.c and
vtv_rts.cc to the C-runtime functions. For testing I used a program just
containing an abort and all three tests in the attached test program.
The call stack, passed parameters and behavior matched at the crucial
parts (tested again on MinGW 32/64bit).

 
 Regarding the question, why I reimplemented mprotect, I also haven't
 changed anything in the patch but answered the question.
 
 And this doesn't make it better.  It is present in the static part of
 libgcc.  Have you tried to declare it with extern C (for C++ case)
 and simply use it?
 Cygwin provides its own version too.  So there seems to me no real
 need to re-implement it.
 

You're right. I was stuck with the idea of importing it dynamically, but
changed it to extern C now.

Regards,
Patrick
* gcc/config/i386/cygwin.h (STARTFILE_SPEC): Add vtv_start.o,
if -fvtable-verify=preinit/std is used.
* gcc/config/i386/mingw-w64.h (STARTFILE_SPEC): Likewise.
* gcc/config/i386/mingw32.h (STARTFILE_SPEC): Likewise.
* gcc/config/i386/cygwin.h (ENDFILE_SPEC): Add vtv_end.o,
if -fvtable-verify=preinit/std is used.
* gcc/config/i386/mingw32.h (ENDFILE_SPEC): Likewise.
* gcc/config/i386/cygwin.h (LIB_SPEC): Pass -lvtv and -lpsapi,
if -fvtable-verify=preinit/std is used.
* gcc/config/i386/mingw-w64.h (LIB_SPEC): Likewise.
* gcc/config/i386/mingw32.h (LIB_SPEC): Likewise.

* gcc/cp/vtable-class-hierarchy.c (vtv_generate_init_routine): Add
check for not TARGET_PECOFF at the VTV_PREINIT_PRIORITY checks.

* gcc/varasm.c (assemble_variable): Add code to properly set the comdat
section and name for the .vtable_map_vars section in case the
target is PE or COFF.


* libgcc/Makefile.in: Move rules to build vtv_*.o out of the check
for CUSTOM_CRTSTUFF.
* libgcc/config.host (i[34567]86-*-cygwin*, x86_64-*-cygwin*, 
i[34567]86-*-mingw*)
(x86_64-*-mingw*): Only add vtv_*.o to extra_parts if enable_vtable_verify.


* libstdc++-v3/acinclude.m4: Define VTV_CYGMIN.
* libstdc++-v3/configure: Regenerate.

* libstdc++-v3/libsupc++/Makefile.am: Add vtv_sources only to
libsupc___la_SOURCES and libsupc__convenience_la_SOURCES if VTV_CYGMIN is
not set.
* libstdc++-v3/libsupc++/Makefile.in: Regenerated.
* libstdc++-v3/libsupc++/vtv_stubs.cc: Add none weak declaration of every
function for Cygwin and MinGW.

* libstdc++-v3/src/Makefile.am: Add libvtv.la to toolexeclib_LTLIBRARIES,
if VTV_CYGMIN is set. Define libvtv_la_SOURCES, libvtv_la_LDFLAGS,
libvtv_la_AM_CXXFLAGS and libvtv_la_LINK if VTV_CYGMIN is set.
* libstdc++-v3/src/Makefile.in: Regenerate.


* libvtv/Makefile.am : Add libvtv.la to toolexeclib_LTLIBRARIES, if VTV_CYGMIN
is set. Define libvtv_la_LIBADD, libvtv_la_LDFLAGS, libvtv_stubs_la_LDFLAGS
and libvtv_stubs_la_SOURCES if VTV_CYGMIN is set. Add obstac.c to
libvtv_la_SOURCES if VTV_CYGMIN is set.
* libvtv/Makefile.in : Regenerate.
* libvtv/aclocal.m4 : Regenerate.
* libvtv/configure : Regenerate.
* libvtv/configure.ac : Add ACX_LT_HOST_FLAGS. Define VTV_CYGMIN.
* libvtv/configure.tgt : (x86_64-*-cygwin*, i?86-*-cygwin*, x86_64-*-mingw*)
(i?86-*-mingw*): Add to supported targets.
* libvtv/vtv_fail.cc : Skip inclusion of execinfo.h on Cygwin and MinGW.
(log_error_message): Skip calls to backtrace and backtrace_symbols_fd on Cygwin
and MinGW.
* libvtv/vtv_malloc.cc : Include windows.h and skip sys/mman.h inclusion on
Cygwin and MinGW. Add sysconf port on Cygwin and MinGW.
(obstack_chunk_alloc): Exchange call to mmap with call to VirtualAlloc on Cygwin
and MinGW.
(__vtv_malloc_init): Exchange call to sysconf with call to port of sysconf on
Cygwin and MinGW.
* libvtv/vtv_malloc.h : Declare mprotect and define PROT_READ and PROT_WRITE on
Cygwin and MinGW.
* libvtv/map.h : Include stdint.h on MinGW.
* libvtv/rts.cc : Include windows.h, winternl.h and psapi.h, skip include of
execinfo.h, sys/mman.h and link.h on Cygwin and MinGW.
Add port of __fortify_fail on Cygwin and MinGW.
Change ElfW (Addr) to uintptr_t on Cygwin and MinGW.
(read_section_offset_and_length): Add port for Cygwin and MinGW
(iterate_modules): New function.
(vtv_unprotect_vtable_vars): Use iterate_modules instead of dl_iterate_phdr on
Cygwin and MinGW.
(vtv_protect_vtable_vars): Likewise.
(count_all_pages): Likewise.
(dl_iterate_phdr_count_pages): Don't build on Cygwin and MinGW.
* libvtv/utils.cc : Include windows.h and skip execinfo.h inclusion on
Cygwin and MinGW.
(__vtv_open_log): Exchange call to getuid and getpid 

Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-27 Thread Jakub Jelinek
On Thu, Nov 27, 2014 at 10:30:16AM +0100, Richard Biener wrote:
 On Wed, 26 Nov 2014, Jakub Jelinek wrote:
 
  On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
   Well, if you want to aggressively prune unused bits then you could 
   back-propagate the shift count value-range.
   
   And note that all the frontend shorten-optimizations should move to the 
   middle-end.
   
   That said, instead of casting to int we could cast to unsigned char...
  
  As long as we don't have  256 bitsize integers, yes, for the purposes of
  GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
  to promote it to wider type if the shift instructions operate on wider shift
  count modes than QImode.
 
 True.  I suppose the promotion pass could promote/demote the shift count
 to a mode appropriate for the target.  But that's again orthogonal to
 optimizing the computation of the shift count based on its value-range
 which does not invoke undefined behavior (thus, shortening of those
 computations).  We don't have a pass yet that allows this kind of
 back-propagation though of course a simple shortening pass wouldn't
 be hard to implement (or similar to what the FEs do now we can add
 some patterns to match.pd catching those opportunities).

My point is that because we don't have infrastructure for that in GCC 5,
I think it would be best to keep the status quo in GIMPLE for this, and only
when we have infrastructure to do something better, change it.
As Marek wants to have the original (promoted) type in the FEs (which is of
course desirable, not only for ubsan, but also for diagnostics),
transforming that into the old way in c_gimplify_expr would keep the
middle-end seeing what it was used to.

Jakub


[PATCH] Fix PR64083

2014-11-27 Thread Richard Biener

The following fixes an omission from when I introduced the
mark_loop_for_removal abstraction.  thread_through_all_blocks still
uses the old way of killing a loop.  The following patch fixes this
by not killing it at all (it's just likely that the loop will break
and a loop-fixup is scheduled anyway if we change the CFG and that
fixup will deal with this just fine).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-11-26  Richard Biener  rguent...@suse.de

PR tree-optimization/64083
* tree-ssa-threadupdate.c (thread_through_all_blocks): Do not
forcibly mark loop for removal the wrong way.

* gcc.dg/torture/pr64083.c: New testcase.

Index: gcc/tree-ssa-threadupdate.c
===
--- gcc/tree-ssa-threadupdate.c (revision 218078)
+++ gcc/tree-ssa-threadupdate.c (working copy)
@@ -2428,16 +2428,8 @@ thread_through_all_blocks (bool may_peel
/* Our path is still valid, thread it.  */
if (e-aux)
  {
-   struct loop *loop = (*path)[0]-e-dest-loop_father;
-
if (thread_block ((*path)[0]-e-dest, false))
- {
-   /* This jump thread likely totally scrambled this loop.
-  So arrange for it to be fixed up.  */
-   loop-header = NULL;
-   loop-latch = NULL;
-   e-aux = NULL;
- }
+ e-aux = NULL;
else
  {
delete_jump_thread_path (path);
Index: gcc/testsuite/gcc.dg/torture/pr64083.c
===
--- gcc/testsuite/gcc.dg/torture/pr64083.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr64083.c  (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+
+int a, b;
+void
+fn1 ()
+{
+  int c = 0;
+  while (b)
+{
+  switch (c)
+case 1:
+   fn1 ();
+   if (a)
+ c = 1;
+   b = 0;
+}
+}


[PATCH] Fix match-and-simplify regression

2014-11-27 Thread Richard Biener

The following allows gimple_fold_stmt_to_constant_1 in SCCVN to follow
SSA edges.  This fixes regressions caused by no longer dispatching to
fold_unary from it.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-11-27  Richard Biener  rguent...@suse.de

* tree-ssa-sccvn.c (try_to_simplify): Allow
gimple_fold_stmt_to_constant_1 to follow SSA edges.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 218114)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -3461,7 +3461,7 @@ try_to_simplify (gassign *stmt)
 return NULL_TREE;
 
   /* First try constant folding based on our current lattice.  */
-  tem = gimple_fold_stmt_to_constant_1 (stmt, vn_valueize);
+  tem = gimple_fold_stmt_to_constant_1 (stmt, vn_valueize, vn_valueize);
   if (tem
(TREE_CODE (tem) == SSA_NAME
  || is_gimple_min_invariant (tem)))


Re: [Patch, ARM, ping1] Fix PR target/56846

2014-11-27 Thread Ramana Radhakrishnan



On 27/11/14 09:34, Richard Biener wrote:

On Thu, 27 Nov 2014, Thomas Preud'homme wrote:


Thanks. Ccing release manager for their opinion.


It doesn't look ARM specific and frankly I have not too much expertise
in this area.  The patch has been on trunk for more than two months
though so I guess it is ok to backport.



It is ARM specific because the whole thing sits in a #ifdef 
__ARM_EABI_UNWINDER__ in eh_personality.cc.


regards
Ramana


Richard.


Best regards,

Thomas


-Original Message-
From: Jonathan Wakely [mailto:jwak...@redhat.com]
Sent: Wednesday, November 26, 2014 5:33 PM
To: Thomas Preud'homme
Cc: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
libstd...@gcc.gnu.org
Subject: Re: [Patch, ARM, ping1] Fix PR target/56846

On 26/11/14 17:23 -, Thomas Preud'homme wrote:

Ping?


I'm OK with backporting it if a release manager approves it.



-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
Sent: Wednesday, November 19, 2014 6:00 PM
To: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
libstd...@gcc.gnu.org
Subject: RE: [Patch ARM] Fix PR target/56846


From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
ow...@gcc.gnu.org] On Behalf Of Tony Wang





Hi all,

The bug is reported at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846, and it’s about

the

problem that
when exception handler is involved in the function, then
_Unwind_Backtrace function will run into deadloop on
arm target.


The patch (in r215101) can be backported without any change on 4.8

and

4.9 branches. I checked in QEMU with and without the patch on both
branches and it indeed solves the problem.

Testsuite run without regression when compiled with arm-none-eabi
cross compiler and executed on QEMU emulating Cortex-M3.

I also bootstrapped gcc on x86_64-linux-gnu and run the testsuite
without
regressions.

Is it ok for backport?

Best regards,

Thomas



















Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-27 Thread Richard Biener
On Thu, 27 Nov 2014, Jakub Jelinek wrote:

 On Thu, Nov 27, 2014 at 10:30:16AM +0100, Richard Biener wrote:
  On Wed, 26 Nov 2014, Jakub Jelinek wrote:
  
   On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
Well, if you want to aggressively prune unused bits then you could 
back-propagate the shift count value-range.

And note that all the frontend shorten-optimizations should move to the 
middle-end.

That said, instead of casting to int we could cast to unsigned char...
   
   As long as we don't have  256 bitsize integers, yes, for the purposes of
   GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
   to promote it to wider type if the shift instructions operate on wider 
   shift
   count modes than QImode.
  
  True.  I suppose the promotion pass could promote/demote the shift count
  to a mode appropriate for the target.  But that's again orthogonal to
  optimizing the computation of the shift count based on its value-range
  which does not invoke undefined behavior (thus, shortening of those
  computations).  We don't have a pass yet that allows this kind of
  back-propagation though of course a simple shortening pass wouldn't
  be hard to implement (or similar to what the FEs do now we can add
  some patterns to match.pd catching those opportunities).
 
 My point is that because we don't have infrastructure for that in GCC 5,
 I think it would be best to keep the status quo in GIMPLE for this, and only
 when we have infrastructure to do something better, change it.
 As Marek wants to have the original (promoted) type in the FEs (which is of
 course desirable, not only for ubsan, but also for diagnostics),
 transforming that into the old way in c_gimplify_expr would keep the
 middle-end seeing what it was used to.

But then rather than using integer_type_node I'd convert to
unsigned_type_node (if that doesn't work you know we have some nasty
dependence on negative shift counts working)

Richard.


Re: [Patch, ARM, ping1] Fix PR target/56846

2014-11-27 Thread Richard Biener
On Thu, 27 Nov 2014, Ramana Radhakrishnan wrote:

 
 
 On 27/11/14 09:34, Richard Biener wrote:
  On Thu, 27 Nov 2014, Thomas Preud'homme wrote:
  
   Thanks. Ccing release manager for their opinion.
  
  It doesn't look ARM specific and frankly I have not too much expertise
  in this area.  The patch has been on trunk for more than two months
  though so I guess it is ok to backport.
  
 
 It is ARM specific because the whole thing sits in a #ifdef
 __ARM_EABI_UNWINDER__ in eh_personality.cc.

Ah, too little patch context then.

Richard.

 regards
 Ramana
 
  Richard.
  
   Best regards,
   
   Thomas
   
-Original Message-
From: Jonathan Wakely [mailto:jwak...@redhat.com]
Sent: Wednesday, November 26, 2014 5:33 PM
To: Thomas Preud'homme
Cc: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
libstd...@gcc.gnu.org
Subject: Re: [Patch, ARM, ping1] Fix PR target/56846

On 26/11/14 17:23 -, Thomas Preud'homme wrote:
 Ping?

I'm OK with backporting it if a release manager approves it.


  -Original Message-
  From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
  ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
  Sent: Wednesday, November 19, 2014 6:00 PM
  To: Tony Wang; gcc-patches@gcc.gnu.org; d...@debian.org; aph-
  g...@littlepinkcloud.com; Richard Earnshaw; Ramana Radhakrishnan;
  libstd...@gcc.gnu.org
  Subject: RE: [Patch ARM] Fix PR target/56846
  
   From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
   ow...@gcc.gnu.org] On Behalf Of Tony Wang
   
  
   
   Hi all,
   
   The bug is reported at
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56846, and it’s about
the
   problem that
   when exception handler is involved in the function, then
   _Unwind_Backtrace function will run into deadloop on
   arm target.
  
  The patch (in r215101) can be backported without any change on 4.8
and
  4.9 branches. I checked in QEMU with and without the patch on both
  branches and it indeed solves the problem.
  
  Testsuite run without regression when compiled with arm-none-eabi
  cross compiler and executed on QEMU emulating Cortex-M3.
  
  I also bootstrapped gcc on x86_64-linux-gnu and run the testsuite
  without
  regressions.
  
  Is it ok for backport?
  
  Best regards,
  
  Thomas
  
  
  
  
 
 
 
 
   
   
   
   
   
  
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany

Re: [PATCH][AArch64] Implement vsqrt_f64 intrinsic

2014-11-27 Thread Kyrill Tkachov


On 26/11/14 10:14, Christophe Lyon wrote:

Hi Kyrill,

Hi Christophe,




On 21 November 2014 at 16:52, Marcus Shawcroft
marcus.shawcr...@gmail.com wrote:

On 17 November 2014 17:35, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:


2014-11-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/aarch64/arm_neon.h (vsqrt_f64): New intrinsic.

2014-11-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * gcc.target/aarch64/simd/vsqrt_f64_1.c

OK /Marcus

Your new test fails at the scan-assembly step because all the code is
optimized away (even at -O1).


Sorry about that, I could have sworn I saw it pass when I initially 
wrote it...
In any case, I've committed this patch (r218117) as obvious to mark one 
of the variables as volatile

to make sure it's not optimised away.
I've confirmed that the scan-assembler test fails without this patch and 
passes with it.


Thanks,
Kyrill



Christophe.

diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vsqrt_f64_1.c b/gcc/testsuite/gcc.target/aarch64/simd/vsqrt_f64_1.c
index 57fb6bb..7f99bd5 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd/vsqrt_f64_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vsqrt_f64_1.c
@@ -11,7 +11,7 @@ extern void abort (void);
 int
 main (void)
 {
-  float64x1_t in = vcreate_f64(0x3febd3e560634d7bULL);
+  volatile float64x1_t in = vcreate_f64(0x3febd3e560634d7bULL);
   float64x1_t result = vsqrt_f64 (in);
   float64_t expected = 0.9325321502142351;
 

Re: [PATCH][AArch64] Implement vsqrt_f64 intrinsic

2014-11-27 Thread Kyrill Tkachov


On 27/11/14 10:03, Kyrill Tkachov wrote:

On 26/11/14 10:14, Christophe Lyon wrote:

Hi Kyrill,

Hi Christophe,



On 21 November 2014 at 16:52, Marcus Shawcroft
marcus.shawcr...@gmail.com wrote:

On 17 November 2014 17:35, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:


2014-11-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/aarch64/arm_neon.h (vsqrt_f64): New intrinsic.

2014-11-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * gcc.target/aarch64/simd/vsqrt_f64_1.c

OK /Marcus

Your new test fails at the scan-assembly step because all the code is
optimized away (even at -O1).

Sorry about that, I could have sworn I saw it pass when I initially
wrote it...
In any case, I've committed this patch (r218117) as obvious to mark one
of the variables as volatile
to make sure it's not optimised away.
I've confirmed that the scan-assembler test fails without this patch and
passes with it.


With the following ChangeLog for completeness sake:

2014-11-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/simd/vsqrt_f64_1.c: Mark variable volatile.



Thanks,
Kyrill


Christophe.







Re: [PATCH] Fix simd clone vectorization (PR tree-optimization/64024)

2014-11-27 Thread Richard Biener
On Wed, 26 Nov 2014, Jakub Jelinek wrote:

 Hi!
 
 As discussed in the PR and on IRC, the problem here is that peeling
 for alignment can for some linear argument that during vect analysis
 passed simple_iv no longer pass it during vect transform phase.
 
 So, to fix this, this patch remembers the base and step values from
 simple_iv during vect analysis and uses them during transform phase
 (biased by what the peeling for alignment advanced of course).
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2014-11-26  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/64024
   * tree-vectorizer.h (struct _stmt_vec_info): Remove simd_clone_fndecl
   field.  Add simd_clone_info field.
   (STMT_VINFO_SIMD_CLONE_FNDECL): Remove.
   (STMT_VINFO_SIMD_CLONE_INFO): Define.
   * tree-vect-stmts.c (vectorizable_simd_clone_call): Adjust for
   STMT_VINFO_SIMD_CLONE_FNDECL becoming first element of
   STMT_VINFO_SIMD_CLONE_INFO vector.  For linear arguments, remember
   base and linear_step from analysis phase and use it during transform
   phase, biased by the difference between LOOP_VINFO_NITERS{_UNCHANGED,}
   multiplied by linear_step.
   (free_stmt_vec_info): Release STMT_VINFO_SIMD_CLONE_INFO.
 
   * gcc.dg/vect/vect-simd-clone-13.c: New test.
   * gcc.dg/vect/vect-simd-clone-14.c: New test.
 
 --- gcc/tree-vectorizer.h.jj  2014-11-19 18:48:07.0 +0100
 +++ gcc/tree-vectorizer.h 2014-11-26 12:56:00.899824766 +0100
 @@ -602,8 +602,10 @@ typedef struct _stmt_vec_info {
   of this stmt.  */
vecdr_p same_align_refs;
  
 -  /* Selected SIMD clone's function decl.  */
 -  tree simd_clone_fndecl;
 +  /* Selected SIMD clone's function info.  First vector element
 + is SIMD clone's function decl, followed by a pair of trees (base + step)
 + for linear arguments (pair of NULLs for other arguments).  */
 +  vectree simd_clone_info;
  
/* Classify the def of this stmt.  */
enum vect_def_type def_type;
 @@ -677,7 +679,7 @@ typedef struct _stmt_vec_info {
  #define STMT_VINFO_RELATED_STMT(S) (S)-related_stmt
  #define STMT_VINFO_PATTERN_DEF_SEQ(S)  (S)-pattern_def_seq
  #define STMT_VINFO_SAME_ALIGN_REFS(S)  (S)-same_align_refs
 -#define STMT_VINFO_SIMD_CLONE_FNDECL(S) (S)-simd_clone_fndecl
 +#define STMT_VINFO_SIMD_CLONE_INFO(S)   (S)-simd_clone_info
  #define STMT_VINFO_DEF_TYPE(S) (S)-def_type
  #define STMT_VINFO_GROUP_FIRST_ELEMENT(S)  (S)-first_element
  #define STMT_VINFO_GROUP_NEXT_ELEMENT(S)   (S)-next_element
 --- gcc/tree-vect-stmts.c.jj  2014-11-19 18:47:59.0 +0100
 +++ gcc/tree-vect-stmts.c 2014-11-26 15:38:59.883409014 +0100
 @@ -2715,12 +2715,40 @@ vectorizable_simd_clone_call (gimple stm
else
   gcc_assert (thisarginfo.vectype != NULL_TREE);
  
 -  if (thisarginfo.dt != vect_constant_def
 -thisarginfo.dt != vect_external_def
 -loop_vinfo
 -TREE_CODE (op) == SSA_NAME
 -simple_iv (loop, loop_containing_stmt (stmt), op, iv, false)
 -tree_fits_shwi_p (iv.step))
 +  /* For linear arguments, the analyze phase should have saved
 +  the base and step in STMT_VINFO_SIMD_CLONE_INFO.  */
 +  if (i * 2 + 3 = STMT_VINFO_SIMD_CLONE_INFO (stmt_info).length ()
 +STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2])
 + {
 +   gcc_assert (vec_stmt);
 +   thisarginfo.linear_step
 + = tree_to_shwi (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]);
 +   thisarginfo.op
 + = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 1];
 +   /* If loop has been peeled for alignment, we need to adjust it.  */
 +   tree n1 = LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo);
 +   tree n2 = LOOP_VINFO_NITERS (loop_vinfo);
 +   if (n1 != n2)
 + {
 +   tree bias = fold_build2 (MINUS_EXPR, TREE_TYPE (n1), n1, n2);
 +   tree step = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2];
 +   tree opt = TREE_TYPE (thisarginfo.op);
 +   bias = fold_convert (TREE_TYPE (step), bias);
 +   bias = fold_build2 (MULT_EXPR, TREE_TYPE (step), bias, step);
 +   thisarginfo.op
 + = fold_build2 (POINTER_TYPE_P (opt)
 +? POINTER_PLUS_EXPR : PLUS_EXPR, opt,
 +thisarginfo.op, bias);
 + }
 + }
 +  else if (!vec_stmt
 + thisarginfo.dt != vect_constant_def
 + thisarginfo.dt != vect_external_def
 + loop_vinfo
 + TREE_CODE (op) == SSA_NAME
 + simple_iv (loop, loop_containing_stmt (stmt), op,
 +  iv, false)
 + tree_fits_shwi_p (iv.step))
   {
 thisarginfo.linear_step = tree_to_shwi (iv.step);
 thisarginfo.op = iv.base;
 @@ -2735,8 +2763,8 @@ vectorizable_simd_clone_call (gimple stm
  
unsigned int badness = 0;

Re: [PATCH] Fix expansion with COMPOUND_LITERAL_EXPR (PR middle-end/64067)

2014-11-27 Thread Richard Biener
On Wed, 26 Nov 2014, Jakub Jelinek wrote:

 Hi!
 
 The testcase shows that when expanding ARRAY_REF from a const static var
 with initializer that contains COMPOUND_LITERAL_EXPR, we can try to expand
 COMPOUND_LITERAL_EXPR even when modifier is not EXPAND_INITIALIZER (it is
 EXPAND_SUM in that testcase, but could be many others).
 While gimplification should get rid of all the compound literal exprs
 for automatic compound literals, COMPOUND_LITERAL_EXPR for static compound
 literals can survive in the initializers, thus this patch handles them
 always.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.  (I wonder if the fallthru still makes sense or if we should
hit a gcc_unreachable () for other uses of COMPOUND_LITERAL_EXPR).

Thanks,
Richard.

 2014-11-26  Jakub Jelinek  ja...@redhat.com
 
   PR middle-end/64067
   * expr.c (expand_expr_addr_expr_1) case COMPOUND_LITERAL_EXPR:
   Handle it by returning address of COMPOUND_LITERAL_EXPR_DECL
   not only if modifier is EXPAND_INITIALIZER, but whenever
   COMPOUND_LITERAL_EXPR_DECL is non-NULL and TREE_STATIC.
 
   * gcc.c-torture/compile/pr64067.c: New test.
 
 --- gcc/expr.c.jj 2014-11-18 08:26:45.0 +0100
 +++ gcc/expr.c2014-11-26 17:38:52.877675088 +0100
 @@ -7677,11 +7677,13 @@ expand_expr_addr_expr_1 (tree exp, rtx t
break;
  
  case COMPOUND_LITERAL_EXPR:
 -  /* Allow COMPOUND_LITERAL_EXPR in initializers, if e.g.
 -  rtl_for_decl_init is called on DECL_INITIAL with
 -  COMPOUNT_LITERAL_EXPRs in it, they aren't gimplified.  */
 -  if (modifier == EXPAND_INITIALIZER
 -COMPOUND_LITERAL_EXPR_DECL (exp))
 +  /* Allow COMPOUND_LITERAL_EXPR in initializers or coming from
 +  initializers, if e.g. rtl_for_decl_init is called on DECL_INITIAL
 +  with COMPOUND_LITERAL_EXPRs in it, or ARRAY_REF on a const static
 +  array with address of COMPOUND_LITERAL_EXPR in DECL_INITIAL;
 +  the initializers aren't gimplified.  */
 +  if (COMPOUND_LITERAL_EXPR_DECL (exp)
 +TREE_STATIC (COMPOUND_LITERAL_EXPR_DECL (exp)))
   return expand_expr_addr_expr_1 (COMPOUND_LITERAL_EXPR_DECL (exp),
   target, tmode, modifier, as);
/* FALLTHRU */
 --- gcc/testsuite/gcc.c-torture/compile/pr64067.c.jj  2014-11-26 
 17:37:37.835865797 +0100
 +++ gcc/testsuite/gcc.c-torture/compile/pr64067.c 2014-11-26 
 17:37:02.0 +0100
 @@ -0,0 +1,10 @@
 +/* PR middle-end/64067 */
 +
 +struct S { int s; };
 +int *const v[1] = { ((struct S) { .s = 42 }).s };
 +
 +int *
 +foo (void)
 +{
 +  return v[0];
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany


Re: [PATCH] Fix find_base_term in 32-bit -fpic code (PR lto/64025, take 2)

2014-11-27 Thread Richard Biener
On Wed, 26 Nov 2014, Jakub Jelinek wrote:

 On Tue, Nov 25, 2014 at 09:20:13AM +0100, Jakub Jelinek wrote:
  Actually, thinking about it more, at least according to
  commutative_operand_precedence the canonical order is
  what we used to return (i.e. (something - _G_O_T_) + (symbol_ref)
  or
  (something - _G_O_T_) + (const (symbol_ref +- const))
  So perhaps better fix is to follow find_base_value, which does something
  like:
  /* Guess which operand is the base address:
 If either operand is a symbol, then it is the base.  If
 either operand is a CONST_INT, then the other is the base.  */
  if (CONST_INT_P (src_1) || CONSTANT_P (src_0))
return find_base_value (src_0);
  else if (CONST_INT_P (src_0) || CONSTANT_P (src_1))
return find_base_value (src_1);
  and do something similar in find_base_term too.  I.e. perhaps even with
  higher precedence over REG_P with REG_POINTER (or lower, in these cases
  it doesn't really matter, neither argument is REG_P), choose first
  operand that is CONSTANT_P and not CONST_INT_P.

Yeah, that makes sense.

 Here it is.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
 trunk?

Ok.

Thanks,
Richard.

 2014-11-26  Jakub Jelinek  ja...@redhat.com
 
   PR lto/64025
   * alias.c (find_base_term): Use std::swap.  Prefer tmp2
   if it is CONSTANT_P other than CONST_INT.
 
 --- gcc/alias.c.jj2014-11-21 10:17:17.0 +0100
 +++ gcc/alias.c   2014-11-26 12:31:24.719485590 +0100
 @@ -1756,11 +1756,11 @@ find_base_term (rtx x)
   if (REG_P (tmp1)  REG_POINTER (tmp1))
 ;
   else if (REG_P (tmp2)  REG_POINTER (tmp2))
 -   {
 - rtx tem = tmp1;
 - tmp1 = tmp2;
 - tmp2 = tem;
 -   }
 +   std::swap (tmp1, tmp2);
 + /* If second argument is constant which has base term, prefer it
 +over variable tmp1.  See PR64025.  */
 + else if (CONSTANT_P (tmp2)  !CONST_INT_P (tmp2))
 +   std::swap (tmp1, tmp2);
  
   /* Go ahead and find the base term for both operands.  If either base
  term is from a pointer or is a named object or a special address
 
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany


RE: [Patch, ARM, ping1] Fix PR target/56846

2014-11-27 Thread Thomas Preud'homme


 -Original Message-
 From: Richard Biener [mailto:rguent...@suse.de]
 Sent: Thursday, November 27, 2014 9:57 AM
 To: Ramana Radhakrishnan
 Cc: Thomas Preud'homme; 'Jonathan Wakely'; Jakub Jelinek; Tony Wang;
 gcc-patches@gcc.gnu.org; d...@debian.org; aph-
 g...@littlepinkcloud.com; Richard Earnshaw; libstd...@gcc.gnu.org
 Subject: Re: [Patch, ARM, ping1] Fix PR target/56846
 
 On Thu, 27 Nov 2014, Ramana Radhakrishnan wrote:
 
 
 
  On 27/11/14 09:34, Richard Biener wrote:
   On Thu, 27 Nov 2014, Thomas Preud'homme wrote:
  
Thanks. Ccing release manager for their opinion.
  
   It doesn't look ARM specific and frankly I have not too much
 expertise
   in this area.  The patch has been on trunk for more than two months
   though so I guess it is ok to backport.
  
 
  It is ARM specific because the whole thing sits in a #ifdef
  __ARM_EABI_UNWINDER__ in eh_personality.cc.
 
 Ah, too little patch context then.

Sorry, my bad. Below is the patch that were sent by Tony at that time with 20 
lines of context:

diff --git a/libstdc++-v3/libsupc++/eh_personality.cc 
b/libstdc++-v3/libsupc++/eh_personality.cc
index f315a83..cb4467a 100644
--- a/libstdc++-v3/libsupc++/eh_personality.cc
+++ b/libstdc++-v3/libsupc++/eh_personality.cc
@@ -361,40 +361,46 @@ PERSONALITY_FUNCTION (int version,
 found_cleanup,
 found_handler
   } found_type;
 
   lsda_header_info info;
   const unsigned char *language_specific_data;
   const unsigned char *action_record;
   const unsigned char *p;
   _Unwind_Ptr landing_pad, ip;
   int handler_switch_value;
   void* thrown_ptr = 0;
   bool foreign_exception;
   int ip_before_insn = 0;
 
 #ifdef __ARM_EABI_UNWINDER__
   _Unwind_Action actions;
 
   switch (state  _US_ACTION_MASK)
 {
 case _US_VIRTUAL_UNWIND_FRAME:
+  // If the unwind state pattern is
+  // _US_VIRTUAL_UNWIND_FRAME | _US_FORCE_UNWIND
+  // then we don't need to search for any handler as it is not a real
+  // exception. Just unwind the stack.
+  if (state  _US_FORCE_UNWIND)
+   CONTINUE_UNWINDING;
   actions = _UA_SEARCH_PHASE;
   break;
 
 case _US_UNWIND_FRAME_STARTING:
   actions = _UA_CLEANUP_PHASE;
   if (!(state  _US_FORCE_UNWIND)
   ue_header-barrier_cache.sp == _Unwind_GetGR(context,
  UNWIND_STACK_REG))
actions |= _UA_HANDLER_FRAME;
   break;
 
 case _US_UNWIND_FRAME_RESUME:
   CONTINUE_UNWINDING;
   break;
 
 default:
   std::abort();
 }
   actions |= state  _US_FORCE_UNWIND;

Best regards,

Thomas





Re: [PATCH] Fix expansion with COMPOUND_LITERAL_EXPR (PR middle-end/64067)

2014-11-27 Thread Jakub Jelinek
On Thu, Nov 27, 2014 at 11:03:20AM +0100, Richard Biener wrote:
  The testcase shows that when expanding ARRAY_REF from a const static var
  with initializer that contains COMPOUND_LITERAL_EXPR, we can try to expand
  COMPOUND_LITERAL_EXPR even when modifier is not EXPAND_INITIALIZER (it is
  EXPAND_SUM in that testcase, but could be many others).
  While gimplification should get rid of all the compound literal exprs
  for automatic compound literals, COMPOUND_LITERAL_EXPR for static compound
  literals can survive in the initializers, thus this patch handles them
  always.
  
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Ok.  (I wonder if the fallthru still makes sense or if we should
 hit a gcc_unreachable () for other uses of COMPOUND_LITERAL_EXPR).

I think the fallthru can't work, because it will hit the gcc_unreachable ();
on unhandled COMPOUND_LITERAL_EXPR in expand_expr_real_*.
So changing the if in the patch into gcc_assert and removing /* FALLTHRU */
might be ok too (and removing == COMPOUND_LITERAL_EXPR from the fallthru
if).

  2014-11-26  Jakub Jelinek  ja...@redhat.com
  
  PR middle-end/64067
  * expr.c (expand_expr_addr_expr_1) case COMPOUND_LITERAL_EXPR:
  Handle it by returning address of COMPOUND_LITERAL_EXPR_DECL
  not only if modifier is EXPAND_INITIALIZER, but whenever
  COMPOUND_LITERAL_EXPR_DECL is non-NULL and TREE_STATIC.
  
  * gcc.c-torture/compile/pr64067.c: New test.
  
  --- gcc/expr.c.jj   2014-11-18 08:26:45.0 +0100
  +++ gcc/expr.c  2014-11-26 17:38:52.877675088 +0100
  @@ -7677,11 +7677,13 @@ expand_expr_addr_expr_1 (tree exp, rtx t
 break;
   
   case COMPOUND_LITERAL_EXPR:
  -  /* Allow COMPOUND_LITERAL_EXPR in initializers, if e.g.
  -rtl_for_decl_init is called on DECL_INITIAL with
  -COMPOUNT_LITERAL_EXPRs in it, they aren't gimplified.  */
  -  if (modifier == EXPAND_INITIALIZER
  -  COMPOUND_LITERAL_EXPR_DECL (exp))
  +  /* Allow COMPOUND_LITERAL_EXPR in initializers or coming from
  +initializers, if e.g. rtl_for_decl_init is called on DECL_INITIAL
  +with COMPOUND_LITERAL_EXPRs in it, or ARRAY_REF on a const static
  +array with address of COMPOUND_LITERAL_EXPR in DECL_INITIAL;
  +the initializers aren't gimplified.  */
  +  if (COMPOUND_LITERAL_EXPR_DECL (exp)
  +  TREE_STATIC (COMPOUND_LITERAL_EXPR_DECL (exp)))
  return expand_expr_addr_expr_1 (COMPOUND_LITERAL_EXPR_DECL (exp),
  target, tmode, modifier, as);
 /* FALLTHRU */

Jakub


Re: [PATCH 1/3, ARM, libgcc, ping5] Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc

2014-11-27 Thread Ramana Radhakrishnan
On Thu, Nov 13, 2014 at 4:03 PM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 [Taking over Tony's patch]

 Ping?

 Best regards,

 Thomas

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Tony Wang
 Sent: Thursday, August 21, 2014 7:15 AM
 To: gcc-patches@gcc.gnu.org
 Subject: [PATCH 1/3,ARM,libgcc]Code size optimization for the fmul/fdiv
 and dmul/ddiv function in libgcc

 Hi there,

 In libgcc the file ieee754-sf.S and ieee754-df.S have some function pairs
 which will be bundled into one .o
 file and sharing the same .text section. For example, the fmul and fdiv,
 the libgcc makefile will build them
 into one .o file and archived into libgcc.a. So when user only call single
 float point multiply functions, the
 fdiv function will also be linked, and as fmul and fdiv share the same .text
 section, linker option
 --gc-sections or -flot can't remove the dead code.

 So this optimization just separates the function pair(fmul/fdiv and
 dmul/ddiv) into different sections,
 following the naming pattern of -ffunction-
 sections(.text.__functionname), through which the unused sections
 of fdiv/ddiv can be eliminated through option --gcc-sections when users
 only use fmul/dmul.The solution is to
 add a conditional statement in the macro FUNC_START, which will
 conditional change the section of a function
 from .text to .text.__\name. when compiling with the L_arm_muldivsf3
 or L_arm_muldivdf3 macro.

 GCC regression test has been done on QEMU for Cortex-M3. No new
 regressions when turn on this patch.

 The code reduction for thumb2 on cortex-m3 is:
 1. When user only use single float point multiply:
 fmul+fdiv = fmul will have a code size reduction of 318 bytes.

 2. When user only use double float point multiply:
 dmul+ddiv = dmul will have a code size reduction of 474 bytes.

 Ok for trunk?

This is OK.


Ramana

 BR,
 Tony

 Step 1: Provide another option: sp-scetion to control whether to split the
 section of a function pair into two
 part.

 gcc/libgcc/ChangeLog:
 2014-08-21  Tony Wang  tony.w...@arm.com

 * config/arm/lib1funcs.S (FUNC_START): Add conditional section
 redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3
 (SYM_END, ARM_SYM_START): Add macros used to expose function
 Symbols

 diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
 index b617137..0f87111 100644
 --- a/libgcc/config/arm/lib1funcs.S
 +++ b/libgcc/config/arm/lib1funcs.S
 @@ -418,8 +418,12 @@ SYM (\name):
  #define THUMB_SYNTAX
  #endif

 -.macro FUNC_START name
 +.macro FUNC_START name sp_section=
 +  .ifc \sp_section, function_section
 + .section.text.__\name,ax,%progbits
 +  .else
   .text
 +  .endif
   .globl SYM (__\name)
   TYPE (__\name)
   .align 0
 @@ -429,14 +433,24 @@ SYM (\name):
  SYM (__\name):
  .endm

 +.macro ARM_SYM_START name
 +   TYPE (\name)
 +   .align 0
 +SYM (\name):
 +.endm
 +
 +.macro SYM_END name
 +   SIZE (\name)
 +.endm
 +
  /* Special function that will always be coded in ARM assembly, even if
 in Thumb-only compilation.  */

  #if defined(__thumb2__)

  /* For Thumb-2 we build everything in thumb mode.  */
 -.macro ARM_FUNC_START name
 -   FUNC_START \name
 +.macro ARM_FUNC_START name sp_section=
 +   FUNC_START \name \sp_section
 .syntax unified
  .endm
  #define EQUIV .thumb_set
 @@ -467,8 +481,12 @@ _L__\name:
  #ifdef __ARM_ARCH_6M__
  #define EQUIV .thumb_set
  #else
 -.macro   ARM_FUNC_START name
 +.macro   ARM_FUNC_START name sp_section=
 +  .ifc \sp_section, function_section
 + .section.text.__\name,ax,%progbits
 +  .else
   .text
 +  .endif
   .globl SYM (__\name)
   TYPE (__\name)
   .align 0





Re: [PATCH][ARM] Fix PR59593/PR63742: make consttable_{1,2} available to ARM

2014-11-27 Thread Ramana Radhakrishnan
On Tue, Nov 11, 2014 at 3:40 PM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 Fixed the subject.

 Best regards,

 Thomas

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
 Sent: Tuesday, November 11, 2014 3:31 PM
 To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan; Richard Earnshaw
 Subject: [PATCH][ARM] Fix PR59593/PR63742: arm *movhi_insn_arch4
 pattern for big-endian

 Currently, constant pool entries for QImode, HImode and SImode values
 are filled as 32-bit quantities. This works fine for little endian system but
 gives some incorrect results for big endian system when the value is
 *accessed* with a mode smaller than 32-bit in size. Suppose the case of
 the value 0x1234 that is accessed as an HImode value. It would be output
 as 0x0 0x0 0x12 0x34 in a constant pool entry and the 16-bit load that
 would be done would lead to the value 0x0 in register.

 This patch makes the consttable_1 and consttable_2 pattern available to
 ARM as well so that values are output according to their mode. This is a
 different approach than the one proposed by Felix Yang at [1].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00258.html

 ChangeLog entries are as follows:

 *** gcc/ChangeLog ***

 2014-09-03  Thomas Preud'homme  thomas.preudho...@arm.com

   PR target/59593
   * config/arm/arm.c (dump_minipool): dispatch to consttable
 pattern
   based on mode size.
   * config/arm/arm.md (consttable_1): Move from
 config/arm/thumb1.md and
   make it TARGET_EITHER.
   (consttable_2): Move from config/arm/thumb1.md, make it
 TARGET_EITHER
   and move HFmode handling from consttable_4 to it.
   (consttable_4): Move HFmode handling to consttable_2 pattern.
   * config/arm/thumb1.md (consttable_1): Move to
 config/arm/arm.md.
   (consttable_2): Ditto.

 *** gcc/testsuite/ChangeLog ***

 2014-10-08  Thomas Preud'homme  thomas.preudho...@arm.com

   PR target/59593
   * gcc.target/arm/constant-pool.c: New test.


 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 index d8bfda3..63babd2 100644
 --- a/gcc/config/arm/arm.c
 +++ b/gcc/config/arm/arm.c
 @@ -16608,7 +16608,7 @@ dump_minipool (rtx_insn *scan)
 fputc ('\n', dump_file);
   }

 -   switch (mp-fix_size)
 +   switch (GET_MODE_SIZE (mp-mode))
   {
  #ifdef HAVE_consttable_1
   case 1:
 diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
 index cd9ab6c..8ca88b6 100644
 --- a/gcc/config/arm/arm.md
 +++ b/gcc/config/arm/arm.md
 @@ -10567,6 +10567,42 @@
[(set_attr type no_insn)]
  )

 +(define_insn consttable_1
 +  [(unspec_volatile [(match_operand 0  )] VUNSPEC_POOL_1)]
 +  TARGET_EITHER
 +  *
 +  making_const_table = TRUE;
 +  assemble_integer (operands[0], 1, BITS_PER_WORD, 1);
 +  assemble_zeros (3);
 +  return \\;
 +  
 +  [(set_attr length 4)
 +   (set_attr type no_insn)]
 +)
 +
 +(define_insn consttable_2
 +  [(unspec_volatile [(match_operand 0  )] VUNSPEC_POOL_2)]
 +  TARGET_EITHER
 +  *
 +  {
 +rtx x = operands[0];
 +making_const_table = TRUE;
 +switch (GET_MODE_CLASS (GET_MODE (x)))
 +  {
 +  case MODE_FLOAT:
 + arm_emit_fp16_const (x);
 + break;
 +  default:
 + assemble_integer (operands[0], 2, BITS_PER_WORD, 1);
 + assemble_zeros (2);
 + break;
 +  }
 +return \\;
 +  }
 +  [(set_attr length 4)
 +   (set_attr type no_insn)]
 +)
 +
  (define_insn consttable_4
[(unspec_volatile [(match_operand 0  )] VUNSPEC_POOL_4)]
TARGET_EITHER
 @@ -10577,15 +10613,12 @@
  switch (GET_MODE_CLASS (GET_MODE (x)))
{
case MODE_FLOAT:
 - if (GET_MODE (x) == HFmode)
 -   arm_emit_fp16_const (x);
 - else
 -   {
 - REAL_VALUE_TYPE r;
 - REAL_VALUE_FROM_CONST_DOUBLE (r, x);
 - assemble_real (r, GET_MODE (x), BITS_PER_WORD);
 -   }
 - break;
 + {
 +   REAL_VALUE_TYPE r;
 +   REAL_VALUE_FROM_CONST_DOUBLE (r, x);
 +   assemble_real (r, GET_MODE (x), BITS_PER_WORD);
 +   break;
 + }
default:
   /* XXX: Sometimes gcc does something really dumb and ends up
 with
  a HIGH in a constant pool entry, usually because it's trying to
 diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
 index 020d83b..80d470b 100644
 --- a/gcc/config/arm/thumb1.md
 +++ b/gcc/config/arm/thumb1.md
 @@ -1735,33 +1735,6 @@
 (set_attr conds clob)]
  )

 -(define_insn consttable_1
 -  [(unspec_volatile [(match_operand 0  )] VUNSPEC_POOL_1)]
 -  TARGET_THUMB1
 -  *
 -  making_const_table = TRUE;
 -  assemble_integer (operands[0], 1, BITS_PER_WORD, 1);
 -  assemble_zeros (3);
 -  return \\;
 -  
 -  [(set_attr length 4)
 -   (set_attr type no_insn)]
 -)
 -
 -(define_insn consttable_2
 -  [(unspec_volatile [(match_operand 0  )] VUNSPEC_POOL_2)]
 -  TARGET_THUMB1
 -  *
 -  making_const_table = TRUE;
 -  gcc_assert (GET_MODE_CLASS 

Re: [PATCH][ARM] Add -mcpu=cortex-a17.cortex-a7

2014-11-27 Thread Ramana Radhakrishnan
On Tue, Nov 18, 2014 at 10:40 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 Following up from adding Cortex-A17 support this patch adds a big.LITTLE
 option cortex-a17.cortex-a7.
 Similar to the existing cortex-a15.cortex-a7 support we schedule for
 Cortex-A7 and make the other tuning
 decisions as for Cortex-A17.

 Tested arm-none-eabi.

 Ok for trunk?

Ok when the rest go in.

Ramana


 Thanks,
 Kyrill

 2014-11-18  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm-cores.def (cortex-a17.cortex-a7): New entry.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm-tune.md: Regenerate.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add mcpu=cortex-a17.cortex-a7.
 * config/arm/t-aprofile: Add cortex-a17.cortex-a7 entry to
 MULTILIB_MATCHES.


Re: [PATCH][ARM] Optimize copysign/copysignf for soft-float using BFI

2014-11-27 Thread Ramana Radhakrishnan
On Wed, Oct 29, 2014 at 10:20 AM, Jiong Wang jiong.w...@arm.com wrote:
 On 26/08/14 13:36, Richard Earnshaw wrote:

 On 29/07/14 15:49, Jiong Wang wrote:

 test done
 ===
 no regression on the full toolchain test on arm-none-eabi.


 ok to install?

 Hmm, I think this is wrong for DF mode.  The principle the patch works
 on is by tying the output to the value containing the sign bit, and then
 copying the rest of the other value into that value.  However, for DF
 mode it only copies 31 of the 63 bits needed; the least significant 32
 bits of the mantissa are not copied over.

 R.


 updated the patch. fixed the DF mode bug.

 no regression on arm-none-eabi multi-lib test.

 ok to trunk?

 gcc/
   * config/arm/arm.md (copysignsf3): New define_expand for SImode.

New pattern.

   (copysigndf3): New define_expand for DImode.

Likewise.

Ok for trunk if no regressions.

Sorry about the slow review.

Ramana



 gcc/testsuite/
   * gcc.target/arm/copysign_softfloat_1.c: New copysign/copysignf testcase
 for soft-float.


Re: [PATCH, ARM] attribute target (thumb,arm) [0/6]

2014-11-27 Thread Ramana Radhakrishnan
On Wed, Nov 19, 2014 at 2:54 PM, Christian Bruel christian.br...@st.com wrote:


 On 11/19/2014 03:18 PM, Ramana Radhakrishnan wrote:

 On Wed, Nov 19, 2014 at 1:24 PM, Christian Bruel christian.br...@st.com
 wrote:

 I think I missed the stage3, Anyway would it be OK for stage1 when it
 reopens ?


 Since you submitted this well during stage1 and given that these
 patches address comments from earlier in the review process we should
 aim to get these in for 5.0. If at some point during the review
 process it looks risky and there needs to be significant rework we can
 always stop. It's on the list of patches to be reviewed and I will
 find some dedicated time later this week to set down and review / play
 with the patches in an attempt to move this forward as it is a
 reasonably large chunk of work.


 Thanks, also I forgot to mention that you need
 https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01231.html
 to play with the attribute. Will be part of the same shot.

Isn't that Ok'd for committing ? Why then isn't it applied ?

Ramana


 Christian



 Thanks for continuing to work on these patches and addressing the
 earlier review comments.

 regards
 Ramana


 Christian




Re: [PING][PATCH] Change contrib/test_installed for testing cross compilers

2014-11-27 Thread Alan Lawrence
Well, FWIW, I was able to do parallel out-of-tree testing using only 
contrib/test_installed and the Makefile from build/gcc/ (plus touch 
config.status and some fiddling around with paths). Again I needed both 
target_triplet and also target_list, but that may be particular to my setup!!


--Alan

Jeff Law wrote:

On 11/24/14 09:51, Alan Lawrence wrote:

Having just been experimenting with testing of installed compilers - yes
something like this could be useful, however: to do cross-testing I
found I also (a) had to set my target_list; so either an extra flag for
that, or maybe just a generic 'extra_site_flags' parameter?
(b) I had to set up some boards...so maybe could have got there with the
--tmpdir flag, ok;
(c) lost all the parallelism provided by the Makefile in build/gcc. It
should be possible to use the (check-parallel-xxx rules from) Makefile
in conjunction with the site.exp from contrib/test_installed, haven't
got that far yet...

This does leave me wondering (1) whether a one-step $ test_installed
is feasible, or a two-stage setup and then run is inevitable; (2)
whether having all that parallelism expressed in the Makefile is the
best place for it. Not that I have an alternative proposal at this point...
It might be inevitable to have a two stage setup.  Red Hat does 
installed compiler testing and I'm sure increased parallelism for 
install testing would be appreciated by the team doing that work.


As for the --target, change itself, seems reasonable.

Jeff






Re: [PATCH][ARM] Add Cortex-A17 support

2014-11-27 Thread Kyrill Tkachov


On 27/11/14 08:52, Ramana Radhakrishnan wrote:

On Thu, Nov 13, 2014 at 5:25 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Hi all,

This patch adds support for the Cortex-A17 processor to the arm backend.
Cortex-A17 is an ARMv7ve core with the same architectural features as the
Cortex-A7, A12 and A15 cores.
The -m{tune, cpu}=cortex-a17 option is added and a pipeline description for
instruction scheduling is provided.
This has given an uplift over -mcpu=cortex-a15 tuning on a number of
benchmarks.

The patch is fairly self-contained with the bulk of the diffstat being the
pipeline description files.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?


Ok with documentation update in invoke.texi and an entry for changes.html


Hi Ramana,

Thanks for the review. The invoke.texi changes were posted and approved 
at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02683.html

and I'm attaching the proposed changes.html patch.

Kyrill



Ramana


Thanks,
Kyrill

2014-11-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm.md (generic_sched): Specify cortexa17 in 'no' list.
 Include cortex-a17.md.
 * config/arm/arm.c (arm_issue_rate): Specify 2 for cortexa17.
 * config/arm/arm-cores.def (cortex-a17): New entry.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm-tune.md: Regenerate.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Specify mcpu=cortex-a17.
 * config/arm/cortex-a17.md: New file.
 * config/arm/cortex-a17-neon.md: New file.
 * config/arm/driver-arm.c (arm_cpu_table): Add entry for cortex-a17.
 * config/arm/t-aprofile: Add cortex-a17 entries to MULTILIB_MATCHES.
Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.43
diff -U 3 -r1.43 changes.html
--- htdocs/gcc-5/changes.html	27 Nov 2014 09:50:04 -	1.43
+++ htdocs/gcc-5/changes.html	27 Nov 2014 09:57:19 -
@@ -398,6 +398,14 @@
   configure option with either of code--with-tune/code or
   code--with-arch/code.
   /li
+  li Support for the Cortex-A17 processor has been added through the
+  code-mcpu=cortex-a17/code and code-mtune=cortex-a17/code options.
+  /li
+  li Initial big.LITTLE tuning support for the combination of Cortex-A17
+   and Cortex-A7 was added through the code-mcpu=cortex-a17.cortex-a7
+   /code and code-mtune=cortex-a17.cortex-a7/code options.
+  /li
+
  /ul
 
 h3 id=x86IA-32/x86-64/h3

Re: [PATCH][ARM] Add Cortex-A17 support

2014-11-27 Thread Ramana Radhakrishnan



On 27/11/14 11:09, Kyrill Tkachov wrote:


On 27/11/14 08:52, Ramana Radhakrishnan wrote:

On Thu, Nov 13, 2014 at 5:25 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Hi all,

This patch adds support for the Cortex-A17 processor to the arm backend.
Cortex-A17 is an ARMv7ve core with the same architectural features as the
Cortex-A7, A12 and A15 cores.
The -m{tune, cpu}=cortex-a17 option is added and a pipeline description for
instruction scheduling is provided.
This has given an uplift over -mcpu=cortex-a15 tuning on a number of
benchmarks.

The patch is fairly self-contained with the bulk of the diffstat being the
pipeline description files.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?


Ok with documentation update in invoke.texi and an entry for changes.html


Hi Ramana,

Thanks for the review. The invoke.texi changes were posted and approved
at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02683.html
and I'm attaching the proposed changes.html patch.


Ok thanks.

Ramana



Kyrill



Ramana


Thanks,
Kyrill

2014-11-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/arm/arm.md (generic_sched): Specify cortexa17 in 'no' list.
  Include cortex-a17.md.
  * config/arm/arm.c (arm_issue_rate): Specify 2 for cortexa17.
  * config/arm/arm-cores.def (cortex-a17): New entry.
  * config/arm/arm-tables.opt: Regenerate.
  * config/arm/arm-tune.md: Regenerate.
  * config/arm/bpabi.h (BE8_LINK_SPEC): Specify mcpu=cortex-a17.
  * config/arm/cortex-a17.md: New file.
  * config/arm/cortex-a17-neon.md: New file.
  * config/arm/driver-arm.c (arm_cpu_table): Add entry for cortex-a17.
  * config/arm/t-aprofile: Add cortex-a17 entries to MULTILIB_MATCHES.




[RFC PATCH, i386]: Prefer %ebx in set_got patterns

2014-11-27 Thread Uros Bizjak
Hello!

Attached patch helps RA to choose the most appropriate PIC register by
changing the register preference for set_got patterns. Using this
patch, there should really be a reason for RA to avoid ABI mandated
hard PIC reg.

This patch avoids many mov %exx,%ebx in front of the calls, that
happen with unpatched compiler even with Vladimir's latest RA patch to
avoid duplicated PIC registers.

As a smoke test, I have checked 32bit libgo.so.6.0.0 library, where now we have:

[uros@omen7 .libs]$ grep thunk.bx aaa | wc -l
7693
[uros@omen7 .libs]$ grep thunk.ax aaa | wc -l
10
[uros@omen7 .libs]$ grep thunk.cx aaa | wc -l
4
[uros@omen7 .libs]$ grep thunk.dx aaa | wc -l
8
[uros@omen7 .libs]$ grep thunk.bp aaa | wc -l
497
[uros@omen7 .libs]$ grep thunk.si aaa | wc -l
145
[uros@omen7 .libs]$ grep thunk.di aaa | wc -l
198

2014-11-27  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (set_got): Use =b,?r constraint for operand 0.
(set_got_labelled): Ditto.
(set_got_rex64): Ditto.
(set_rip_rex64): Ditto.
(set_got_offset_rex64): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Thoughts?

Uros
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 218111)
+++ config/i386/i386.md (working copy)
@@ -12101,7 +12101,7 @@
   ix86_expand_prologue (); DONE;)
 
 (define_insn set_got
-  [(set (match_operand:SI 0 register_operand =r)
+  [(set (match_operand:SI 0 register_operand =b,?r)
(unspec:SI [(const_int 0)] UNSPEC_SET_GOT))
(clobber (reg:CC FLAGS_REG))]
   !TARGET_64BIT
@@ -12110,7 +12110,7 @@
(set_attr length 12)])
 
 (define_insn set_got_labelled
-  [(set (match_operand:SI 0 register_operand =r)
+  [(set (match_operand:SI 0 register_operand =b,?r)
(unspec:SI [(label_ref (match_operand 1))]
 UNSPEC_SET_GOT))
(clobber (reg:CC FLAGS_REG))]
@@ -12120,7 +12120,7 @@
(set_attr length 12)])
 
 (define_insn set_got_rex64
-  [(set (match_operand:DI 0 register_operand =r)
+  [(set (match_operand:DI 0 register_operand =b,?r)
(unspec:DI [(const_int 0)] UNSPEC_SET_GOT))]
   TARGET_64BIT
   lea{q}\t{_GLOBAL_OFFSET_TABLE_(%%rip), %0|%0, _GLOBAL_OFFSET_TABLE_[rip]}
@@ -12129,7 +12129,7 @@
(set_attr mode DI)])
 
 (define_insn set_rip_rex64
-  [(set (match_operand:DI 0 register_operand =r)
+  [(set (match_operand:DI 0 register_operand =b,?r)
(unspec:DI [(label_ref (match_operand 1))] UNSPEC_SET_RIP))]
   TARGET_64BIT
   lea{q}\t{%l1(%%rip), %0|%0, %l1[rip]}
@@ -12138,7 +12138,7 @@
(set_attr mode DI)])
 
 (define_insn set_got_offset_rex64
-  [(set (match_operand:DI 0 register_operand =r)
+  [(set (match_operand:DI 0 register_operand =b,?r)
(unspec:DI
  [(label_ref (match_operand 1))]
  UNSPEC_SET_GOT_OFFSET))]


[PATCH][AARCH64]Use selected cpu's tuning when no tuning parameter is specified.

2014-11-27 Thread Renlin Li

Hi all,

We have the following code in aarch64_override_options() function.

  /* The selected cpu may be an architecture, so lookup tuning by core 
ID.  */

  if (!selected_tune)
selected_tune = all_cores[selected_cpu-core];

However, the logic here is not right any more according to our current 
code. selected_cpu will never be an architecture at this point.


This patch will correct the behaviour and use selected_cpu's tuning 
directly if selected_tune is still not decided at this point.



We have the following consequence after the change.
Previously we have the following tuning settings with the following 
command line options:
-march=armv8-a -- selected_cpu = generic, aarch64_tune_params = 
cortexa53_tunings
Nothing specified and selected_cpu == generic -- aarch64_tune_params = 
cortexa53_tunings


After the change, we got something different:
-march=armv8-a -- selected_cpu = generic, aarch64_tune_params = 
generic_tunings
Nothing specified and selected_cpu == generic -- aarch64_tune_params = 
generic_tunings


All other configuration(-march, -mcpu, -mtune) combinations should be 
unaffected.



aarch64-none-elf has been built and tested on the model, no issue.

Is it Okay for trunk?


gcc/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

* config/aarch64/aarch64.c (aarch64_parse_cpu): Don't define 
selected_tune.

(aarch64_override_options): Use selected_cpu's tuning.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1809513..0a8c303 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6613,7 +6613,6 @@ aarch64_parse_cpu (void)
   if (strlen (cpu-name) == len  strncmp (cpu-name, str, len) == 0)
 	{
 	  selected_cpu = cpu;
-	  selected_tune = cpu;
 	  aarch64_isa_flags = selected_cpu-flags;
 
 	  if (ext != NULL)
@@ -6709,9 +6708,8 @@ aarch64_override_options (void)
 
   gcc_assert (selected_cpu);
 
-  /* The selected cpu may be an architecture, so lookup tuning by core ID.  */
   if (!selected_tune)
-selected_tune = all_cores[selected_cpu-core];
+selected_tune = selected_cpu;
 
   aarch64_tune_flags = selected_tune-flags;
   aarch64_tune = selected_tune-core;

Re: [PATCH] pr61324 pr 63649 - fix crash in ipa_comdats

2014-11-27 Thread Trevor Saunders
On Tue, Nov 25, 2014 at 09:59:46PM +0100, Jan Hubicka wrote:
  From: Trevor Saunders tsaund...@mozilla.com
  
  Hi,
  
  the interesting symbol in the test case for pr61324 is __GLOBAL__sub_I_s.  
  It
  refers to nothing, and is called by nothing, however it is kept (I believe
  because of -fkeep-inline-functions).  That means ipa_comdats never tries to 
  put
 
 Aha, that explans why it is around.

 Ok, so pr 63649 is more interesting.  We start out with static ctors for
 a and b.  ipa-pure-const decides the static ctor for b is const, and so
 calls cgraph_node::set_const_flag ().  That function sets
 DECL_STATIC_CONSTRUCTOR on the static ctor for b to 0, but doesn't
 actually remove the function (I guess it relies on something else doing
 that).  Then ipa-comdats comes along and the story continues more or
 less the same as pr 61324.

  it in a comdat, and so it never ends up in the hash table.  It seems like 
  the
  simplest solution is to just check if symbol is not in the map before 
  trying to
  get the comdat it should go in, but another approach might be to keep 
  separate
  hash maps for comdat functions and functions that can't be in any comdat, 
  and
  then iterate over only the functions that belong in a comdat.
 
 Well, -fkeep-inline-functions promise you that you can call any inline 
 function from debugger.
 I suppose in this case you also want to be able to call static functions. 
 Comdat pass may
 bundle the function into comdat that is later optimized away by linker, so I 
 would say we
 just want to disable the whole comdat pass when -fkeep-inline-functions is 
 used?

So we might still want to do that, but I think fixing the above issue
may resolve this one too?

Trev

 
 Patch for that is preapproved.
 Honza
  
  boottstrapped + regtested x86_64-unknown-linux-gnu, ok?
  
  Trev
  
  gcc/
  
  * ipa-comdats.c (ipa_commdats): check if map contains symbol before
  trying to put symbol in a comdat.
  
  diff --git a/gcc/ipa-comdats.c b/gcc/ipa-comdats.c
  index af2aef8..8695a7e 100644
  --- a/gcc/ipa-comdats.c
  +++ b/gcc/ipa-comdats.c
  @@ -327,18 +327,18 @@ ipa_comdats (void)
 !symbol-alias
 symbol-real_symbol_p ())
  {
  - tree group = *map.get (symbol);
  + tree *group = map.get (symbol);
   
  - if (group == error_mark_node)
  + if (!group || *group == error_mark_node)
  continue;
if (dump_file)
  {
fprintf (dump_file, Localizing symbol\n);
symbol-dump (dump_file);
  - fprintf (dump_file, To group: %s\n, IDENTIFIER_POINTER (group));
  + fprintf (dump_file, To group: %s\n, IDENTIFIER_POINTER 
  (*group));
  }
symbol-call_for_symbol_and_aliases (set_comdat_group,
  -*comdat_head_map.get (group),
  +*comdat_head_map.get (*group),
   true);
  }
   }
  diff --git a/gcc/testsuite/g++.dg/pr61324.C b/gcc/testsuite/g++.dg/pr61324.C
  new file mode 100644
  index 000..6102574
  --- /dev/null
  +++ b/gcc/testsuite/g++.dg/pr61324.C
  @@ -0,0 +1,13 @@
  +// { dg-do compile }
  +// { dg-options -O -fkeep-inline-functions -fno-use-cxa-atexit }
  +void foo ();
  +
  +struct S
  +{
  +  ~S ()
  +  {
  +foo ();
  +  }
  +};
  +
  +S s;
  -- 
  2.1.3


Re: [PATCH] Fix regressions in libgomp testsuite: set flag_fat_lto_objects for offload

2014-11-27 Thread Ilya Verbin
On 25 Nov 10:51, Richard Biener wrote:
 In the patch adding flag_generate_offload sounds like a good solution,
 I didn't like emitting fat LTO objects unconditionally just because we 
 offload.

Here is updated patch.  Bootstrap and make check passed on i686-linux and
x86_64-linux.  Tests with offloading also passed with trunk binutils.
OK for trunk?

  -- Ilya


gcc/
* cgraphunit.c (ipa_passes): Handle flag_generate_offload.
(symbol_table::compile): Set flag_generate_offload if there is something
to offload.
* collect2.c (scan_prog_file): Look for the offload info marker symbol.
* common.opt (flag_generate_offload): New Variable declaration.
* dwarf2out.c (dwarf2out_finish): Handle flag_generate_offload.
* ipa-inline-analysis.c (inline_generate_summary): Do not skip if
flag_generate_offload is set.
* lto-streamer.c (gate_lto_out): Handle flag_generate_offload.
* passes.c (ipa_write_summaries): Do not skip if flag_generate_offload
is set.
* toplev.c (compile_file): Emit offload marker if offload info has been
previously emitted.
* tree.c (free_lang_data): Do not skip if flag_generate_offload is set.


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 2fd99a7..fed1a3e 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2075,7 +2075,7 @@ ipa_passes (void)
 }
 
   /* Some targets need to handle LTO assembler output specially.  */
-  if (flag_generate_lto)
+  if (flag_generate_lto || flag_generate_offload)
 targetm.asm_out.lto_start ();
 
   if (!in_lto_p)
@@ -2092,7 +2092,7 @@ ipa_passes (void)
}
 }
 
-  if (flag_generate_lto)
+  if (flag_generate_lto || flag_generate_offload)
 targetm.asm_out.lto_end ();
 
   if (!flag_ltrans  (in_lto_p || !flag_lto || flag_fat_lto_objects))
@@ -2176,10 +2176,10 @@ symbol_table::compile (void)
 
   /* Offloading requires LTO infrastructure.  */
   if (!in_lto_p  g-have_offload)
-flag_generate_lto = 1;
+flag_generate_offload = 1;
 
   /* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
-  if (flag_generate_lto)
+  if (flag_generate_lto || flag_generate_offload)
 lto_streamer_hooks_init ();
 
   /* Don't run the IPA passes if there was any error or sorry messages.  */
diff --git a/gcc/collect2.c b/gcc/collect2.c
index 9c3a1c5..2dcebcd 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -2392,12 +2392,16 @@ scan_prog_file (const char *prog_name, scanpass 
which_pass,
   if (found_lto)
 continue;
 
-  /* Look for the LTO info marker symbol, and add filename to
+  /* Look for the LTO or offload info marker symbol, and add filename 
to
  the LTO objects list if found.  */
   for (p = buf; (ch = *p) != '\0'  ch != '\n'; p++)
 if (ch == ' '   p[1] == '_'  p[2] == '_'
-(strncmp (p + (p[3] == '_' ? 2 : 1), __gnu_lto_v1, 12) == 
0)
-ISSPACE (p[p[3] == '_' ? 14 : 13]))
+(((strncmp (p + (p[3] == '_' ? 2 : 1),
+  __gnu_lto_v1, 12) == 0)
+ ISSPACE (p[p[3] == '_' ? 14 : 13]))
+   || ((strncmp (p + (p[3] == '_' ? 2 : 1),
+ __gnu_offload_v1, 16) == 0)
+ISSPACE (p[p[3] == '_' ? 18 : 17]
   {
 add_lto_object (lto_objects, prog_name);
 
diff --git a/gcc/common.opt b/gcc/common.opt
index 41c8d4e..752d939 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -67,6 +67,10 @@ int *param_values
 Variable
 int flag_generate_lto
 
+; Nonzero if we should write GIMPLE bytecode for offload compilation.
+Variable
+int flag_generate_offload = 0
+
 ; True to warn about any objects definitions whose size is larger
 ; than N bytes.  Also want about function definitions whose returned
 ; values are larger than N bytes, where N is 'larger_than_size'.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 25f0e7d..4ee3102 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -24420,7 +24420,8 @@ dwarf2out_finish (const char *filename)
   /* When generating LTO bytecode we can not generate new assembler
  names at this point and all important decls got theirs via
 free-lang-data.  */
-  if ((!flag_generate_lto || DECL_ASSEMBLER_NAME_SET_P (decl))
+  if (((!flag_generate_lto  !flag_generate_offload)
+  || DECL_ASSEMBLER_NAME_SET_P (decl))
   DECL_ASSEMBLER_NAME (decl) != DECL_NAME (decl))
{
  add_linkage_attr (node-die, decl);
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 2f2993c..9d62722 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -4031,7 +4031,7 @@ inline_generate_summary (void)
 
   /* When not optimizing, do not bother to analyze.  Inlining is still done
  because edge redirection needs to happen there.  */
-  if (!optimize  !flag_generate_lto  !flag_wpa)

Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)

2014-11-27 Thread Marek Polacek
On Wed, Nov 26, 2014 at 12:03:45PM -0500, Jason Merrill wrote:
 On 11/20/2014 02:04 PM, Marek Polacek wrote:
 +  if (fun == NULL_TREE)
 +switch (CALL_EXPR_IFN (t))
 +  {
 +  case IFN_UBSAN_NULL:
 +  case IFN_UBSAN_BOUNDS:
 +return void_node;
 +  default:
 +break;
 +  }
 
 Other IFNs should make the call non-constant.

Ok.
 
 -/* { dg-error is not a constant expression  { target c++ } 12 } */
 +/* { dg-warning right shift count is negative  { target c++ } 12 } */
 
 This should be an xfail (pending the delayed folding work) instead of a
 different test.

Just to check that I understand correctly, the point is that with
delayed folding we'd give an error here?

I added the xfails.

 +constexpr int n3 = fn7 ((const int *) 0, 8); // { dg-error is not a 
 constant expression|constexpr call flows off }
 
 The flows off the end error is a bug and should not be tested for. I'm
 going to check in a fix.

Fixed.

Bootstrapped/regtested on ppc64-linux.

2014-11-27  Marek Polacek  pola...@redhat.com

PR sanitizer/63956
* ubsan.c (is_ubsan_builtin_p): Check also built-in class.
cp/
* constexpr.c: Include ubsan.h.
(cxx_eval_call_expression): Bail out for IFN_UBSAN_{NULL,BOUNDS}
internal functions and for ubsan builtins.
* error.c: Include internal-fn.h.
(dump_expr): Add printing of internal functions.
testsuite/
* c-c++-common/ubsan/shift-5.c: Add xfails.
* g++.dg/ubsan/div-by-zero-1.C: Don't use -w.  Add xfail.
* g++.dg/ubsan/pr63956.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index ef9ef70..45d5959 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include gimplify.h
 #include builtins.h
 #include tree-inline.h
+#include ubsan.h
 
 static bool verify_constant (tree, bool, bool *, bool *);
 #define VERIFY_CONSTANT(X) \
@@ -1151,6 +1152,19 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   constexpr_call *entry;
   bool depth_ok;
 
+  if (fun == NULL_TREE)
+switch (CALL_EXPR_IFN (t))
+  {
+  case IFN_UBSAN_NULL:
+  case IFN_UBSAN_BOUNDS:
+   return void_node;
+  default:
+   if (!ctx-quiet)
+ error_at (loc, call to internal function);
+   *non_constant_p = true;
+   return t;
+  }
+
   if (TREE_CODE (fun) != FUNCTION_DECL)
 {
   /* Might be a constexpr function pointer.  */
@@ -1171,6 +1185,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 }
   if (DECL_CLONED_FUNCTION_P (fun))
 fun = DECL_CLONED_FUNCTION (fun);
+
+  if (is_ubsan_builtin_p (fun))
+return void_node;
+
   if (is_builtin_fn (fun))
 return cxx_eval_builtin_function_call (ctx, t,
   addr, non_constant_p, overflow_p);
diff --git gcc/cp/error.c gcc/cp/error.c
index 5dcc149..ff26fb9 100644
--- gcc/cp/error.c
+++ gcc/cp/error.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-pretty-print.h
 #include c-family/c-objc.h
 #include ubsan.h
+#include internal-fn.h
 
 #include new// For placement-new.
 
@@ -2039,6 +2040,14 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
tree fn = CALL_EXPR_FN (t);
bool skipfirst = false;
 
+   /* Deal with internal functions.  */
+   if (fn == NULL_TREE)
+ {
+   pp_string (pp, internal_fn_name (CALL_EXPR_IFN (t)));
+   dump_call_expr_args (pp, t, flags, skipfirst);
+   break;
+ }
+
if (TREE_CODE (fn) == ADDR_EXPR)
  fn = TREE_OPERAND (fn, 0);
 
diff --git gcc/testsuite/c-c++-common/ubsan/shift-5.c 
gcc/testsuite/c-c++-common/ubsan/shift-5.c
index 6f9c52a..d779571 100644
--- gcc/testsuite/c-c++-common/ubsan/shift-5.c
+++ gcc/testsuite/c-c++-common/ubsan/shift-5.c
@@ -2,31 +2,41 @@
 /* { dg-options -fsanitize=shift -w } */
 /* { dg-shouldfail ubsan } */
 
-int x;
 int
-foo (void)
+foo (int x)
 {
   /* None of the following should pass.  */
   switch (x)
 {
 case 1  -1:
-/* { dg-error case label does not reduce to an integer constant  {target c 
} 12 } */
-/* { dg-error is not a constant expression  { target c++ } 12 } */
+/* { dg-error case label does not reduce to an integer constant  { target 
c } 11 } */
+/* { dg-error   { xfail { *-*-* } } 11 } */
 case -1  -1:
-/* { dg-error case label does not reduce to an integer constant  {target c 
} 15 } */
-/* { dg-error is not a constant expression  { target c++ } 15 } */
+/* { dg-error case label does not reduce to an integer constant  { target 
c } 14 } */
+/* { dg-error   { xfail { *-*-* } } 14 } */
 case 1  -1:
-/* { dg-error case label does not reduce to an integer constant  {target c 
} 18 } */
-/* { dg-error is not a constant expression  { target c++ } 18 } */
+/* { dg-error case label does not reduce to an integer constant  { target 
c } 

[PING][PATCH] Ignore alignment by option

2014-11-27 Thread Marat Zakirov


On 11/19/2014 06:01 PM, Marat Zakirov wrote:

Hi all!

Here is the patch which forces ASan to ignore alignment of memory 
access. It increases ASan overhead but it's still useful because some 
programs like linux kernel often cheat with alignment which may cause 
false negatives.


--Marat


gcc/ChangeLog:

2014-11-14  Marat Zakirov  m.zaki...@samsung.com

	* asan.c (asan_expand_check_ifn): Ignore alignment by option.  
	* doc/invoke.texi: Document.  
	* params.def (asan-alignment-optimize): New.  
	* params.h: Likewise.  

gcc/testsuite/ChangeLog:

2014-11-14  Marat Zakirov  m.zaki...@samsung.com

	* c-c++-common/asan/red-align-3.c: New test.  


diff --git a/gcc/asan.c b/gcc/asan.c
index 79dede7..4f86088 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -2518,6 +2518,12 @@ asan_expand_check_ifn (gimple_stmt_iterator *iter, bool use_calls)
 
   HOST_WIDE_INT size_in_bytes
 = is_scalar_access  tree_fits_shwi_p (len) ? tree_to_shwi (len) : -1;
+  
+  if (!ASAN_ALIGNMENT_OPTIMIZE  size_in_bytes  1)
+{
+  size_in_bytes = -1;
+  align = 1;
+}
 
   if (use_calls)
 {
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 13270bc..8f43c06 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10580,6 +10580,12 @@ is greater or equal to this number, use callbacks instead of inline checks.
 E.g. to disable inline code use
 @option{--param asan-instrumentation-with-call-threshold=0}.
 
+@item asan-alignment-optimize
+Enable asan optimization for aligned accesses.
+It is enabled by default when using @option{-fsanitize=address} option.
+To disable optimization for aligned accesses use 
+@option{--param asan-alignment-optimize=0}.
+
 @item chkp-max-ctor-size
 Static constructors generated by Pointer Bounds Checker may become very
 large and significantly increase compile time at optimization level
diff --git a/gcc/params.def b/gcc/params.def
index d2d2add..fbccf46 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1114,6 +1114,11 @@ DEFPARAM (PARAM_ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD,
  in function becomes greater or equal to this number,
  7000, 0, INT_MAX)
 
+DEFPARAM (PARAM_ASAN_ALIGNMENT_OPTIMIZE,
+ asan-alignment-optimize,
+ Enable asan optimization for aligned access,
+ 1, 0, 1)
+
 DEFPARAM (PARAM_UNINIT_CONTROL_DEP_ATTEMPTS,
 	  uninit-control-dep-attempts,
 	  Maximum number of nested calls to search for control dependencies 
diff --git a/gcc/params.h b/gcc/params.h
index 4779e17..e2973d4 100644
--- a/gcc/params.h
+++ b/gcc/params.h
@@ -238,5 +238,7 @@ extern void init_param_values (int *params);
   PARAM_VALUE (PARAM_ASAN_USE_AFTER_RETURN)
 #define ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD \
   PARAM_VALUE (PARAM_ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD)
+#define ASAN_ALIGNMENT_OPTIMIZE \
+  PARAM_VALUE (PARAM_ASAN_ALIGNMENT_OPTIMIZE)
 
 #endif /* ! GCC_PARAMS_H */
diff --git a/gcc/testsuite/c-c++-common/asan/ignore_align.c b/gcc/testsuite/c-c++-common/asan/ignore_align.c
new file mode 100644
index 000..989958b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/ignore_align.c
@@ -0,0 +1,34 @@
+/* { dg-options -fdump-tree-sanopt --param asan-alignment-optimize=0 } */
+/* { dg-do compile } */
+/* { dg-skip-if  { *-*-* } { -flto } {  } } */
+
+#ifdef __cplusplus
+extern C {
+#endif
+void *memset (void *, int, __SIZE_TYPE__);
+#ifdef __cplusplus
+}
+#endif
+
+struct dummy {
+  int a;
+  int b;
+  int c;
+  int d;
+};
+
+volatile struct dummy * new_p;
+volatile struct dummy * old_p; 
+
+void foo(void)
+{
+  *(volatile char *)(0x12);
+  *(volatile short int *)(0x12);
+  *(volatile unsigned int *)(0x12);
+  *(volatile unsigned long long *)(0x12);
+  *new_p = *old_p;
+}
+
+/* { dg-final { scan-tree-dump-times  3 11 sanopt } } */
+/* { dg-final { scan-tree-dump-times  7 11 sanopt } } */
+/* { dg-final { cleanup-tree-dump sanopt } } */


[PATCH] Fix PR64088

2014-11-27 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-11-27  Richard Biener  rguent...@suse.de

PR middle-end/64088
* fold-const.c (const_unop): Re-instantiate missing condition
before calling fold_abs_const.

* gcc.dg/torture/pr64088.c: New testcase.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 218121)
+++ gcc/fold-const.c(working copy)
@@ -1481,7 +1489,9 @@ const_unop (enum tree_code code, tree ty
   }
 
 case ABS_EXPR:
-  return fold_abs_const (arg0, type);
+  if (TREE_CODE (arg0) == INTEGER_CST || TREE_CODE (arg0) == REAL_CST)
+   return fold_abs_const (arg0, type);
+  break;
 
 case CONJ_EXPR:
   if (TREE_CODE (arg0) == COMPLEX_CST)
Index: gcc/testsuite/gcc.dg/torture/pr64088.c
===
--- gcc/testsuite/gcc.dg/torture/pr64088.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr64088.c  (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+
+extern int abs (int);
+extern void attenuate_curve(int*);
+int a;
+void
+setup_tone_curves_center_boost ()
+{
+  int b[0];
+  a = 0;
+  for (; a  6; a++)
+{
+  int c = abs (a);
+  b[a] = c;
+}
+  attenuate_curve (b);
+}


[PATCH] Fix PR64091

2014-11-27 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-11-27  Richard Biener  rguent...@suse.de

PR tree-optimization/64088
* tree-ssa-tail-merge.c (update_debug_stmt): After resetting
the stmt break from the loop over use operands.

* gcc.dg/torture/pr64091.c: New testcase.

Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c   (revision 218117)
+++ gcc/tree-ssa-tail-merge.c   (working copy)
@@ -1606,9 +1613,7 @@ update_debug_stmt (gimple stmt)
 {
   use_operand_p use_p;
   ssa_op_iter oi;
-  basic_block bbdef, bbuse;
-  gimple def_stmt;
-  tree name;
+  basic_block bbuse;
 
   if (!gimple_debug_bind_p (stmt))
 return;
@@ -1616,19 +1621,16 @@ update_debug_stmt (gimple stmt)
   bbuse = gimple_bb (stmt);
   FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, oi, SSA_OP_USE)
 {
-  name = USE_FROM_PTR (use_p);
-  gcc_assert (TREE_CODE (name) == SSA_NAME);
-
-  def_stmt = SSA_NAME_DEF_STMT (name);
-  gcc_assert (def_stmt != NULL);
-
-  bbdef = gimple_bb (def_stmt);
+  tree name = USE_FROM_PTR (use_p);
+  gimple def_stmt = SSA_NAME_DEF_STMT (name);
+  basic_block bbdef = gimple_bb (def_stmt);
   if (bbdef == NULL || bbuse == bbdef
  || dominated_by_p (CDI_DOMINATORS, bbuse, bbdef))
continue;
 
   gimple_debug_bind_reset_value (stmt);
   update_stmt (stmt);
+  break;
 }
 }
 
Index: gcc/testsuite/gcc.dg/torture/pr64091.c
===
--- gcc/testsuite/gcc.dg/torture/pr64091.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr64091.c  (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options -g } */
+
+extern int foo(void);
+
+int main(void)
+{
+  int i, a, b;
+
+  if (foo())
+return 0;
+
+  for (i = 0, a = 0, b = 0; i  3; i++, a++)
+  {
+if (foo())
+  break;
+
+if (b += a)
+  a = 0;
+  }
+
+  if (!a)
+return 2;
+
+  b += a;
+
+  return 0;
+}


Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-11-27 Thread Marek Polacek
On Wed, Nov 26, 2014 at 11:56:26AM -0500, Jason Merrill wrote:
 Please give diagnostics explaining what's wrong with the shift rather than
 the generic is not a constant expression.

Done.
 
 +  tree t = build_int_cst (unsigned_type_node, uprec - 1);
 +  t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs);
 +  tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs);
 +  t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t);
 +  if (tree_int_cst_lt (integer_one_node, t))
 
 This could also use a comment explaining the logic.

Ok.

How about the following then?

Bootstrapped/regtested on ppc64-linux.

2014-11-27  Marek Polacek  pola...@redhat.com

* constexpr.c (cxx_eval_check_shift_p): New function.
(cxx_eval_binary_expression): Call it.

* g++.dg/cpp0x/constexpr-shift1.C: New test.
* g++.dg/cpp1y/constexpr-shift1.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index ef9ef70..d305298 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -1451,6 +1451,48 @@ verify_constant (tree t, bool allow_non_constant, bool 
*non_constant_p,
   return *non_constant_p;
 }
 
+/* Return true if the shift operation on LHS and RHS is undefined.  */
+
+static bool
+cxx_eval_check_shift_p (enum tree_code code, tree lhs, tree rhs)
+{
+  if ((code != LSHIFT_EXPR  code != RSHIFT_EXPR)
+  || TREE_CODE (lhs) != INTEGER_CST
+  || TREE_CODE (rhs) != INTEGER_CST)
+return false;
+
+  tree lhstype = TREE_TYPE (lhs);
+  unsigned HOST_WIDE_INT uprec = TYPE_PRECISION (TREE_TYPE (lhs));
+
+  /* [expr.shift] The behavior is undefined if the right operand
+ is negative, or greater than or equal to the length in bits
+ of the promoted left operand.  */
+  if (tree_int_cst_sgn (rhs) == -1 || compare_tree_int (rhs, uprec) = 0)
+return true;
+
+  /* The value of E1  E2 is E1 left-shifted E2 bit positions; [...]
+ if E1 has a signed type and non-negative value, and E1x2^E2 is
+ representable in the corresponding unsigned type of the result type,
+ then that value, converted to the result type, is the resulting value;
+ otherwise, the behavior is undefined.  */
+  if (code == LSHIFT_EXPR  !TYPE_UNSIGNED (lhstype))
+{
+  if (tree_int_cst_sgn (lhs) == -1)
+   return true;
+  /* For signed x  y the following:
+(unsigned) x  ((prec (lhs) - 1) - y)
+if  1, is undefined.  */
+  tree t = build_int_cst (unsigned_type_node, uprec - 1);
+  t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs);
+  tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs);
+  t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t);
+  if (tree_int_cst_lt (integer_one_node, t))
+   return true;
+}
+
+  return false;
+}
+
 /* Subroutine of cxx_eval_constant_expression.
Attempt to reduce the unary expression tree T to a compile time value.
If successful, return the value.  Otherwise issue a diagnostic
@@ -1513,6 +1555,9 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
   else
r = build2_loc (loc, code, type, lhs, rhs);
 }
+  else if (cxx_eval_check_shift_p (code, lhs, rhs))
+error_at (loc, shift expression %q+E overflows,
+ build2_loc (loc, code, type, lhs, rhs));
   VERIFY_CONSTANT (r);
   return r;
 }
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
index e69de29..48c2ab1 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
@@ -0,0 +1,73 @@
+// { dg-do compile { target c++11 } }
+
+constexpr int
+fn1 (int i, int j)
+{
+  return i  j; // { dg-error shift expression }
+}
+
+constexpr int i1 = fn1 (1, -1);
+
+constexpr int
+fn2 (int i, int j)
+{
+  return i  j; // { dg-error shift expression }
+}
+
+constexpr int i2 = fn2 (1, 200);
+
+constexpr int
+fn3 (int i, int j)
+{
+  return i  j; // { dg-error shift expression }
+}
+
+constexpr int i3 = fn3 (-1, 2);
+
+constexpr int
+fn4 (int i, int j)
+{
+  return i  j; // { dg-error shift expression }
+}
+
+constexpr int i4 = fn4 (__INT_MAX__, 2);
+
+constexpr int
+fn5 (int i, int j)
+{
+  return i  j;
+}
+
+constexpr int i5 = fn5 (__INT_MAX__, 1);
+
+constexpr int
+fn6 (unsigned int i, unsigned int j)
+{
+  return i  j; // { dg-error shift expression }
+}
+
+constexpr int i6 = fn6 (1, -1);
+
+constexpr int
+fn7 (int i, int j)
+{
+  return i  j; // { dg-error shift expression }
+}
+
+constexpr int i7 = fn7 (1, -1);
+
+constexpr int
+fn8 (int i, int j)
+{
+  return i  j;
+}
+
+constexpr int i8 = fn8 (-1, 1);
+
+constexpr int
+fn9 (int i, int j)
+{
+  return i  j;  // { dg-error shift expression }
+}
+
+constexpr int i9 = fn9 (1, 200);
diff --git gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C 
gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C
index e69de29..a739fd2 100644
--- gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C
@@ -0,0 +1,9 @@
+// { dg-do 

Re: [PATCH] IPA ICF: memory leak fix

2014-11-27 Thread Martin Liška

On 11/23/2014 10:01 AM, Markus Trippelsdorf wrote:

On 2014.11.22 at 17:46 +0100, Markus Trippelsdorf wrote:

On 2014.11.22 at 16:04 +0100, Martin Liška wrote:

On 11/22/2014 10:09 AM, Markus Trippelsdorf wrote:

On 2014.11.22 at 09:05 +0100, Martin Liška wrote:


Following patch removes memory leak that was introduced by very first IPA ICF 
patch.
I would like to thank David for hunting the leak.

Patch an bootstrap on x86_86-linux-pc and no regression is introduced.


I gave the patch a quick spin on gcc112:

*** Error in `/home/trippels/gcc_build_dir/./prev-gcc/lto1': free(): invalid 
next size (fast): 0x01000a5fc160 ***
=== Backtrace: =
/lib64/libc.so.6(+0xa3d9c)[0x3fff7b6b3d9c]
/lib64/libc.so.6(+0xaf0b4)[0x3fff7b6bf0b4]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN3vecIi7va_heap6vl_ptrE7releaseEv-0x1d4bc00)[0x1025dd88]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf12sem_function14equals_privateEPNS_8sem_itemER8hash_mapIP11symtab_nodeS2_22default_hashmap_traitsE-0x9c083c)[0x116586bc]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf12sem_function6equalsEPNS_8sem_itemER8hash_mapIP11symtab_nodeS2_22default_hashmap_traitsE-0x9c0578)[0x11658998]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf18sem_item_optimizer7executeEv-0x9b8774)[0x11660a84]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf12pass_ipa_icf7executeEP8function-0x9b0314)[0x11668efc]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_Z16execute_one_passP8opt_pass-0x1647588)[0x1098a0a8]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_Z21execute_ipa_pass_listP8opt_pass-0x1644c2c)[0x1098ca7c]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_Z8lto_mainv-0x1df20e4)[0x101b494c]
/home/trippels/gcc_build_dir/./prev-gcc/lto1[0x10b599b8]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN6toplev4mainEiPPc-0x1e8be70)[0x101507b8]
/home/trippels/gcc_build_dir/./prev-gcc/lto1(main-0x1ec8d8c)[0x1015493c]
/lib64/libc.so.6(+0x447ac)[0x3fff7b6547ac]
/lib64/libc.so.6(__libc_start_main-0x19cbf4)[0x3fff7b6549d4]
=== Memory map: 
...



Thank you for testing, problem is that I should grow the vector by 1, because 
'0' is used as NULL value.

Please try my fixed patch.


This one survives bootstrap-lto. Thanks.


But Firefox doesn't build:

/home/trippels/gcc_test/usr/local/bin/c++ -fPIC -Wall -Wempty-body 
-Woverloaded-virtual -Wsign-compare -Wwrite-strings -Werror=endif-labels 
-Werror=int-to-pointer-cast -Werror=missing-braces -Werror=pointer-arith 
-Werror=return-type -Werror=sequence-point -Werror=unused-label 
-Werror=trigraphs -Werror=type-limits -Wno-invalid-offsetof -Wcast-align 
-flto=160 --param lto-partitions=160 -mcpu=power8 -ffunction-sections 
-fdata-sections -fno-exceptions -fno-strict-aliasing -frtti -fno-exceptions 
-fno-math-errno -std=gnu++0x -pthread -pipe -UDEBUG -DNDEBUG -O3 
-DU_STATIC_IMPLEMENTATION -fvisibility=hidden -W -Wall -pedantic 
-Wpointer-arith -Wwrite-strings -Wno-long-long -Wno-unused 
-Wno-unused-parameter   -lpthread 
-Wl,--hash-style=gnu,--as-needed,--gc-sections,--icf=all -Wl,-z,noexecstack 
-Wl,-z,text -Wl,--build-id -Wl,--gc-sections  -o ../../bin/makeconv makeconv.o 
ucnvstat.o genmbcs.o gencnvex.o -L../../lib -licutu -L../../lib -licui18n 
-L../../lib -licuuc -L../../stubdata -licudata -

l
pthread -ldl -lm

lto1: internal compiler error: in operator[], at vec.h:736
0x10122377 vecint, va_heap, vl_embed::operator[](unsigned int)
 ../../gcc/gcc/vec.h:736
0x10d1a0f3 vecint, va_heap, vl_embed::operator[](unsigned int)
 ../../gcc/gcc/ipa-icf.c:963
0x10d1a0f3 vecint, va_heap, vl_ptr::operator[](unsigned int)
 ../../gcc/gcc/vec.h:1202
0x10d1a0f3 ipa_icf::sem_function::bb_dict_test(auto_vecint, 0ul, int, int)
 ../../gcc/gcc/ipa-icf.c:970
0x10d1aa73 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*, 
hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits)
 ../../gcc/gcc/ipa-icf.c:512
0x10d1afb3 ipa_icf::sem_function::equals(ipa_icf::sem_item*, hash_mapsymtab_node*, 
ipa_icf::sem_item*, default_hashmap_traits)
 ../../gcc/gcc/ipa-icf.c:384
0x10d1dba3 ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)
 ../../gcc/gcc/ipa-icf.c:1833
0x10d2397f ipa_icf::sem_item_optimizer::execute()
 ../../gcc/gcc/ipa-icf.c:1652
0x10d24a93 ipa_icf_driver
 ../../gcc/gcc/ipa-icf.c:2382
0x10d24a93 ipa_icf::pass_ipa_icf::execute(function*)
 ../../gcc/gcc/ipa-icf.c:2430
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See http://gcc.gnu.org/bugs.html for instructions.
lto-wrapper: fatal error: /home/trippels/gcc_test/usr/local/bin/c++ returned 1 
exit status
compilation terminated.
/home/trippels/bin/ld: fatal error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[8]: *** [../../bin/makeconv] Error 1



Hi.

Well, this final version can run profiled-bootstrap and I capable of building 
FF with 

RE: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.

2014-11-27 Thread David Sherwood
 On 18 November 2014 10:14, David Sherwood david.sherw...@arm.com wrote:
  Hi Christophe,
 
  Ah sorry. My mistake - it fixes this in bugzilla:
 
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59810
 
 I did look at that PR, but since it has no testcase attached, I was unsure.
 And I am still not :-)
 PR 59810 is [AArch64] LDn/STn implementations are not ABI-conformant
 for bigendian.
 but the advsimd-intrinsics/vldX.c and vldX_lane.c now PASS with Alan's
 patches on aarch64_be, so I thought Alan's patches solve PR59810.
 
 What am I missing?

Hi Christophe,

I think probably this is our fault for making our lives way too difficult and
artificially splitting all these patches up. :)

Alan's patch:

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00952.html

fixes some issues on aarch64_be, but also causes regressions. For example,


Tests that now fail, but worked before:

aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects execution 
test
aarch64_be-elf-aem: gcc.dg/vect/slp-perm-8.c execution test
aarch64_be-elf-aem: gcc.dg/vect/vect-over-widen-1-big-array.c -flto 
-ffat-lto-objects execution test
...

Tests that now work, but didn't before:

aarch64_be-elf-aem: gcc.dg/vect/fast-math-vect-complex-3.c execution test
aarch64_be-elf-aem: gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c execution test
aarch64_be-elf-aem: gcc.dg/vect/no-scevccp-outer-10a.c execution test
...


His patch is only half of the story and must be applied at the same time as the
[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.
patch. With both patches applied the result looks much healthier:


# Comparing 1 common sum files
## /bin/sh ./src/gcc/contrib/compare_tests  /tmp/gxx-sum1.10051 
/tmp/gxx-sum2.10051
Tests that now work, but didn't before:

aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer  
execution test
aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions  execution test
aarch64_be-elf-aem: gcc.dg/torture/pr52028.c   -O3 -fomit-frame-pointer 
-funroll-loops  execution test
...


with no new regressions. After applying both patches the aarch64_be gcc 
testsuite is
on a parity with the aarch64 testsuite. Furthermore, after applying both of 
these patches:

[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe
[AArch64] [BE] Fix vector load/stores to not use ld1/st1

it then becomes safe for us to remove the CCMC macro, which is the cause of 
unnecessary spills to the stack for certain auto-vectorised code. So really I
suppose when I posted my second patch

[AArch64] [BE] [2/2] Make large opaque integer modes endianness-safe

I should have really just called this

[AArch64] [BE] Remove CCMC for aarch64

in order to make it clear exactly what the purpose of these patches is.

Kind Regards,
David Sherwood.






[PATCH] Fix PR64084

2014-11-27 Thread Richard Biener

Currently the order in which patterns in match.pd match is pretty
random (well, it's deterministic of course).  This is bad if there
are multiple possible matches such as in

  z = x + 1;
  w = z + 0;

where we have both z + 0 - z and (x + 1) + 0 - x + (1 + 0).
Currently the latter happens to be applied and thus we simplify to

  z = x + 1;
  w = x + 1;

which is bad for propagation engines which only handle constant
or SSA name results (and also either generates redundant code or leaves
dead code).

Thus the following patch makes sure that if there are multiple
matches possible then the pattern first appearing in match.pd
is chosen.

This slightly increases the number of instructions executed during
the matching in the worst case but is certainly the best course
of action here (other possibilities are to make the patterns
disjunct by disabling the latter for the + 0 case).  At least
it matches how other generated matchers behave.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-11-27  Richard Biener  rguent...@suse.de

PR middle-end/64084
* genmatch.c (dt_node::gen_kids_1): New function, split out
from dt_node::gen_kids.
(decision_tree::cmp_node): DT_TRUE are generally not equal.
(decision_tree::find_node): Treat DT_TRUE as barrier for
node CSE on the same level.
(dt_node::append_node): Do not keep DT_TRUE last.
(dt_node::gen_kids): Emit code after each DT_TRUE node seen.

* gcc.dg/tree-ssa/ssa-ccp-34.c: New testcase.
* gcc.dg/tree-ssa/forwprop-31.c: Likewise.

Index: gcc/genmatch.c
===
*** gcc/genmatch.c.orig 2014-11-14 12:13:31.029117775 +0100
--- gcc/genmatch.c  2014-11-27 15:03:24.085664272 +0100
*** struct dt_node
*** 1098,1103 
--- 1098,1106 
virtual void gen (FILE *, bool) {}
  
void gen_kids (FILE *, bool);
+   void gen_kids_1 (FILE *, bool,
+  vecdt_operand *, vecdt_operand *, vecdt_operand *,
+  vecdt_operand *, vecdt_operand *, vecdt_node *);
  };
  
  /* Generic decision tree node used for DT_OPERAND and DT_MATCH.  */
*** decision_tree::cmp_node (dt_node *n1, dt
*** 1203,1211 
if (!n1 || !n2 || n1-type != n2-type)
  return false;
  
!   if (n1 == n2 || n1-type == dt_node::DT_TRUE)
  return true;
  
if (n1-type == dt_node::DT_OPERAND)
  return cmp_operand ((as_adt_operand * (n1))-op,
(as_adt_operand * (n2))-op);
--- 1206,1217 
if (!n1 || !n2 || n1-type != n2-type)
  return false;
  
!   if (n1 == n2)
  return true;
  
+   if (n1-type == dt_node::DT_TRUE)
+ return false;
+ 
if (n1-type == dt_node::DT_OPERAND)
  return cmp_operand ((as_adt_operand * (n1))-op,
(as_adt_operand * (n2))-op);
*** decision_tree::cmp_node (dt_node *n1, dt
*** 1220,1229 
  dt_node *
  decision_tree::find_node (vecdt_node * ops, dt_node *p)
  {
!   for (unsigned i = 0; i  ops.length (); ++i)
! if (decision_tree::cmp_node (ops[i], p))
!   return ops[i];
! 
return NULL;
  }
  
--- 1226,1246 
  dt_node *
  decision_tree::find_node (vecdt_node * ops, dt_node *p)
  {
!   /* We can merge adjacent DT_TRUE.  */
!   if (p-type == dt_node::DT_TRUE
!!ops.is_empty ()
!ops.last ()-type == dt_node::DT_TRUE)
! return ops.last ();
!   for (int i = ops.length () - 1; i = 0; --i)
! {
!   /* But we can't merge across DT_TRUE nodes as they serve as
!  pattern order barriers to make sure that patterns apply
!in order of appearance in case multiple matches are possible.  */
!   if (ops[i]-type == dt_node::DT_TRUE)
!   return NULL;
!   if (decision_tree::cmp_node (ops[i], p))
!   return ops[i];
! }
return NULL;
  }
  
*** dt_node::append_node (dt_node *n)
*** 1242,1256 
kids.safe_push (n);
n-level = this-level + 1;
  
-   unsigned len = kids.length ();
- 
-   if (len  1  kids[len - 2]-type == dt_node::DT_TRUE)
- {
-   dt_node *p = kids[len - 2];
-   kids[len - 2] = kids[len - 1];
-   kids[len - 1] = p;
- }
- 
return n;
  }
  
--- 1259,1264 
*** dt_node::gen_kids (FILE *f, bool gimple)
*** 2006,2012 
auto_vecdt_operand * generic_fns;
auto_vecdt_operand * preds;
auto_vecdt_node * others;
-   dt_node *true_operand = NULL;
  
for (unsigned i = 0; i  kids.length (); ++i)
  {
--- 2014,2019 
*** dt_node::gen_kids (FILE *f, bool gimple)
*** 2044,2054 
   || kids[i]-type == dt_node::DT_SIMPLIFY)
others.safe_push (kids[i]);
else if (kids[i]-type == dt_node::DT_TRUE)
!   true_operand = kids[i];
else
gcc_unreachable ();
  }
  
char buf[128];
char *kid_opname = buf;
  
--- 2051,2090 
   || kids[i]-type == dt_node::DT_SIMPLIFY)

Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

2014-11-27 Thread Kyrill Tkachov

Ping.

Thanks,
Kyrill

On 11/11/14 11:55, Kyrill Tkachov wrote:

Hi all,

This is the arm implementation of the macro fusion hook.
It tries to fuse movw+movt operations together. It also tries to take
lo_sum RTXs into account since those generate movt instructions as well.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2014-11-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
  * config/arm/arm.c (arm_macro_fusion_p): New function.
  (arm_macro_fusion_pair_p): Likewise.
  (TARGET_SCHED_MACRO_FUSION_P): Define.
  (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
  (ARM_FUSE_NOTHING): Likewise.
  (ARM_FUSE_MOVW_MOVT): Likewise.
  (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
  arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
  arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
  arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
  arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
  arm_cortex_a5_tune): Specify fuseable_ops value.





Re: [PATCH] Fix regressions in libgomp testsuite: set flag_fat_lto_objects for offload

2014-11-27 Thread Richard Biener
On Thu, Nov 27, 2014 at 2:31 PM, Ilya Verbin iver...@gmail.com wrote:
 On 25 Nov 10:51, Richard Biener wrote:
 In the patch adding flag_generate_offload sounds like a good solution,
 I didn't like emitting fat LTO objects unconditionally just because we 
 offload.

 Here is updated patch.  Bootstrap and make check passed on i686-linux and
 x86_64-linux.  Tests with offloading also passed with trunk binutils.
 OK for trunk?

If it also works without the new __gnu_offload symbol then I'd prefer
to re-use __gnu_LTO here.

Ok with that change.

Thanks,
Richard.

   -- Ilya


 gcc/
 * cgraphunit.c (ipa_passes): Handle flag_generate_offload.
 (symbol_table::compile): Set flag_generate_offload if there is 
 something
 to offload.
 * collect2.c (scan_prog_file): Look for the offload info marker 
 symbol.
 * common.opt (flag_generate_offload): New Variable declaration.
 * dwarf2out.c (dwarf2out_finish): Handle flag_generate_offload.
 * ipa-inline-analysis.c (inline_generate_summary): Do not skip if
 flag_generate_offload is set.
 * lto-streamer.c (gate_lto_out): Handle flag_generate_offload.
 * passes.c (ipa_write_summaries): Do not skip if flag_generate_offload
 is set.
 * toplev.c (compile_file): Emit offload marker if offload info has 
 been
 previously emitted.
 * tree.c (free_lang_data): Do not skip if flag_generate_offload is 
 set.


 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index 2fd99a7..fed1a3e 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -2075,7 +2075,7 @@ ipa_passes (void)
  }

/* Some targets need to handle LTO assembler output specially.  */
 -  if (flag_generate_lto)
 +  if (flag_generate_lto || flag_generate_offload)
  targetm.asm_out.lto_start ();

if (!in_lto_p)
 @@ -2092,7 +2092,7 @@ ipa_passes (void)
 }
  }

 -  if (flag_generate_lto)
 +  if (flag_generate_lto || flag_generate_offload)
  targetm.asm_out.lto_end ();

if (!flag_ltrans  (in_lto_p || !flag_lto || flag_fat_lto_objects))
 @@ -2176,10 +2176,10 @@ symbol_table::compile (void)

/* Offloading requires LTO infrastructure.  */
if (!in_lto_p  g-have_offload)
 -flag_generate_lto = 1;
 +flag_generate_offload = 1;

/* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
 -  if (flag_generate_lto)
 +  if (flag_generate_lto || flag_generate_offload)
  lto_streamer_hooks_init ();

/* Don't run the IPA passes if there was any error or sorry messages.  */
 diff --git a/gcc/collect2.c b/gcc/collect2.c
 index 9c3a1c5..2dcebcd 100644
 --- a/gcc/collect2.c
 +++ b/gcc/collect2.c
 @@ -2392,12 +2392,16 @@ scan_prog_file (const char *prog_name, scanpass 
 which_pass,
if (found_lto)
  continue;

 -  /* Look for the LTO info marker symbol, and add filename to
 +  /* Look for the LTO or offload info marker symbol, and add 
 filename to
   the LTO objects list if found.  */
for (p = buf; (ch = *p) != '\0'  ch != '\n'; p++)
  if (ch == ' '   p[1] == '_'  p[2] == '_'
 -(strncmp (p + (p[3] == '_' ? 2 : 1), __gnu_lto_v1, 12) 
 == 0)
 -ISSPACE (p[p[3] == '_' ? 14 : 13]))
 +(((strncmp (p + (p[3] == '_' ? 2 : 1),
 +  __gnu_lto_v1, 12) == 0)
 + ISSPACE (p[p[3] == '_' ? 14 : 13]))
 +   || ((strncmp (p + (p[3] == '_' ? 2 : 1),
 + __gnu_offload_v1, 16) == 0)
 +ISSPACE (p[p[3] == '_' ? 18 : 17]
{
  add_lto_object (lto_objects, prog_name);

 diff --git a/gcc/common.opt b/gcc/common.opt
 index 41c8d4e..752d939 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -67,6 +67,10 @@ int *param_values
  Variable
  int flag_generate_lto

 +; Nonzero if we should write GIMPLE bytecode for offload compilation.
 +Variable
 +int flag_generate_offload = 0
 +
  ; True to warn about any objects definitions whose size is larger
  ; than N bytes.  Also want about function definitions whose returned
  ; values are larger than N bytes, where N is 'larger_than_size'.
 diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
 index 25f0e7d..4ee3102 100644
 --- a/gcc/dwarf2out.c
 +++ b/gcc/dwarf2out.c
 @@ -24420,7 +24420,8 @@ dwarf2out_finish (const char *filename)
/* When generating LTO bytecode we can not generate new assembler
   names at this point and all important decls got theirs via
  free-lang-data.  */
 -  if ((!flag_generate_lto || DECL_ASSEMBLER_NAME_SET_P (decl))
 +  if (((!flag_generate_lto  !flag_generate_offload)
 +  || DECL_ASSEMBLER_NAME_SET_P (decl))
DECL_ASSEMBLER_NAME (decl) != DECL_NAME (decl))
 {
   add_linkage_attr (node-die, decl);
 diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
 index 2f2993c..9d62722 

Re: [PATCH] IPA ICF: memory leak fix

2014-11-27 Thread Richard Biener
On Thu, Nov 27, 2014 at 3:36 PM, Martin Liška mli...@suse.cz wrote:
 On 11/23/2014 10:01 AM, Markus Trippelsdorf wrote:

 On 2014.11.22 at 17:46 +0100, Markus Trippelsdorf wrote:

 On 2014.11.22 at 16:04 +0100, Martin Liška wrote:

 On 11/22/2014 10:09 AM, Markus Trippelsdorf wrote:

 On 2014.11.22 at 09:05 +0100, Martin Liška wrote:


 Following patch removes memory leak that was introduced by very first
 IPA ICF patch.
 I would like to thank David for hunting the leak.

 Patch an bootstrap on x86_86-linux-pc and no regression is introduced.


 I gave the patch a quick spin on gcc112:

 *** Error in `/home/trippels/gcc_build_dir/./prev-gcc/lto1': free():
 invalid next size (fast): 0x01000a5fc160 ***
 === Backtrace: =
 /lib64/libc.so.6(+0xa3d9c)[0x3fff7b6b3d9c]
 /lib64/libc.so.6(+0xaf0b4)[0x3fff7b6bf0b4]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN3vecIi7va_heap6vl_ptrE7releaseEv-0x1d4bc00)[0x1025dd88]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf12sem_function14equals_privateEPNS_8sem_itemER8hash_mapIP11symtab_nodeS2_22default_hashmap_traitsE-0x9c083c)[0x116586bc]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf12sem_function6equalsEPNS_8sem_itemER8hash_mapIP11symtab_nodeS2_22default_hashmap_traitsE-0x9c0578)[0x11658998]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf18sem_item_optimizer7executeEv-0x9b8774)[0x11660a84]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN7ipa_icf12pass_ipa_icf7executeEP8function-0x9b0314)[0x11668efc]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_Z16execute_one_passP8opt_pass-0x1647588)[0x1098a0a8]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_Z21execute_ipa_pass_listP8opt_pass-0x1644c2c)[0x1098ca7c]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_Z8lto_mainv-0x1df20e4)[0x101b494c]
 /home/trippels/gcc_build_dir/./prev-gcc/lto1[0x10b599b8]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(_ZN6toplev4mainEiPPc-0x1e8be70)[0x101507b8]

 /home/trippels/gcc_build_dir/./prev-gcc/lto1(main-0x1ec8d8c)[0x1015493c]
 /lib64/libc.so.6(+0x447ac)[0x3fff7b6547ac]
 /lib64/libc.so.6(__libc_start_main-0x19cbf4)[0x3fff7b6549d4]
 === Memory map: 
 ...


 Thank you for testing, problem is that I should grow the vector by 1,
 because '0' is used as NULL value.

 Please try my fixed patch.


 This one survives bootstrap-lto. Thanks.


 But Firefox doesn't build:

 /home/trippels/gcc_test/usr/local/bin/c++ -fPIC -Wall -Wempty-body
 -Woverloaded-virtual -Wsign-compare -Wwrite-strings -Werror=endif-labels
 -Werror=int-to-pointer-cast -Werror=missing-braces -Werror=pointer-arith
 -Werror=return-type -Werror=sequence-point -Werror=unused-label
 -Werror=trigraphs -Werror=type-limits -Wno-invalid-offsetof -Wcast-align
 -flto=160 --param lto-partitions=160 -mcpu=power8 -ffunction-sections
 -fdata-sections -fno-exceptions -fno-strict-aliasing -frtti -fno-exceptions
 -fno-math-errno -std=gnu++0x -pthread -pipe -UDEBUG -DNDEBUG -O3
 -DU_STATIC_IMPLEMENTATION -fvisibility=hidden -W -Wall -pedantic
 -Wpointer-arith -Wwrite-strings -Wno-long-long -Wno-unused
 -Wno-unused-parameter   -lpthread
 -Wl,--hash-style=gnu,--as-needed,--gc-sections,--icf=all -Wl,-z,noexecstack
 -Wl,-z,text -Wl,--build-id -Wl,--gc-sections  -o ../../bin/makeconv
 makeconv.o ucnvstat.o genmbcs.o gencnvex.o -L../../lib -licutu -L../../lib
 -licui18n -L../../lib -licuuc -L../../stubdata -licudata -

 l
 pthread -ldl -lm

 lto1: internal compiler error: in operator[], at vec.h:736
 0x10122377 vecint, va_heap, vl_embed::operator[](unsigned int)
  ../../gcc/gcc/vec.h:736
 0x10d1a0f3 vecint, va_heap, vl_embed::operator[](unsigned int)
  ../../gcc/gcc/ipa-icf.c:963
 0x10d1a0f3 vecint, va_heap, vl_ptr::operator[](unsigned int)
  ../../gcc/gcc/vec.h:1202
 0x10d1a0f3 ipa_icf::sem_function::bb_dict_test(auto_vecint, 0ul, int,
 int)
  ../../gcc/gcc/ipa-icf.c:970
 0x10d1aa73 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*,
 hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits)
  ../../gcc/gcc/ipa-icf.c:512
 0x10d1afb3 ipa_icf::sem_function::equals(ipa_icf::sem_item*,
 hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits)
  ../../gcc/gcc/ipa-icf.c:384
 0x10d1dba3
 ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool)
  ../../gcc/gcc/ipa-icf.c:1833
 0x10d2397f ipa_icf::sem_item_optimizer::execute()
  ../../gcc/gcc/ipa-icf.c:1652
 0x10d24a93 ipa_icf_driver
  ../../gcc/gcc/ipa-icf.c:2382
 0x10d24a93 ipa_icf::pass_ipa_icf::execute(function*)
  ../../gcc/gcc/ipa-icf.c:2430
 Please submit a full bug report,
 with preprocessed source if appropriate.
 Please include the complete backtrace with any bug report.
 See http://gcc.gnu.org/bugs.html for instructions.
 lto-wrapper: fatal error: /home/trippels/gcc_test/usr/local/bin/c++
 returned 1 exit status
 compilation terminated.
 /home/trippels/bin/ld: fatal error: lto-wrapper failed
 collect2: error: ld returned 1 exit 

Re: PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

2014-11-27 Thread Richard Biener
On Tue, Nov 25, 2014 at 5:05 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 7:01 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 The enclosed testcase fails on x86 when compiled with -Os since we pass
 a byte parameter with a byte load in caller and read it as an int in
 callee.  The reason it only shows up with -Os is x86 backend encodes
 a byte load with an int load if -O isn't used.  When a byte load is
 used, the upper 24 bits of the register have random value for none
 WORD_REGISTER_OPERATIONS targets.

 It happens because setup_incoming_promotions in combine.c has

   /* The mode and signedness of the argument before any promotions 
 happen
  (equal to the mode of the pseudo holding it at that stage).  */
   mode1 = TYPE_MODE (TREE_TYPE (arg));
   uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

   /* The mode and signedness of the argument after any source language 
 and
  TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
   mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
   uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

   /* The mode and signedness of the argument as it is actually passed,
  after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
   mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, uns3,
  TREE_TYPE (cfun-decl), 0);

 while they are actually passed in register by assign_parm_setup_reg in
 function.c:

   /* Store the parm in a pseudoregister during the function, but we may
  need to do it in a wider mode.  Using 2 here makes the result
  consistent with promote_decl_mode and thus expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_function_mode (data-nominal_type, data-nominal_mode, 
 unsignedp,
  TREE_TYPE (current_function_decl), 2);

 where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
 and TYPE_MODE (nominal_type). TREE_TYPE here is

 I think the bug is here, not in combine.c.  Can you try going back in history
 for both snippets and see if they matched at some point?


 The bug was introduced by

 https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html

 commit 5d93234932c3d8617ce92b77b7013ef6bede9508
 Author: shinwell shinwell@138bc75d-0d04-0410-961f-82ee72b054a4
 Date:   Thu Sep 20 11:01:18 2007 +

   gcc/
   * combine.c: Include cgraph.h.
   (setup_incoming_promotions): Rework to allow more aggressive
   elimination of sign extensions when all call sites of the
   current function are known to lie within the current unit.


 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618
 138bc75d-0d04-0410-961f-82ee72b054a4

 Before this commit, combine.c has

   enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
   int uns = TYPE_UNSIGNED (TREE_TYPE (arg));

   mode = promote_mode (TREE_TYPE (arg), mode, uns, 1);
   if (mode == GET_MODE (reg)  mode != DECL_MODE (arg))
 {
   rtx x;
   x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx);
   x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x);
   record_value_for_reg (reg, first, x);
 }

 It matches function.c:

   /* This is not really promoting for a call.  However we need to be
  consistent with assign_parm_find_data_types and expand_expr_real_1.  */
   promoted_nominal_mode
 = promote_mode (data-nominal_type, data-nominal_mode, unsignedp, 1);

 r128618 changed

 mode = promote_mode (TREE_TYPE (arg), mode, uns, 1);

 to

 mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, uns3, 1);

 It breaks none WORD_REGISTER_OPERATIONS targets.

Hmm, I think that DECL_ARG_TYPE makes a difference only
for non-WORD_REGISTER_OPERATIONS targets.

But yeah, isolated the above change looks wrong.

Your patch is ok for trunk if nobody objects within 24h and for branches
after a week.

Thanks,
Richard.

 --
 H.J.


Re: [PATCH] VRP: don't assume strict overflow semantics when checking if a loop wraps

2014-11-27 Thread Richard Biener
On Fri, Nov 21, 2014 at 2:32 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 On Fri, Nov 21, 2014 at 7:18 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Nov 21, 2014 at 12:29 PM, Patrick Palka patr...@parcs.ath.cx wrote:
 When adjusting the value range of an induction variable using SCEV, VRP
 calls scev_probably_wraps_p() with use_overflow_semantics=true.  This
 parameter set to true makes scev_probably_wraps_p() assume that signed
 induction variables never wrap, so for these variables it always returns
 false (when strict overflow rules are in effect).  This is wrong because
 if a signed induction variable really does overflow then we want to give
 it an INF(OVF) value range and not the (finite) estimation returned by
 SCEV.

 While this change shouldn't make a difference in code generation, it
 should help improve the coverage of -Wstrict-overflow warnings on
 induction variables like in the test case.

 OK after bootstrap + regtest on x86_64-unknown-linux-gnu?

 Hmm, I don't think the change won't affect code-generation.  In fact
 we check for overflow ourselves in the most interesting case
 (the first block) - only the path adjusting min/max based on the
 init value and the max value of the type needs to know whether
 overflow may happen and fail or drop to +-INF(OVF).

 So I'd rather open-code the relevant cases and not call
 scev_probably_wraps_p at all.

 What kind of tests for overflow do you have in mind?
 max_loop_iterations() in this test case always return INT_MAX so there
 will be no overflow when computing the upper bound using the number of
 loop iterations. Do you mean to compare what max_loop_iterations()
 returns with the range that VRP has inferred for the induction
 variable?

I'm talking about

  /* Try to use estimated number of iterations for the loop to constrain the
 final value in the evolution.  */
  if (TREE_CODE (step) == INTEGER_CST
   is_gimple_val (init)
   (TREE_CODE (init) != SSA_NAME
  || get_value_range (init)-type == VR_RANGE))
{
  widest_int nit;

  /* We are only entering here for loop header PHI nodes, so using
 the number of latch executions is the correct thing to use.  */
  if (max_loop_iterations (loop, nit))

which should be fine without the scev_probably_wraps check and
the fallback tmin/tmax with the min/max of the type only being
valid for TYPE_OVERFLOW_UNDEFINED types.

At least it should boil down to that, no?

Thanks,
Richard.


 Richard.

 gcc/
 * tree-vrp.c (adjust_range_with_scev): Call
 scev_probably_wraps_p with use_overflow_semantics=false.

 gcc/testsuite/
 * gcc.dg/Wstrict-overflow-27.c: New test.
 ---
  gcc/testsuite/gcc.dg/Wstrict-overflow-27.c | 22 ++
  gcc/tree-vrp.c |  2 +-
  2 files changed, 23 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.dg/Wstrict-overflow-27.c

 diff --git a/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c 
 b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
 new file mode 100644
 index 000..c1f27ab
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/Wstrict-overflow-27.c
 @@ -0,0 +1,22 @@
 +/* { dg-do compile } */
 +/* { dg-options -fstrict-overflow -O2 -Wstrict-overflow } */
 +
 +/* Warn about an overflow when folding i  0.  */
 +
 +void bar (unsigned *p);
 +
 +int
 +foo (unsigned *p)
 +{
 +  int i;
 +  int sum = 0;
 +
 +  for (i = 0; i  *p; i++)
 +{
 +  if (i  0) /* { dg-warning signed overflow } */
 +   sum += 2;
 +  bar (p);
 +}
 +
 +  return sum;
 +}
 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index a75138f..bf9ff61 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -4270,7 +4270,7 @@ adjust_range_with_scev (value_range_t *vr, struct 
 loop *loop,
dir == EV_DIR_UNKNOWN
/* ... or if it may wrap.  */
|| scev_probably_wraps_p (init, step, stmt, get_chrec_loop (chrec),
 -   true))
 +   /*use_overflow_semantics=*/false))
  return;

/* We use TYPE_MIN_VALUE and TYPE_MAX_VALUE here instead of
 --
 2.2.0.rc1.23.gf570943



Re: [Ping]Re: [PR63762][4.9] Backport the patch which fixes GCC generates UNPREDICTABLE STR with Rn = Rt for arm

2014-11-27 Thread Renlin Li

On 26/11/14 18:12, H.J. Lu wrote:

On Wed, Nov 26, 2014 at 10:09 AM, Renlin Li renlin...@arm.com wrote:

On 26/11/14 12:16, H.J. Lu wrote:

On Wed, Nov 26, 2014 at 4:07 AM, Renlin Li renlin...@arm.com wrote:

On 20/11/14 16:17, Renlin Li wrote:

Hi all,

This is a backport for gcc-4_9-branch of the patch [PR63762]GCC
generates
UNPREDICTABLE STR with Rn = Rt for arm posted in:
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html

arm-none-eabi has been test on the model, no new issues. bootstrapping
and
regression tested on x86, no new issues.

Is it Okay for gcc-4_9-branch?

gcc/ChangeLog:

2014-11-20  Renlin Li  renlin...@arm.com

  PR middle-end/63762
  * ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-20  Renlin Li  renlin...@arm.com

  PR middle-end/63762
  * gcc.dg/pr63762.c: New.

Ping for it.


Please verify if it is the real fix for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

If yes, please add a testcase for PR 63661 and mention it in
your ChangeLog entry.

Thanks.



Hi H.J.

Yes, I have verified that, this patch additionally fixes PR 63661.

I observed the same behaviour as I saw on arm backend. It will be great if
you can double check they are caused by exactly the same reason.

I will ask our people to take a look.



A new testcase has been added, ChangeLog has been updated to reflect the
change. Updated patch has bee attached.
Okay for gcc-4_9-branch?

Regards,
Renlin Li


gcc/ChangeLog:

2014-11-26  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 PR middle-end/63661
 * ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-26  Renlin Li  renlin...@arm.com

 PR middle-end/63661
 PR middle-end/63762
 * testsuite/gcc.dg/pr63661.c: New.
 * testsuite/gcc.dg/pr63762.c: New.



pr63661.c should be moved to gcc.target/i386 and run it
on PIC target.

Thanks.


Hi H.J.

The patch has been adjusted according to your suggestion.


gcc/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR target/63661
* ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR target/63661
* testsuite/gcc.dg/pr63762.c: New.
* testsuite/gcc.target/i386/pr63661.c: New.

diff --git a/gcc/ira.c b/gcc/ira.c
index 4d91d21..0c703c5 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5347,7 +5347,18 @@ ira (FILE *f)
 	  ira_allocno_iterator ai;
 
 	  FOR_EACH_ALLOCNO (a, ai)
-		ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)-reg);
+		{
+		  int old_regno = ALLOCNO_REGNO (a);
+		  int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)-reg);
+
+		  ALLOCNO_REGNO (a) = new_regno;
+
+		  if (old_regno != new_regno)
+		setup_reg_classes (new_regno, reg_preferred_class (old_regno),
+		   reg_alternate_class (old_regno),
+		   reg_allocno_class (old_regno));
+		}
+
 	}
 	  else
 	{
diff --git a/gcc/testsuite/gcc.dg/pr63762.c b/gcc/testsuite/gcc.dg/pr63762.c
new file mode 100644
index 000..df11067
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr63762.c
@@ -0,0 +1,77 @@
+/* PR middle-end/63762 */
+/* { dg-do assemble } */
+/* { dg-options -O2 } */
+
+#include stdlib.h
+
+void *astFree ();
+void *astMalloc ();
+void astNegate (void *);
+int astGetNegated (void *);
+void astGetRegionBounds (void *, double *, double *);
+int astResampleF (void *, ...);
+
+extern int astOK;
+
+int
+MaskF (int inside, int ndim, const int lbnd[], const int ubnd[],
+   float in[], float val)
+{
+
+  void *used_region;
+  float *c, *d, *out, *tmp_out;
+  double *lbndgd, *ubndgd;
+  int *lbndg, *ubndg, idim, ipix, nax, nin, nout, npix, npixg, result = 0;
+  if (!astOK) return result;
+  lbndg = astMalloc (sizeof (int)*(size_t) ndim);
+  ubndg = astMalloc (sizeof (int)*(size_t) ndim);
+  lbndgd = astMalloc (sizeof (double)*(size_t) ndim);
+  ubndgd = astMalloc (sizeof (double)*(size_t) ndim);
+  if (astOK)
+{
+  astGetRegionBounds (used_region, lbndgd, ubndgd);
+  npix = 1;
+  npixg = 1;
+  for (idim = 0; idim  ndim; idim++)
+{
+  lbndg[ idim ] = lbnd[ idim ];
+  ubndg[ idim ] = ubnd[ idim ];
+  npix *= (ubnd[ idim ] - lbnd[ idim ] + 1);
+  if (npixg = 0) npixg *= (ubndg[ idim ] - lbndg[ idim ] + 1);
+}
+  if (npixg = 0  astOK)
+{
+  if ((inside != 0) == (astGetNegated( used_region ) != 0))
+{
+  c = in;
+  for (ipix = 0; ipix  npix; ipix++) *(c++) = val;
+  result = npix;
+}
+}
+  else if (npixg  0  astOK)
+{
+  if ((inside != 0) == (astGetNegated (used_region) != 0))
+{
+  tmp_out = astMalloc (sizeof (float)*(size_t) npix);
+  if (tmp_out)
+{
+  c = tmp_out;
+  for (ipix = 0; ipix  npix; ipix++) *(c++) = val;
+  result = 

Re: [PATCH][AArch64] Use std::swap instead of manually swapping

2014-11-27 Thread Kyrill Tkachov

Ping.

Thanks,
Kyrill

On 13/11/14 09:42, Kyrill Tkachov wrote:

Hi all,

Following the trend in i386 and alpha, this patch uses std::swap to
perform swapping of values in the aarch64 backend instead of declaring
temporaries.
Tested and bootstrapped on aarch64-linux.

Ok for trunk?

Thanks,
Kyrill


2014-11-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/aarch64/aarch64.c (aarch64_evpc_ext): Use std::swap instead
  of manual swapping implementation.
  (aarch64_expand_vec_perm_const_1): Likewise.





Re: [Ping]Re: [PR63762][4.9] Backport the patch which fixes GCC generates UNPREDICTABLE STR with Rn = Rt for arm

2014-11-27 Thread H.J. Lu
On Thu, Nov 27, 2014 at 7:32 AM, Renlin Li renlin...@arm.com wrote:
 On 26/11/14 18:12, H.J. Lu wrote:

 On Wed, Nov 26, 2014 at 10:09 AM, Renlin Li renlin...@arm.com wrote:

 On 26/11/14 12:16, H.J. Lu wrote:

 On Wed, Nov 26, 2014 at 4:07 AM, Renlin Li renlin...@arm.com wrote:

 On 20/11/14 16:17, Renlin Li wrote:

 Hi all,

 This is a backport for gcc-4_9-branch of the patch [PR63762]GCC
 generates
 UNPREDICTABLE STR with Rn = Rt for arm posted in:
 https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html

 arm-none-eabi has been test on the model, no new issues. bootstrapping
 and
 regression tested on x86, no new issues.

 Is it Okay for gcc-4_9-branch?

 gcc/ChangeLog:

 2014-11-20  Renlin Li  renlin...@arm.com

   PR middle-end/63762
   * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-20  Renlin Li  renlin...@arm.com

   PR middle-end/63762
   * gcc.dg/pr63762.c: New.

 Ping for it.

 Please verify if it is the real fix for

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

 If yes, please add a testcase for PR 63661 and mention it in
 your ChangeLog entry.

 Thanks.


 Hi H.J.

 Yes, I have verified that, this patch additionally fixes PR 63661.

 I observed the same behaviour as I saw on arm backend. It will be great
 if
 you can double check they are caused by exactly the same reason.

 I will ask our people to take a look.


 A new testcase has been added, ChangeLog has been updated to reflect the
 change. Updated patch has bee attached.
 Okay for gcc-4_9-branch?

 Regards,
 Renlin Li


 gcc/ChangeLog:

 2014-11-26  Renlin Li  renlin...@arm.com

  PR middle-end/63762
  PR middle-end/63661
  * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-26  Renlin Li  renlin...@arm.com

  PR middle-end/63661
  PR middle-end/63762
  * testsuite/gcc.dg/pr63661.c: New.
  * testsuite/gcc.dg/pr63762.c: New.


 pr63661.c should be moved to gcc.target/i386 and run it
 on PIC target.

 Thanks.

 Hi H.J.

 The patch has been adjusted according to your suggestion.


 gcc/ChangeLog:

 2014-11-27  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 PR target/63661
 * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-27  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 PR target/63661
 * testsuite/gcc.dg/pr63762.c: New.
 * testsuite/gcc.target/i386/pr63661.c: New.
^^^  No testsuite/


gcc.target/i386/pr63661.c should be checked into trunk first.

-- 
H.J.


[PATCH][PR63661]Add a test case for trunk.

2014-11-27 Thread Renlin Li

Hi all,

PR63661 is effectively fixed by my previous patch here: 
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html


This patch add a test case for it.

Okay for trunk?

gcc/testsuite/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

 PR target/63661
* gcc.target/i386/pr63661.c: New.diff --git a/gcc/testsuite/gcc.target/i386/pr63661.c b/gcc/testsuite/gcc.target/i386/pr63661.c
new file mode 100644
index 000..8b55146
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr63661.c
@@ -0,0 +1,77 @@
+/* PR target/63661 */
+/* { dg-do run } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options -mtune=nehalem -fPIC -O2 } */
+
+static void __attribute__((noinline,noclone,hot))
+foo (double a, double q, double *ff, double *gx, int e, int ni)
+{
+  union
+{
+  double n;
+  unsigned long long o;
+} punner;
+
+  punner.n = q;
+   __builtin_printf(B: 0x%016llx  %g\n, punner.o, q);
+
+  if(q != 5)
+__builtin_abort();
+}
+
+static int __attribute__((noinline,noclone,hot))
+bar (int order, double q, double c[])
+{
+  int ni, nn, i, e;
+  double g2, x2, de, s, ratio, ff;
+
+  nn = 0;
+  e = order  1;
+  s = 0;
+  ratio = 0;
+  x2 = 0;
+  g2 = 0;
+
+  if(q == 0.0)
+return 0;
+
+  if (order  5)
+{
+  ratio = 1.0 / q;
+  nn = order;
+}
+
+  ni = -nn;
+
+  while(1)
+{
+  de = ratio - g2 - x2;
+
+  foo (0, q, ff, g2, e, ni);
+
+  if((int)de == 0)
+break;
+}
+
+  s += 2 * nn * c[nn];
+
+  for (i = 0; i  1; i++)
+{
+  c[0] = nn;
+  for (; i  10; i++)
+c[i] = 0.0;
+  c[0] /= s;
+}
+
+  return 0;
+}
+
+int
+main ()
+{
+  double c[1000];
+
+  __builtin_printf(A: 0x%016llx\n, (unsigned long long)c);
+  bar (1, 5.0, c);
+  return 0;
+}

Re: [Ping]Re: [PR63762][4.9] Backport the patch which fixes GCC generates UNPREDICTABLE STR with Rn = Rt for arm

2014-11-27 Thread Renlin Li

On 27/11/14 15:37, H.J. Lu wrote:

On Thu, Nov 27, 2014 at 7:32 AM, Renlin Li renlin...@arm.com wrote:

On 26/11/14 18:12, H.J. Lu wrote:

On Wed, Nov 26, 2014 at 10:09 AM, Renlin Li renlin...@arm.com wrote:

On 26/11/14 12:16, H.J. Lu wrote:

On Wed, Nov 26, 2014 at 4:07 AM, Renlin Li renlin...@arm.com wrote:

On 20/11/14 16:17, Renlin Li wrote:

Hi all,

This is a backport for gcc-4_9-branch of the patch [PR63762]GCC
generates
UNPREDICTABLE STR with Rn = Rt for arm posted in:
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html

arm-none-eabi has been test on the model, no new issues. bootstrapping
and
regression tested on x86, no new issues.

Is it Okay for gcc-4_9-branch?

gcc/ChangeLog:

2014-11-20  Renlin Li  renlin...@arm.com

   PR middle-end/63762
   * ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-20  Renlin Li  renlin...@arm.com

   PR middle-end/63762
   * gcc.dg/pr63762.c: New.

Ping for it.


Please verify if it is the real fix for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

If yes, please add a testcase for PR 63661 and mention it in
your ChangeLog entry.

Thanks.



Hi H.J.

Yes, I have verified that, this patch additionally fixes PR 63661.

I observed the same behaviour as I saw on arm backend. It will be great
if
you can double check they are caused by exactly the same reason.

I will ask our people to take a look.


A new testcase has been added, ChangeLog has been updated to reflect the
change. Updated patch has bee attached.
Okay for gcc-4_9-branch?

Regards,
Renlin Li


gcc/ChangeLog:

2014-11-26  Renlin Li  renlin...@arm.com

  PR middle-end/63762
  PR middle-end/63661
  * ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-26  Renlin Li  renlin...@arm.com

  PR middle-end/63661
  PR middle-end/63762
  * testsuite/gcc.dg/pr63661.c: New.
  * testsuite/gcc.dg/pr63762.c: New.



pr63661.c should be moved to gcc.target/i386 and run it
on PIC target.

Thanks.


Hi H.J.

The patch has been adjusted according to your suggestion.


gcc/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 PR target/63661
 * ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 PR target/63661
 * testsuite/gcc.dg/pr63762.c: New.
 * testsuite/gcc.target/i386/pr63661.c: New.

 ^^^  No testsuite/


Hi H.J.

gcc.target/i386/pr63661.c should be checked into trunk first.



A separate patch is sent to mailing list for this.


ChangeLog has been corrected.

Regards,
Renlin Li

gcc/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR target/63661
* ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR target/63661
* gcc.dg/pr63762.c: New.
* gcc.target/i386/pr63661.c: New.


diff --git a/gcc/ira.c b/gcc/ira.c
index 4d91d21..0c703c5 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5347,7 +5347,18 @@ ira (FILE *f)
 	  ira_allocno_iterator ai;
 
 	  FOR_EACH_ALLOCNO (a, ai)
-		ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)-reg);
+		{
+		  int old_regno = ALLOCNO_REGNO (a);
+		  int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)-reg);
+
+		  ALLOCNO_REGNO (a) = new_regno;
+
+		  if (old_regno != new_regno)
+		setup_reg_classes (new_regno, reg_preferred_class (old_regno),
+		   reg_alternate_class (old_regno),
+		   reg_allocno_class (old_regno));
+		}
+
 	}
 	  else
 	{
diff --git a/gcc/testsuite/gcc.dg/pr63762.c b/gcc/testsuite/gcc.dg/pr63762.c
new file mode 100644
index 000..df11067
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr63762.c
@@ -0,0 +1,77 @@
+/* PR middle-end/63762 */
+/* { dg-do assemble } */
+/* { dg-options -O2 } */
+
+#include stdlib.h
+
+void *astFree ();
+void *astMalloc ();
+void astNegate (void *);
+int astGetNegated (void *);
+void astGetRegionBounds (void *, double *, double *);
+int astResampleF (void *, ...);
+
+extern int astOK;
+
+int
+MaskF (int inside, int ndim, const int lbnd[], const int ubnd[],
+   float in[], float val)
+{
+
+  void *used_region;
+  float *c, *d, *out, *tmp_out;
+  double *lbndgd, *ubndgd;
+  int *lbndg, *ubndg, idim, ipix, nax, nin, nout, npix, npixg, result = 0;
+  if (!astOK) return result;
+  lbndg = astMalloc (sizeof (int)*(size_t) ndim);
+  ubndg = astMalloc (sizeof (int)*(size_t) ndim);
+  lbndgd = astMalloc (sizeof (double)*(size_t) ndim);
+  ubndgd = astMalloc (sizeof (double)*(size_t) ndim);
+  if (astOK)
+{
+  astGetRegionBounds (used_region, lbndgd, ubndgd);
+  npix = 1;
+  npixg = 1;
+  for (idim = 0; idim  ndim; idim++)
+{
+  lbndg[ idim ] = lbnd[ idim ];
+  ubndg[ idim ] = ubnd[ idim ];
+  npix *= (ubnd[ idim ] - lbnd[ idim ] + 1);
+  if (npixg = 0) npixg *= (ubndg[ idim ] - lbndg[ 

Re: [PATCH] Fix PR lto/64075

2014-11-27 Thread Mike Stump
On Nov 26, 2014, at 1:35 AM, Ilya Enkovich enkovich@gmail.com wrote:
 This patch fixes LTO streamers which were not adjusted when function_code 
 field was extended up to 12 bits.
 
 OK for trunk after bootstrap and check?

Please could you burry this in a #define someplace.  I think I have a patch 
that takes it up to 13, and don't look forward to the same issue.  Also 12 is 
kinda short sighted for a port.  13 is plenty big enough for the next 10 years 
I suspect.

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-27 Thread Mike Stump
On Nov 26, 2014, at 4:24 AM, Andrew Stubbs andrew_stu...@mentor.com wrote:

 On 14/11/14 11:12, Andrew Stubbs wrote:
 On 07/11/14 10:35, Andrew Stubbs wrote:
   if armv6 never co-exist with NEON, personally I think your original
 patch is better
   because TARGET_NEON generally will be used when all options are
 processed.
 
   any way, this needs gate keeper's approval.
 
 Ping, Richard.
 
 Ping.
 
 Ping.

Could you include a link or the patch.  If the test suite, I'll review it if no 
one else steps up.

Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-11-27 Thread Martin Liška

On 11/21/2014 04:21 PM, Martin Liška wrote:

On 11/21/2014 04:02 PM, Richard Biener wrote:

On Fri, Nov 21, 2014 at 3:39 PM, Martin Liška mli...@suse.cz wrote:


Hello.

Ok, this is simplified, one can use sreal a = 12345 and it works ;)


that's a  new API, right?  There is no max () and I think that using
LONG_MIN here is asking for trouble (host dependence).  The
comment in the file says the max should be
sreal (SREAL_MAX_SIG, SREAL_MAX_EXP) and the min
sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP)?



Sure, sreal can store much bigger(smaller) numbers :)


Where do you need sreal::to_double?  The host shouldn't perform
double calculations so it can be only for dumping?  In which case
the user should have used sreal::dump (), maybe with extra
arguments.



That new function was request from Honza, only for debugging purpose.
I agree that dump should this kind of job.

If no other problem, I will run tests once more and commit it.
Thanks,
Martin


-#define SREAL_MAX_EXP (INT_MAX / 4)
+#define SREAL_MAX_EXP (INT_MAX / 8)

this change doesn't look necessary anymore?

Btw, it's also odd that...

  #define SREAL_PART_BITS 32
...
  #define SREAL_MIN_SIG ((uint64_t) 1  (SREAL_PART_BITS - 1))
  #define SREAL_MAX_SIG (((uint64_t) 1  SREAL_PART_BITS) - 1)

thus all m_sig values fit in 32bits but we still use a uint64_t m_sig ...
(the implementation uses 64bit for internal computations, but still
the storage is wasteful?)

Of course the way normalize() works requires that storage to be
64bits to store unnormalized values.

I'd say ok with the SREAL_MAX_EXP change reverted.



Hi.

You are right, this change was done because I used one bit for m_negative 
(bitfield), not needed any more.

Final version attached.

Thank you,
Martin


Thanks,
Richard.





Otherwise looks good to me and sorry for not noticing the above
earlier.

Thanks,
Richard.


Thanks,
Martin



};

extern void debug (sreal ref);
@@ -76,12 +133,12 @@ inline sreal operator+= (sreal a, const sreal
b)

inline sreal operator-= (sreal a, const sreal b)
{
-return a = a - b;
+  return a = a - b;
}

inline sreal operator/= (sreal a, const sreal b)
{
-return a = a / b;
+  return a = a / b;
}

inline sreal operator*= (sreal a, const sreal b)
--
2.1.2










Hello.

After IRC discussions, I decided to give sreal another refactoring where I
use int64_t for m_sig.

This approach looks much easier and straightforward. I would like to
ask folk for comments?

I am able to run profiled bootstrap on x86_64-linux-pc and ppc64-linux-pc
and new regression is introduced.

Thanks,
Martin


From bff0b4b803271788cd90cfd4032ed6d4e6e95707 Mon Sep 17 00:00:00 2001
From: mliska mli...@suse.cz
Date: Wed, 26 Nov 2014 15:46:42 +0100
Subject: [PATCH] New sreal implementation which uses int64_t as m_sig.

gcc/ChangeLog:

2014-11-27  Martin Liska  mli...@suse.cz

	* sreal.c (sreal::shift_right): New implementation
	for int64_t as m_sig.
	(sreal::normalize): Likewise.
	(sreal::to_int): Likewise.
	(sreal::operator+): Likewise.
	(sreal::operator-): Likewise.
	(sreal::operator*): Likewise.
	(sreal::operator/): Likewise.
	(sreal::signedless_minus): Removed.
	(sreal::signedless_plus): Removed.
	* sreal.h (sreal::operator): New implementation
	for int64_t as m_sig.
---
 gcc/sreal.c | 129 +++-
 gcc/sreal.h |  52 ++--
 2 files changed, 61 insertions(+), 120 deletions(-)

diff --git a/gcc/sreal.c b/gcc/sreal.c
index 2b5e3ae..304feb0 100644
--- a/gcc/sreal.c
+++ b/gcc/sreal.c
@@ -91,7 +91,7 @@ sreal::shift_right (int s)
 
   m_exp += s;
 
-  m_sig += (uint64_t) 1  (s - 1);
+  m_sig += (int64_t) 1  (s - 1);
   m_sig = s;
 }
 
@@ -100,43 +100,46 @@ sreal::shift_right (int s)
 void
 sreal::normalize ()
 {
+  int64_t s = m_sig  0 ? -1 : 1;
+  uint64_t sig = m_sig == LONG_MIN ? LONG_MAX : std::abs (m_sig);
+
   if (m_sig == 0)
 {
-  m_negative = 0;
   m_exp = -SREAL_MAX_EXP;
 }
-  else if (m_sig  SREAL_MIN_SIG)
+  else if (sig  SREAL_MIN_SIG)
 {
   do
 	{
-	  m_sig = 1;
+	  sig = 1;
 	  m_exp--;
+	  gcc_checking_assert (sig);
 	}
-  while (m_sig  SREAL_MIN_SIG);
+  while (sig  SREAL_MIN_SIG);
 
   /* Check underflow.  */
   if (m_exp  -SREAL_MAX_EXP)
 	{
 	  m_exp = -SREAL_MAX_EXP;
-	  m_sig = 0;
+	  sig = 0;
 	}
 }
-  else if (m_sig  SREAL_MAX_SIG)
+  else if (sig  SREAL_MAX_SIG)
 {
   int last_bit;
   do
 	{
-	  last_bit = m_sig  1;
-	  m_sig = 1;
+	  last_bit = sig  1;
+	  sig = 1;
 	  m_exp++;
 	}
-  while (m_sig  SREAL_MAX_SIG);
+  while (sig  SREAL_MAX_SIG);
 
   /* Round the number.  */
-  m_sig += last_bit;
-  if (m_sig  SREAL_MAX_SIG)
+  sig += last_bit;
+  if (sig  SREAL_MAX_SIG)
 	{
-	  m_sig = 1;
+	  sig = 1;
 	  m_exp++;
 	}
 
@@ -144,9 +147,11 @@ sreal::normalize ()
   if (m_exp  SREAL_MAX_EXP)
 	{
 	  m_exp = SREAL_MAX_EXP;
-	  m_sig = SREAL_MAX_SIG;
+	  sig = SREAL_MAX_SIG;
 	}
 }
+
+ 

Re: [Build, Graphite] PR64017 - support ISL-0.14 (gcc/configure.ac and gcc/graphite*.c)

2014-11-27 Thread Jack Howarth
   This patch bootstraps fine on x86_64-apple-darwin14 against isl
0.14 at r218098.
  Jack

On Wed, Nov 26, 2014 at 10:55 AM, Tobias Burnus
tobias.bur...@physik.fu-berlin.de wrote:
 This patch adds a configure check for 
 isl_schedule_constraints_compute_schedule,
 which is new in ISL 0.13 - and uses it for conditional compilation.

 The graphite*c patch is based on the one of Jack Howarth,
 https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00906.html - without checking
 whether the ISL replacements make sense.

 With the patch, I have successfully bootstrapped GCC with ISL 0.12.2 and
 ISL 0.14 (both using an in-tree build).  [I still have to regtest the two
 variants and I also want to do a system-ISL built with ISL 0.12.2]

 Is the patch OK for the trunk?

 Tobias

 PS: I'd be happy if some others could run additional tests.

 PPS: If the patch is in, can someone put ISL 0.14 to infrastructure? Then
 one can update contrib/download_prerequisites - such that the graphite
 failure is at least fixed for in-tree builds (cf. PR64017, PR62289).


[RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Manuel López-Ibáñez
This is just a RFC so no Changelog yet. It bootstraps and passes the
testsuite. I have three major questions:

* Dodji: Do the common diagnostics part look reasonable? I tried to be
as least invasive as possible. If you have comments or suggestions
they are very welcome.

* Fortran devs: Is this approach acceptable? The main idea is to have
an output_buffer called pp_warning_buffer with the flush_p bit unset
if we are buffering. When printing buffered warnings, use this
output_buffer in the global_dc-printer instead of the (unbuffered
one) used by the *_now variants. In principle this could support
several buffered diagnostics, but Fortran only seems to buffer at most
one.

The ugliest part is how to handle warningcount and werrorcount. I
could handle this in the common machinery in a better way by storing
DK_WERROR in the diagnostic-kind and checking it after printing. This
way we can first decrease both counters, then increase back the one
not changed and store the kind somewhere. then if the diagnostic is
canceled nothing is done, if it is flushed, then increase the
appropriate counter (perhaps calling into the common part for any post
action due to Wfatal-errors or -fmax-errors=).

I can also hide the output_buffer switching inside two helper
functions, but the helper function would need to use either a static
variable or a global one to save and restore the tmp_buffer. I'm not
sure that is better or worse (the current code uses a global pointer
cur_error_buffer, so perhaps I should have used a similar approach).

* Fortran devs #2: The testsuite is testing that the warning is
eventually printed. However, I'm not sure it is testing when the
warning is buffered and then discarded, is it? If not, how can I
produce such a test?

Thanks,

Manuel.
Index: gcc/pretty-print.c
===
--- gcc/pretty-print.c  (revision 218090)
+++ gcc/pretty-print.c  (working copy)
@@ -40,7 +40,8 @@
 cur_chunk_array (),
 stream (stderr),
 line_length (),
-digit_buffer ()
+digit_buffer (),
+flush_p (true)
 {
   obstack_init (formatted_obstack);
   obstack_init (chunk_obstack);
@@ -679,12 +680,25 @@
   pp_wrapping_mode (pp) = oldmode;
 }
 
-/* Flush the content of BUFFER onto the attached stream.  */
+/* Flush the content of BUFFER onto the attached stream.  This
+   function does nothing unless pp-output_buffer-flush_p.  */
 void
 pp_flush (pretty_printer *pp)
 {
+  pp_clear_state (pp);
+  if (!pp-buffer-flush_p)
+return;
   pp_write_text_to_stream (pp);
+  fflush (pp_buffer (pp)-stream);
+}
+
+/* Flush the content of BUFFER onto the attached stream independently
+   of the value of pp-output_buffer-flush_p.  */
+void
+pp_really_flush (pretty_printer *pp)
+{
   pp_clear_state (pp);
+  pp_write_text_to_stream (pp);
   fflush (pp_buffer (pp)-stream);
 }
 
Index: gcc/pretty-print.h
===
--- gcc/pretty-print.h  (revision 218090)
+++ gcc/pretty-print.h  (working copy)
@@ -100,6 +100,11 @@
   /* This must be large enough to hold any printed integer or
  floating-point value.  */
   char digit_buffer[128];
+
+  /* Nonzero means that text should be flushed when
+ appropriate. Otherwise, text is buffered until either
+ pp_pp_really_flush or pp_clear_output_area are called. */
+  bool flush_p;
 };
 
 /* The type of pretty-printer flags passed to clients.  */
@@ -314,6 +319,7 @@
 extern void pp_verbatim (pretty_printer *, const char *, ...)
  ATTRIBUTE_GCC_PPDIAG(2,3);
 extern void pp_flush (pretty_printer *);
+extern void pp_really_flush (pretty_printer *);
 extern void pp_format (pretty_printer *, text_info *);
 extern void pp_output_formatted_text (pretty_printer *);
 extern void pp_format_verbatim (pretty_printer *, text_info *);
Index: gcc/fortran/gfortran.h
===
--- gcc/fortran/gfortran.h  (revision 218090)
+++ gcc/fortran/gfortran.h  (working copy)
@@ -2443,7 +2443,6 @@
   int dump_fortran_optimized;
 
   int warn_aliasing;
-  int warn_ampersand;
   int warn_function_elimination;
   int warn_implicit_interface;
   int warn_implicit_procedure;
@@ -2691,6 +2690,7 @@
 const char *gfc_print_wide_char (gfc_char_t);
 
 void gfc_warning (const char *, ...) ATTRIBUTE_GCC_GFC(1,2);
+bool gfc_warning_1 (int opt, const char *, ...) ATTRIBUTE_GCC_GFC(2,3);
 void gfc_warning_now_1 (const char *, ...) ATTRIBUTE_GCC_GFC(1,2);
 bool gfc_warning_now (const char *, ...) ATTRIBUTE_GCC_GFC(1,2);
 bool gfc_warning_now (int opt, const char *, ...) ATTRIBUTE_GCC_GFC(2,3);
@@ -2719,7 +2719,7 @@
 void gfc_free_error (gfc_error_buf *);
 
 void gfc_get_errors (int *, int *);
-void gfc_errors_to_warnings (int);
+void gfc_errors_to_warnings (bool);
 
 /* arith.c */
 void gfc_arith_init_1 (void);
Index: gcc/fortran/error.c
===
--- gcc/fortran/error.c (revision 

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-27 Thread Andrew Stubbs

On 27/11/14 17:05, Mike Stump wrote:

Could you include a link or the patch.  If the test suite, I'll review it if no 
one else steps up.


The original patch is here:

https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01119.html

Thanks

Andrew


Re: [PATCH] Fix PR lto/64075

2014-11-27 Thread Ilya Enkovich
2014-11-27 19:55 GMT+03:00 Mike Stump mikest...@comcast.net:
 On Nov 26, 2014, at 1:35 AM, Ilya Enkovich enkovich@gmail.com wrote:
 This patch fixes LTO streamers which were not adjusted when function_code 
 field was extended up to 12 bits.

 OK for trunk after bootstrap and check?

 Please could you burry this in a #define someplace.  I think I have a patch 
 that takes it up to 13, and don't look forward to the same issue.  Also 12 is 
 kinda short sighted for a port.  13 is plenty big enough for the next 10 
 years I suspect.

I've committed the patch already.


Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Steve Kargl
On Thu, Nov 27, 2014 at 06:25:01PM +0100, Manuel López-Ibáñez wrote:
 
 The ugliest part is how to handle warningcount and werrorcount. I
 could handle this in the common machinery in a better way by storing
 DK_WERROR in the diagnostic-kind and checking it after printing. This
 way we can first decrease both counters, then increase back the one
 not changed and store the kind somewhere. then if the diagnostic is
 canceled nothing is done, if it is flushed, then increase the
 appropriate counter (perhaps calling into the common part for any post
 action due to Wfatal-errors or -fmax-errors=).
 
 I can also hide the output_buffer switching inside two helper
 functions, but the helper function would need to use either a static
 variable or a global one to save and restore the tmp_buffer. I'm not
 sure that is better or worse (the current code uses a global pointer
 cur_error_buffer, so perhaps I should have used a similar approach).
 

Hi Manuel,

First, thanks for all your work on the error/warning code in gfortran.

Second, I'm dredging through my memories from a decade or so ago.  Others
are encouraged to jump in and correct me.

The design of gfortran's warning/error code is largely unchanged from 
when g95 was originally imported into the tree.  There have been, of
course, some changes but the design was set.  My understanding of how
it all works(ed) is that gfortran's parsing runs a series of matchers
over the input.  A matcher may generate an error/warning, which is
written into a buffer.  But, instead of immediately reporting the issue
gfortran may continue to run a series of matchers to see if the input
is in fact validate.  gfortran will report the last buffered error/warning
message only after the last matcher fails.  In running the series of
matchers, there are times when a matcher may hit an error/warning
condition and it is, in fact, a problem and gfortran wants to report
the problem NOW.  This is the origin of the *_now variants.

I would need to go dredge through ChangeLogs, but I believe I may be
responsible for the counting of error messages. I know that I at least
set the default value for -fmax-error.  The origins to the counting
is that once gfortran encountered and reported an error, she would
discard the problematic statement and continue parsing the input.  This
often leads to a series of spurious run-on error messages, which are
all caused by the first error.  Thus, if one does -fmax-error=1 and
fixes the problem, then all the other problems disappear.

Finally, gfortran's error/warning mechanism isn't immediately available
at the start of execution.  However, errors can occur before the mechanism
is initialized.  This is one reason why one finds fatal_error() spinkled
throughout the code.


 * Fortran devs #2: The testsuite is testing that the warning is
 eventually printed. However, I'm not sure it is testing when the
 warning is buffered and then discarded, is it? If not, how can I
 produce such a test?
 

I think that you can't check for a discarded buffered message.  In
fact, why bother?

-- 
Steve


Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-27 Thread Marek Polacek
On Thu, Nov 27, 2014 at 10:56:06AM +0100, Richard Biener wrote:
 But then rather than using integer_type_node I'd convert to
 unsigned_type_node (if that doesn't work you know we have some nasty
 dependence on negative shift counts working)

Yeah, I think adding the conversion into c_gimplify_expr is the best
thing we can do until we have type demotion/promotion pass.  Seems
that unsigned_type_node there works well.
(What happened to Kai's type elevation pass?)

Bootstrapped/regtested on ppc64-linux, ok for trunk?

2014-11-27  Marek Polacek  pola...@redhat.com

PR c/63862
c-family/
* c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR
to op1_utype.
* c-gimplify.c (c_gimplify_expr): Convert right operand of a shift
expression to unsigned_type_node.
c/
* c-typeck.c (build_binary_op) RSHIFT_EXPR, LSHIFT_EXPR: Don't
convert the right operand to integer type.
cp/
* typeck.c (cp_build_binary_op) RSHIFT_EXPR, LSHIFT_EXPR: Don't
convert the right operand to integer type.
testsuite/
* gcc.c-torture/execute/shiftopt-1.c: Don't XFAIL anymore.
* c-c++-common/ubsan/shift-7.c: New test.

diff --git gcc/c-family/c-gimplify.c gcc/c-family/c-gimplify.c
index 85b4223..8e0eb4c 100644
--- gcc/c-family/c-gimplify.c
+++ gcc/c-family/c-gimplify.c
@@ -242,6 +242,24 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p 
ATTRIBUTE_UNUSED,
 
   switch (code)
 {
+case LSHIFT_EXPR:
+case RSHIFT_EXPR:
+  {
+   /* We used to convert the right operand of a shift-expression
+  to an integer_type_node in the FEs.  But it is unnecessary
+  and not desirable for diagnostics and sanitizers.  We keep
+  this here to not pessimize the code, but we convert to an
+  unsigned type, because negative shift counts are undefined
+  anyway.
+  We should get rid of this conversion when we have a proper
+  type demotion/promotion pass.  */
+   tree *op1_p = TREE_OPERAND (*expr_p, 1);
+   if (TREE_CODE (TREE_TYPE (*op1_p)) != VECTOR_TYPE
+TYPE_MAIN_VARIANT (TREE_TYPE (*op1_p)) != unsigned_type_node)
+ *op1_p = convert (unsigned_type_node, *op1_p);
+   break;
+  }
+
 case DECL_EXPR:
   /* This is handled mostly by gimplify.c, but we have to deal with
 not warning about int x = x; as it is a GCC extension to turn off
diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c
index 90b03f2..96afc67 100644
--- gcc/c-family/c-ubsan.c
+++ gcc/c-family/c-ubsan.c
@@ -151,7 +151,7 @@ ubsan_instrument_shift (location_t loc, enum tree_code code,
!TYPE_UNSIGNED (type0)
flag_isoc99)
 {
-  tree x = fold_build2 (MINUS_EXPR, unsigned_type_node, uprecm1,
+  tree x = fold_build2 (MINUS_EXPR, op1_utype, uprecm1,
fold_convert (op1_utype, op1));
   tt = fold_convert_loc (loc, unsigned_type_for (type0), op0);
   tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 67efb46..bf0f306 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -10513,11 +10513,6 @@ build_binary_op (location_t location, enum tree_code 
code,
 
  /* Use the type of the value to be shifted.  */
  result_type = type0;
- /* Convert the non vector shift-count to an integer, regardless
-of size of value being shifted.  */
- if (TREE_CODE (TREE_TYPE (op1)) != VECTOR_TYPE
-  TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-   op1 = convert (integer_type_node, op1);
  /* Avoid converting op1 to result_type later.  */
  converted = 1;
}
@@ -10563,11 +10558,6 @@ build_binary_op (location_t location, enum tree_code 
code,
 
  /* Use the type of the value to be shifted.  */
  result_type = type0;
- /* Convert the non vector shift-count to an integer, regardless
-of size of value being shifted.  */
- if (TREE_CODE (TREE_TYPE (op1)) != VECTOR_TYPE
-  TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-   op1 = convert (integer_type_node, op1);
  /* Avoid converting op1 to result_type later.  */
  converted = 1;
}
diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 8b66acc..6ca346b 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4295,10 +4295,6 @@ cp_build_binary_op (location_t location,
 right shift count = width of type);
}
}
- /* Convert the shift-count to an integer, regardless of
-size of value being shifted.  */
- if (TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-   op1 = cp_convert (integer_type_node, op1, complain);
  /* Avoid converting op1 to result_type later.  */
  converted = 1;
}
@@ -4344,10 +4340,6 @@ cp_build_binary_op (location_t location,
   

Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Tobias Burnus

Manuel López-Ibáñez wrote:

* Fortran devs: Is this approach acceptable? The main idea is to have
an output_buffer called pp_warning_buffer with the flush_p bit unset
if we are buffering. When printing buffered warnings, use this
output_buffer in the global_dc-printer instead of the (unbuffered
one) used by the *_now variants. In principle this could support
several buffered diagnostics, but Fortran only seems to buffer at most
one.


I think the approach is fine. As the _now version overrides the buffer, 
one might even do with a single buffer by clearing it, setting flush_p 
temporarily to true and printing the message. It only might collide with 
buffered warnings (for error_now) and errors (for warning_now), but I 
don't see whether that's an issue. For warnings/error_now it probably 
isn't, for errors/warning_now, it might. Thus, having two buffers is 
probably better.



The ugliest part is how to handle warningcount and werrorcount. I
could handle this in the common machinery in a better way by storing
DK_WERROR in the diagnostic-kind and checking it after printing.


I'm missing the but. Does it require more modifications in common code 
and thus you shy away? Or is there another reason?



I can also hide the output_buffer switching inside two helper
functions, but the helper function would need to use either a static
variable or a global one to save and restore the tmp_buffer. I'm not
sure that is better or worse (the current code uses a global pointer
cur_error_buffer, so perhaps I should have used a similar approach).


Me neither. The current approach is rather localized in error.c; thus, 
it doesn't really matter.



* Fortran devs #2: The testsuite is testing that the warning is
eventually printed. However, I'm not sure it is testing when the
warning is buffered and then discarded, is it? If not, how can I
produce such a test?


Well, for nearly every Fortran program, at least for one line multiple 
attempts have to be done by the parser. Thus, if you set a break point 
in gfc_error, you will see syntax error messages which never make it 
to the screen. As the test suite checks for excess errors, nearly every 
test-suite file tests this.


* * *

With the patch in place, plus its gfc_error cousin, and after the 
follow-up conversion work, only the following would remain:


* gfc_warning: a single occurence of two locations in the same error 
string (of ~120 calls)

* gfc_error: approx. 34 times two locations (of ~1630 calls)
* gfc_error_now_1: 11 times two locations

* In scanner.c:
- gfc_warning_now_1: 6 calls in scanner.c
- gfc_warning: 3
- fprintf: 2
- gfc_error: 0
Some of those can probably be simply converted, others either need to 
remain, or one has to setup a proper location (currently, using 
gfc_warning_now would ICE), or one creates a work around and constructs 
manually the error string (at least for the fprintf cases).


* * *

Regarding the use of libcpp: Currently, it is only used with explicit 
preprocessing (-cpp, file extension) and it is used to output into a 
temporary file which is then read back. I experimented with

  token = cpp_get_token (cpp_in);
  const char *str = cpp_token_as_text (cpp_in, token);
which kind of works okay.

But I do not understand how one gets linebreaks and spacing correctly. 
For linebreaks, I can use flags  BOL or a callback. But I'm still 
trying to understand how one can get the number of spacings. flags  
PREV_WHITE seems to record only a single one and seems to convert all 
of them (\t, \n and ' ') into a single type.


For fixed-form Fortran, spaces are essential. On punch cards,* only the 
first 72 characters were read; as (some?) punch cards had additional 8 
characters, those were used for comments (e.g. to enumerate the cards). 
Hence, there are codes out there, which assumes that everything beyond 
72 characters is ignored. Thus, the number of spaces is crucial – at 
least with -fpreprocessed, if one wants to always goes through libcpp.


Looking at the current code, it seems to use a line-break callback with 
SOURCE_COLUMN to reconstruct the indentation. I think saving the token 
src_column and the length of the token string and subtracting it from 
the new source location, will also work for mid-line spaces. However, 
that's really a kludge – and it still doesn't permit one to distiniguish 
between spaces and tabs.


Finally, the token handling seems to get in trouble with
  print *, Hello 
  world!
where a single string extends over two lines.

Taking everything together, I think using libcpp for reading Fortran 
files is a topic for GCC 6. Any suggestion how to properly handle the 
spacing?


For the lexing itself, one probably needs a Fortran mode, which can 
recognize Fortran comments, continuation lines and some special 
properties about Fortran string. When implemented, one could then also 
turn of the traditional mode.


Tobias

* Disclaimer: I started with Fortran 95 and a rather object-oriented 
code. Hence, I might 

Re: [C/C++ PATCH] Don't convert RHS of a shift-expression to int (PR c/63862)

2014-11-27 Thread Jakub Jelinek
On Thu, Nov 27, 2014 at 08:24:03PM +0100, Marek Polacek wrote:
 On Thu, Nov 27, 2014 at 10:56:06AM +0100, Richard Biener wrote:
  But then rather than using integer_type_node I'd convert to
  unsigned_type_node (if that doesn't work you know we have some nasty
  dependence on negative shift counts working)
 
 Yeah, I think adding the conversion into c_gimplify_expr is the best
 thing we can do until we have type demotion/promotion pass.  Seems
 that unsigned_type_node there works well.
 (What happened to Kai's type elevation pass?)
 
 Bootstrapped/regtested on ppc64-linux, ok for trunk?

LGTM.

 2014-11-27  Marek Polacek  pola...@redhat.com
 
   PR c/63862
 c-family/
   * c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR
   to op1_utype.
   * c-gimplify.c (c_gimplify_expr): Convert right operand of a shift
   expression to unsigned_type_node.
 c/
   * c-typeck.c (build_binary_op) RSHIFT_EXPR, LSHIFT_EXPR: Don't
   convert the right operand to integer type.
 cp/
   * typeck.c (cp_build_binary_op) RSHIFT_EXPR, LSHIFT_EXPR: Don't
   convert the right operand to integer type.
 testsuite/
   * gcc.c-torture/execute/shiftopt-1.c: Don't XFAIL anymore.
   * c-c++-common/ubsan/shift-7.c: New test.

Jakub


Re: [patch c++]: Fix PR/53904

2014-11-27 Thread H.J. Lu
On Wed, Nov 26, 2014 at 9:52 AM, Kai Tietz ktiet...@googlemail.com wrote:
 Ok.  Adjusted patch attached.  Nevertheless we should use here
 unsigned HWI instead of possible truncation to signed int.  I admit
 that it is unlikely to have more then 2^31 elements, but well 

 Ok for apply with adjusted ChangeLog?

 Regards,
 Kai


On Linux/x86, the testcase fails with

output is:
/export/gnu/import/git/gcc/gcc/testsuite/g++.dg/cpp0x/pr63904.C: In
instantiation of 'struct foo0':^M
/export/gnu/import/git/gcc/gcc/testsuite/g++.dg/cpp0x/pr63904.C:10:26:
  required from here^M
/export/gnu/import/git/gcc/gcc/testsuite/g++.dg/cpp0x/pr63904.C:6:12:
error: ISO C++ forbids zero-size array [-Wpedantic]^M

FAIL: g++.dg/cpp0x/pr63904.C  -std=c++11 (test for excess errors)

H.J.


Re: PATCH: PR target/63833: REAL_PIC_OFFSET_TABLE_REGNUM is wrong for x86-64

2014-11-27 Thread H.J. Lu
On Wed, Nov 12, 2014 at 6:53 AM, H.J. Lu hongjiu...@intel.com wrote:
 We have been using the wrong register to hold GOT in 64-bit large model,
 which is used by the large model PLT.  The only reason we haven't run into
 any problem is linker doesn't support the large model PLT. I am looking
 into linker issue.  This patch corrects REAL_PIC_OFFSET_TABLE_REGNUM for
 64-bit large model.  OK to install?

 Thanks.

 H.J.
 ---
 2014-11-12  H.J. Lu  hongjiu...@intel.com

  PR target/63833
  * config/i386/i386.h (REAL_PIC_OFFSET_TABLE_REGNUM): Use
  R15_REG for 64-bit.
  * config/i386/rdos64.h (REAL_PIC_OFFSET_TABLE_REGNUM): Removed.

 diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
 index 53dfd22..0ccfb21 100644
 --- a/gcc/config/i386/i386.h
 +++ b/gcc/config/i386/i386.h
 @@ -1244,7 +1244,7 @@ extern const char *host_detect_local_cpu (int argc, 
 const char **argv);
 the pic register when possible.  The change is visible after the
 prologue has been emitted.  */

 -#define REAL_PIC_OFFSET_TABLE_REGNUM  BX_REG
 +#define REAL_PIC_OFFSET_TABLE_REGNUM  (TARGET_64BIT ? R15_REG : BX_REG)

  #define PIC_OFFSET_TABLE_REGNUM  
   \
((TARGET_64BIT  (ix86_cmodel == CM_SMALL_PIC   \
 diff --git a/gcc/config/i386/rdos64.h b/gcc/config/i386/rdos64.h
 index e6f089a..f902651 100644
 --- a/gcc/config/i386/rdos64.h
 +++ b/gcc/config/i386/rdos64.h
 @@ -17,8 +17,5 @@ You should have received a copy of the GNU General Public 
 License
  along with GCC; see the file COPYING3.  If not see
  http://www.gnu.org/licenses/.  */

 -#undef REAL_PIC_OFFSET_TABLE_REGNUM
 -#define REAL_PIC_OFFSET_TABLE_REGNUM  R15_REG
 -
  #undef DEFAULT_LARGE_SECTION_THRESHOLD
  #define DEFAULT_LARGE_SECTION_THRESHOLD 16

Hi Uros,

The psABI

http://www.x86-64.org/svn/trunk/x86-64-ABI/

has been fixed.  Is this OK for trunk?

Thanks.


-- 
H.J.


Re: PATCH: PR target/63833: REAL_PIC_OFFSET_TABLE_REGNUM is wrong for x86-64

2014-11-27 Thread Uros Bizjak
On Wed, Nov 12, 2014 at 3:53 PM, H.J. Lu hongjiu...@intel.com wrote:
 We have been using the wrong register to hold GOT in 64-bit large model,
 which is used by the large model PLT.  The only reason we haven't run into
 any problem is linker doesn't support the large model PLT. I am looking
 into linker issue.  This patch corrects REAL_PIC_OFFSET_TABLE_REGNUM for
 64-bit large model.  OK to install?

 Thanks.

 H.J.
 ---
 2014-11-12  H.J. Lu  hongjiu...@intel.com

  PR target/63833
  * config/i386/i386.h (REAL_PIC_OFFSET_TABLE_REGNUM): Use
  R15_REG for 64-bit.
  * config/i386/rdos64.h (REAL_PIC_OFFSET_TABLE_REGNUM): Removed.

OK, the ABI documentation has just been fixed.

Thanks,
Uros.


Re: [patch c++]: Fix PR/53904

2014-11-27 Thread Kai Tietz
Sorry, missed to add needed hunk to disable pedantic warnings for this testcase.
Committed it as obvious fix at rev 218130.

Kai


[PATCH][ARM] FreeBSD arm support, EABI, v2

2014-11-27 Thread Andreas Tobler

Hi all,

this is the second attempt.

I reworked the issues Richard mentioned in the previous review.
I also found one issue which will break build/bootstrap if I pass
--enable-gnu-indirect-function, also fixed.

One thing which came up is the way we generate code for the
armv6*-*-freebsd* triplet versus the arm-*-freebsd* triplet.

I think the thing which confuses is the fact that we have only two fixed 
triplets where we build a complete OS with. Means the whole OS is built 
with the same optimization, not only the kernel or one binary.


For the armv6* we want to benefit from the cpu's functionality by 
default. We build all __ARM_ARCH = 6 with TARGET_CPU_arm1176jzs,

on the other hand all __ARM_ARCH =5 will be built with TARGET_CPU_arm9.

Now who becomes arm-*-freebsd* and who becomes armv6*-*-freebsd*?

As tried above, we only know two triplets, so __ARM_ARCH = 6 becomes
armv6*-*-freebsd* and __ARM_ARCH =5 becomes arm-*-freebsd*.

armv8 is not yet in the portfolio and it will become something 
different, either arm64 or aarch64, I do not know.


I'd like to keep this since our system compilers, clang and gcc-4.2.1 
behave the same.

If we have to change here, we would confuse people quite a lot.

The whole thing is FreeBSD specific and does not touch others.

As usual, bootstrapped, cross compiled, tested.

Ok for trunk?

TIA,
Andreas

toplevel:

* configure.ac: Don't add ${libgcj} for arm*-*-freebsd*.
* configure: Regenerate.
gcc:
* config.gcc (arm*-*-freebsd*): New configuration.
* config/arm/freebsd.h: New file.
* config.host: Add extra components for arm*-*-freebsd*.
* config/arm/arm.h: Introduce MAX_SYNC_LIBFUNC_SIZE.
* config/arm/arm.c (arm_init_libfuncs): Use MAX_SYNC_LIBFUNC_SIZE.

libgcc:

* config.host (arm*-*-freebsd*): Add new configuration for
arm*-*-freebsd*.
* config/arm/freebsd-atomic.c: New file.
* config/arm/t-freebsd: Likewise.
* config/arm/unwind-arm.h: Add __FreeBSD__ to the list of
'PC-relative indirect' OS's.

libatomic:

* configure.tgt: Exclude arm*-*-freebsd* from try_ifunc.

libstdc++-v3:

* configure.host: Add arm*-*-freebsd* port_specific_symbol_files.
Index: configure
===
--- configure   (revision 218129)
+++ configure   (working copy)
@@ -3427,6 +3427,9 @@
   alpha*-*-*vms*)
 noconfigdirs=$noconfigdirs ${libgcj}
 ;;
+  arm*-*-freebsd*)
+noconfigdirs=$noconfigdirs ${libgcj}
+;;
   arm-wince-pe)
 noconfigdirs=$noconfigdirs ${libgcj}
 ;;
Index: configure.ac
===
--- configure.ac(revision 218129)
+++ configure.ac(working copy)
@@ -781,6 +781,9 @@
   alpha*-*-*vms*)
 noconfigdirs=$noconfigdirs ${libgcj}
 ;;
+  arm*-*-freebsd*)
+noconfigdirs=$noconfigdirs ${libgcj}
+;;
   arm-wince-pe)
 noconfigdirs=$noconfigdirs ${libgcj}
 ;;
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 218129)
+++ gcc/config/arm/arm.c(working copy)
@@ -2160,7 +2160,7 @@
 {
   /* For Linux, we have access to kernel support for atomic operations.  */
   if (arm_abi == ARM_ABI_AAPCS_LINUX)
-init_sync_libfuncs (2 * UNITS_PER_WORD);
+init_sync_libfuncs (MAX_SYNC_LIBFUNC_SIZE);
 
   /* There are no special library functions unless we are using the
  ARM BPABI.  */
Index: gcc/config/arm/arm.h
===
--- gcc/config/arm/arm.h(revision 218129)
+++ gcc/config/arm/arm.h(working copy)
@@ -766,6 +766,11 @@
 #define PCC_BITFIELD_TYPE_MATTERS TARGET_AAPCS_BASED
 #endif
 
+/* The maximum size of the sync library functions supported.  */
+#ifndef MAX_SYNC_LIBFUNC_SIZE
+#define MAX_SYNC_LIBFUNC_SIZE (2 * UNITS_PER_WORD);
+#endif
+
 
 /* Standard register usage.  */
 
Index: gcc/config/arm/freebsd.h
===
--- gcc/config/arm/freebsd.h(revision 0)
+++ gcc/config/arm/freebsd.h(working copy)
@@ -0,0 +1,180 @@
+/* Definitions of target machine for GNU compiler, FreeBSD/arm version.
+   Copyright (C) 2002-2014 Free Software Foundation, Inc.
+   Contributed by Wasabi Systems, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described 

Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Manuel López-Ibáñez
On 27 November 2014 at 20:28, Tobias Burnus bur...@net-b.de wrote:
 I think the approach is fine. As the _now version overrides the buffer, one
 might even do with a single buffer by clearing it, setting flush_p
 temporarily to true and printing the message. It only might collide with
 buffered warnings (for error_now) and errors (for warning_now), but I don't
 see whether that's an issue. For warnings/error_now it probably isn't, for
 errors/warning_now, it might. Thus, having two buffers is probably better.

Oh, I didn't notice that the _now versions override the buffered
messages. Where do you see that? It seems that gfc_warning_1 saves and
restores buffer_flag but does not touch the buffers (it prints
directly to the stream).

However, as you say, I may still need two buffers, one for errors and
other for warnings. (If this were not necessary, it would be great! Is
it really?).

In fact, we will need more buffers, since there is gfc_push/pop_error.
This is why I didn't start with gfc_error, it is a bit more complex
than gfc_warning.

 The ugliest part is how to handle warningcount and werrorcount. I
 could handle this in the common machinery in a better way by storing
 DK_WERROR in the diagnostic-kind and checking it after printing.


 I'm missing the but. Does it require more modifications in common code and
 thus you shy away? Or is there another reason?

I was going to write Exactly and a long explanation. But now I
realize that in this particular case I can simply check DK_ERROR,
since within gfc_warning that can only come from a warning converted
to an error. That simplifies the function a bit:

+/* Issue a warning.  */
+
+bool
+gfc_warning_1 (int opt, const char *gmsgid, ...)
+{
+  va_list argp;
+  diagnostic_info diagnostic;
+  bool fatal_errors = global_dc-fatal_errors;
+  pretty_printer *pp = global_dc-printer;
+  output_buffer *tmp_buffer = pp-buffer;
+
+  if (!pp_warning_buffer.flush_p)
+{
+  /* To prevent -fmax-errors= triggering.  */
+  --werrorcount;
+  pp-buffer = pp_warning_buffer;
+  global_dc-fatal_errors = false;
+}
+
+  va_start (argp, gmsgid);
+  diagnostic_set_info (diagnostic, gmsgid, argp, UNKNOWN_LOCATION,
+  DK_WARNING);
+  diagnostic.option_index = opt;
+  bool ret = report_diagnostic (diagnostic);
+
+  if (ret  !pp_warning_buffer.flush_p)
+{
+  warningcount_buffered = 0;
+  werrorcount_buffered = 0;
+  /* Undo the above --werrorcount if not Werror, otherwise werrorcount is
+correct already.  */
+  diagnostic.kind == DK_ERROR
+   ? (++werrorcount_buffered)
+   : (++werrorcount, --warningcount, ++warningcount_buffered);
+
+  pp-buffer = tmp_buffer;
+  global_dc-fatal_errors = fatal_errors;
+}
+
+  va_end (argp);
+  return ret;
+}


 Well, for nearly every Fortran program, at least for one line multiple
 attempts have to be done by the parser. Thus, if you set a break point in
 gfc_error, you will see syntax error messages which never make it to the
 screen. As the test suite checks for excess errors, nearly every test-suite
 file tests this.

Sure, but I would like to test specifically triggering and discarding
the gfc_warning that I converted (or another single one that you
suggest), if this were possible.

 Some of those can probably be simply converted, others either need to
 remain, or one has to setup a proper location (currently, using
 gfc_warning_now would ICE), or one creates a work around and constructs
 manually the error string (at least for the fprintf cases).

Why does it ICE? At worst it should give a wrong location, but the
computation of the offset is fairly similar to what Fortran already
does. Could  you give me a patch+testcase that ICEs?

 Taking everything together, I think using libcpp for reading Fortran files
 is a topic for GCC 6. Any suggestion how to properly handle the spacing?

No idea. My knowledge of cpp is limited. I think you need to ask
Dodji, Tom or Joseph (in IRC or a new thread). I was hoping it was
going to be easier to lex Fortran with cpp since cpp can already
preprocess it.

Thanks for the comment. They are very helpful.

Cheers,

Manuel.


Re: [RFC PATCH, i386]: Prefer %ebx in set_got patterns

2014-11-27 Thread H.J. Lu
On Thu, Nov 27, 2014 at 3:19 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Attached patch helps RA to choose the most appropriate PIC register by
 changing the register preference for set_got patterns. Using this
 patch, there should really be a reason for RA to avoid ABI mandated
 hard PIC reg.

 This patch avoids many mov %exx,%ebx in front of the calls, that
 happen with unpatched compiler even with Vladimir's latest RA patch to
 avoid duplicated PIC registers.

 As a smoke test, I have checked 32bit libgo.so.6.0.0 library, where now we 
 have:

 [uros@omen7 .libs]$ grep thunk.bx aaa | wc -l
 7693
 [uros@omen7 .libs]$ grep thunk.ax aaa | wc -l
 10
 [uros@omen7 .libs]$ grep thunk.cx aaa | wc -l
 4
 [uros@omen7 .libs]$ grep thunk.dx aaa | wc -l
 8
 [uros@omen7 .libs]$ grep thunk.bp aaa | wc -l
 497
 [uros@omen7 .libs]$ grep thunk.si aaa | wc -l
 145
 [uros@omen7 .libs]$ grep thunk.di aaa | wc -l
 198

 2014-11-27  Uros Bizjak  ubiz...@gmail.com

 * config/i386/i386.md (set_got): Use =b,?r constraint for operand 0.

EBX is needed only if PLT is used.  If PLT isn't used, we should try
caller saved registers first.

 (set_got_labelled): Ditto.
 (set_got_rex64): Ditto.
 (set_rip_rex64): Ditto.
 (set_got_offset_rex64): Ditto.

We shouldn't do it for 64-bit.

 Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

 Thoughts?

 Uros



-- 
H.J.


[PATCH, i386]: Use preferred_for_{size,speed} instead of Yx and Yd register constraints

2014-11-27 Thread Uros Bizjak
Hello!

Yx and Yd register constraints depend on optimize_function_for_speed.
However, optimize_function_for_size is not stable during the
compilation, so this can result in unrecognized insn (as was the case
with *floatSWI48:modeMODEF:mode2_sse insn failure).

The patch uses preferred_for_{size,speed} infrastructure instead to
achieve the same functionality as with Yx and Yd register constraint.

2014-11-27  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (preferred_for_size): New attribute
(*pushxf): Split Yx*r constraints to r,*r.  Use preferred_for_size
attribute to conditionally disable alternative 1.
(*pushdf): Split Yd*r constraints to r,*r.  Use preferred_for_size
and prefered_for_speed attributes to conditionally disable
alternative 1.
(*movxf_internal): Split Yx*r constraints to r,*r.  Use
preferred_for_size attribute to conditionally disable
alternatives 3 and 4.
(*movdf_internal): Split Yd*r constraints to r,*r.  Use
preferred_for_size and prefered_for_speed attributes to conditionally
disable alternatives 3 and 4.
* config/i386/constraints.md (Yd, Yx): Remove register constraints.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32} and was committed to mainline SVN.

Uros.
Index: config/i386/constraints.md
===
--- config/i386/constraints.md  (revision 218129)
+++ config/i386/constraints.md  (working copy)
@@ -105,8 +105,6 @@
 ;;  n  MMX inter-unit moves from MMX register enabled
 ;;  a  Integer register when zero extensions with AND are disabled
 ;;  p  Integer register when TARGET_PARTIAL_REG_STALL is disabled
-;;  d  Integer register when integer DFmode moves are enabled
-;;  x  Integer register when integer XFmode moves are enabled
 ;;  f  x87 register when 80387 floating point arithmetic is enabled
 
 (define_register_constraint Yz TARGET_SSE ? SSE_FIRST_REG : NO_REGS
@@ -137,15 +135,6 @@
   ? NO_REGS : GENERAL_REGS
  @internal Any integer register when zero extensions with AND are disabled.)
 
-(define_register_constraint Yd
- TARGET_INTEGER_DFMODE_MOVES  optimize_function_for_speed_p (cfun)
-  ? GENERAL_REGS : NO_REGS
- @internal Any integer register when integer DFmode moves are enabled.)
-
-(define_register_constraint Yx
- optimize_function_for_speed_p (cfun) ? GENERAL_REGS : NO_REGS
- @internal Any integer register when integer XFmode moves are enabled.)
-
 (define_register_constraint Yf
  (ix86_fpmath  FPMATH_387) ? FLOAT_REGS : NO_REGS
  @internal Any x87 register when 80387 FP arithmetic is enabled.)
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 218129)
+++ config/i386/i386.md (working copy)
@@ -816,6 +816,7 @@
]
(const_int 1)))
 
+(define_attr preferred_for_size  (const_int 1))
 (define_attr preferred_for_speed  (const_int 1))
 
 ;; Describe a user's asm statement.
@@ -2811,8 +2812,8 @@
 })
 
 (define_insn *pushxf
-  [(set (match_operand:XF 0 push_operand =,)
-   (match_operand:XF 1 general_no_elim_operand f,Yx*roF))]
+  [(set (match_operand:XF 0 push_operand =,,,)
+   (match_operand:XF 1 general_no_elim_operand f,r,*r,oF))]
   
 {
   /* This insn should be already split before reg-stack.  */
@@ -2819,14 +2820,18 @@
   gcc_unreachable ();
 }
   [(set_attr type multi)
-   (set_attr unit i387,*)
+   (set_attr unit i387,*,*,*)
(set (attr mode)
-   (cond [(eq_attr alternative 1)
+   (cond [(eq_attr alternative 1,2,3)
 (if_then_else (match_test TARGET_64BIT)
   (const_string DI)
   (const_string SI))
  ]
- (const_string XF)))])
+ (const_string XF)))
+   (set (attr preferred_for_size)
+ (cond [(eq_attr alternative 1)
+  (symbol_ref false)]
+   (symbol_ref true)))])
 
 ;; %%% Kill this when call knows how to work this out.
 (define_split
@@ -2842,18 +2847,26 @@
 })
 
 (define_insn *pushdf
-  [(set (match_operand:DF 0 push_operand =,,,)
-   (match_operand:DF 1 general_no_elim_operand f,Yd*roF,rmF,x))]
+  [(set (match_operand:DF 0 push_operand =,)
+   (match_operand:DF 1 general_no_elim_operand f,r,*r,oF,rmF,x))]
   
 {
   /* This insn should be already split before reg-stack.  */
   gcc_unreachable ();
 }
-  [(set_attr isa *,nox64,x64,sse2)
+  [(set_attr isa *,nox64,nox64,nox64,x64,sse2)
(set_attr type multi)
-   (set_attr unit i387,*,*,sse)
-   (set_attr mode DF,SI,DI,DF)])
-
+   (set_attr unit i387,*,*,*,*,sse)
+   (set_attr mode DF,SI,SI,SI,DI,DF)
+   (set (attr preferred_for_size)
+ (cond [(eq_attr alternative 1)
+  (symbol_ref false)]
+   (symbol_ref true)))
+   (set (attr preferred_for_speed)
+ (cond [(eq_attr alternative 1)
+  (symbol_ref TARGET_INTEGER_DFMODE_MOVES)]
+   (symbol_ref true)))])
+   
 ;; %%% Kill this when call knows how to work this out.
 

Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Manuel López-Ibáñez
On 27 November 2014 at 22:33, Manuel López-Ibáñez lopeziba...@gmail.com wrote:
 On 27 November 2014 at 20:28, Tobias Burnus bur...@net-b.de wrote:
 I think the approach is fine. As the _now version overrides the buffer, one
 might even do with a single buffer by clearing it, setting flush_p
 temporarily to true and printing the message. It only might collide with
 buffered warnings (for error_now) and errors (for warning_now), but I don't
 see whether that's an issue. For warnings/error_now it probably isn't, for
 errors/warning_now, it might. Thus, having two buffers is probably better.

 Oh, I didn't notice that the _now versions override the buffered
 messages. Where do you see that? It seems that gfc_warning_1 saves and
 restores buffer_flag but does not touch the buffers (it prints
 directly to the stream).

I meant gfc_warning_now_1. However, gfc_error_now_1 does reset
error_buffer (and sets it as curr_buffer, why if without buffering it
is not used?) . Hum, if this is really necessary, I'm afraid I just
introduced a subtle bug, since the new gfc_error_now does not reset
error_buffer, thus, potentially it may give an error that should have
been discarded. Tobias, what do you think?


Re: PR 13631 Problems in messages

2014-11-27 Thread François Dumont

Hi

Here is a revisited version. I finally go without compiling in 
C++11 as I prefer to use __builtin_alloca instead of using 
std::unique_ptr and heap memory. I also realized that I could use 
codecvt::max_length to know the max number of multibytes.


Tested under Linux x86_64 with all locales installed. There are 
some failures but not coming from this patch, will surely be the subject 
of another mail.


2014-11-28  François Dumont  fdum...@gcc.gnu.org

DR libstdc++/13631
* include/bits/codecvt.h (__get_c_locale): New.
* config/locale/gnu/messages_member.h, messages_member.cc: Prefer
dgettext usage to gettext to allow usage of several catalogs at the
same time. Add an internal cache to make catalog names to catalog ids.
Add charset management.
* testsuite/22_locale/messages/13631.cc: New.
* testsuite/22_locale/messages/members/char/2.cc: Use also fr_FR locale
for charset conversion to get the expected accentuated character.

Do we need any abi tag ? Do we need to wait ?

François


On 24/11/2014 01:23, Jonathan Wakely wrote:

On 24/11/14 00:13 +0100, François Dumont wrote:

Hello

   As we are at doing some evolution in the ABI I would like to take 
the opportunity to merge branch libstdcxx_so_7-2. The first fix was 


I don't think we want to merge everything, but it's certainly worth
looking to see if there are some fixes on that branch worth taking.

It would have been better to do during stage 1 though :-\

about a messages facet issue. So here is the version for the trunk 
which is the one from the branch plus management of the charset part. 
This way messageswchar_t works too.


   There are still some uncovered points in this patch:
- I had to make codecvt _M_c_locale_codecvt public to access it from 
the messages facet code. How do you want to handle it properly ? A 
friend function to access it or do I make messages facet friend ?
- I haven't use the ABI tag yet. I know that there is a plan to tag 
locale facets, will it work for what I am doing ?


Unless I'm missing something you're not making any ABI changes to
std::messages, just making the definitiosn of some functions no longer
inline.

   Note that I could use std::tuple instead of combination of 
std::pair and std::unique_ptr instead of wchar_buffer if we were 
building it in C++11 mode. Is it possible ?


Yes, the symlink to the messages_members.cc file would need to be
moved from src/c++98/Makefile.am to src/c++11/Makefile.am


Index: include/bits/locale_facets_nonio.h
===
--- include/bits/locale_facets_nonio.h(revision 217816)
+++ include/bits/locale_facets_nonio.h(working copy)
@@ -1842,22 +1842,6 @@
  */
  virtual void
  do_close(catalog) const;
-
-  // Returns a locale and codeset-converted string, given a 
char* message.

-  char*
-  _M_convert_to_char(const string_type __msg) const
-  {
-// XXX
-return reinterpret_castchar*(const_cast_CharT*(__msg.c_str()));
-  }
-
-  // Returns a locale and codeset-converted string, given a 
char* message.

-  string_type
-  _M_convert_from_char(char*) const
-  {
-// XXX
-return string_type();
-  }
 };


Those members are used by the ieee_1003.1-2001 locale.




Index: config/locale/gnu/messages_members.cc
===
--- config/locale/gnu/messages_members.cc	(revision 218027)
+++ config/locale/gnu/messages_members.cc	(working copy)
@@ -31,54 +31,272 @@
 #include locale
 #include bits/c++locale_internal.h
 
+#include algorithm
+#include utility
+
+#include ext/concurrence.h
+
+namespace
+{
+  using namespace std;
+
+  class Catalogs
+  {
+  public:
+typedef messages_base::catalog catalog_id;
+
+struct catalog_info
+{
+  catalog_info()
+  { }
+
+  catalog_info(catalog_id __id, const char* __domain, locale __loc)
+	: _M_id(__id), _M_domain(__domain), _M_locale(__loc)
+  { }
+
+  catalog_id _M_id;
+  const char* _M_domain;
+  locale _M_locale;
+};
+
+typedef pairconst char*, locale result_type;
+
+Catalogs() : _M_counter(0), _M_nb_entry(0) { }
+
+~Catalogs()
+{
+  if (_M_nb_entry)
+	{
+	  for (size_t i = 0; i != _M_nb_entry; ++i)
+	delete[] _M_map[i]._M_domain;
+	  delete[] _M_map;
+	}
+}
+
+catalog_id
+_M_add(const string __domain, locale __l)
+{
+  __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+  catalog_info* __new_map = new catalog_info[_M_nb_entry + 1];
+  __try
+	{
+	  copy(_M_map, _M_map + _M_nb_entry, __new_map);
+	  char* __domain_copy = new char[__domain.size() + 1];
+	  __domain.copy(__domain_copy, __domain.size());
+	  __domain_copy[__domain.size()] = 0;
+	  __new_map[_M_nb_entry] = catalog_info(_M_counter, __domain_copy, __l);
+	}
+  __catch(...)
+	{
+	  delete[] __new_map;
+	  __throw_exception_again;
+	}
+
+  // The counter is not likely to 

Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Tobias Burnus

Manuel López-Ibáñez wrote:
Oh, I didn't notice that the _now versions override the buffered 
messages. Where do you see that?


I think I messed up a bit - they do not seem to get cleared. I was 
reading gfc_error_now_1, which has:


  error_buffer.flag = 1;
  error_buffer.index = 0;
  cur_error_buffer = error_buffer;

and I misread the error_buffer.index = 0 as resetting the output. 
However, this index = 0 is completely unused as one has unbuffered 
output and it only applies to the next error.


Actually, I wonder whether that can lead to printing an error message 
multiple times. Assume a buffered message, that's printed with 
gfc_error_check(), which sets flags = 0, but the buffer-message still 
contains the message. gfc_error_now_1 sets the flag to 1; it doesn't 
touch the buffer itself as it directly outputs the message. If one now 
calls gfc_error_check(), it outputs the previous buffer again. I wonder 
whether that's sometimes happens. I do recall seeing the same message 
multiple times without understanding why.


Thus, I withdraw the comment. And regarding error push/pop: I also 
missed that point.



Well, for nearly every Fortran program, at least for one line multiple
attempts have to be done by the parser. Thus, if you set a break point in
gfc_error, you will see syntax error messages which never make it to the
screen. As the test suite checks for excess errors, nearly every test-suite
file tests this.

Sure, but I would like to test specifically triggering and discarding
the gfc_warning that I converted (or another single one that you
suggest), if this were possible.


Hmm, theoretically, it would be possible. However, my feeling is that 
there is no such case. What would be needed is something which is 
ambiguous, the compiler assumes first the wrong kind of statement, 
fails, and retries is with a different kind of statement.


That's simple. However, either one has already aborted (gfc_error or 
NO_MATCH) *before* reaching a gfc_warning – or the code is valid and, 
hence, the buffered warning is printed.


Hence, sorry, I cannot deliver for gfc_warning.

For error, I'd could create one.



Some of those can probably be simply converted, others either need to
remain, or one has to setup a proper location (currently, using
gfc_warning_now would ICE), or one creates a work around and constructs
manually the error string (at least for the fprintf cases).

Why does it ICE? At worst it should give a wrong location, but the
computation of the offset is fairly similar to what Fortran already
does. Could  you give me a patch+testcase that ICEs?


Maybe it has been fixed by your latest patch, which added %L, but using
printf '\tend'  test.f90; gfortran -Wtabs test.f90
used to ICE for me. However, it no longer ICEs. Thus, the following 
patch could be committed:


--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -1391 +1391 @@ gfc_gobble_whitespace (void)
- gfc_warning_now_1 (Nonconforming tab character at %C);
+ gfc_warning_now (OPT_Wtabs, Nonconforming tab character 
at %C);



I also confirm that the !$OMP warning now works correctly.


However, the change

- gfc_warning_now_1 (Line truncated at %L, gfc_current_locus);
+ gfc_warning_now (Line truncated at %L, gfc_current_locus);

still triggers an ICE with gfortran -Wall test.f90 and with test.f 
(two locations in scanner.c; see attached test cases and attache patch). 
The backtrace for test.f90 is:


Aborted
0xb6c38f crash_signal
../../gcc/toplev.c:359
0x1244620 linemap_position_for_loc_and_offset(line_maps*, unsigned int, 
unsigned int)

../../libcpp/line-map.c:648
0x63247a gfc_format_decoder
../../gcc/fortran/error.c:991
0x1231e8c pp_format(pretty_printer*, text_info*)
../../gcc/pretty-print.c:616
0x122f6ec diagnostic_report_diagnostic(diagnostic_context*, 
diagnostic_info*)

../../gcc/diagnostic.c:820
0x63380b gfc_warning_now(char const*, ...)
../../gcc/fortran/error.c:1133
0x69aca9 gfc_next_char_literal(gfc_instring)
../../gcc/fortran/scanner.c:1059


Besides those, there is additionally:
  gfc_warning_now_1 (%s:%d: Illegal preprocessor directive,
which I haven't tested.

Tobias
print *, 'aa'
end 
  print *, ''
diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index 884fe70..5b42777 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -777,6 +777,6 @@ skip_free_comments (void)
 			  else
-			gfc_warning_now_1 (!$OMP at %C starts a commented 
-	   line as it neither is followed 
-	   by a space nor is a 
-	   continuation line);
+			gfc_warning_now (!$OMP at %C starts a commented 
+	 line as it neither is followed 
+	 by a space nor is a 
+	 continuation line);
 			}
@@ 

Re: [PATCH] Fix regressions in libgomp testsuite: set flag_fat_lto_objects for offload

2014-11-27 Thread Ilya Verbin
On 27 Nov 16:00, Richard Biener wrote:
 On Thu, Nov 27, 2014 at 2:31 PM, Ilya Verbin iver...@gmail.com wrote:
  On 25 Nov 10:51, Richard Biener wrote:
  In the patch adding flag_generate_offload sounds like a good solution,
  I didn't like emitting fat LTO objects unconditionally just because we 
  offload.
 
  Here is updated patch.  Bootstrap and make check passed on i686-linux and
  x86_64-linux.  Tests with offloading also passed with trunk binutils.
  OK for trunk?
 
 If it also works without the new __gnu_offload symbol then I'd prefer
 to re-use __gnu_LTO here.
 
 Ok with that change.

Do you mean a patch like this?  Yes, it also works.
Bootstrap and make check passed.  OK?

  -- Ilya


gcc/
* cgraphunit.c (ipa_passes): Handle flag_generate_offload.
(symbol_table::compile): Set flag_generate_offload if there is something
to offload.
* common.opt (flag_generate_offload): New Variable declaration.
* dwarf2out.c (dwarf2out_finish): Handle flag_generate_offload.
* ipa-inline-analysis.c (inline_generate_summary): Do not skip if
flag_generate_offload is set.
* lto-streamer.c (gate_lto_out): Handle flag_generate_offload.
* passes.c (ipa_write_summaries): Do not skip if flag_generate_offload
is set.
* toplev.c (compile_file): Emit LTO marker if offload info has been
previously emitted.  Do not emit lto_slim marker if
flag_generate_offload is without flag_generate_lto.
* tree.c (free_lang_data): Do not skip if flag_generate_offload is set.


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 2fd99a7..fed1a3e 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2075,7 +2075,7 @@ ipa_passes (void)
 }
 
   /* Some targets need to handle LTO assembler output specially.  */
-  if (flag_generate_lto)
+  if (flag_generate_lto || flag_generate_offload)
 targetm.asm_out.lto_start ();
 
   if (!in_lto_p)
@@ -2092,7 +2092,7 @@ ipa_passes (void)
}
 }
 
-  if (flag_generate_lto)
+  if (flag_generate_lto || flag_generate_offload)
 targetm.asm_out.lto_end ();
 
   if (!flag_ltrans  (in_lto_p || !flag_lto || flag_fat_lto_objects))
@@ -2176,10 +2176,10 @@ symbol_table::compile (void)
 
   /* Offloading requires LTO infrastructure.  */
   if (!in_lto_p  g-have_offload)
-flag_generate_lto = 1;
+flag_generate_offload = 1;
 
   /* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
-  if (flag_generate_lto)
+  if (flag_generate_lto || flag_generate_offload)
 lto_streamer_hooks_init ();
 
   /* Don't run the IPA passes if there was any error or sorry messages.  */
diff --git a/gcc/common.opt b/gcc/common.opt
index 41c8d4e..752d939 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -67,6 +67,10 @@ int *param_values
 Variable
 int flag_generate_lto
 
+; Nonzero if we should write GIMPLE bytecode for offload compilation.
+Variable
+int flag_generate_offload = 0
+
 ; True to warn about any objects definitions whose size is larger
 ; than N bytes.  Also want about function definitions whose returned
 ; values are larger than N bytes, where N is 'larger_than_size'.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 25f0e7d..4ee3102 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -24420,7 +24420,8 @@ dwarf2out_finish (const char *filename)
   /* When generating LTO bytecode we can not generate new assembler
  names at this point and all important decls got theirs via
 free-lang-data.  */
-  if ((!flag_generate_lto || DECL_ASSEMBLER_NAME_SET_P (decl))
+  if (((!flag_generate_lto  !flag_generate_offload)
+  || DECL_ASSEMBLER_NAME_SET_P (decl))
   DECL_ASSEMBLER_NAME (decl) != DECL_NAME (decl))
{
  add_linkage_attr (node-die, decl);
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 2f2993c..9d62722 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -4031,7 +4031,7 @@ inline_generate_summary (void)
 
   /* When not optimizing, do not bother to analyze.  Inlining is still done
  because edge redirection needs to happen there.  */
-  if (!optimize  !flag_generate_lto  !flag_wpa)
+  if (!optimize  !flag_generate_lto  !flag_generate_offload  !flag_wpa)
 return;
 
   function_insertion_hook_holder =
diff --git a/gcc/lto-streamer.c b/gcc/lto-streamer.c
index e8347dc..af20330 100644
--- a/gcc/lto-streamer.c
+++ b/gcc/lto-streamer.c
@@ -328,7 +328,7 @@ lto_streamer_init (void)
 bool
 gate_lto_out (void)
 {
-  return ((flag_generate_lto || in_lto_p)
+  return ((flag_generate_lto || flag_generate_offload || in_lto_p)
  /* Don't bother doing anything if the program has errors.  */
   !seen_error ());
 }
diff --git a/gcc/passes.c b/gcc/passes.c
index a3be0bb..74b40e5 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2466,7 +2466,7 @@ ipa_write_summaries (bool offload_lto_mode)
   struct cgraph_node *node;
   struct cgraph_node **order;
 
-  if 

Re: [PATCH][OpenMP] Fix named critical sections inside target functions

2014-11-27 Thread Ilya Verbin
On 21 Nov 21:36, Jakub Jelinek wrote:
 On Fri, Nov 21, 2014 at 11:19:26PM +0300, Ilya Verbin wrote:
  '#pragma omp critical (name)' can be placed in the function, marked
  with '#pragma omp declare target', in this case the corresponding node
  should be marked as offloadable too.
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Please add a testcase for this.

Here is the testcase, based on critical-2.c, ok for trunk?


gcc/
* omp-low.c (lower_omp_critical): Mark critical sections
inside target functions as offloadable.

libgomp/
* testsuite/libgomp.c/target-critical-1.c: New test.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index dff1504..621958c 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9387,16 +9387,6 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
  DECL_ARTIFICIAL (decl) = 1;
  DECL_IGNORED_P (decl) = 1;
 
- /* If '#pragma omp critical' is inside target region, the symbol must
-be marked for offloading.  */
- omp_context *octx;
- for (octx = ctx-outer; octx; octx = octx-outer)
-   if (is_targetreg_ctx (octx))
- {
-   varpool_node::get_create (decl)-offloadable = 1;
-   break;
- }
-
  varpool_node::finalize_decl (decl);
 
  critical_name_mutexes-put (name, decl);
@@ -9404,6 +9394,20 @@ lower_omp_critical (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
   else
decl = *n;
 
+  /* If '#pragma omp critical' is inside target region or
+inside function marked as offloadable, the symbol must be
+marked as offloadable too.  */
+  omp_context *octx;
+  if (cgraph_node::get (current_function_decl)-offloadable)
+   varpool_node::get_create (decl)-offloadable = 1;
+  else
+   for (octx = ctx-outer; octx; octx = octx-outer)
+ if (is_targetreg_ctx (octx))
+   {
+ varpool_node::get_create (decl)-offloadable = 1;
+ break;
+   }
+
   lock = builtin_decl_explicit (BUILT_IN_GOMP_CRITICAL_NAME_START);
   lock = build_call_expr_loc (loc, lock, 1, build_fold_addr_expr_loc (loc, 
decl));
 
diff --git a/libgomp/testsuite/libgomp.c/target-critical-1.c 
b/libgomp/testsuite/libgomp.c/target-critical-1.c
new file mode 100644
index 000..84ad558
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/target-critical-1.c
@@ -0,0 +1,72 @@
+/* { dg-do run } */
+
+#include omp.h
+#include stdlib.h
+
+#define N 2000
+
+#pragma omp declare target
+int foo ()
+{
+  int A[N];
+  int i, nthreads;
+  int res = 0;
+
+  #pragma omp parallel shared (A, nthreads)
+{
+  #pragma omp master
+   nthreads = omp_get_num_threads ();
+
+  #pragma omp for
+   for (i = 0; i  N; i++)
+ A[i] = 0;
+
+  #pragma omp critical (crit1)
+for (i = 0; i  N; i++)
+ A[i]++;
+}
+
+  for (i = 0; i  N; i++)
+if (A[i] != nthreads)
+  res = 1;
+
+  return res;
+}
+#pragma omp end declare target
+
+int main ()
+{
+  int res1, res2;
+
+  #pragma omp target map (from: res1, res2)
+{
+  int B[N];
+  int i, nthreads;
+
+  res1 = foo ();
+
+  #pragma omp parallel shared (B, nthreads)
+   {
+ #pragma omp master
+   nthreads = omp_get_num_threads ();
+
+ #pragma omp for
+   for (i = 0; i  N; i++)
+ B[i] = 0;
+
+ #pragma omp critical (crit2)
+   for (i = 0; i  N; i++)
+ B[i]++;
+   }
+
+  res2 = 0;
+  for (i = 0; i  N; i++)
+   if (B[i] != nthreads)
+ res2 = 1;
+}
+
+  if (res1 || res2)
+abort ();
+
+  return 0;
+}

  -- Ilya


Re: [PATCH][ARM] Add Cortex-A17 support

2014-11-27 Thread Gerald Pfeifer

On Thursday 2014-11-27 11:09, Kyrill Tkachov wrote:
Thanks for the review. The invoke.texi changes were posted and 
approved at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02683.html 
and I'm attaching the proposed changes.html patch.


Looks good.

Note that once you have has been added and once was added.
I _think_ the former may be more appropriate, but let's one of
our native speakers comment in case.

GeraldIndex: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.43
diff -U 3 -r1.43 changes.html
--- htdocs/gcc-5/changes.html	27 Nov 2014 09:50:04 -	1.43
+++ htdocs/gcc-5/changes.html	27 Nov 2014 09:57:19 -
@@ -398,6 +398,14 @@
   configure option with either of code--with-tune/code or
   code--with-arch/code.
   /li
+  li Support for the Cortex-A17 processor has been added through the
+  code-mcpu=cortex-a17/code and code-mtune=cortex-a17/code options.
+  /li
+  li Initial big.LITTLE tuning support for the combination of Cortex-A17
+   and Cortex-A7 was added through the code-mcpu=cortex-a17.cortex-a7
+   /code and code-mtune=cortex-a17.cortex-a7/code options.
+  /li
+
  /ul
 
 h3 id=x86IA-32/x86-64/h3

Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2014-11-27 Thread Ilya Verbin
On 11 Oct 18:49, Ilya Verbin wrote:
   (run_gcc): Outline options handling into the new functions:
   find_and_merge_options, append_compiler_options, append_linker_options.
 ...
 @@ -625,18 +893,13 @@ run_gcc (unsigned argc, char *argv[])
/* Look at saved options in the IL files.  */
for (i = 1; i  argc; ++i)
  {
 ...
 +  have_lto
 + = find_and_merge_options (fd, file_offset, LTO_SECTION_NAME_PREFIX,
 +   fdecoded_options, fdecoded_options_count,
 +   collect_gcc);
 +  have_offload
 + = find_and_merge_options (fd, file_offset, OFFLOAD_SECTION_NAME_PREFIX,
 +   offload_fdecoded_options,
 +   offload_fdecoded_options_count, collect_gcc);
close (fd);
  }

I found a bug here, have_{lto,offload} must be set if at least one file contains
lto/offload sections, but currently they are overwritten by the last file.
Fix is bootstrapped and regtested on x86_64-linux.  OK for trunk?


gcc/
* lto-wrapper.c (run_gcc): Set have_lto and have_offload if at least one
file contains sections with LTO and offload IR, respectively.


diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 9a540b4..0f69457 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -921,13 +921,14 @@ run_gcc (unsigned argc, char *argv[])
continue;
 
   have_lto
-   = find_and_merge_options (fd, file_offset, LTO_SECTION_NAME_PREFIX,
- fdecoded_options, fdecoded_options_count,
- collect_gcc);
+   |= find_and_merge_options (fd, file_offset, LTO_SECTION_NAME_PREFIX,
+  fdecoded_options, fdecoded_options_count,
+  collect_gcc);
   have_offload
-   = find_and_merge_options (fd, file_offset, OFFLOAD_SECTION_NAME_PREFIX,
- offload_fdecoded_options,
- offload_fdecoded_options_count, collect_gcc);
+   |= find_and_merge_options (fd, file_offset, OFFLOAD_SECTION_NAME_PREFIX,
+  offload_fdecoded_options,
+  offload_fdecoded_options_count,
+  collect_gcc);
   close (fd);
 }
 

  -- Ilya


[PATCH] rs6000: fix for PR64093

2014-11-27 Thread Segher Boessenkool
A slight oversight.  andmode3_imm is predicated by rs6000_gen_cell_microcode
so we had better not generate it otherwise; and in that case, we need to force
the constant to a register.

Bootstrap and testsuite run are fine (of course, since this isn't
tested at all); comparing testsuite runs between -mcpu=cell and
default shows a few thousand fails (mainly ICEs in dfp), and no
differences between with and without this patch.

But this PR was a failure in Linux kernel build; I checked it did fail
to build without this patch, and works with (I built cell_defconfig).

Okay for trunk?


Segher


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
PR target/64093
* config/rs6000/rs6000.md (andmode3): Don't generate
andmode3_imm unless rs6000_gen_cell_microcode is true.


---
 gcc/config/rs6000/rs6000.md | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c8c3a68..f3b5aae 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2646,8 +2646,13 @@ (define_expand andmode3
   if (logical_const_operand (operands[2], MODEmode)
!any_mask_operand (operands[2], MODEmode))
 {
-  emit_insn (gen_andmode3_imm (operands[0], operands[1], operands[2]));
-  DONE;
+  if (rs6000_gen_cell_microcode)
+   {
+ emit_insn (gen_andmode3_imm (operands[0], operands[1], 
operands[2]));
+ DONE;
+   }
+  else
+   operands[2] = force_reg (MODEmode, operands[2]);
 }
 
   if ((MODEmode == DImode  !and64_2_operand (operands[2], MODEmode))
-- 
1.8.1.4



[PATCH 0/6] various combine patches, v2

2014-11-27 Thread Segher Boessenkool
The first three of these are from the previous series, with hopefully
all comments addressed.

The fourth simplifies the code that breaks up an arith insn with a
compare to zero, using the helpers that the third patch introduced.

The fifth and the sixth are separate from the rest and each other;
they fix PRs that people optimistically thought this patch series
would fix.  Well, now they are right ;-)


Segher Boessenkool (6):
  combine: add regno field to LOG_LINKS
  combine: distribute_log_links for PARALLELs of SETs
  combine: handle I2 a parallel of two SETs
  combine: simplify the CC thing code
  combine: handle REG_UNUSED in reg_dead_at_p (PR59278)
  combine: allow combining two insns into two in some cases (PR52714)

 gcc/combine.c | 326 +-
 1 file changed, 229 insertions(+), 97 deletions(-)

-- 
1.8.1.4



[PATCH 1/6] combine: add regno field to LOG_LINKS

2014-11-27 Thread Segher Boessenkool
With this new field in place, we can have LOG_LINKS for insns that set
more than one register and distribute them properly in distribute_links.
This then allows many more PARALLELs to be combined.

Also split off new functions can_combine_{def,use}_p from the
create_log_links function.

v2: Add used_between_p check to the only case where I2 can be a multiple
set and still be combined: we have a log_link for the first use of each
def now, so try_combine can be called with I3 later than the first use
of one of the sets, so we better check if we have multiple sets.


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (struct insn_link): New field `regno'.
(alloc_insn_link): New parameter `regno'.  Use it.
(find_single_use): Check the new field.
(can_combine_def_p, can_combine_use_p): New functions.  Split
off from ...
(create_log_links): ... here.  Correct data type of `regno'.
Adjust call to alloc_insn_link.
(adjust_for_new_dest): Find regno, use it in call to
alloc_insn_link.
(try_combine): Check reg_used_between_p when combining a PARALLEL
as earlier insn.  Adjust call to alloc_insn_link.
(distribute_links): Check the new field.


---
 gcc/combine.c | 144 --
 1 file changed, 89 insertions(+), 55 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 1808f97..9b53a93 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -328,6 +328,7 @@ static int *uid_insn_cost;
 
 struct insn_link {
   rtx_insn *insn;
+  unsigned int regno;
   struct insn_link *next;
 };
 
@@ -346,12 +347,13 @@ static struct obstack insn_link_obstack;
 /* Allocate a link.  */
 
 static inline struct insn_link *
-alloc_insn_link (rtx_insn *insn, struct insn_link *next)
+alloc_insn_link (rtx_insn *insn, unsigned int regno, struct insn_link *next)
 {
   struct insn_link *l
 = (struct insn_link *) obstack_alloc (insn_link_obstack,
  sizeof (struct insn_link));
   l-insn = insn;
+  l-regno = regno;
   l-next = next;
   return l;
 }
@@ -686,7 +688,7 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc)
 if (INSN_P (next)  dead_or_set_p (next, dest))
   {
FOR_EACH_LOG_LINK (link, next)
- if (link-insn == insn)
+ if (link-insn == insn  link-regno == REGNO (dest))
break;
 
if (link)
@@ -982,6 +984,43 @@ delete_noop_moves (void)
 }
 
 
+/* Return false if we do not want to (or cannot) combine DEF.  */
+static bool
+can_combine_def_p (df_ref def)
+{
+  /* Do not consider if it is pre/post modification in MEM.  */
+  if (DF_REF_FLAGS (def)  DF_REF_PRE_POST_MODIFY)
+return false;
+
+  unsigned int regno = DF_REF_REGNO (def);
+
+  /* Do not combine frame pointer adjustments.  */
+  if ((regno == FRAME_POINTER_REGNUM
+(!reload_completed || frame_pointer_needed))
+#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
+  || (regno == HARD_FRAME_POINTER_REGNUM
+  (!reload_completed || frame_pointer_needed))
+#endif
+#if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
+  || (regno == ARG_POINTER_REGNUM  fixed_regs[regno])
+#endif
+  )
+return false;
+
+  return true;
+}
+
+/* Return false if we do not want to (or cannot) combine USE.  */
+static bool
+can_combine_use_p (df_ref use)
+{
+  /* Do not consider the usage of the stack pointer by function call.  */
+  if (DF_REF_FLAGS (use)  DF_REF_CALL_STACK_USAGE)
+return false;
+
+  return true;
+}
+
 /* Fill in log links field for all insns.  */
 
 static void
@@ -1015,67 +1054,46 @@ create_log_links (void)
 
  FOR_EACH_INSN_DEF (def, insn)
 {
-  int regno = DF_REF_REGNO (def);
+  unsigned int regno = DF_REF_REGNO (def);
   rtx_insn *use_insn;
 
   if (!next_use[regno])
 continue;
 
-  /* Do not consider if it is pre/post modification in MEM.  */
-  if (DF_REF_FLAGS (def)  DF_REF_PRE_POST_MODIFY)
-continue;
+ if (!can_combine_def_p (def))
+   continue;
 
-  /* Do not make the log link for frame pointer.  */
-  if ((regno == FRAME_POINTER_REGNUM
-(! reload_completed || frame_pointer_needed))
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  || (regno == HARD_FRAME_POINTER_REGNUM
-   (! reload_completed || frame_pointer_needed))
-#endif
-#if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
-  || (regno == ARG_POINTER_REGNUM  fixed_regs[regno])
-#endif
-  )
-continue;
+ use_insn = next_use[regno];
+ next_use[regno] = NULL;
 
-  use_insn = next_use[regno];
-  if (BLOCK_FOR_INSN (use_insn) == bb)
-{
-  /* flow.c claimed:
-
- We don't build a LOG_LINK 

[PATCH 4/6] combine: simplify the CC thing code

2014-11-27 Thread Segher Boessenkool
[What a great subject line].

This simplifies the code right before the new code of the previous patch
to use one of the new helper functions that introduces.


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (try_combine): Use is_parallel_of_n_reg_sets some more.


---
 gcc/combine.c | 42 --
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 6c94e53..83a0a10 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2832,43 +2832,33 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
  This undoes a previous combination and allows us to match a branch-and-
  decrement insn.  */
 
-  if (i1 == 0  GET_CODE (PATTERN (i2)) == PARALLEL
-   XVECLEN (PATTERN (i2), 0) = 2
-   GET_CODE (XVECEXP (PATTERN (i2), 0, 0)) == SET
+  if (i1 == 0
+   is_parallel_of_n_reg_sets (i2, 2)
(GET_MODE_CLASS (GET_MODE (SET_DEST (XVECEXP (PATTERN (i2), 0, 0
  == MODE_CC)
GET_CODE (SET_SRC (XVECEXP (PATTERN (i2), 0, 0))) == COMPARE
XEXP (SET_SRC (XVECEXP (PATTERN (i2), 0, 0)), 1) == const0_rtx
-   GET_CODE (XVECEXP (PATTERN (i2), 0, 1)) == SET
-   REG_P (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)))
rtx_equal_p (XEXP (SET_SRC (XVECEXP (PATTERN (i2), 0, 0)), 0),
  SET_SRC (XVECEXP (PATTERN (i2), 0, 1)))
!reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3)
!reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3))
 {
-  for (i = XVECLEN (PATTERN (i2), 0) - 1; i = 2; i--)
-   if (GET_CODE (XVECEXP (PATTERN (i2), 0, i)) != CLOBBER)
- break;
-
-  if (i == 1)
-   {
- /* We make I1 with the same INSN_UID as I2.  This gives it
-the same DF_INSN_LUID for value tracking.  Our fake I1 will
-never appear in the insn stream so giving it the same INSN_UID
-as I2 will not cause a problem.  */
+  /* We make I1 with the same INSN_UID as I2.  This gives it
+the same DF_INSN_LUID for value tracking.  Our fake I1 will
+never appear in the insn stream so giving it the same INSN_UID
+as I2 will not cause a problem.  */
 
- i1 = gen_rtx_INSN (VOIDmode, NULL, i2, BLOCK_FOR_INSN (i2),
-XVECEXP (PATTERN (i2), 0, 1), INSN_LOCATION (i2),
--1, NULL_RTX);
- INSN_UID (i1) = INSN_UID (i2);
+  i1 = gen_rtx_INSN (VOIDmode, NULL, i2, BLOCK_FOR_INSN (i2),
+XVECEXP (PATTERN (i2), 0, 1), INSN_LOCATION (i2),
+-1, NULL_RTX);
+  INSN_UID (i1) = INSN_UID (i2);
 
- SUBST (PATTERN (i2), XVECEXP (PATTERN (i2), 0, 0));
- SUBST (XEXP (SET_SRC (PATTERN (i2)), 0),
-SET_DEST (PATTERN (i1)));
- unsigned int regno = REGNO (SET_DEST (PATTERN (i1)));
- SUBST_LINK (LOG_LINKS (i2),
- alloc_insn_link (i1, regno, LOG_LINKS (i2)));
-   }
+  SUBST (PATTERN (i2), XVECEXP (PATTERN (i2), 0, 0));
+  SUBST (XEXP (SET_SRC (PATTERN (i2)), 0),
+SET_DEST (PATTERN (i1)));
+  unsigned int regno = REGNO (SET_DEST (PATTERN (i1)));
+  SUBST_LINK (LOG_LINKS (i2),
+ alloc_insn_link (i1, regno, LOG_LINKS (i2)));
 }
 
   /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
-- 
1.8.1.4



[PATCH 2/6] combine: distribute_log_links for PARALLELs of SETs

2014-11-27 Thread Segher Boessenkool
Now that LOG_LINKS are per regno, we can distribute them on PARALLELs
just fine.  Do so.  This makes PARALLELs not lose their LOG_LINKS early
when e.g. a trivial reg-reg move is combined, so that they can be used
in more useful combinations as well.


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (distribute_links): Handle multiple SETs.

---
 gcc/combine.c | 52 +---
 1 file changed, 37 insertions(+), 15 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 9b53a93..7e3f4e6 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -13817,24 +13817,46 @@ distribute_links (struct insn_link *links)
 
   next_link = link-next;
 
-  /* If the insn that this link points to is a NOTE or isn't a single
-set, ignore it.  In the latter case, it isn't clear what we
-can do other than ignore the link, since we can't tell which
-register it was for.  Such links wouldn't be used by combine
-anyway.
-
-It is not possible for the destination of the target of the link to
-have been changed by combine.  The only potential of this is if we
-replace I3, I2, and I1 by I3 and I2.  But in that case the
-destination of I2 also remains unchanged.  */
-
-  if (NOTE_P (link-insn)
- || (set = single_set (link-insn)) == 0)
+  /* If the insn that this link points to is a NOTE, ignore it.  */
+  if (NOTE_P (link-insn))
+   continue;
+
+  set = 0;
+  rtx pat = PATTERN (link-insn);
+  if (GET_CODE (pat) == SET)
+   set = pat;
+  else if (GET_CODE (pat) == PARALLEL)
+   {
+ int i;
+ for (i = 0; i  XVECLEN (pat, 0); i++)
+   {
+ set = XVECEXP (pat, 0, i);
+ if (GET_CODE (set) != SET)
+   continue;
+
+ reg = SET_DEST (set);
+ while (GET_CODE (reg) == ZERO_EXTRACT
+|| GET_CODE (reg) == STRICT_LOW_PART
+|| GET_CODE (reg) == SUBREG)
+   reg = XEXP (reg, 0);
+
+ if (!REG_P (reg))
+   continue;
+
+ if (REGNO (reg) == link-regno)
+   break;
+   }
+ if (i == XVECLEN (pat, 0))
+   continue;
+   }
+  else
continue;
 
   reg = SET_DEST (set);
-  while (GET_CODE (reg) == SUBREG || GET_CODE (reg) == ZERO_EXTRACT
-|| GET_CODE (reg) == STRICT_LOW_PART)
+
+  while (GET_CODE (reg) == ZERO_EXTRACT
+|| GET_CODE (reg) == STRICT_LOW_PART
+|| GET_CODE (reg) == SUBREG)
reg = XEXP (reg, 0);
 
   /* A LOG_LINK is defined as being placed on the first insn that uses
-- 
1.8.1.4



Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Manuel López-Ibáñez
On 27 November 2014 at 23:58, Tobias Burnus bur...@net-b.de wrote:
 Manuel López-Ibáñez wrote:
 Sure, but I would like to test specifically triggering and discarding
 the gfc_warning that I converted (or another single one that you
 suggest), if this were possible.


 Hmm, theoretically, it would be possible. However, my feeling is that there
 is no such case. What would be needed is something which is ambiguous, the
 compiler assumes first the wrong kind of statement, fails, and retries is
 with a different kind of statement.

 That's simple. However, either one has already aborted (gfc_error or
 NO_MATCH) *before* reaching a gfc_warning - or the code is valid and, hence,
 the buffered warning is printed.

Then, I don't understand the point of gfc_warning.

If gfc_warning is never reached, then we could have used gfc_warning_now.
If the warning is printed, then gfc_warning was reached, thus we could
have used gfc_warning_now.

The only difference is that gfc_warning_now will print the warning as
soon as gfc_warning is reached, but since there is at most one
buffered warning and there can not be a previous error, then this
should not make a difference. What am I missing here?

 Why does it ICE? At worst it should give a wrong location, but the
 computation of the offset is fairly similar to what Fortran already
 does. Could  you give me a patch+testcase that ICEs?

 However, the change

 - gfc_warning_now_1 (Line truncated at %L, gfc_current_locus);
 + gfc_warning_now (Line truncated at %L, gfc_current_locus);

 still triggers an ICE with gfortran -Wall test.f90 and with test.f (two
 locations in scanner.c; see attached test cases and attache patch). The
 backtrace for test.f90 is:

The problem seems to be that linemap_position_for_loc_and_offset
cannot handle an offset larger than the column_hint. Fortran hardcodes
that to 120. There are two bugs there: (1) we should not crash, at
worst we should use offset=0 and (2) we should update the column_hint
to the largest column seen. I'm testing a patch that seems to fix the
problem.

Cheers,

Manuel.


[PATCH 3/6] combine: handle I2 a parallel of two SETs

2014-11-27 Thread Segher Boessenkool
If I2 is a PARALLEL of two SETs, split it into two instructions, I1
and I2.  If there already was an I1, rename it to I0.  If there
already was an I0, don't do anything.

This surprisingly simple patch is enough to let combine handle such
PARALLELs properly.

v2: Add some functions to make the checking for a suitable PARALLEL
more readable (and more general).


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (is_parallel_of_n_reg_sets): New function.
(can_split_parallel_of_n_reg_sets): New function.
(try_combine): If I2 is a PARALLEL of two SETs, split it into
two insns if possible.


---
 gcc/combine.c | 78 +++
 1 file changed, 78 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index 7e3f4e6..6c94e53 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2461,6 +2461,59 @@ update_cfg_for_uncondjump (rtx_insn *insn)
 }
 }
 
+/* Return whether INSN is a PARALLEL of exactly N register SETs followed
+   by an arbitrary number of CLOBBERs.  */
+static bool
+is_parallel_of_n_reg_sets (rtx_insn *insn, int n)
+{
+  rtx pat = PATTERN (insn);
+
+  if (GET_CODE (pat) != PARALLEL)
+return false;
+
+  int len = XVECLEN (pat, 0);
+  if (len  n)
+return false;
+
+  int i;
+  for (i = 0; i  n; i++)
+if (GET_CODE (XVECEXP (pat, 0, i)) != SET
+   || !REG_P (SET_DEST (XVECEXP (pat, 0, i
+  return false;
+  for ( ; i  len; i++)
+if (GET_CODE (XVECEXP (pat, 0, i)) != CLOBBER)
+  return false;
+
+  return true;
+}
+
+/* Return whether INSN, a PARALLEL of N register SETs (and maybe some
+   CLOBBERs), can be split into individual SETs in that order, without
+   changing semantics.  */
+static bool
+can_split_parallel_of_n_reg_sets (rtx_insn *insn, int n)
+{
+  if (!insn_nothrow_p (insn))
+return false;
+
+  rtx pat = PATTERN (insn);
+
+  int i, j;
+  for (i = 0; i  n; i++)
+{
+  if (side_effects_p (SET_SRC (XVECEXP (pat, 0, i
+   return false;
+
+  rtx reg = SET_DEST (XVECEXP (pat, 0, i));
+
+  for (j = i + 1; j  n; j++)
+   if (reg_referenced_p (reg, XVECEXP (pat, 0, j)))
+ return false;
+}
+
+  return true;
+}
+
 /* Try to combine the insns I0, I1 and I2 into I3.
Here I0, I1 and I2 appear earlier than I3.
I0 and I1 can be zero; then we combine just I2 into I3, or I1 and I2 into
@@ -2817,6 +2870,31 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
  alloc_insn_link (i1, regno, LOG_LINKS (i2)));
}
 }
+
+  /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
+ make those two SETs separate I1 and I2 insns, and make an I0 that is
+ the original I1.  */
+  if (i0 == 0
+   is_parallel_of_n_reg_sets (i2, 2)
+   can_split_parallel_of_n_reg_sets (i2, 2)
+   !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 0)), i2, i3)
+   !reg_used_between_p (SET_DEST (XVECEXP (PATTERN (i2), 0, 1)), i2, i3))
+{
+  /* If there is no I1, there is no I0 either.  */
+  i0 = i1;
+
+  /* We make I1 with the same INSN_UID as I2.  This gives it
+the same DF_INSN_LUID for value tracking.  Our fake I1 will
+never appear in the insn stream so giving it the same INSN_UID
+as I2 will not cause a problem.  */
+
+  i1 = gen_rtx_INSN (VOIDmode, NULL, i2, BLOCK_FOR_INSN (i2),
+XVECEXP (PATTERN (i2), 0, 0), INSN_LOCATION (i2),
+-1, NULL_RTX);
+  INSN_UID (i1) = INSN_UID (i2);
+
+  SUBST (PATTERN (i2), XVECEXP (PATTERN (i2), 0, 1));
+}
 #endif
 
   /* Verify that I2 and I1 are valid for combining.  */
-- 
1.8.1.4



[PATCH 5/6] combine: handle REG_UNUSED in reg_dead_at_p (PR59278)

2014-11-27 Thread Segher Boessenkool
Currently reg_dead_at_p returns 0 for registers that are set but never
used.  This patch repairs that oversight.  This fixes PR59278.


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
PR rtl-optimization/59278
combine (reg_dead_at_p): Consider REG_UNUSED notes.

---
 gcc/combine.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index 83a0a10..91ddff4 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -12990,6 +12990,9 @@ reg_dead_at_p (rtx reg, rtx_insn *insn)
 {
   if (INSN_P (insn))
 {
+ if (find_regno_note (insn, REG_UNUSED, reg_dead_regno))
+   return 1;
+
  note_stores (PATTERN (insn), reg_dead_at_p_1, NULL);
  if (reg_dead_flag)
return reg_dead_flag == 1 ? 1 : 0;
-- 
1.8.1.4



[PATCH 6/6] combine: allow combining two insns into two in some cases (PR52714)

2014-11-27 Thread Segher Boessenkool
If one of the resulting insns is a noop set it is safe to allow the
combination, for it will never lead to a loop: any following combination
using that noop will delete it.  In fact, in most cases combine deletes
the noop immediately; for cc0 targets it does not in some cases, but it
deletes another insn in that case (a following jump insn).

This fixes PR52714.


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
PR rtl-optimization/52714
* combine.c (try_combine): Allow combining two insns into two
new insns if at least one of those is a noop.

---
 gcc/combine.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 91ddff4..66007d8 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -3812,15 +3812,20 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   /* Similarly, check for a case where we have a PARALLEL of two independent
  SETs but we started with three insns.  In this case, we can do the sets
  as two separate insns.  This case occurs when some SET allows two
- other insns to combine, but the destination of that SET is still live.  */
+ other insns to combine, but the destination of that SET is still live.
 
-  else if (i1  insn_code_number  0  asm_noperands (newpat)  0
+ Also do this if we started with two insns and (at least) one of the
+ resulting sets is a noop; this noop will be deleted later.  */
+
+  else if (insn_code_number  0  asm_noperands (newpat)  0
GET_CODE (newpat) == PARALLEL
XVECLEN (newpat, 0) == 2
GET_CODE (XVECEXP (newpat, 0, 0)) == SET
+   GET_CODE (XVECEXP (newpat, 0, 1)) == SET
+   (i1 || set_noop_p (XVECEXP (newpat, 0, 0))
+ || set_noop_p (XVECEXP (newpat, 0, 1)))
GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
-   GET_CODE (XVECEXP (newpat, 0, 1)) == SET
GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
-- 
1.8.1.4



Re: [PATCH, ARM] Fix PR63718, Thumb1 bootstrap -- disable fuse-caller-save for Thumb1

2014-11-27 Thread Joey Ye
This work around brings thumb1 bootstrap back. Though I cannot
approve, I would like it in stage 3. A complete fix in thumb1 pattern
will be welcomed.

Thanks,
Joey

On Thu, Nov 20, 2014 at 7:54 PM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,

 This patch fixes PR63718, which currently breaks Thumb1 bootstrap.

 The problem is that in Thumb1 mode, we emit the epilogue in RTL, but the
 last insn - epilogue_insns - does not accurately model the corresponding
 insns
 emitted in the asm file. F.i., the asm file may contain an insn:
 ...
   pop {r0}
 
 while the corresponding RTL pattern looks like this:
 ...
 (jump_insn (unspec_volatile [
 (return)
  ] VUNSPEC_EPILOGUE))
 ...

 As a consequence, the epilogue may clobber registers without
 fuse-caller-save being able to analyze that.

 Adding the missing clobbers to epilogue_insns is not trivial, and probably
 not a good idea for stage3. The patch works around the problem by disabling
 fuse-caller-save in Thumb1 mode.

 Build and reg-tested on arm-none-eabi.

 OK for stage3?

 Thanks,
 - Tom


Re: [PATCH] rs6000: fix for PR64093

2014-11-27 Thread David Edelsohn
On Thu, Nov 27, 2014 at 7:21 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 A slight oversight.  andmode3_imm is predicated by rs6000_gen_cell_microcode
 so we had better not generate it otherwise; and in that case, we need to force
 the constant to a register.

 Bootstrap and testsuite run are fine (of course, since this isn't
 tested at all); comparing testsuite runs between -mcpu=cell and
 default shows a few thousand fails (mainly ICEs in dfp), and no
 differences between with and without this patch.

 But this PR was a failure in Linux kernel build; I checked it did fail
 to build without this patch, and works with (I built cell_defconfig).

 Okay for trunk?


 Segher


 2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

 gcc/
 PR target/64093
 * config/rs6000/rs6000.md (andmode3): Don't generate
 andmode3_imm unless rs6000_gen_cell_microcode is true.

Okay.

Thanks, David


Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-11-27 Thread Mike Stump
On Nov 27, 2014, at 9:28 AM, Andrew Stubbs a...@codesourcery.com wrote:
 On 27/11/14 17:05, Mike Stump wrote:
 Could you include a link or the patch.  If the test suite, I'll review it if 
 no one else steps up.
 
 The original patch is here:
 
 https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01119.html

Sorry, arm people will have to approve...  Many reasonable choices, yet I bet 
only one is best.

Re: [RFC diagnostics/fortran] Move gfc_warning (buffered) to the common diagnostics machinery

2014-11-27 Thread Tobias Burnus

Manuel López-Ibáñez wrote:

On 27 November 2014 at 23:58, Tobias Burnus bur...@net-b.de wrote:

Hmm, theoretically, it would be possible. However, my feeling is that there
is no such case. What would be needed is something which is ambiguous, the
compiler assumes first the wrong kind of statement, fails, and retries is
with a different kind of statement.

That's simple. However, either one has already aborted (gfc_error or
NO_MATCH) *before* reaching a gfc_warning - or the code is valid and, hence,
the buffered warning is printed.

Then, I don't understand the point of gfc_warning.


I think the idea is that a warning could be printed in the wrong code 
path (assumption of which kind of statement or expression one has) – and 
then one re-tries a different code path, see that it fits and drops the 
buffered message.


Grepping through the 132 gfc_warning, I have the impression that this 
cannot happen, i.e. that the message is always shown. However, in order 
to be sure, I had to check all code paths and had to think about ways 
they become unreachable – with the chance that I miss something.



If gfc_warning is never reached, then we could have used gfc_warning_now.
If the warning is printed, then gfc_warning was reached, thus we could
have used gfc_warning_now.


I concur; the question is only whether my impression is correct. You 
could try to map gfc_warning to gfc_warning_now_1 and see whether 
something in the testsuite breaks.



The only difference is that gfc_warning_now will print the warning as
soon as gfc_warning is reached, but since there is at most one
buffered warning and there can not be a previous error, then this
should not make a difference. What am I missing here?


(a) That, conceptionally, warning buffering makes sense, if you 
buffering errors for rewinding - even if it only occurs for error and 
not for warnings. (b) That I might have missed some code path. (c) That 
one might add such a code path in future.


Except for (b), I do not see a reason not to map gfc_warning to 
gfc_warning_now. If one later needs buffering, one can still take 
gfc_error as model for implementing it as it surely needs buffering.


FX, Steve et al.: Comments or additional facts?


Why does it ICE? At worst it should give a wrong location, but the
computation of the offset is fairly similar to what Fortran already
does. Could  you give me a patch+testcase that ICEs?

However, the change

- gfc_warning_now_1 (Line truncated at %L, gfc_current_locus);
+ gfc_warning_now (Line truncated at %L, gfc_current_locus);

still triggers an ICE with gfortran -Wall test.f90 and with test.f (two
locations in scanner.c; see attached test cases and attache patch). The
backtrace for test.f90 is:

The problem seems to be that linemap_position_for_loc_and_offset
cannot handle an offset larger than the column_hint. Fortran hardcodes
that to 120. There are two bugs there: (1) we should not crash, at
worst we should use offset=0 and (2) we should update the column_hint
to the largest column seen. I'm testing a patch that seems to fix the
problem.


Fortran (old code) chops off items from the beginning of the line if the 
column number is high (and also trimps the output line to the right). 
Thus, instead of

  print 'aa.'
(1)
is just show
   aaa'
   (1)
I don't know how C/C++ handles this. If one doesn't chop off, the output 
can become extremely long – especially with topformflat 0. And then 
the ^ is also not helpful as one has to count the numbe of lines ...


Tobias