date:20160327

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread wilson at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #4 from Jim Wilson  ---
The broken targets all define flag_section_anchors at -O1 and up.  x86_64 does
not.  I don't know why this makes a difference yet.

[PATCH 1/4] Add gcc-auto-profile script

2016-03-27 Thread Andi Kleen

From: Andi Kleen 

Using autofdo is currently something difficult. It requires using the
model specific branches taken event, which differs on different CPUs.
The example shown in the manual requires a special patched version of
perf that is non standard, and also will likely not work everywhere.

This patch adds a new gcc-auto-profile script that figures out the
correct event and runs perf. The script is installed with on Linux systems.

Since maintaining the script would be somewhat tedious (needs changes
every time a new CPU comes out) I auto generated it from the online
Intel event database. The script to do that is in contrib and can be
rerun.

Right now there is no test if perf works in configure. This
would vary depending on the build and target system, and since
it currently doesn't work in virtualization and needs uptodate
kernel it may often fail in common distribution build setups.

So Linux just hardcodes installing the script, but it may fail at runtime.

This is needed to actually make use of autofdo in a generic way
in the build system and in the test suite.

So far the script is not installed.

gcc/:
2016-03-27  Andi Kleen  

* doc/invoke.texi: Document gcc-auto-profile
* gcc-auto-profile: Create.

contrib/:

2016-03-27  Andi Kleen  

* gen_autofdo_event.py: New file to regenerate
gcc-auto-profile.
---
 contrib/gen_autofdo_event.py | 155 +++
 gcc/doc/invoke.texi  |  31 +++--
 gcc/gcc-auto-profile |  70 +++
 3 files changed, 251 insertions(+), 5 deletions(-)
 create mode 100755 contrib/gen_autofdo_event.py
 create mode 100755 gcc/gcc-auto-profile

diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py
new file mode 100755
index 000..db4db33
--- /dev/null
+++ b/contrib/gen_autofdo_event.py
@@ -0,0 +1,155 @@
+#!/usr/bin/python
+# generate Intel taken branches Linux perf event script for autofdo profiling
+
+# Copyright (C) 2016 Free Software Foundation, Inc.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+# WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .  */
+
+# run it with perf record -b -e EVENT program ...
+# The Linux Kernel needs to support the PMU of the current CPU, and
+# it will likely not work in VMs.
+# add --all to print for all cpus, otherwise for current cpu
+# add --script to generate shell script to run correct event
+#
+# requires internet (https) access. this may require setting up a proxy
+# with export https_proxy=...
+#
+import urllib2
+import sys
+import json
+import argparse
+import collections
+
+baseurl = "https://download.01.org/perfmon;
+
+target_events = (u'BR_INST_RETIRED.NEAR_TAKEN',
+ u'BR_INST_EXEC.TAKEN',
+ u'BR_INST_RETIRED.TAKEN_JCC',
+ u'BR_INST_TYPE_RETIRED.COND_TAKEN')
+
+ap = argparse.ArgumentParser()
+ap.add_argument('--all', '-a', help='Print for all CPUs', action='store_true')
+ap.add_argument('--script', help='Generate shell script', action='store_true')
+args = ap.parse_args()
+
+eventmap = collections.defaultdict(list)
+
+def get_cpu_str():
+with open('/proc/cpuinfo', 'r') as c:
+vendor, fam, model = None, None, None
+for j in c:
+n = j.split()
+if n[0] == 'vendor_id':
+vendor = n[2]
+elif n[0] == 'model' and n[1] == ':':
+model = int(n[2])
+elif n[0] == 'cpu' and n[1] == 'family':
+fam = int(n[3])
+if vendor and fam and model:
+return "%s-%d-%X" % (vendor, fam, model), model
+return None, None
+
+def find_event(eventurl, model):
+print >>sys.stderr, "Downloading", eventurl
+u = urllib2.urlopen(eventurl)
+events = json.loads(u.read())
+u.close()
+
+found = 0
+for j in events:
+if j[u'EventName'] in target_events:
+event = "cpu/event=%s,umask=%s/" % (j[u'EventCode'], j[u'UMask'])
+if u'PEBS' in j and j[u'PEBS'] > 0:
+event += "p"
+if args.script:
+eventmap[event].append(model)
+else:
+print j[u'EventName'], "event for model", model, "is", event
+found += 1
+return found
+
+if not args.all:
+cpu, model = get_cpu_str()
+if not cpu:
+sys.exit("Unknown CPU type")
+
+url = baseurl +

[PATCH 3/4] Run profile feedback tests with autofdo

2016-03-27 Thread Andi Kleen

From: Andi Kleen 

Extend the existing bprob and tree-prof tests to also run with autofdo.
The test runtimes are really a bit too short for autofdo, but it's
a reasonable sanity check.

This only works natively for now.

dejagnu doesn't seem to support a wrapper for unix tests, so I had
to open code running these tests.  That should be ok due to the
native run restrictions.

gcc/testsuite/:
2016-03-27  Andi Kleen  

* g++.dg/bprob/bprob.exp: Support autofdo.
* g++.dg/tree-prof/tree-prof.exp: dito.
* gcc.dg/tree-prof/tree-prof.exp: dito.
* gcc.misc-tests/bprob.exp: dito.
* gfortran.dg/prof/prof.exp: dito.
* lib/profopt.exp: dito.
* lib/target-supports.exp: Check for autofdo.
---
 gcc/testsuite/g++.dg/bprob/bprob.exp | 19 +
 gcc/testsuite/g++.dg/tree-prof/tree-prof.exp | 19 +
 gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp | 19 +
 gcc/testsuite/gcc.misc-tests/bprob.exp   | 23 +++
 gcc/testsuite/gfortran.dg/prof/prof.exp  | 18 +
 gcc/testsuite/lib/profopt.exp| 58 ++--
 gcc/testsuite/lib/target-supports.exp| 23 +++
 7 files changed, 176 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/g++.dg/bprob/bprob.exp 
b/gcc/testsuite/g++.dg/bprob/bprob.exp
index d07..92a1e94 100644
--- a/gcc/testsuite/g++.dg/bprob/bprob.exp
+++ b/gcc/testsuite/g++.dg/bprob/bprob.exp
@@ -53,6 +53,7 @@ if $tracelevel then {
 
 set profile_options "-fprofile-arcs"
 set feedback_options "-fbranch-probabilities"
+set profile_wrapper ""
 
 # Main loop.
 foreach profile_option $profile_options feedback_option $feedback_options {
@@ -65,4 +66,22 @@ foreach profile_option $profile_options feedback_option 
$feedback_options {
 }
 }
 
+set profile_wrapper [profopt-perf-wrapper]
+set profile_options "-g"
+set feedback_options "-fauto-profile"
+set run_autofdo 1
+
+foreach profile_option $profile_options feedback_option $feedback_options {
+foreach src [lsort [glob -nocomplain $srcdir/$subdir/bprob-*.c]] {
+if ![runtest_file_p $runtests $src] then {
+continue
+}
+   set base [file tail $srco
+profopt-execute $src
+}
+}
+
+set run_autofdo ""
+set profile_wrapper ""
+
 set PROFOPT_OPTIONS $bprob_save_profopt_options
diff --git a/gcc/testsuite/g++.dg/tree-prof/tree-prof.exp 
b/gcc/testsuite/g++.dg/tree-prof/tree-prof.exp
index 7a4b5cb..7220217 100644
--- a/gcc/testsuite/g++.dg/tree-prof/tree-prof.exp
+++ b/gcc/testsuite/g++.dg/tree-prof/tree-prof.exp
@@ -44,6 +44,7 @@ set PROFOPT_OPTIONS [list {}]
 # profile data.
 set profile_option "-fprofile-generate -D_PROFILE_GENERATE"
 set feedback_option "-fprofile-use -D_PROFILE_USE"
+set profile_wrapper ""
 
 foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.C]] {
 # If we're only testing specific files and this isn't one of them, skip it.
@@ -53,4 +54,22 @@ foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.C]] {
 profopt-execute $src
 }
 
+set profile_wrapper [profopt-perf-wrapper]
+set profile_options "-g"
+set feedback_options "-fauto-profile"
+set run_autofdo 1
+
+foreach profile_option $profile_options feedback_option $feedback_options {
+foreach src [lsort [glob -nocomplain $srcdir/$subdir/bprob-*.c]] {
+if ![runtest_file_p $runtests $src] then {
+continue
+}
+   set base [file tail $srco
+profopt-execute $src
+}
+}
+
+set run_autofdo ""
+set profile_wrapper ""
+
 set PROFOPT_OPTIONS $treeprof_save_profopt_options
diff --git a/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp 
b/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp
index 650ad8d..7fff52c 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp
+++ b/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp
@@ -44,6 +44,7 @@ set PROFOPT_OPTIONS [list {}]
 # profile data.
 set profile_option "-fprofile-generate -D_PROFILE_GENERATE"
 set feedback_option "-fprofile-use -D_PROFILE_USE"
+set profile_wrapper ""
 
 foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.c]] {
 # If we're only testing specific files and this isn't one of them, skip it.
@@ -53,4 +54,22 @@ foreach src [lsort [glob -nocomplain $srcdir/$subdir/*.c]] {
 profopt-execute $src
 }
 
+set profile_wrapper [profopt-perf-wrapper]
+set profile_options "-g"
+set feedback_options "-fauto-profile"
+set run_autofdo 1
+
+foreach profile_option $profile_options feedback_option $feedback_options {
+foreach src [lsort [glob -nocomplain $srcdir/$subdir/bprob-*.c]] {
+if ![runtest_file_p $runtests $src] then {
+continue
+}
+   set base [file tail $srco
+profopt-execute $src
+}
+}
+
+set run_autofdo ""
+set profile_wrapper ""
+
 set PROFOPT_OPTIONS $treeprof_save_profopt_options
diff --git a/gcc/testsuite/gcc.misc-tests/bprob.exp 
b/gcc/testsuite/gcc.misc-tests/bprob.exp
index 52dcb1f..cc12f1f 100644
---

[PATCH 2/4] Don't cause ICEs when auto profile file is not found with checking

2016-03-27 Thread Andi Kleen

From: Andi Kleen 

Currently, on a checking enabled compiler when -fauto-profile does
not find the profile feedback file it errors out with assertation
failures. Add proper errors for this case.

gcc/:

2016-03-27  Andi Kleen  

* auto-profile.c (read_profile): Replace asserts with errors
when file does not exist.
* gcov-io.c (gcov_read_words): Dito.
---
 gcc/auto-profile.c | 32 +---
 gcc/gcov-io.c  |  4 +++-
 2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 5c0640a..5cb94a6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -887,16 +887,25 @@ static void
 read_profile (void)
 {
   if (gcov_open (auto_profile_file, 1) == 0)
-error ("Cannot open profile file %s.", auto_profile_file);
+{
+  error ("Cannot open profile file %s.", auto_profile_file);
+  return;
+}
 
   if (gcov_read_unsigned () != GCOV_DATA_MAGIC)
-error ("AutoFDO profile magic number does not mathch.");
+{
+  error ("AutoFDO profile magic number does not mathch.");
+  return;
+}
 
   /* Skip the version number.  */
   unsigned version = gcov_read_unsigned ();
   if (version != AUTO_PROFILE_VERSION)
-error ("AutoFDO profile version %u does match %u.",
-   version, AUTO_PROFILE_VERSION);
+{
+  error ("AutoFDO profile version %u does match %u.",
+version, AUTO_PROFILE_VERSION);
+  return;
+}
 
   /* Skip the empty integer.  */
   gcov_read_unsigned ();
@@ -904,19 +913,28 @@ read_profile (void)
   /* string_table.  */
   afdo_string_table = new string_table ();
   if (!afdo_string_table->read())
-error ("Cannot read string table from %s.", auto_profile_file);
+{
+  error ("Cannot read string table from %s.", auto_profile_file);
+  return;
+}
 
   /* autofdo_source_profile.  */
   afdo_source_profile = autofdo_source_profile::create ();
   if (afdo_source_profile == NULL)
-error ("Cannot read function profile from %s.", auto_profile_file);
+{
+  error ("Cannot read function profile from %s.", auto_profile_file);
+  return;
+}
 
   /* autofdo_module_profile.  */
   fake_read_autofdo_module_profile ();
 
   /* Read in the working set.  */
   if (gcov_read_unsigned () != GCOV_TAG_AFDO_WORKING_SET)
-error ("Cannot read working set from %s.", auto_profile_file);
+{
+  error ("Cannot read working set from %s.", auto_profile_file);
+  return;
+}
 
   /* Skip the length of the section.  */
   gcov_read_unsigned ();
diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c
index 17fcae0..95ead22 100644
--- a/gcc/gcov-io.c
+++ b/gcc/gcov-io.c
@@ -493,7 +493,9 @@ gcov_read_words (unsigned words)
   const gcov_unsigned_t *result;
   unsigned excess = gcov_var.length - gcov_var.offset;
 
-  gcov_nonruntime_assert (gcov_var.mode > 0);
+  if (gcov_var.mode <= 0)
+return NULL;
+
   if (excess < words)
 {
   gcov_var.start += gcov_var.offset;
-- 
2.7.3

[PATCH] gcc/final.c: -fdebug-prefix-map support to remap sources with relative path

2016-03-27 Thread Hongxu Jia

PR other/70428
* final.c (remap_debug_filename): Use lrealpath to translate
relative path before remapping

Signed-off-by: Hongxu Jia 
---
 gcc/ChangeLog |  6 ++
 gcc/final.c   | 15 ---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f02e3d8..8b7207c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -26,6 +26,12 @@
in all constraint alternatives.
(movtd_64bit_nodm): Delete "j" constraint alternative.
 
+2016-03-24  Hongxu Jia  
+
+   PR other/70428
+   * final.c (remap_debug_filename): Use lrealpath to translate
+   relative path before remapping
+
 2016-03-24  Aldy Hernandez  
 
* tree-ssa-propagate.c: Enhance docs for
diff --git a/gcc/final.c b/gcc/final.c
index 55cf509..23293e5 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -1554,16 +1554,25 @@ remap_debug_filename (const char *filename)
   const char *name;
   size_t name_len;
 
+  /* Support to remap filename with relative path  */
+  char *realpath = lrealpath (filename);
+  if (realpath == NULL)
+return filename;
+
   for (map = debug_prefix_maps; map; map = map->next)
-if (filename_ncmp (filename, map->old_prefix, map->old_len) == 0)
+if (filename_ncmp (realpath, map->old_prefix, map->old_len) == 0)
   break;
   if (!map)
-return filename;
-  name = filename + map->old_len;
+{
+  free (realpath);
+  return filename;
+}
+  name = realpath + map->old_len;
   name_len = strlen (name) + 1;
   s = (char *) alloca (name_len + map->new_len);
   memcpy (s, map->new_prefix, map->new_len);
   memcpy (s + map->new_len, name, name_len);
+  free (realpath);
   return ggc_strdup (s);
 }
 
-- 
2.7.4

Re: rs6000 stack_tie mishap again

2016-03-27 Thread Segher Boessenkool

On Wed, Mar 23, 2016 at 05:04:39PM +0100, Olivier Hainque wrote:
> The visible effect is a powerpc-eabispe compiler (no red-zone) producing an
> epilogue sequence like
> 
>addi %r11,%r1,184# temp pointer within frame

The normal -m32 compiler here generates code like

lwz 11,0(1)

and try as I might I cannot get it to fail.  Maybe because the GPR11
setup here involves a load?

>addi %r1,%r11,104# release frame
> 
>evldd %r21,16(%r11)  # restore registers
>...  # ...
>evldd %r31,96(%r11)  # ...
> 
>blr  # return

> We have observed this with a gcc 4.7 back-end and weren't able to reproduce
> with a more recent version.

This makes it not a regression and thus out of scope for GCC 6.  We're
supposed to stabilise things now ;-)

>   if (! writep)
> {
>   base = find_base_term (mem_addr);
>   if (base && (GET_CODE (base) == LABEL_REF
>  || (GET_CODE (base) == SYMBOL_REF
>  && CONSTANT_POOL_ADDRESS_P (base
>   return 0;
> }
> 
> 
> with
> 
> (gdb) pr mem_addr
> (plus:SI (reg:SI 11 11)
> (const_int 96 [0x60]))
>  
> and
>  
> (gdb) pr base
> (symbol_ref/u:SI ("*.LC0") [flags 0x82])
>  
> coming from insn 710, despite 894 in between. Ug.

Yeah that is just Wrong.

> The reason why 894 is not accounted in the base ref computation is because it
> is part of the epilogue sequence, and init_alias_analysis has:
> 
>   /* Walk the insns adding values to the new_reg_base_value array.  */
>   for (i = 0; i < rpo_cnt; i++)
>   { ...
> if (could_be_prologue_epilogue
> && prologue_epilogue_contains (insn))
>   continue;
> 
> The motivation for this is unclear to me.

Alan linked to the history.  It seems clear that just considering the
prologue is enough to fix the original problem (frame pointer setup),
and problems like yours cannot happen in the prologue.

Better would be not to have this hack at all.

> My rough understanding is that we probably really care about frame_related
> insns only here, at least on targets where the flag is supposed to be set
> accurately.

On targets with DWARF2 unwind info the flag should be set on those insns
that need unwind info.  This does not include all insns in the epilogue
that access the frame, so I don't think this will help you?

> This is the basis of the proposed patch, which essentially disconnects the
> shortcut above for !frame_related insns on targets which need precise alias
> analysis within epilogues, as indicated by a new target hook.

Eww.  Isn't that really all targets that schedule the epilogue at all?

> On the key insn at hand, the frame_related bit was cleared on purpose,
> per:
> 
> https://gcc.gnu.org/ml/gcc-patches/2011-11/msg00543.html
> 
>   "1) Marking an instruction setting up r11 for use by _save64gpr_* as
>frame-related results in r11 being set as the cfa for the rest of the
>function.  That's bad when r11 gets used for other things later.
>   "

And that is correct.

> So, aside from the dependency issue which needs to be fixed somehow, I
> think it would make sense to consider using a strong blockage mecanism in
> expand_epilogue.

It would be very nice if we could directly express "the set of GPR1 should
stay behind any frame accesses", yeah.


Segher

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread wilson at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #3 from Jim Wilson  ---
I can reproduce on armhf and aarch64, but not on x86_64.

stage2 is built with -g -gtoggle.  stage3 is built with -g.  Debug info is
stripped before the compare, so in theory that shouldn't matter.

I am looking at statistics.c, as it is a conveniently small file.  On aarch64,
in stage2 statistics.s, I see
.section   
.rodata._ZN10hash_tableI20stats_counter_hasher11xcallocatorE6expandEv.str1.8,"aMS",@progbits,1
.align  3
.LC17:
.string "alloc_entries"

In stage3 statistics.s I see
.section.rodata.str1.8,"aMS",@progbits,1
.align  3
...
.LC17:
.string "alloc_entries"
.zero   2
...
.section.debug_str,"MS",@progbits,1
...
.LASF1861:
.string "alloc_entries"

So something about debug info caused the string to move from the function
specific rodata section to the general rodata section, and that causes the
comparison failure.  On x86_64, the string is in the function specific rodata
section in both cases, so no comparison failure.

[Bug tree-optimization/59124] [4.9/5/6 Regression] Wrong warnings "array subscript is above array bounds"

2016-03-27 Thread ppalka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124

--- Comment #36 from Patrick Palka  ---
Patch posted at https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01439.html

[Bug tree-optimization/70427] autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

--- Comment #3 from Andi Kleen  ---

Analyzing the code more it looks like the compiler generates it correctly, the
edge returned should not be 0 here.

[Bug tree-optimization/70427] autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

--- Comment #2 from Andi Kleen  ---
Created attachment 38110
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38110=edit
somewhat reduced input file, only single function

[Bug other/70428] New: -fdebug-prefix-map did not support to remap sources with relative path

2016-03-27 Thread hongxu.jia at windriver dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70428

Bug ID: 70428
   Summary: -fdebug-prefix-map did not support to remap sources
with relative path
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hongxu.jia at windriver dot com
  Target Milestone: ---

1. Prepare sources and build dir

$ pwd
/folk/hjia

$ mkdir dir1/dir2 test1/test2/ -p

$ cat > test1/test2/test.c << ENDOF
#include "test.h"

int main(int argc, char *argv[])
{
  func();
  return 0;
}

ENDOF

$ cat > test1/test2/test.h << ENDOF
void func()
{
  return;
}
ENDOF

$ cd dir1/dir2

2. Enter build dir to compile with relative path sources

$ gcc ../../test1/test2/test.c -g  -o test.o
$ objdump -g test.o | less

 <0>: Abbrev Number: 1 (DW_TAG_compile_unit)
   DW_AT_producer: (indirect string, offset: 0x47): GNU C 4.8.4
-mtune=generic -march=x86-64 -g -fstack-protector
<10>   DW_AT_language: 1(ANSI C)
<11>   DW_AT_name: (indirect string, offset: 0x5):
../../test1/test2/test.c 
<15>   DW_AT_comp_dir: (indirect string, offset: 0x1e):
/folk/hjia/dir1/dir2

Contents of the .debug_str section:


3. Compile with option -fdebug-prefix-map, it could not remap sources with
relative path

$ gcc ../../test1/test2/test.c
-fdebug-prefix-map=/folk/hjia/test1/test2=/usr/src -g  -o test.o
$ objdump -g test.o | less

 <0>: Abbrev Number: 1 (DW_TAG_compile_unit)
   DW_AT_producer: (indirect string, offset: 0x47): GNU C 4.8.4
-mtune=generic -march=x86-64 -g
-fdebug-prefix-map=/folk/hjia/test1/test2=/usr/src -fstack-protector 
<10>   DW_AT_language: 1(ANSI C)
<11>   DW_AT_name: (indirect string, offset: 0x5):
../../test1/test2/test.c 
<15>   DW_AT_comp_dir: (indirect string, offset: 0x1e):
/folk/hjia/dir1/dir2 


What we expected is:

<11>   DW_AT_name: (indirect string, offset: 0x5): /usr/src/test.c  
<15>   DW_AT_comp_dir: (indirect string, offset: 0x15):
/folk/hjia/dir1/dir2

[Bug tree-optimization/70427] autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

--- Comment #1 from Andi Kleen  ---
Created attachment 38109
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38109=edit
ipa-profile input

Here's the source of the miscompiled file from the compiler

cc1plus -O2 ipa-profile.i  -S

unfortunately have to inspect assembler to see the miscompilation:

look for ipa_generate_profile_summary

then look for get_edge

call_ZN11cgraph_node8get_edgeEP6gimple
testq   %rax, %rax
movq%rax, %r15 
je  .L836< jump if rax/r15 is 0
testb   $2, 96(%rax)
je  .L837
.L836:   <--- it can be here
movq16(%r12), %rax
movq64(%r15), %rsi <-- BAD

same miscompilation here (just with another register). r15 is referenced after
being tested for NULL.

[Bug tree-optimization/70427] New: autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

Bug ID: 70427
   Summary: autofdo bootstrap generates wrong code
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andi-gcc at firstfloor dot org
  Target Milestone: ---

I've been working on building gcc with an autofdo bootstrap.

Currently I always run into an crash while rebuilding tree.c with the stage2
compiler and the autofdo information 

Looking at the code it is clearly miscompiled in ipa_profile_generate_summary:

struct cgraph_edge * e = node->get_edge (stmt);
if (e && !e->indirect_unknown_callee)
  continue;


   0x0093bb16 <+326>:   callq  0x7be530
<_ZN11cgraph_node8get_edgeEP6gimple> 
   0x0093bb1b <+331>:   test   %rax,%rax   # check for NUULL
   0x0093bb1e <+334>:   mov%rax,%r8
   0x0093bb21 <+337>:   je 0x93bb2d   
<_ZL28ipa_profile_generate_summaryv+349>
   0x0093bb23 <+339>:   testb  $0x2,0x60(%rax)
   0x0093bb27 <+343>:   je 0x93baa7
<_ZL28ipa_profile_generate_summaryv+215>
   0x0093bb2d <+349>:   mov0x10(%r13),%rax # go here because of
NULL
=> 0x0093bb31 <+353>:   mov0x40(%r8),%rsi  # but we still
reference!

(gdb) p $r8
$4 = 0

The crash is on bb31 because r8 is NULL. The code checked the return value of
the call, but then references it afterwards before doing the continue.

Command line option:

cc1plus -fauto-profile=cc1plus.fda  -g -O2 tree.i

cc1plus.fda is at http://halobates.de/cc1plus.fda (too big to attach)

Re: Constexpr in intrinsics?

2016-03-27 Thread Allan Sandfeld Jensen

On Sunday 27 March 2016, Marc Glisse wrote:
> On Sun, 27 Mar 2016, Allan Sandfeld Jensen wrote:
> > Would it be possible to add constexpr to the intrinsics headers?
> > 
> > For instance _mm_set_XX and _mm_setzero intrinsics.
> 
> Already suggested here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65197
> 
> A patch would be welcome (I started doing it at some point, I don't
> remember if it was functional, the patch is attached).
> 
That looks very similar to the patch I experimented with, and that at least 
works for using them in C++11 constexpr functions.

> > Ideally it could also be added all intrinsics that can be evaluated at
> > compile time, but it is harder to tell which those are.
> > 
> > Does gcc have a C extension we can use to set constexpr?
> 
> What for?

To have similar functionality in C. For instance to explicitly allow those 
functions to be evaluated at compile time, and values with similar attributes 
be optimized completely out. And of course avoid using precompiler noise, in 
shared C/C++ headers like these are.

Best regards
`Allan

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-27 Thread olegendo at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

--- Comment #14 from Oleg Endo  ---
(In reply to Kazumoto Kojima from comment #12)
> 
> (insn 516 508 510 18 (set (reg:SI 0 r0)
> (plus:SI (reg:SI 2 r2)
> (const_int 4 [0x4]))) xxx.i:100 67 {*addsi3}
>  (nil))
> 
> which is invalid.

I haven't checked the details... but we've added those "special" addsi patterns
and the above seems to be covered by at least one of them.

Maybe at that stage in the reload code it will end up using the last *addsi3
pattern and not try to look for a new pattern in the .md when it wants to
change it.  In other words, maybe it'll help if the *addsi3 patterns are merged
into a single pattern somehow.  I'll give it a try...

Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-27 Thread Segher Boessenkool

On Fri, Mar 25, 2016 at 05:29:26PM -0400, Jason Merrill wrote:
> 70353 is a problem with the function-local static declaration of 
> __func__.  Normally constexpr functions can't have local statics, so 
> this is only an issue with __func__.  Meanwhile, core issue 1962 looks 
> like it's going to be resolved by changing __func__ et al to be prvalue 
> constants of type const char * rather than static local array variables, 
> so implementing that proposed resolution also resolves this issue, as 
> well as 62466 which complains about the strings not being merged between 
> translation units.  This patch proceeds from Martin's work last year.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

This patch caused PR70422, a bootstrap comparison failure on aarch64,
ia64, and powerpc64.


Segher

[Bug other/70426] New: decl_expr contains too little information

2016-03-27 Thread JamesMikeDuPont at googlemail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70426

Bug ID: 70426
   Summary: decl_expr contains too little information
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: JamesMikeDuPont at googlemail dot com
  Target Milestone: ---

using gcc (Debian 4.9.2-10) 4.9.2
In the 001t.tu file, the decl_expr contains no real information. 

here is the context of relevant statements :

@9529   function_declname: @9547type: @5191scpe: @155
 srcp: eval.c:199  chain: @9548
 link: static   body: @9549

@9549   bind_exprtype: @129 vars: @9568body: @9569

@9569   statement_list   0   : @95871   : @95882   : @9589
 3   : @95904   : @95915   : @9592
 6   : @9593
@9585   identifier_node  strg: pwd  lngt: 3
@9568   var_decl name: @9585type: @144 scpe: @9529
 srcp: eval.c:201  chain: @9586
 size: @22  algn: 64   used: 1

@9587   decl_exprtype: @129
@9588   decl_exprtype: @129
@9589   modify_expr  type: @144 op 0: @9586op 1: @9615

@129void_typename: @126 algn: 8
@126type_declname: @128 type: @129 chain: @130
@128identifier_node  strg: void lngt: 4



The source code around 199 is :
  198 static void
  199 send_pwd_to_eterm ()
  200 {
  201   char *pwd, *f;
  202 
  203   f = 0;
  204   pwd = get_string_value ("PWD");
  205   if (pwd == 0)
  206 f = pwd = get_working_directory ("eterm");
  207   fprintf (stderr, "\032/%s\n", pwd);
  208   free (f);
  209 }

So can I infer that @9587 refers to line 201 for the pwd variable?

See https://archive.org/details/bash.compilation for a full snapshot of the
compile. build/eval.c.001t.tu is the file.


So please tell me if this is correct or are we missing important fields in the
decl_expr.

[Bug other/70425] New: decl_expr contains too little information

2016-03-27 Thread JamesMikeDuPont at googlemail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70425

Bug ID: 70425
   Summary: decl_expr contains too little information
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: JamesMikeDuPont at googlemail dot com
  Target Milestone: ---

using gcc (Debian 4.9.2-10) 4.9.2
In the 001t.tu file, the decl_expr contains no real information. 

here is the context of relevant statements :

@9529   function_declname: @9547type: @5191scpe: @155
 srcp: eval.c:199  chain: @9548
 link: static   body: @9549

@9549   bind_exprtype: @129 vars: @9568body: @9569

@9569   statement_list   0   : @95871   : @95882   : @9589
 3   : @95904   : @95915   : @9592
 6   : @9593
@9585   identifier_node  strg: pwd  lngt: 3
@9568   var_decl name: @9585type: @144 scpe: @9529
 srcp: eval.c:201  chain: @9586
 size: @22  algn: 64   used: 1

@9587   decl_exprtype: @129
@9588   decl_exprtype: @129
@9589   modify_expr  type: @144 op 0: @9586op 1: @9615

@129void_typename: @126 algn: 8
@126type_declname: @128 type: @129 chain: @130
@128identifier_node  strg: void lngt: 4



The source code around 199 is :
  198 static void
  199 send_pwd_to_eterm ()
  200 {
  201   char *pwd, *f;
  202 
  203   f = 0;
  204   pwd = get_string_value ("PWD");
  205   if (pwd == 0)
  206 f = pwd = get_working_directory ("eterm");
  207   fprintf (stderr, "\032/%s\n", pwd);
  208   free (f);
  209 }

So can I infer that @9587 refers to line 201 for the pwd variable?

See https://archive.org/details/bash.compilation for a full snapshot of the
compile. build/eval.c.001t.tu is the file.


So please tell me if this is correct or are we missing important fields in the
decl_expr.

Re: Patches to fix GCC’s C++ exception handling on NetBSD/VAX

2016-03-27 Thread Jake Hamby

The results you want to see for the test program are the following:

throwtest(0) returned 0
throwtest(1) returned 1
Caught int exception: 123
Caught double exception: 123.45
Caught float exception: 678.900024
enter recursive_throw(6)
calling recursive_throw(5)
enter recursive_throw(5)
calling recursive_throw(4)
enter recursive_throw(4)
calling recursive_throw(3)
enter recursive_throw(3)
calling recursive_throw(2)
enter recursive_throw(2)
calling recursive_throw(1)
enter recursive_throw(1)
calling recursive_throw(0)
enter recursive_throw(0)
throwing exception
Caught int exception: 456

Before I made the changes I've submitted, this worked on m68k and presumably 
everywhere else but VAX. On VAX, it crashed due to the pointer size 
discrepancies that I already talked about. I believe that it should be possible 
to improve GCC's backend by allowing %ap to be used as an additional general 
register (removing it from FIXED_REGISTERS, but leaving it in 
CALL_USED_REGISTERS, since it's modified on CALLS), without breaking the DWARF 
stack unwinding stuff, since the .cfi information it's emitting notes the 
correct %fp offset to find the frame, which it actually uses instead of the %ap 
in stack unwinding.

Gaining an additional general register to use within a function would be a nice 
win if it worked. But there are at two problems that must be solved before 
doing this (that I know of). The first is that certain combinations of VAX 
instructions and addressing modes seem to have problems when %ap, %fp, and/or 
%sp are used. I discovered this in the OpenVMS VAX Macro reference (which is 
essentially an updated version of the 1977 VAX architecture handbook), in Table 
8-5, General Register Addressing.

An additional source of info on which modes fail with address faults when AP or 
above is used, SimH's vax_cpu.c correctly handles this, and you can trace these 
macros to find the conditions:

#define CHECK_FOR_PCif (rn == nPC) \
RSVD_ADDR_FAULT
#define CHECK_FOR_SPif (rn >= nSP) \
RSVD_ADDR_FAULT
#define CHECK_FOR_APif (rn >= nAP) \
RSVD_ADDR_FAULT

So as long as the correct code is added to vax.md and vax.c to never emit move 
instructions under the wrong circumstances when %ap is involved, it could be 
used as a general register. I wonder if the use of %ap to find address 
arguments is a special case that happens to never emit anything that would fail 
(with a SIGILL, I believe).

But the other problem with making %ap available as a general (with a few 
special cases) register is that it would break one part of the patch I 
submitted at the beginning of the thread to fix C++ exceptions. One necessary 
part of that fix was to change "#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0" to 
"#define ARG_POINTER_CFA_OFFSET(FNDECL) 0", which correctly generates the code 
to return the value for __builtin_dwarf_cfa () (as an offset of 0 from %ap).

When I was working on that fix, it seemed like it should be possible, since the 
DWARF / CFA code that's in there now is using an offset from %fp that it knows, 
that an improved fix would define FRAME_POINTER_CFA_OFFSET(FNDECL) as something 
that knows how to return the current CFA (canonical frame address) as an offset 
from %fp, since that's what it's using for all the .cfi_* directives. But I 
don't know what a correct definition of FRAME_POINTER_CFA_OFFSET should be in 
order for it to return that value, instead of 0, because I know that a 0 offset 
from %fp is definitely wrong, and it's not a fixed offset either (it depends on 
the number of registers saved in the procedure entry mask). Fortunately, %ap 
points directly to CFA, so that always works.

Just some thoughts on future areas for improval... I'm very happy to be able to 
run the NetBSD testsuite on VAX now. It gives me a lot of confidence as to what 
works and what doesn't. Most of the stuff I expected to fail (like libm tests, 
since it's not IEEE FP) failed, and most of the rest succeeded.

-Jake

> On Mar 27, 2016, at 15:34, Jake Hamby  wrote:
> 
> I'm very pleased to report that I was able to successfully build a NetBSD/vax 
> system using the checked-in GCC 5.3, with the patches I've submitted, setting 
> FIRST_PSEUDO_REGISTER to 17 and DWARF_FRAME_REGISTERS to 16. The kernel 
> produced with GCC 5.3 crashes (on a real VS4000/90 and also SimH) in UVM, 
> which may be a bug in the kernel that different optimization exposed, or a 
> bug in GCC's generated code.
> 
> If you don't set DWARF_FRAME_REGISTERS to 16, then C++ exceptions break 
> again, and GDB suddenly can't deal with the larger debug frames because of 
> the data structure size mismatch between GCC and GDB. But with that 
> additional define, you can raise FIRST_PSEUDO_REGISTER to include PSW (which 
> is correct, since DWARF already uses that meaning), remove the "#ifdef 
> notworking" around the asserts that Christos added in the

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-27
 Ever confirmed|0   |1

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread segher at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Segher Boessenkool  changed:

   What|Removed |Added

 Target|aarch64-*-*, ia64-*-*   |aarch64-*-*, ia64-*-*,
   ||powerpc64-*-*
   Priority|P3  |P1
 CC||segher at gcc dot gnu.org

--- Comment #2 from Segher Boessenkool  ---
Also on powerpc64-linux.

Re: [PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-27 Thread Patrick Palka

On Sun, Mar 27, 2016 at 2:58 PM, Patrick Palka  wrote:
> In unrolling of the inner loop in the test case below we introduce
> unreachable code that otherwise contains out-of-bounds array accesses.
> This is because the estimation of the maximum number of iterations of
> the inner loop is too conservative: we assume 6 iterations instead of
> the actual 4.
>
> Nonetheless, VRP should be able to tell that the code is unreachable so
> that it doesn't warn about it.  The only thing holding VRP back is that
> it doesn't look through conditionals of the form
>
>if (j_10 != CST1)where j_10 = j_9 + CST2
>
> so that it could add the assertion
>
>j_9 != (CST1 - CST2)
>
> This patch teaches VRP to detect such conditionals and to add such
> assertions, so that it could remove instead of warn about the
> unreachable code created during loop unrolling.
>
> What this addition does with the test case below is something like this:
>
> ASSERT_EXPR (i <= 5);
> for (i = 1; i < 6; i++)
>   {
> j = i - 1;
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 1)
> bar[j] = baz[j];
>
> j = i - 2
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 2)
> bar[j] = baz[j];
>
> j = i - 3
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 3)
> bar[j] = baz[j];
>
> j = i - 4
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 4)
> bar[j] = baz[j];
>
> j = i - 5
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 5)
> bar[j] = baz[j];
>
> j = i - 6
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 6)
> bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is always false
>   }

Er, sorry, this illustration is wrong.  First off, break; should say
continue;.  Second, we actually find that the second-to-last bar[j] =
baz[j]; assignment is unreachable, since VRP can use the ASSERT_EXPRs
to determine that i == 5 when evaluating the conditional immediately
preceding the second-to-last array access.  And because the
second-to-last assignment is unreachable then so is the last
assignment.  So we remove two unreachable array accesses (and their
enclosing basic blocks) and thus suppress the two -Warray-bounds
warnings.

gcc-6-20160327 is now available

2016-03-27 Thread gccadmin

Snapshot gcc-6-20160327 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/6-20160327/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 6 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 234496

You'll find:

 gcc-6-20160327.tar.bz2   Complete GCC

  MD5=3cb4291a666bd58256f2071ab39b778c
  SHA1=03bfff42128015ad5ba816eb273e589e91c26bda

Diffs from 6-20160320 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-6
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Re: Patches to fix GCC’s C++ exception handling on NetBSD/VAX

2016-03-27 Thread Jake Hamby

I'm very pleased to report that I was able to successfully build a NetBSD/vax 
system using the checked-in GCC 5.3, with the patches I've submitted, setting 
FIRST_PSEUDO_REGISTER to 17 and DWARF_FRAME_REGISTERS to 16. The kernel 
produced with GCC 5.3 crashes (on a real VS4000/90 and also SimH) in UVM, which 
may be a bug in the kernel that different optimization exposed, or a bug in 
GCC's generated code.

If you don't set DWARF_FRAME_REGISTERS to 16, then C++ exceptions break again, 
and GDB suddenly can't deal with the larger debug frames because of the data 
structure size mismatch between GCC and GDB. But with that additional define, 
you can raise FIRST_PSEUDO_REGISTER to include PSW (which is correct, since 
DWARF already uses that meaning), remove the "#ifdef notworking" around the 
asserts that Christos added in the NetBSD tree, and everything works as well as 
it did with #define FIRST_PSEUDO_REGISTER 16.

Here's the C++ test case I've been using to verify that the stack unwinding 
works and that different simple types can be thrown and caught. My ultimate 
goal is to be able to run GCC's testsuite because I'm far from certain that the 
OS, or even the majority of packages, really exercise all of the different 
paths in this very CISC architecture.

#include 
#include 

int recursive_throw(int i) {
  printf("enter recursive_throw(%d)\n", i);
  if (i > 0) {
printf("calling recursive_throw(%d)\n", i - 1);
recursive_throw(i - 1);
  } else {
printf("throwing exception\n");
throw 456;
  }
  printf("exit recursive_throw(%d)\n", i);
}

/* Test several kinds of throws. */
int throwtest(int test) {
  switch (test) {
case 0:
case 1:
  return test;

case 2:
  throw 123;

case 3:
  throw 123.45;

case 4:
  throw 678.9f;

case 5:
  recursive_throw(6);
  return 666;  // fail

default:
  return 999;  // not used in test
  }
}

int main() {
  for (int i = 0; i < 6; i++) {
try {
  int ret = throwtest(i);
  printf("throwtest(%d) returned %d\n", i, ret);
} catch (int e) {
  printf("Caught int exception: %d\n", e);
} catch (double d) {
  printf("Caught double exception: %f\n", d);
} catch (float f) {
  printf("Caught float exception: %f\n", (double)f);
}
  }
}

I'm pleased that I got it working, but the change I made to except.c to add:

RTX_FRAME_RELATED_P (insn) = 1;

below:

#ifdef EH_RETURN_HANDLER_RTX
  rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);

isn't really correct, I don't think. It adds an additional .cfi directive that 
wasn't there before, and a GCC ./buildsh release fails building unwind-dw2.c 
(that's the place where the build either succeeds or fails or generates bad 
code) if you try to compile with assertions (and it doesn't without my change 
to except.c).

Unfortunately, I don't have a patch for the root cause for me having to add 
that line to except.c, which is that the required mov instruction to copy the 
__builtin_eh_return() return address into the old stack frame is being deleted 
for some reason, otherwise. Since I know the #ifdef EH_RETURN_HANDLER_RTX line 
*is* staying in the final code on other archs, I presume the problem is 
something VAX-related in the .md file.

If anyone knows about GCC's liveness detection, and specifically any potential 
problems that might cause this to be happening (removing a required 
"emit_move_insn (EH_RETURN_HANDLER_RTX, ...)" before a return call that was 
added in expand_eh_return () but then deleted if -O or higher is used), any 
advice would be appreciated as to where to look.

What I'm working on now is cleaning up and refactoring the .md insn 
definitions, but I'm not ready to share that code until it works and does 
something useful. I'm hoping to be able to optimize the removal of unneeded tst 
/ cmp instructions when the condition codes have been set to something useful 
by a previous insn. I don't think the code in vax_notice_update_cc () is 
necessarily correct, and it's very difficult to understand exactly how it's 
working, because it's trying to determine this based entirely on looking at the 
RTL of the insn (set, call, zero_extract, etc), which I think may have become 
out of sync with the types of instructions that are actually emitted in vax.md 
for those operations.

So I've just finished tagging the define_insn calls in vax.md with a "cc" 
attribute (like the avr backend) which my (hopefully more correct and more 
optimized) version of vax_notice_update_cc will use to know exactly what the 
flag status is after the insn, for Z, N, and C. Here's my definition of "cc". 
I'll share the rest after I'm sure that it works.

;; Condition code settings.  On VAX, the default value is "clobber".
;; The V flag is often set to zero, or else it has a special meaning,
;; usually related to testing for a signed integer range overflow.
;; "cmp_czn", "cmp_zn", and "cmp_z" are all assumed to modify V, and
;;

Re: Patches to fix GCC's C++ exception_handling on NetBSD/VAX

2016-03-27 Thread Jake Hamby

Thank you for the offer. I already tested it on an Amiga 3000 w/ 68040 card 
when I made the fix. The bug manifested as the cross-compiler crashing with a 
failure to find a suitable insn, because it couldn’t find the correct FP 
instruction to expand to. I believe it manifested when running ./build.sh 
release with “-m68040” set in CPUFLAGS. I will test it myself and see if it’s 
still an issue without the patch. If you look at the .md file, there’s an 
entirely different code path to generate the same instructions when 
"TARGET_68881 && TUNE_68040" aren't defined.

At the time I made the change, I had already been testing the code on an Amiga 
3000 w/ 68040 card, so I know that the generated code is likely correct (also, 
the assembler accepted it). So I’m assuming that it’s a fairly safe thing. It 
was the difference between the build succeeding or failing, and not an issue 
with the generated code.

So the only thing I can suggest is that you can try a build with the patch and 
make sure it's stable. I was never able to produce a build without it, because 
"TARGET_68881 && TUNE_68040" was excluding the other choices when building, I 
believe, libm or libc or the kernel or something like that. I do have a test 
case for C++ exceptions on VAX, which I will send separately.

Thanks,
Jake

> On Mar 27, 2016, at 10:08, Mikael Pettersson  wrote:
> 
> Jake Hamby writes:
>> As an added bonus, I see that my patch set also included an old m68k patch
>> that had been sitting in my tree, which fixes a crash when -m68040 is 
>> defined.
>> I may have submitted it to port-m68k before. It hasn't been tested with the
>> new compiler either. Here's that patch separately. It only matter when
>> TARGET_68881 && TUNE_68040.
> 
> Do you have a test case or some recipe for reproducing the crash?
> I'd be happy to test this patch on Linux/M68K.
> 
> /Mikael
> 
>> 
>> -Jake
>> 
>> 
>> Index: external/gpl3/gcc/dist/gcc/config/m68k/m68k.md
>> 
>> RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/m68k/m68k.md,v
>> retrieving revision 1.4
>> diff -u -u -r1.4 m68k.md
>> --- external/gpl3/gcc/dist/gcc/config/m68k/m68k.md   24 Jan 2016 09:43:33 
>> -  1.4
>> +++ external/gpl3/gcc/dist/gcc/config/m68k/m68k.md   26 Mar 2016 10:42:41 
>> -
>> @@ -2132,9 +2132,9 @@
>> ;; into the kernel to emulate fintrz.  They should also be faster
>> ;; than calling the subroutines fixsfsi or fixdfsi.
>> 
>> -(define_insn "fix_truncdfsi2"
>> +(define_insn "fix_truncsi2"
>>   [(set (match_operand:SI 0 "nonimmediate_operand" "=dm")
>> -(fix:SI (fix:DF (match_operand:DF 1 "register_operand" "f"
>> +(fix:SI (match_operand:FP 1 "register_operand" "f")))
>>(clobber (match_scratch:SI 2 "=d"))
>>(clobber (match_scratch:SI 3 "=d"))]
>>   "TARGET_68881 && TUNE_68040"
>> @@ -2143,9 +2143,9 @@
>>   return "fmovem%.l %!,%2\;moveq #16,%3\;or%.l %2,%3\;and%.w 
>> #-33,%3\;fmovem%.l %3,%!\;fmove%.l %1,%0\;fmovem%.l %2,%!";
>> })
>> 
>> -(define_insn "fix_truncdfhi2"
>> +(define_insn "fix_trunchi2"
>>   [(set (match_operand:HI 0 "nonimmediate_operand" "=dm")
>> -(fix:HI (fix:DF (match_operand:DF 1 "register_operand" "f"
>> +(fix:HI (match_operand:FP 1 "register_operand" "f")))
>>(clobber (match_scratch:SI 2 "=d"))
>>(clobber (match_scratch:SI 3 "=d"))]
>>   "TARGET_68881 && TUNE_68040"
>> @@ -2154,9 +2154,9 @@
>>   return "fmovem%.l %!,%2\;moveq #16,%3\;or%.l %2,%3\;and%.w 
>> #-33,%3\;fmovem%.l %3,%!\;fmove%.w %1,%0\;fmovem%.l %2,%!";
>> })
>> 
>> -(define_insn "fix_truncdfqi2"
>> +(define_insn "fix_truncqi2"
>>   [(set (match_operand:QI 0 "nonimmediate_operand" "=dm")
>> -(fix:QI (fix:DF (match_operand:DF 1 "register_operand" "f"
>> +(fix:QI (match_operand:FP 1 "register_operand" "f")))
>>(clobber (match_scratch:SI 2 "=d"))
>>(clobber (match_scratch:SI 3 "=d"))]
>>   "TARGET_68881 && TUNE_68040"
> 
> --

Re: [DOC Patch] Add sample for @cc constraint

2016-03-27 Thread David Wohlferd

Thanks for the feedback.  While I agree with some of this, there are 
parts that I want to defend.  If after explaining why I did what I did 
you still feel it should be changed, I'm prepared to do that.

On 3/24/2016 8:00 AM, Bernd Schmidt wrote:
> More problematic than a lack of documentation is that I haven't been 
able to find an executable testcase. If you could adapt your example for 
use in gcc.target/i386, that would be even more important.

It looks like Richard included some "scan-assembler" statements in the 
suites with the original checkin 
(https://gcc.gnu.org/viewcvs/gcc?view=revision=225122). Is that 
not sufficient?  If not, I'm certainly prepared to create a couple 
executable cases for the next rev of this patch.

>> +Do not clobber flags if they are being used as outputs.
>
> I don't think the manual should point out the obvious. I'd be 
surprised if this wasn't documented or at least strongly implied 
elsewhere for normal operands.

Well, *I* thought it was obvious, because it is both documented and 
implied elsewhere.

However, the compiler doesn't see it that way.  Normally, attempting to 
overlap 'clobbers' and 'outputs' generates compile errors, but not when 
outputting and clobbering flags.  I filed pr68095 about this (including 
a rough draft at a patch), but apparently not everyone sees this the way 
I do.

Outputting flags is new to v6, so changing the compiler to reject 
overlaps before the feature is ever released would be ideal.  If we try 
to patch this in v7, will it get rejected because it would break 
backward compatibility?

If we aren't going to change the code, then I decided it needed to be 
hammered home in the docs.  Because someday someone is going to want to 
do something more with flags, but they won't be able to because it will 
break backward compatibility with all the people who have written this 
the "obviously wrong" way.  Hopefully this text will serve as 
justification for that future someone to do it anyway.

That said, I'm ok with any of:

1) Leave this text in.
2) Remove the text and add the compiler check to v6.
3) Remove the text and add the compiler check to v7.
4) Leave the text in v6, then in v7: remove the text and add the 
compiler check.

5) (Reluctantly) remove the text and hope this never becomes a problem.

I'll update the patch with whichever option seems best.  If it were my 
choice to make, I'd go with #4 (followed by 3, 1, 5).  2 would actually 
be the best, but probably isn't realistic at this late date.

>> +For builds that don't support flag output operands,
>
> This feels strange, we should just be documenting the capabilities of 
this feature. Other parts of the docs already show what to do without it.

While I liked using the #define to contrast how this used to work (not 
sure where you think we show this?) with how the feature makes things 
better, I think I prefer the shorter example you are proposing.  I'll 
change this in the next rev of the patch.

>> +Note: On the x86 platform, flags are normally considered clobbered by
>> +extended asm whether the @code{"cc"} clobber is specified or not.
>
> Is it really necessary or helpful to mention that here? Not only is 
it not strictly correct (an output operand is not also considered 
clobbered), but to me it breaks the flow because you're left wondering 
how that sentence relates to the example (it doesn't).

The problem I am trying to fix here is that on x86, the "cc" is implicit 
for all extended asm statements, whether it is specified or not and 
whether there is a flags output or not.  However, that fact isn't 
documented anywhere.  So, where does that info go?  It could go right by 
the docs for "cc", but since this behavior only applies to x86, that 
would make the docs there messy.

Since the 'output flags' section already has an x86-specific section, 
that seemed like a plausible place to put it.  But no matter where I put 
it in that section, it always looks weird for exactly the reasons you state.

I'll try moving it up by the "cc" clobber in the next rev.  Let me know 
what you think.

>> +For platform-specific uses of flags, see also
>> +@ref{FlagOutputOperands,Flag Output Operands}.
>
> Is this likely to be helpful? Someone who's looking at how to use 
flag outputs probably isn't looking in the "Clobbers" section?

People reading about "cc" may be interested in knowing that you can do 
something with flags other than clobbering them.  And of course this 
lets us put the note about "x86 always clobbers flags" in that other 
section.

dw

Re: [PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-27 Thread Patrick Palka

On Sun, 27 Mar 2016, Patrick Palka wrote:

> In unrolling of the inner loop in the test case below we introduce
> unreachable code that otherwise contains out-of-bounds array accesses.
> This is because the estimation of the maximum number of iterations of
> the inner loop is too conservative: we assume 6 iterations instead of
> the actual 4.
> 
> Nonetheless, VRP should be able to tell that the code is unreachable so
> that it doesn't warn about it.  The only thing holding VRP back is that
> it doesn't look through conditionals of the form
> 
>if (j_10 != CST1)where j_10 = j_9 + CST2
> 
> so that it could add the assertion
> 
>j_9 != (CST1 - CST2)
> 
> This patch teaches VRP to detect such conditionals and to add such
> assertions, so that it could remove instead of warn about the
> unreachable code created during loop unrolling.
> 
> What this addition does with the test case below is something like this:
> 
> ASSERT_EXPR (i <= 5);
> for (i = 1; i < 6; i++)
>   {
> j = i - 1;
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 1)
> bar[j] = baz[j];
> 
> j = i - 2
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 2)
> bar[j] = baz[j];
> 
> j = i - 3
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 3)
> bar[j] = baz[j];
> 
> j = i - 4
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 4)
> bar[j] = baz[j];
> 
> j = i - 5
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 5)
> bar[j] = baz[j];
> 
> j = i - 6
> if (j == 0)
>   break;
> // ASSERT_EXPR (i != 6)
> bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is always false
>   }
> 
> (I think the patch I sent a year ago that improved the
>  register_edge_assert stuff would have fixed this too.  I'll try to
>  post it again during next stage 1.
>  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00908.html)
> 
> Bootstrap + regtest in progress on x86_64-pc-linux-gnu, does this look
> OK to commit after testing?
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/59124
>   * tree-vrp.c (register_edge_assert_for): For NAME != CST1
>   where NAME = A + CST2 add the assertion A != (CST1 - CST2).
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/59124
>   * gcc.dg/Warray-bounds-19.c: New test.
> ---
>  gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
>  gcc/tree-vrp.c  | 22 ++
>  2 files changed, 39 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-19.c
> 
> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-19.c 
> b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
> new file mode 100644
> index 000..e2f9661
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/59124 */
> +/* { dg-options "-O3 -Warray-bounds" } */
> +
> +unsigned baz[6];
> +
> +void foo(unsigned *bar, unsigned n)
> +{
> +  unsigned i, j;
> +
> +  if (n > 6)
> +n = 6;
> +
> +  for (i = 1; i < n; i++)
> +for (j = i - 1; j > 0; j--)
> +  bar[j - 1] = baz[j - 1];
> +}
> +
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index b5654c5..31bd575 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -5820,6 +5820,28 @@ register_edge_assert_for (tree name, edge e, 
> gimple_stmt_iterator si,
>   }
>  }
>  
> +  /* In the case of NAME != CST1 where NAME = A + CST2 we can
> + assert that NAME != (CST1 - CST2).  */

This should say A != (...) not NAME != (...)

> +  if ((comp_code == EQ_EXPR || comp_code == NE_EXPR)
> +  && TREE_CODE (val) == INTEGER_CST)
> +{
> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
> +
> +  if (is_gimple_assign (def_stmt)
> +   && gimple_assign_rhs_code (def_stmt) == PLUS_EXPR)
> + {
> +   tree op0 = gimple_assign_rhs1 (def_stmt);
> +   tree op1 = gimple_assign_rhs2 (def_stmt);
> +   if (TREE_CODE (op0) == SSA_NAME
> +   && TREE_CODE (op1) == INTEGER_CST)
> + {
> +   op1 = int_const_binop (MINUS_EXPR, val, op1);
> +   register_edge_assert_for_2 (op0, e, si, comp_code,
> +   op0, op1, is_else_edge);

The last argument to register_edge_assert_for_2() should be false not
is_else_edge since comp_code is already inverted.

Consider these two things fixed.  Also I moved down the new code so that
it's at the very bottom of register_edge_assert_for.  Here's an updated
patch that passes bootstrap + regtest.

-- 8< --

gcc/ChangeLog:

PR tree-optimization/59124
* tree-vrp.c (register_edge_assert_for): For NAME != CST1
where NAME = A + CST2 add the assertion A != (CST1 - CST2).

gcc/testsuite/ChangeLog:

PR tree-optimization/59124
* gcc.dg/Warray-bounds-19.c: New test.
---
 gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
 gcc/tree-vrp.c  | 22 ++
 2 files changed, 39 insertions(+)
 create

[Bug middle-end/70424] [4.9/5/6 Regression] Pointer derived from integer gets reduced alignment

2016-03-27 Thread bugdal at aerifal dot cx

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70424

Rich Felker  changed:

   What|Removed |Added

 CC||bugdal at aerifal dot cx

--- Comment #1 from Rich Felker  ---
If correct, this can likely break MMIO access in bare-metal applications or
kernel drivers that derive the MMIO addresses via certain types of arithmetic
expressions. Accessing a 32-bit MMIO register as multiple 16-bit or 8-bit
loads/stores is likely to do the wrong thing or not work at all.

I see no reason why GCC should even try to account for the possibility that the
resulting pointer might be misaligned. Unless the pointed-to type has
__attribute__((__aligned__(1))) applied to it, misaligned access is simply UB.

[Bug middle-end/70424] New: [4.9/5/6 Regression] Pointer derived from integer gets reduced alignment

2016-03-27 Thread amonakov at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70424

Bug ID: 70424
   Summary: [4.9/5/6 Regression] Pointer derived from integer gets
reduced alignment
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

int f(long a)
{
  int *p=(int*)(a<<1);
  //asm("" : "+r"(p));
  return *p;
}

Starting from 4.9, in the above example GCC assumes that *p is aligned to 16
bits (on 4.8 and earlier, to 32 bits, like normal int*). This causes the load
to be torn in two on strict-alignment targets; using -O0 or uncommenting the
asm restores old behavior (one 32-bit load). This change seems unintended.

On x86_64 it's visible on RTL level (note A32->A16 change):

gcc-4.8.0 -S t.c -Os -o- -dP

#(insn:TI 7 3 15 2 (set (reg:SI 0 ax [orig:66 *p_3 ] [66])
#(mem:SI (plus:DI (reg/v:DI 5 di [orig:63 a ] [63])
#(reg/v:DI 5 di [orig:63 a ] [63])) [2 *p_3+0 S4 A32]))
align.c:5 89 {*movsi_internal}
# (expr_list:REG_DEAD (reg/v:DI 5 di [orig:63 a ] [63])
#(nil)))
movl(%rdi,%rdi), %eax   # 7 *movsi_internal/1   [length
= 3]

gcc-4.9.2 -S t.c -Os -o- -dP

#(insn:TI 7 3 13 2 (set (reg:SI 0 ax [orig:90 *p_3 ] [90])
#(mem:SI (plus:DI (reg/v:DI 5 di [orig:87 a ] [87])
#(reg/v:DI 5 di [orig:87 a ] [87])) [2 *p_3+0 S4 A16]))
align.c:5 90 {*movsi_internal}
# (expr_list:REG_DEAD (reg/v:DI 5 di [orig:87 a ] [87])
#(nil)))
movl(%rdi,%rdi), %eax   # 7 *movsi_internal/1   [length
= 3]

[Bug target/70421] [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-27 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-03-27
 CC||jakub at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Jakub Jelinek  ---
Untested fix:
--- gcc/config/i386/i386.c  (revision 234449)
+++ gcc/config/i386/i386.c  (working copy)
@@ -46930,7 +46930,7 @@ half:
 {
   tmp = gen_reg_rtx (mode);
   emit_insn (gen_rtx_SET (tmp, gen_rtx_VEC_DUPLICATE (mode, val)));
-  emit_insn (gen_blendm (target, tmp, target,
+  emit_insn (gen_blendm (target, target, tmp,
 force_reg (mmode,
gen_int_mode (1 << elt, mmode;
 }

Both the
(define_insn "_blendm"
  [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
(vec_merge:V48_AVX512VL
  (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
  (match_operand:V48_AVX512VL 1 "register_operand" "v")
  (match_operand: 3 "register_operand" "Yk")))]
  "TARGET_AVX512F"
  "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "evex")
   (set_attr "mode" "")])

(define_insn "_blendm"
  [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v")
(vec_merge:VI12_AVX512VL
  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
  (match_operand:VI12_AVX512VL 1 "register_operand" "v")
  (match_operand: 3 "register_operand" "Yk")))]
  "TARGET_AVX512BW"
  "vpblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "evex")
   (set_attr "mode" "")])
patterns have the order of operands swapped vs. VEC_MERGE, and for VEC_MERGE
we use the
  tmp = gen_rtx_VEC_MERGE (mode, tmp, target, GEN_INT (1 << elt));
order, so I believe the above patch is right.  Will test it on Tuesday.

[Bug driver/70423] New: -shared option description isn't clear about exactly when -fpic/-fPIC is required

2016-03-27 Thread britton.kerin at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70423

Bug ID: 70423
   Summary: -shared option description isn't clear about exactly
when -fpic/-fPIC is required
   Product: gcc
   Version: 5.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: britton.kerin at gmail dot com
  Target Milestone: ---

Section 3.13 Options for Linking includes this:

-shared
Produce a shared object which can then be linked with other objects to form
an executable. Not all systems support this option. For predictable results,
you must also specify the same set of options used for compilation (-fpic,
-fPIC, or model suboptions) when you specify this linker option.

This makes it sound like -fpic/-fPIC would be required when performing a
link-only gcc invocation (i.e. with arguments consisting only of .o files).
Most people don't include -fpic/-fPIC in this situation and in fact it
apparently isn't required:

Cary Coutant wrote in a bug report elsewhere:

 The -fpic and model suboptions are not linker options. They're only
 required when you pass -shared to gcc if you're also compiling source
 files at the same time. If you're just running gcc to link a bunch of .o
 files, compiler options like -fpic are unnecessary.

This problem could be fixed by changing the -shared description to read like
this:

-shared
Produce a shared object which can then be linked with other objects to form
an executable. Not all systems support this option. For predictable results,
you must also specify the same set of options used for compilation (-fpic,
-fPIC, or model suboptions) when you specify this linker option in a gcc
invocation that will perform both compilation and linking.

Actually this is still a little imperfect since if I understand correctly the
purpose is to get the same set of options in the compile/link invocation as
those used for any previous compilations that produced object files to be
included in the link.  If the invocation compiles everything that gets linked
there's no possibility to go wrong.  But I think the above gets the point
accross.

This is worth fixing because -shared -fPIC -fpic etc. form a somewhat
complicated nest so the specifications need to be precise.  I believe most
existing build systems don't mix compilation and linking, so don't use
-fpic/-fPIC at link time, so violate the most likely (but wrong) interpretation
of the situations in which -fpic/-fPIC are required according to the current
-shared option description.

Re: [Patch, Fortran, pr70397, gcc-5, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-27 Thread Dominique d'Humières

Andre,

In order to apply the patch on a recent trunk

@@ -1070,7 +1089,7 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, 
bool unlimited)
   if (unlimited)
 {
   if (from_class_base != NULL_TREE)
-   from_len = gfc_class_len_get (from_class_base);
+   from_len = gfc_class_len_or_zero_get (from_class_base);
   else
from_len = integer_zero_node;
 }
should be something such as

@@ -1120,7 +1142,7 @@ gfc_copy_class_to_class (tree from, tree
   if (unlimited)
 {
   if (from != NULL_TREE && unlimited)
-   from_len = gfc_class_len_get (from);
+   from_len = gfc_class_len_or_zero_get (from);
   else
from_len = integer_zero_node;
 }

With my patched tree I also see the regression

FAIL: gfortran.dg/coarray_allocate_4.f08  * (internal compiler error)

/opt/gcc/work/gcc/testsuite/gfortran.dg/coarray_allocate_4.f08:39:0:

allocate (z, source=x)
 
internal compiler error: tree check: expected record_type or union_type or 
qual_union_type, have void_type in gfc_class_len_or_zero_get, at 
fortran/trans-expr.c:186

Dominique

[Bug c++/70275] -w disables all -Werror flags

2016-03-27 Thread manu at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70275

--- Comment #4 from Manuel López-Ibáñez  ---
(In reply to Kevin Tucker from comment #3)
> I'm new to this.  How is is determined if this is a desired change or not?

Suggestion #10 applies also to non-patches: https://gcc.gnu.org/wiki/Community

In short, I would recommend to write to g...@gcc.gnu.org, CC the relevant
MAINTAINERS, choose an appropriate subject, write a concise but clear-cut email
so they won't simply overlook it for lack of time or lack of clarity.

[PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-27 Thread Patrick Palka

In unrolling of the inner loop in the test case below we introduce
unreachable code that otherwise contains out-of-bounds array accesses.
This is because the estimation of the maximum number of iterations of
the inner loop is too conservative: we assume 6 iterations instead of
the actual 4.

Nonetheless, VRP should be able to tell that the code is unreachable so
that it doesn't warn about it.  The only thing holding VRP back is that
it doesn't look through conditionals of the form

   if (j_10 != CST1)where j_10 = j_9 + CST2

so that it could add the assertion

   j_9 != (CST1 - CST2)

This patch teaches VRP to detect such conditionals and to add such
assertions, so that it could remove instead of warn about the
unreachable code created during loop unrolling.

What this addition does with the test case below is something like this:

ASSERT_EXPR (i <= 5);
for (i = 1; i < 6; i++)
  {
j = i - 1;
if (j == 0)
  break;
// ASSERT_EXPR (i != 1)
bar[j] = baz[j];

j = i - 2
if (j == 0)
  break;
// ASSERT_EXPR (i != 2)
bar[j] = baz[j];

j = i - 3
if (j == 0)
  break;
// ASSERT_EXPR (i != 3)
bar[j] = baz[j];

j = i - 4
if (j == 0)
  break;
// ASSERT_EXPR (i != 4)
bar[j] = baz[j];

j = i - 5
if (j == 0)
  break;
// ASSERT_EXPR (i != 5)
bar[j] = baz[j];

j = i - 6
if (j == 0)
  break;
// ASSERT_EXPR (i != 6)
bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is always false
  }

(I think the patch I sent a year ago that improved the
 register_edge_assert stuff would have fixed this too.  I'll try to
 post it again during next stage 1.
 https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00908.html)

Bootstrap + regtest in progress on x86_64-pc-linux-gnu, does this look
OK to commit after testing?

gcc/ChangeLog:

PR tree-optimization/59124
* tree-vrp.c (register_edge_assert_for): For NAME != CST1
where NAME = A + CST2 add the assertion A != (CST1 - CST2).

gcc/testsuite/ChangeLog:

PR tree-optimization/59124
* gcc.dg/Warray-bounds-19.c: New test.
---
 gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
 gcc/tree-vrp.c  | 22 ++
 2 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-19.c

diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-19.c 
b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
new file mode 100644
index 000..e2f9661
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
@@ -0,0 +1,17 @@
+/* PR tree-optimization/59124 */
+/* { dg-options "-O3 -Warray-bounds" } */
+
+unsigned baz[6];
+
+void foo(unsigned *bar, unsigned n)
+{
+  unsigned i, j;
+
+  if (n > 6)
+n = 6;
+
+  for (i = 1; i < n; i++)
+for (j = i - 1; j > 0; j--)
+  bar[j - 1] = baz[j - 1];
+}
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b5654c5..31bd575 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -5820,6 +5820,28 @@ register_edge_assert_for (tree name, edge e, 
gimple_stmt_iterator si,
}
 }
 
+  /* In the case of NAME != CST1 where NAME = A + CST2 we can
+ assert that NAME != (CST1 - CST2).  */
+  if ((comp_code == EQ_EXPR || comp_code == NE_EXPR)
+  && TREE_CODE (val) == INTEGER_CST)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+
+  if (is_gimple_assign (def_stmt)
+ && gimple_assign_rhs_code (def_stmt) == PLUS_EXPR)
+   {
+ tree op0 = gimple_assign_rhs1 (def_stmt);
+ tree op1 = gimple_assign_rhs2 (def_stmt);
+ if (TREE_CODE (op0) == SSA_NAME
+ && TREE_CODE (op1) == INTEGER_CST)
+   {
+ op1 = int_const_binop (MINUS_EXPR, val, op1);
+ register_edge_assert_for_2 (op0, e, si, comp_code,
+ op0, op1, is_else_edge);
+   }
+   }
+}
+
   /* In the case of NAME == 0 or NAME != 1, for BIT_IOR_EXPR defining
  statement of NAME we can assert both operands of the BIT_IOR_EXPR
  have zero value.  */
-- 
2.8.0.rc3.27.gade0865

[Bug tree-optimization/59124] [4.9/5/6 Regression] Wrong warnings "array subscript is above array bounds"

2016-03-27 Thread ppalka at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ppalka at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

--- Comment #35 from Patrick Palka  ---
I have a rather simple patch that teaches VRP to insert the relevant
ASSERT_EXPRs so that it knows to remove the unreachable code inserted by the
loop unrolling.

Re: [Patch, Fortran, pr70397, gcc-5, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-27 Thread Paul Richard Thomas

Hi Andre,

The patch looks to be fine to me for both trunk and 5-branch.

Thanks for the patch.

Paul

On 27 March 2016 at 18:53, Andre Vehreschild  wrote:
> Hi all,
>
> and here is already the follow-up. In the initial patch a safe wasn't 
> commenced
> before pulling the patch, which lead to a refactoring of the new functions 
> node
> to be partial only. Sorry for the noise.
>
> - Andre
>
> Am Sun, 27 Mar 2016 18:49:18 +0200
> schrieb Andre Vehreschild :
>
>> Hi all,
>>
>> attached is a patch to fix an ICE on allocating an unlimited polymorphic
>> entity from a non-poly class or type without an length component. The routine
>> gfc_copy_class_to_class() assumed that both the source and destination
>> object's type is unlimited polymorphic, but in this case it is true for the
>> destination only, which made gfortran look for a non-existent _len component
>> in the source object and therefore ICE. This is fixed by the patch by adding
>> a function to return either the _len component, when it exists, or a constant
>> zero node to init the destination object's _len component with.
>>
>> Bootstrapped and regtested ok on x86_64-linux-gnu/F23. (Might have some
>> line deltas, because my git is a bit older. Sorry, only have limited/slow
>> net-access currently.)
>>
>> The same patch should be adaptable to trunk. To come...
>>
>> Ok for 5-trunk?
>>
>> Regards,
>>   Andre
>
>
>
> --
> Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen
> Email: ve...@gmx.de * Tel: +49 241 9291018



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: [Patch, Fortran] STOP issue with coarrays

2016-03-27 Thread Paul Richard Thomas

Hi Alessandro,

The patch is fine for trunk and 5-branch - going on trivial, I would
say! Are you going to add the testcase?

Thanks a lot! I am impressed that you are doing these between
celebrating your doctorate and preparing for your move :-)

Paul

On 27 March 2016 at 17:10, Alessandro Fanfarillo
 wrote:
> Dear all,
>
> the attached patch fixes the issue reported by Anton Shterenlikht
> (https://gcc.gnu.org/ml/fortran/2016-03/msg00037.html). The compiler
> delegates the external library to manage the STOP statement in case
> -fcoarray=lib is used.
>
> Built and regtested on x86_64-pc-linux-gnu.
>
> Ok for trunk and gcc-5-branch?

-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: Patches to fix GCC's C++ exception_handling on NetBSD/VAX

2016-03-27 Thread Mikael Pettersson

Jake Hamby writes:
 > As an added bonus, I see that my patch set also included an old m68k patch
 > that had been sitting in my tree, which fixes a crash when -m68040 is 
 > defined.
 > I may have submitted it to port-m68k before. It hasn't been tested with the
 > new compiler either. Here's that patch separately. It only matter when
 > TARGET_68881 && TUNE_68040.

Do you have a test case or some recipe for reproducing the crash?
I'd be happy to test this patch on Linux/M68K.

/Mikael

 > 
 > -Jake
 > 
 > 
 > Index: external/gpl3/gcc/dist/gcc/config/m68k/m68k.md
 > 
 > RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/m68k/m68k.md,v
 > retrieving revision 1.4
 > diff -u -u -r1.4 m68k.md
 > --- external/gpl3/gcc/dist/gcc/config/m68k/m68k.md   24 Jan 2016 09:43:33 
 > -  1.4
 > +++ external/gpl3/gcc/dist/gcc/config/m68k/m68k.md   26 Mar 2016 10:42:41 
 > -
 > @@ -2132,9 +2132,9 @@
 >  ;; into the kernel to emulate fintrz.  They should also be faster
 >  ;; than calling the subroutines fixsfsi or fixdfsi.
 > 
 > -(define_insn "fix_truncdfsi2"
 > +(define_insn "fix_truncsi2"
 >[(set (match_operand:SI 0 "nonimmediate_operand" "=dm")
 > -(fix:SI (fix:DF (match_operand:DF 1 "register_operand" "f"
 > +(fix:SI (match_operand:FP 1 "register_operand" "f")))
 > (clobber (match_scratch:SI 2 "=d"))
 > (clobber (match_scratch:SI 3 "=d"))]
 >"TARGET_68881 && TUNE_68040"
 > @@ -2143,9 +2143,9 @@
 >return "fmovem%.l %!,%2\;moveq #16,%3\;or%.l %2,%3\;and%.w 
 > #-33,%3\;fmovem%.l %3,%!\;fmove%.l %1,%0\;fmovem%.l %2,%!";
 >  })
 > 
 > -(define_insn "fix_truncdfhi2"
 > +(define_insn "fix_trunchi2"
 >[(set (match_operand:HI 0 "nonimmediate_operand" "=dm")
 > -(fix:HI (fix:DF (match_operand:DF 1 "register_operand" "f"
 > +(fix:HI (match_operand:FP 1 "register_operand" "f")))
 > (clobber (match_scratch:SI 2 "=d"))
 > (clobber (match_scratch:SI 3 "=d"))]
 >"TARGET_68881 && TUNE_68040"
 > @@ -2154,9 +2154,9 @@
 >return "fmovem%.l %!,%2\;moveq #16,%3\;or%.l %2,%3\;and%.w 
 > #-33,%3\;fmovem%.l %3,%!\;fmove%.w %1,%0\;fmovem%.l %2,%!";
 >  })
 > 
 > -(define_insn "fix_truncdfqi2"
 > +(define_insn "fix_truncqi2"
 >[(set (match_operand:QI 0 "nonimmediate_operand" "=dm")
 > -(fix:QI (fix:DF (match_operand:DF 1 "register_operand" "f"
 > +(fix:QI (match_operand:FP 1 "register_operand" "f")))
 > (clobber (match_scratch:SI 2 "=d"))
 > (clobber (match_scratch:SI 3 "=d"))]
 >"TARGET_68881 && TUNE_68040"

--

Re: [Patch, Fortran, pr70397, gcc-5, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-27 Thread Andre Vehreschild

Hi all,

and here is already the follow-up. In the initial patch a safe wasn't commenced
before pulling the patch, which lead to a refactoring of the new functions node
to be partial only. Sorry for the noise.

- Andre

Am Sun, 27 Mar 2016 18:49:18 +0200
schrieb Andre Vehreschild :

> Hi all,
> 
> attached is a patch to fix an ICE on allocating an unlimited polymorphic
> entity from a non-poly class or type without an length component. The routine
> gfc_copy_class_to_class() assumed that both the source and destination
> object's type is unlimited polymorphic, but in this case it is true for the
> destination only, which made gfortran look for a non-existent _len component
> in the source object and therefore ICE. This is fixed by the patch by adding
> a function to return either the _len component, when it exists, or a constant
> zero node to init the destination object's _len component with.
> 
> Bootstrapped and regtested ok on x86_64-linux-gnu/F23. (Might have some
> line deltas, because my git is a bit older. Sorry, only have limited/slow
> net-access currently.)
> 
> The same patch should be adaptable to trunk. To come...
> 
> Ok for 5-trunk?
> 
> Regards,
>   Andre



-- 
Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen
Email: ve...@gmx.de * Tel: +49 241 9291018
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 1681d14..642ce26 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -173,6 +173,24 @@ gfc_class_len_get (tree decl)
 }
 
 
+/* Try to get the _len component of a class.  When the class is not unlimited
+   poly, i.e. no _len field exists, then return a zero node.  */
+
+tree
+gfc_class_len_or_zero_get (tree decl)
+{
+  tree len;
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref_loc (input_location, decl);
+  len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
+			   CLASS_LEN_FIELD);
+  return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
+	 TREE_TYPE (len), decl, len,
+	 NULL_TREE)
+			  : integer_zero_node;
+}
+
+
 /* Get the specified FIELD from the VPTR.  */
 
 static tree
@@ -250,6 +268,7 @@ gfc_vptr_size_get (tree vptr)
 
 #undef CLASS_DATA_FIELD
 #undef CLASS_VPTR_FIELD
+#undef CLASS_LEN_FIELD
 #undef VTABLE_HASH_FIELD
 #undef VTABLE_SIZE_FIELD
 #undef VTABLE_EXTENDS_FIELD
@@ -1070,7 +1089,7 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
   if (unlimited)
 {
   if (from_class_base != NULL_TREE)
-	from_len = gfc_class_len_get (from_class_base);
+	from_len = gfc_class_len_or_zero_get (from_class_base);
   else
 	from_len = integer_zero_node;
 }
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index e6544f9..9a181be 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -356,6 +356,7 @@ tree gfc_class_set_static_fields (tree, tree, tree);
 tree gfc_class_data_get (tree);
 tree gfc_class_vptr_get (tree);
 tree gfc_class_len_get (tree);
+tree gfc_class_len_or_zero_get (tree);
 gfc_expr * gfc_find_and_cut_at_last_class_ref (gfc_expr *);
 /* Get an accessor to the class' vtab's * field, when a class handle is
available.  */
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
new file mode 100644
index 000..d0b2a2e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
@@ -0,0 +1,40 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUBLIC :: base_type
+  END TYPE base_type
+
+  TYPE, PUBLIC :: dict_entry_type
+ CLASS( * ), ALLOCATABLE :: key
+ CLASS( * ), ALLOCATABLE :: val
+  END TYPE dict_entry_type
+
+
+contains
+
+  SUBROUTINE dict_put ( this, key, val )
+CLASS(dict_entry_type), INTENT(INOUT) :: this
+CLASS(base_type), INTENT(IN) :: key, val
+INTEGER  :: istat
+ALLOCATE( this%key, SOURCE=key, STAT=istat )
+  end SUBROUTINE dict_put
+end module mod
+
+program test
+  use mod
+  type(dict_entry_type) :: t
+  type(base_type) :: a, b
+  call dict_put(t, a, b)
+
+  if (.NOT. allocated(t%key)) call abort()
+  select type (x => t%key)
+type is (base_type)
+class default
+  call abort()
+  end select
+  deallocate(t%key)
+end
+
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90
new file mode 100644
index 000..1300069
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90
@@ -0,0 +1,47 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUBLIC :: dict_entry_type
+ CLASS( * ), ALLOCATABLE :: key
+ CLASS( * ), ALLOCATABLE :: val
+  END TYPE dict_entry_type
+
+
+contains
+
+  SUBROUTINE dict_put ( this, key, val )
+CLASS(dict_entry_type), INTENT(INOUT) :: this
+CLASS(*), INTENT(IN)

[Patch, Fortran, pr70397, gcc-5, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-27 Thread Andre Vehreschild

Hi all,

attached is a patch to fix an ICE on allocating an unlimited polymorphic entity
from a non-poly class or type without an length component. The routine
gfc_copy_class_to_class() assumed that both the source and destination object's
type is unlimited polymorphic, but in this case it is true for the destination
only, which made gfortran look for a non-existent _len component in the source
object and therefore ICE. This is fixed by the patch by adding a function to
return either the _len component, when it exists, or a constant zero node to
init the destination object's _len component with.

Bootstrapped and regtested ok on x86_64-linux-gnu/F23. (Might have some
line deltas, because my git is a bit older. Sorry, only have limited/slow
net-access currently.)

The same patch should be adaptable to trunk. To come...

Ok for 5-trunk?

Regards,
Andre
-- 
Andre Vehreschild * vehre at gcc dot gnu.org

pr70391_1.clog
Description: Binary data
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 1681d14..642ce26 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -173,6 +173,24 @@ gfc_class_len_get (tree decl)
 }
 
 
+/* Try to get the _len component of a class.  When the class is not unlimited
+   poly, i.e. no _len field exists, then return a zero node.  */
+
+tree
+gfc_class_len_or_zero_get (tree decl)
+{
+  tree len;
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref_loc (input_location, decl);
+  len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
+			   CLASS_LEN_FIELD);
+  return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
+	 TREE_TYPE (len), decl, len,
+	 NULL_TREE)
+			  : integer_zero_node;
+}
+
+
 /* Get the specified FIELD from the VPTR.  */
 
 static tree
@@ -250,6 +268,7 @@ gfc_vptr_size_get (tree vptr)
 
 #undef CLASS_DATA_FIELD
 #undef CLASS_VPTR_FIELD
+#undef CLASS_LEN_FIELD
 #undef VTABLE_HASH_FIELD
 #undef VTABLE_SIZE_FIELD
 #undef VTABLE_EXTENDS_FIELD
@@ -1070,7 +1089,7 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
   if (unlimited)
 {
   if (from_class_base != NULL_TREE)
-	from_len = gfc_class_len_get (from_class_base);
+	from_len = gfc_class_len_or_zero_get (from_class_base);
   else
 	from_len = integer_zero_node;
 }
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index e6544f9..38cffa4 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -356,6 +356,7 @@ tree gfc_class_set_static_fields (tree, tree, tree);
 tree gfc_class_data_get (tree);
 tree gfc_class_vptr_get (tree);
 tree gfc_class_len_get (tree);
+tree gfc_class_len_get_try (tree);
 gfc_expr * gfc_find_and_cut_at_last_class_ref (gfc_expr *);
 /* Get an accessor to the class' vtab's * field, when a class handle is
available.  */
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
new file mode 100644
index 000..d0b2a2e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
@@ -0,0 +1,40 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUBLIC :: base_type
+  END TYPE base_type
+
+  TYPE, PUBLIC :: dict_entry_type
+ CLASS( * ), ALLOCATABLE :: key
+ CLASS( * ), ALLOCATABLE :: val
+  END TYPE dict_entry_type
+
+
+contains
+
+  SUBROUTINE dict_put ( this, key, val )
+CLASS(dict_entry_type), INTENT(INOUT) :: this
+CLASS(base_type), INTENT(IN) :: key, val
+INTEGER  :: istat
+ALLOCATE( this%key, SOURCE=key, STAT=istat )
+  end SUBROUTINE dict_put
+end module mod
+
+program test
+  use mod
+  type(dict_entry_type) :: t
+  type(base_type) :: a, b
+  call dict_put(t, a, b)
+
+  if (.NOT. allocated(t%key)) call abort()
+  select type (x => t%key)
+type is (base_type)
+class default
+  call abort()
+  end select
+  deallocate(t%key)
+end
+
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90
new file mode 100644
index 000..1300069
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90
@@ -0,0 +1,47 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUBLIC :: dict_entry_type
+ CLASS( * ), ALLOCATABLE :: key
+ CLASS( * ), ALLOCATABLE :: val
+  END TYPE dict_entry_type
+
+
+contains
+
+  SUBROUTINE dict_put ( this, key, val )
+CLASS(dict_entry_type), INTENT(INOUT) :: this
+CLASS(*), INTENT(IN) :: key, val
+INTEGER  :: istat
+ALLOCATE( this%key, SOURCE=key, STAT=istat )
+ALLOCATE( this%val, SOURCE=val, STAT=istat )
+  end SUBROUTINE dict_put
+end module mod
+
+program test
+  use mod
+  type(dict_entry_type) :: t
+  call dict_put(t, "foo", 42)
+
+  if (.NOT. allocated(t%key)) call abort()
+

[Bug target/70421] [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-27 Thread zsojka at seznam dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

--- Comment #1 from Zdenek Sojka  ---
The operation done by the vmovdqa32 instruction is inverted; this fixes the
assembly (-O3, intel syntax):

@@ -72,7 +72,7 @@
and rsp, -64#,
pushQWORD PTR [r10-8]   #
pushrbp #
-   mov eax, 2  # tmp108,
+   mov eax, 0xfd   # tmp108,
kmovw   k1, eax # tmp108, tmp108
xor edx, ecx# tmp106, tmp100
.cfi_escape 0x10,0x6,0x2,0x76,0

[Patch, Fortran] STOP issue with coarrays

2016-03-27 Thread Alessandro Fanfarillo

Dear all,

the attached patch fixes the issue reported by Anton Shterenlikht
(https://gcc.gnu.org/ml/fortran/2016-03/msg00037.html). The compiler
delegates the external library to manage the STOP statement in case
-fcoarray=lib is used.

Built and regtested on x86_64-pc-linux-gnu.

Ok for trunk and gcc-5-branch?
gcc/fortran/ChangeLog
2016-03-27  Alessandro Fanfarillo  

* trans-decl.c (gfc_build_builtin_function_decls):
caf_stop_numeric and caf_stop_str definition.
* trans-stmt.c (gfc_trans_stop): invoke external functions
for stop and stop_str when coarrays are used.
* trans.h: extern for new functions.

libgfortran/ChangeLog
2016-03-27  Alessandro Fanfarillo  

* caf/libcaf.h: caf_stop_numeric and caf_stop_str prototype.
* caf/single.c: _gfortran_caf_stop_numeric and
_gfortran_caf_stop_str implementation.

commit bb407679e918dfb9cbc769594cf39a6bd09dd9d9
Author: Alessandro Fanfarillo 
Date:   Sun Mar 27 16:42:59 2016 +0200

Adding caf_stop_str and caf_stop_numeric

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 4bd7dc4..309baf1 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -137,6 +137,8 @@ tree gfor_fndecl_caf_sendget;
 tree gfor_fndecl_caf_sync_all;
 tree gfor_fndecl_caf_sync_memory;
 tree gfor_fndecl_caf_sync_images;
+tree gfor_fndecl_caf_stop_str;
+tree gfor_fndecl_caf_stop_numeric;
 tree gfor_fndecl_caf_error_stop;
 tree gfor_fndecl_caf_error_stop_str;
 tree gfor_fndecl_caf_atomic_def;
@@ -3550,6 +3552,18 @@ gfc_build_builtin_function_decls (void)
   /* CAF's ERROR STOP doesn't return.  */
   TREE_THIS_VOLATILE (gfor_fndecl_caf_error_stop_str) = 1;
 
+  gfor_fndecl_caf_stop_numeric = gfc_build_library_function_decl_with_spec 
(
+get_identifier (PREFIX("caf_stop_numeric")), ".R.",
+void_type_node, 1, gfc_int4_type_node);
+  /* CAF's STOP doesn't return.  */
+  TREE_THIS_VOLATILE (gfor_fndecl_caf_stop_numeric) = 1;
+
+  gfor_fndecl_caf_stop_str = gfc_build_library_function_decl_with_spec (
+get_identifier (PREFIX("caf_stop_str")), ".R.",
+void_type_node, 2, pchar_type_node, gfc_int4_type_node);
+  /* CAF's STOP doesn't return.  */
+  TREE_THIS_VOLATILE (gfor_fndecl_caf_stop_str) = 1;
+
   gfor_fndecl_caf_atomic_def = gfc_build_library_function_decl_with_spec (
get_identifier (PREFIX("caf_atomic_define")), "R..RW",
void_type_node, 7, pvoid_type_node, size_type_node, integer_type_node,
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index cb54499..2fc43ed 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -635,7 +635,9 @@ gfc_trans_stop (gfc_code *code, bool error_stop)
 ? (flag_coarray == GFC_FCOARRAY_LIB
? gfor_fndecl_caf_error_stop_str
: gfor_fndecl_error_stop_string)
-: gfor_fndecl_stop_string,
+: (flag_coarray == GFC_FCOARRAY_LIB
+   ? gfor_fndecl_caf_stop_str
+   : gfor_fndecl_stop_string),
 2, build_int_cst (pchar_type_node, 0), tmp);
 }
   else if (code->expr1->ts.type == BT_INTEGER)
@@ -646,7 +648,9 @@ gfc_trans_stop (gfc_code *code, bool error_stop)
 ? (flag_coarray == GFC_FCOARRAY_LIB
? gfor_fndecl_caf_error_stop
: gfor_fndecl_error_stop_numeric)
-: gfor_fndecl_stop_numeric_f08, 1,
+: (flag_coarray == GFC_FCOARRAY_LIB
+   ? gfor_fndecl_caf_stop_numeric
+   : gfor_fndecl_stop_numeric_f08), 1,
 fold_convert (gfc_int4_type_node, se.expr));
 }
   else
@@ -657,7 +661,9 @@ gfc_trans_stop (gfc_code *code, bool error_stop)
 ? (flag_coarray == GFC_FCOARRAY_LIB
? gfor_fndecl_caf_error_stop_str
: gfor_fndecl_error_stop_string)
-: gfor_fndecl_stop_string,
+: (flag_coarray == GFC_FCOARRAY_LIB
+   ? gfor_fndecl_caf_stop_str
+   : gfor_fndecl_stop_string),
 2, se.expr, se.string_length);
 }
 
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 316ee9b..add0cea 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -762,6 +762,8 @@ extern GTY(()) tree gfor_fndecl_caf_sendget;
 extern GTY(()) tree gfor_fndecl_caf_sync_all;
 extern GTY(()) tree gfor_fndecl_caf_sync_memory;
 extern GTY(()) tree

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread sch...@linux-m68k.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #1 from Andreas Schwab  ---
@@ -1,5 +1,5 @@

-stage2-gcc/bitmap.o: file format elf64-littleaarch64
+stage3-gcc/bitmap.o: file format elf64-littleaarch64


 Disassembly of section .text:
@@ -4788,11 +4788,11 @@
  22c:  aa0003f8mov x24, x0
  230:  b5fff200cbnzx0, 70
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv+0x70>
  234:  9002adrpx2, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
-   234: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
+   234: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN21mem_alloc_descriptionI12bitmap_usageEC2Ev.str1.8
  238:  9000adrpx0, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
238: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE26find_empty_slot_for_expandEj.str1.8+0x20
  23c:  9142add x2, x2, #0x0
-   23c: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
+   23c: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN21mem_alloc_descriptionI12bitmap_usageEC2Ev.str1.8
  240:  9100add x0, x0, #0x0
240: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE26find_empty_slot_for_expandEj.str1.8+0x20
  244:  528051a1mov w1, #0x28d  // #653
@@ -4827,13 +4827,13 @@
  2a4:  f920str x0, [x1]
  2a8:  177ab   90
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv+0x90>
  2ac:  9002adrpx2, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
-   2ac: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x10
+   2ac: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
  2b0:  9000adrpx0, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
-   2b0: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x28
+   2b0: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x18
  2b4:  9142add x2, x2, #0x0
-   2b4: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x10
+   2b4: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
  2b8:  9100add x0, x0, #0x0
-   2b8: R_AARCH64_ADD_ABS_LO12_NC

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-27 Thread olegendo at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

Oleg Endo  changed:

   What|Removed |Added

  Attachment #38105|0   |1
is obsolete||

--- Comment #13 from Oleg Endo  ---
Created attachment 38108
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38108=edit
reduced test case for -O2 -fpic

It seems it can be reduced even a bit further.

[Bug bootstrap/70422] New: [6 regression] Bootstrap comparison failure

2016-03-27 Thread sch...@linux-m68k.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Bug ID: 70422
   Summary: [6 regression] Bootstrap comparison failure
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sch...@linux-m68k.org
CC: jason at gcc dot gnu.org
Blocks: 64266, 70353
  Target Milestone: ---
Target: aarch64-*-*, ia64-*-*

Both aarch64 and ia64 fail to bootstrap due to comparison failure.

a478a028f1e445c05b162236d708de6935d4b5e2 is the first bad commit
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234484
138bc75d-0d04-0410-961f-82ee72b054a4


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64266
[Bug 64266] Can GCC produce local mergeable symbols for *.__FUNCTION__ and
*.__PRETTY_FUNCTION__ functions?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70353
[Bug 70353] [5/6 regression] ICE on __PRETTY_FUNCTION__ in a constexpr function

New Danish PO file for 'cpplib' (version 6.1-b20160131)

2016-03-27 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Danish team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/da.po

(This file, 'cpplib-6.1-b20160131.da.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Contents of PO file 'cpplib-6.1-b20160131.da.po'

2016-03-27 Thread Translation Project Robot



cpplib-6.1-b20160131.da.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

[Bug fortran/70235] [4.9/5/6 Regression] Incorrect output with PF format

2016-03-27 Thread dominiq at lps dot ens.fr

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70235

--- Comment #22 from Dominique d'Humieres  ---
Created attachment 38107
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38107=edit
New patch with test.

With the patch we now get for y=6431.25

ru,-8pf18.2 y=  0.01

IMO this is the correct rounding. Does someone disagree with that?

What tests should be removed/added from gfortran.dg/fmt_pf.f90?

[Bug bootstrap/67728] Build fails when cross-compiling with in-tree GMP and ISL

2016-03-27 Thread bernd.edlinger at hotmail dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67728

--- Comment #26 from Bernd Edlinger  ---
with unpatched trunk and mpfr-3.1.4 and mpc-1.0.3 in-tree

I've got this in mpc/src/libmpc.la:
dependency_libs=' -lmpfr /home/ed/gnu/gcc-build1/./gmp/.libs/libgmp.la -lm'

and check-mpc fails to build this:
libtool: link: /home/ed/gnu/gcc-build1/./prev-gcc/xgcc
-B/home/ed/gnu/gcc-build1/./prev-gcc/
-B/home/ed/gnu/install1/x86_64-pc-linux-gnu/bin/
-B/home/ed/gnu/install1/x86_64-pc-linux-gnu/bin/
-B/home/ed/gnu/install1/x86_64-pc-linux-gnu/lib/ -isystem
/home/ed/gnu/install1/x86_64-pc-linux-gnu/include -isystem
/home/ed/gnu/install1/x86_64-pc-linux-gnu/sys-include -g -O2 -static-libstdc++
-static-libgcc -o tabs tabs.o  ./.libs/libmpc-tests.a ../src/.libs/libmpc.a
-lmpfr /home/ed/gnu/gcc-build1/./gmp/.libs/libgmp.a -lm
/usr/bin/ld: cannot find -lmpfr

and with the patch this line in mpc/src/libmpc.la changed to:
dependency_libs=' /home/ed/gnu/gcc-build/./mpfr/.libs/libmpfr.la
/home/ed/gnu/gcc-build/./gmp/.libs/libgmp.la'

and the check-mpc succeeds on a plain x86_64-ubuntu14.04 with definitely no
gmp or mpfr libs installed

Re: Constexpr in intrinsics?

2016-03-27 Thread Marc Glisse


On Sun, 27 Mar 2016, Allan Sandfeld Jensen wrote:


Would it be possible to add constexpr to the intrinsics headers?

For instance _mm_set_XX and _mm_setzero intrinsics.


Already suggested here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65197

A patch would be welcome (I started doing it at some point, I don't 
remember if it was functional, the patch is attached).



Ideally it could also be added all intrinsics that can be evaluated at compile
time, but it is harder to tell which those are.

Does gcc have a C extension we can use to set constexpr?


What for?

--
Marc GlisseIndex: gcc/config/i386/avx2intrin.h
===
--- gcc/config/i386/avx2intrin.h(revision 223886)
+++ gcc/config/i386/avx2intrin.h(working copy)
@@ -93,41 +93,45 @@ _mm256_packus_epi32 (__m256i __A, __m256
   return (__m256i)__builtin_ia32_packusdw256 ((__v8si)__A, (__v8si)__B);
 }
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_packus_epi16 (__m256i __A, __m256i __B)
 {
   return (__m256i)__builtin_ia32_packuswb256 ((__v16hi)__A, (__v16hi)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_add_epi8 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v32qu)__A + (__v32qu)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_add_epi16 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v16hu)__A + (__v16hu)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_add_epi32 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v8su)__A + (__v8su)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_add_epi64 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v4du)__A + (__v4du)__B);
 }
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_adds_epi8 (__m256i __A, __m256i __B)
@@ -167,20 +171,21 @@ _mm256_alignr_epi8 (__m256i __A, __m256i
 }
 #else
 /* In that case (__N*8) will be in vreg, and insn will not be matched. */
 /* Use define instead */
 #define _mm256_alignr_epi8(A, B, N)   \
   ((__m256i) __builtin_ia32_palignr256 ((__v4di)(__m256i)(A), \
(__v4di)(__m256i)(B),  \
(int)(N) * 8))
 #endif
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_and_si256 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v4du)__A & (__v4du)__B);
 }
 
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_andnot_si256 (__m256i __A, __m256i __B)
@@ -219,69 +224,77 @@ _mm256_blend_epi16 (__m256i __X, __m256i
   return (__m256i) __builtin_ia32_pblendw256 ((__v16hi)__X,
  (__v16hi)__Y,
   __M);
 }
 #else
 #define _mm256_blend_epi16(X, Y, M)\
   ((__m256i) __builtin_ia32_pblendw256 ((__v16hi)(__m256i)(X), \
(__v16hi)(__m256i)(Y), (int)(M)))
 #endif
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpeq_epi8 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v32qi)__A == (__v32qi)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpeq_epi16 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v16hi)__A == (__v16hi)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpeq_epi32 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v8si)__A == (__v8si)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpeq_epi64 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v4di)__A == (__v4di)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpgt_epi8 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v32qi)__A > (__v32qi)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpgt_epi16 (__m256i __A, __m256i __B)
 {
   return (__m256i) ((__v16hi)__A > (__v16hi)__B);
 }
 
+__GCC_X86_CONSTEXPR11
 extern __inline __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm256_cmpgt_epi32 (__m256i __A, __m256i __B)
 {
   return (__m256i)

Re: [PATCH 3/4, libgomp] Resolve deadlock on plugin exit, HSA plugin parts

2016-03-27 Thread Chung-Lin Tang

On 2016/3/25 上午 02:40, Martin Jambor wrote:
> On the whole, I am fine with the patch but there are two issues:
> 
> First, and generally, when you change the return type of a function,
> you must document what return values mean in the comment of the
> function.  Most importantly, it must be immediately apparent whether a
> function returns true or false on failure from its comment.  So please
> fix that.

Thanks, I'll update on that.

>> >  /* Callback of dispatch queues to report errors.  */
>> > @@ -454,7 +471,7 @@ queue_callback (hsa_status_t status,
>> >hsa_queue_t *queue __attribute__ ((unused)),
>> >void *data __attribute__ ((unused)))
>> >  {
>> > -  hsa_fatal ("Asynchronous queue error", status);
>> > +  hsa_error ("Asynchronous queue error", status);
>> >  }
> ...I believe this hunk is wrong.  Errors reported in this way mean
> that something is very wrong and generally happen during execution of
> code on HSA GPU, i.e. within GOMP_OFFLOAD_run.  And since you left
> calls in create_single_kernel_dispatch, which is called as a part of
> GOMP_OFFLOAD_run, intact, I believe you actually want to leave
> hsa_fatel here too.

Yes, a fatal exit is okay within the 'run' hook, since we're not holding
the device lock there. I was only trying to audit the GOMP_OFFLOAD_init_device()
function, where the queues are created.

I'm not familiar with the HSA runtime API; will the callback only be triggered
during GPU kernel execution (inside the 'run' hook), and not for example,
within hsa_queue_create()? If so, then yes as you advised, the above change to
queue_callback() should be reverted.

Chung-Lin

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-27 Thread kugan at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #12 from kugan at gcc dot gnu.org ---
However, diff of cfgexand is significantly different:
 ;; Full RTL generated for this function:
 ;;
32: NOTE_INSN_DELETED
-   38: NOTE_INSN_BASIC_BLOCK 2
+   39: NOTE_INSN_BASIC_BLOCK 2
33: r151:SI=r0:SI
34: r152:SI=r1:SI
35: r153:SI=r2:SI
36: NOTE_INSN_FUNCTION_BEG
-   40: {r141:SI=abs(r151:SI);clobber cc:CC;}
-   41: r154:SI=r153:SI-0x1
-   42: r142:SI=r152:SI+r154:SI
-   43: r155:SI=0
-   44: r156:QI=r155:SI#0
-   45: [r142:SI]=r156:QI
-   61: L61:
-   46: NOTE_INSN_BASIC_BLOCK 4
-   47: r142:SI=r142:SI-0x1
-   48: r1:SI=0xa
-   49: r0:SI=r141:SI
-   50: r0:DI=call [`__aeabi_uidivmod'] argc:0
+   41: {r141:SI=abs(r151:SI);clobber cc:CC;}
+   42: r154:SI=r153:SI-0x1
+   43: r142:SI=r152:SI+r154:SI
+   44: r155:SI=0
+   45: r156:QI=r155:SI#0
+   46: [r142:SI]=r156:QI
+   81: pc=L62
+   82: barrier
+   84: L84:
+   83: NOTE_INSN_BASIC_BLOCK 4
+   37: r142:SI=r150:SI
+   62: L62:
+   47: NOTE_INSN_BASIC_BLOCK 5
+   48: r150:SI=r142:SI-0x1
+   49: r1:SI=0xa
+   50: r0:SI=r141:SI
+   51: r0:DI=call [`__aeabi_uidivmod'] argc:0
   REG_CALL_DECL `__aeabi_uidivmod'
   REG_EH_REGION 0x8000
-   51: r162:SI=r1:SI
+   52: r162:SI=r1:SI
   REG_EQUAL umod(r141:SI,0xa)
-   52: r163:QI=r162:SI#0
-   53: r164:SI=r163:QI#0+0x30
-   54: r165:QI=r164:SI#0
-   55: [r142:SI]=r165:QI
-   56: r1:SI=0xa
-   57: r0:SI=r141:SI
-   58: r0:SI=call [`__aeabi_uidiv'] argc:0
+   53: r163:QI=r162:SI#0
+   54: r164:SI=r163:QI#0+0x30
+   55: r165:QI=r164:SI#0
+   56: [r150:SI]=r165:QI
+   57: r1:SI=0xa
+   58: r0:SI=r141:SI
+   59: r0:SI=call [`__aeabi_uidiv'] argc:0
   REG_CALL_DECL `__aeabi_uidiv'
   REG_EH_REGION 0x8000
-   59: r169:SI=r0:SI
+   60: r169:SI=r0:SI
   REG_EQUAL udiv(r141:SI,0xa)
-   60: r141:SI=r169:SI
-   62: cc:CC=cmp(r141:SI,0)
-   63: pc={(cc:CC!=0)?L61:pc}
+   61: r141:SI=r169:SI
+   63: cc:CC=cmp(r141:SI,0)
+   64: pc={(cc:CC!=0)?L84:pc}
   REG_BR_PROB 9100
-   64: NOTE_INSN_BASIC_BLOCK 5
-   65: cc:CC=cmp(r151:SI,0)
-   66: pc={(cc:CC>=0)?L72:pc}
+   65: NOTE_INSN_BASIC_BLOCK 6
+   66: cc:CC=cmp(r151:SI,0)
+   67: pc={(cc:CC>=0)?L77:pc}
   REG_BR_PROB 6335
-   67: NOTE_INSN_BASIC_BLOCK 6
-   68: r149:SI=r142:SI-0x1
-   69: r170:SI=0x2d
-   70: r171:QI=r170:SI#0
-   71: [r142:SI-0x1]=r171:QI
-   37: r142:SI=r149:SI
-   72: L72:
-   73: NOTE_INSN_BASIC_BLOCK 7
-   74: r150:SI=r142:SI
+   68: NOTE_INSN_BASIC_BLOCK 7
+   69: r149:SI=r142:SI-0x2
+   70: r170:SI=0x2d
+   71: r171:QI=r170:SI#0
+   72: [r150:SI-0x1]=r171:QI
+   38: r150:SI=r149:SI
+   77: L77:
+   80: NOTE_INSN_BASIC_BLOCK 9
78: r0:SI=r150:SI
79: use r0:SI

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-27 Thread kugan at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #11 from kugan at gcc dot gnu.org ---
Optimized gimple diff between 5.3 and trunk is :

-;; Function inttostr (inttostr, funcdef_no=0, decl_uid=5268, cgraph_uid=0,
symbol_order=0)
+;; Function inttostr (inttostr, funcdef_no=0, decl_uid=4222, cgraph_uid=0,
symbol_order=0)

 Removing basic block 7
 Removing basic block 8
@@ -43,7 +43,7 @@
 goto ;

   :
-  p_22 = p_2 + 4294967294;
+  p_22 = p_16 + 4294967295;
   MEM[(char *)p_16 + 4294967295B] = 45;

   :

Constexpr in intrinsics?

2016-03-27 Thread Allan Sandfeld Jensen

Would it be possible to add constexpr to the intrinsics headers?

For instance _mm_set_XX and _mm_setzero intrinsics.

Ideally it could also be added all intrinsics that can be evaluated at compile 
time, but it is harder to tell which those are.

Does gcc have a C extension we can use to set constexpr?

Best regards
`Allan

[Ada] Fix segfault on double record extension

2016-03-27 Thread Eric Botcazou

This is a regression present on the mainline and 5 branch, for a double record 
extension involving a size clause on the root type and a discriminant with 
variant part on the first extension...

Tested on x86_64-suse-linux, applied on the mainline and 5 branch.


2016-03-27  Eric Botcazou  

* gcc-interface/decl.c (components_to_record): Add special case for 
single field with representation clause at offset 0.


2016-03-27  Eric Botcazou  

* gnat.dg/specs/double_record_extension3.ads: New test.


-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 234493)
+++ gcc-interface/decl.c	(working copy)
@@ -7606,6 +7606,23 @@ components_to_record (tree gnu_record_ty
   if (p_gnu_rep_list && gnu_rep_list)
 *p_gnu_rep_list = chainon (*p_gnu_rep_list, gnu_rep_list);
 
+  /* If only one field has a rep clause and it starts at 0, put back the field
+ at the head of the regular field list.  This will avoid creating a useless
+ REP part below and deal with the annoying case of an extension of a record
+ with variable size and rep clause, for which the _Parent field is forced
+ at offset 0 and has variable size, which we do not support below.  */
+  else if (gnu_rep_list
+	   && !DECL_CHAIN (gnu_rep_list)
+	   && !variants_have_rep
+	   && first_free_pos
+	   && integer_zerop (first_free_pos)
+	   && integer_zerop (bit_position (gnu_rep_list)))
+{
+  DECL_CHAIN (gnu_rep_list) = gnu_field_list;
+  gnu_field_list = gnu_rep_list;
+  gnu_rep_list = NULL_TREE;
+}
+
   /* Otherwise, sort the fields by bit position and put them into their own
  record, before the others, if we also have fields without rep clause.  */
   else if (gnu_rep_list)
-- { dg-do compile }

package Double_Record_Extension3 is

   type Rec1 is tagged record
  Id : Integer;
   end record;

   for Rec1 use record
  Id at 8 range 0 .. 31;
   end record;

   type Rec2 (Size : Integer) is new Rec1 with record
  Data : String (1 .. Size);
   end record;

   type Rec3 is new Rec2 (Size => 128) with record
  Valid : Boolean;
   end record;

end Double_Record_Extension3;

[Bug target/70421] New: [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-27 Thread zsojka at seznam dot cz

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

Bug ID: 70421
   Summary: [5/6 Regression] wrong code with v16si vector and
useless cast at -O -mavx512f
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
Target: x86_64-pc-linux-gnu

Created attachment 38106
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38106=edit
reduced testcase

Output: (using emulation)
$ x86_64-pc-linux-gnu-gcc -O -mavx512f testcase.c
$ sde64 -- ./a.out 
1010
Aborted

$ x86_64-pc-linux-gnu-gcc -v 
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-234469-checking-yes-rtl-df-nographite/bin/../libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-checking=yes,rtl,df --without-cloog --without-ppl --without-isl
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-234469-checking-yes-rtl-df-nographite
Thread model: posix
gcc version 6.0.0 20160324 (experimental) (GCC) 

Tested revisions:
trunk r234469 - FAIL
5-branch r234412 - FAIL
4_9-branch r234243 - OK

[Bug bootstrap/67728] Build fails when cross-compiling with in-tree GMP and ISL

2016-03-27 Thread andrewm.roberts at sky dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67728

--- Comment #25 from Andrew Roberts  ---
The patch works on native armv7l-unknown-linux-gnuabihf with:
gcc-6-20160320
and in tree
gmp 6.1.0
mpc 1.0.3
mpfr 3.1.4
isl 0.16.1

although I wasn't seeing a problem with check-mpc.
At least the build completes without needing the GMP snapshot or seding 
none- to `uname -m`- in the makefile.

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-27 Thread kkojima at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

--- Comment #12 from Kazumoto Kojima  ---
Created attachment 38105
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38105=edit
reduced test case for -O2 -fpic

reload1.c:reload_as_needed function generates the error message with
error_for_asm when checking if reload insns are valid.  The problematic
reload insn is:

(gdb) fr 1
#1  0x08636702 in reload_as_needed (live_known=live_known@entry=1)
at ../../ORIG/trunk/gcc/reload1.c:4698
4698 "impossible reload");
(gdb) call debug_rtx(p) 
(insn 516 508 510 18 (set (reg:SI 0 r0)
(plus:SI (reg:SI 2 r2)
(const_int 4 [0x4]))) xxx.i:100 67 {*addsi3}
 (nil))

which is invalid.  Looks that that complex asm expression requires
an invalid add insn on SH to reload its operand of that asm insn.
Unfortunately I have no idea to fix this issue.

58 matches

Mail list logo